WO2018095400A1 - 音频信号处理方法与相关设备 - Google Patents

音频信号处理方法与相关设备 Download PDF

Info

Publication number
WO2018095400A1
WO2018095400A1 PCT/CN2017/112803 CN2017112803W WO2018095400A1 WO 2018095400 A1 WO2018095400 A1 WO 2018095400A1 CN 2017112803 W CN2017112803 W CN 2017112803W WO 2018095400 A1 WO2018095400 A1 WO 2018095400A1
Authority
WO
WIPO (PCT)
Prior art keywords
signal
audio
audio signal
noise
video
Prior art date
Application number
PCT/CN2017/112803
Other languages
English (en)
French (fr)
Inventor
冯银华
龚连银
Original Assignee
深圳市道通智能航空技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳市道通智能航空技术有限公司 filed Critical 深圳市道通智能航空技术有限公司
Publication of WO2018095400A1 publication Critical patent/WO2018095400A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05DSYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
    • G05D1/00Control of position, course, altitude or attitude of land, water, air or space vehicles, e.g. using automatic pilots
    • G05D1/10Simultaneous control of position or course in three dimensions
    • G05D1/101Simultaneous control of position or course in three dimensions specially adapted for aircraft
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/4302Content synchronisation processes, e.g. decoder synchronisation
    • H04N21/4307Synchronising the rendering of multiple content streams or additional data on devices, e.g. synchronisation of audio on a mobile phone with the video output on the TV screen
    • H04N21/43072Synchronising the rendering of multiple content streams or additional data on devices, e.g. synchronisation of audio on a mobile phone with the video output on the TV screen of multiple content streams on the same device
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering

Definitions

  • the present application relates to the field of drone technology, and in particular, to an audio signal processing method and related equipment.
  • Unmanned Aerial Vehicle which can be referred to as UAV
  • UAV Unmanned Aerial Vehicle
  • the drone can be implemented by the control of a remote controller or terminal in communication therewith.
  • the embodiment of the present invention provides an audio signal processing method and related device, which can implement an audio and video signal when the UAV performs an aerial photography task, thereby improving the user experience.
  • an embodiment of the present application provides an audio signal processing method, including:
  • an audio signal processing apparatus including:
  • a receiving unit configured to collect an ambient sound of the drone to obtain a first audio signal
  • a processing unit configured to filter out the noise signal from the first audio signal to obtain a second audio signal
  • the processing unit is further configured to synthesize the second audio signal and the collected video signal into a sound image Frequency signal
  • a sending unit configured to send the audio and video signal to the terminal, where the audio and video signal is used for playing by the terminal.
  • an embodiment of the present application provides a drone, including:
  • An audio and video collection device wherein the audio and video collection device is disposed in the center casing or the arm;
  • the audio and video collection device and the processor are electrically coupled to the communication interface;
  • the audio and video collection device is configured to collect an ambient sound of the drone to obtain a first audio signal
  • the processor is configured to filter a noise signal from the first audio signal to obtain a second audio signal
  • the processor is further configured to synthesize the second audio signal and the collected video signal into an audio and video signal; and send the audio and video signal to the communication interface;
  • the communication interface is configured to send the audio and video signal to a terminal, and the audio and video signal is used for playing by the terminal.
  • an embodiment of the present application provides a computer readable storage medium storing computer instructions for being executed by a processor to implement the method of the first aspect.
  • the first audio signal can be obtained by collecting the ambient sound; the second audio signal can be obtained by filtering the noise signal from the first audio signal; and synthesizing the second audio signal and the collected video signal into audio and video.
  • the signal is sent to the terminal, and the terminal can play the received audio and video signal, thereby realizing that the drone obtains the audio and video signal when performing the aerial photography task, thereby improving the user's on-site immersion, the user The experience is higher.
  • FIG. 1 is a schematic flowchart diagram of an audio signal processing method according to an embodiment of the present application
  • FIG. 2 is a schematic flowchart diagram of another audio signal processing method according to an embodiment of the present application.
  • FIG. 3 is a schematic structural diagram of an audio signal processing apparatus according to an embodiment of the present application.
  • FIG. 4 is a schematic structural diagram of a drone provided by an embodiment of the present application.
  • FIG. 1 is a schematic flowchart diagram of an audio signal processing method according to an embodiment of the present application. As shown in FIG. 1, the method can include at least the following steps.
  • Step 101 Collect an ambient sound of the drone to obtain a first audio signal.
  • the sound collection device configured by the drone, for example, a microphone, a sound sensor, or the like can be used to collect the ambient sound of the drone.
  • the environmental sound of the drone includes the live sound in the external environment where the drone is located, and noise.
  • the collected noise may include noise of an external environment in which the drone is located and noise of an internal environment in the drone.
  • the noise of the external environment may be noise generated when the UAV propeller rotates, or noise emitted from the scene.
  • the noise of the internal environment in the drone refers to the noise emitted by the components disposed in the casing of the drone during operation, for example, the noise emitted by the fan inside the casing.
  • the UAV collects images by using a camera configured by the UAV to form a video signal, thereby synchronizing the audio signal and the video signal obtained by the UAV.
  • Step 102 Filter out the noise signal from the first audio signal to obtain a second audio signal.
  • the noise characteristics may include at least one of a frequency band characteristic, a loudness characteristic, a timbre characteristic, a tonal characteristic, and the like.
  • the drone can determine the noise signal matched by the preset noise feature according to the preset noise characteristics. For example, the frequency, amplitude, and phase can be determined based on the above characteristics to determine a noise signal that matches it.
  • the noise characteristics corresponding to the noise emitted by the rotation of the propeller can be preset in the drone.
  • the drone can also preset the noise characteristics corresponding to the environment according to the environment in which the drone is located. For example, if the environment is a concert, the noise characteristics corresponding to the noisy sound emitted by the listener can be preset; if the environment is For the natural environment, the noise characteristics corresponding to the wind sound can be preset.
  • the name of the noise used to represent the preset noise feature may be sent to the terminal, and the terminal provides the noise name as an option to the user for selection, and the terminal determines the user according to the user's selection operation.
  • the noise name is selected, and the noise name selected by the user can be sent to the drone, and the drone determines the noise characteristic represented by the noise name selected by the user from all preset noise characteristics.
  • the noise signal matched with the preset noise feature may be determined according to a preset noise feature corresponding to the environment or a preset noise feature selected by the user.
  • the preset noise feature includes a certain frequency band
  • determining that the signal of the first audio signal in the frequency band is a noise signal that matches the preset noise feature.
  • determining the frequency, phase, and amplitude according to the preset noise characteristics determining whether the waveform signal corresponding to the frequency, phase, and amplitude can be parsed from the first audio signal, and if so, determining that the waveform signal is a noise signal.
  • the filtering process of the noise signal in the first audio signal may include: filtering all the noise signals in the first audio signal to prevent the noise signal from being included in the first audio signal; or, the noise signal
  • the weakening process is performed to convert the noise signal into a non-noise signal, for example, to reduce the loudness, pitch, and the like of the noise signal, which are not limited herein.
  • filtering the noise signal may be by any of the following methods.
  • the anti-noise signal corresponding to the noise signal can be determined.
  • the anti-noise signal is used to cancel the above determined noise signal.
  • the above noise signals are all canceled, or the noise signal is weakened, which is not limited herein.
  • the anti-noise signal can be the same as the amplitude and frequency of the noise signal, with opposite phases.
  • the anti-noise signal can also be implemented in other ways, which is not limited herein.
  • the anti-noise signal can be superimposed with the first audio signal to achieve the effect of filtering out the noise signal.
  • the anti-noise signal is superimposed with the first audio signal to obtain a second audio signal, which can be used to represent the live sound of the external environment in which the drone is located.
  • Step 103 synthesize the second audio signal and the collected video signal into an audio and video signal, and send the audio and video signal to a terminal, where the audio and video signal is used for playing by the terminal.
  • the drone can synthesize the obtained second audio signal and the acquired video signal into an audio and video signal in real time, or synthesize the second audio signal segment and the acquired video signal segment into an audio and video signal segment. Specifically, determining an audio signal and a video signal corresponding to each time point, synthesizing The audio and video signals corresponding to each time point, and then the audio and video segments are obtained.
  • the UAV can send the audio and video signals obtained in the above manner to the terminal, so that the terminal can play the audio and video signals, thereby improving the user's immersion.
  • the first audio signal can be obtained by collecting the ambient sound; the second audio signal can be obtained by filtering the noise signal from the first audio signal; and synthesizing the second audio signal and the collected video signal into audio and video.
  • the signal is sent to the terminal, and the terminal can play the received audio and video signal, thereby realizing that the drone obtains the audio and video signal when performing the aerial photography task, thereby improving the user's on-site immersion, the user The experience is higher.
  • FIG. 2 is a schematic flowchart diagram of another audio signal processing method according to an embodiment of the present application. As shown in FIG. 2, the method can include at least the following steps.
  • Step 201 Collect an ambient sound of the drone to obtain a first audio signal.
  • Step 202 Filter out the noise signal from the first audio signal to obtain a second audio signal.
  • Step 203 Perform optimization processing on the second audio signal.
  • the optimization process may include a general processing manner such as equalization processing of the second audio signal, which is not limited herein.
  • some of the audio signals in the second audio signal may also be enhanced.
  • the third audio signal matching the sound feature may be determined from the second audio signal according to the sound feature. Further, the third audio signal is subjected to enhancement processing to highlight the playback effect of the third audio signal.
  • the drone may sequentially determine whether the second audio signal includes an audio signal that matches the sound feature; if included, the audio signal is the third audio signal.
  • the drone may select one or more sound features from the plurality of sound features, and further determine an audio signal corresponding to each of the one or more sound features in the second audio signal.
  • the drone can first identify the target object from the video signal.
  • the drone can identify the target object according to a preset identification rule or according to the indication information sent by the terminal.
  • the drone After the drone recognizes the target object, it can be based on the preset sounding object and sound Corresponding relationship of the sign determines the sound characteristics corresponding to the target object.
  • the drone determines in the second audio signal whether the audio signal matches the sound signature, and if so, determines that the audio signal is the third audio signal and enhances it.
  • the UAV recognizes the target object, determining a time when the target object appears in the video signal, and further determining, in the second audio signal, that the audio signal at the time is the third audio signal corresponding to the target object, and
  • the third audio signal is subjected to enhancement processing.
  • Step 204 Synthesize the processed second audio signal with the acquired video signal into the audio and video signal.
  • Step 205 Send the audio and video signal to a terminal, where the audio and video signal is used for playing by the terminal.
  • the drone can collect the ambient sound of the environment in which it is located to obtain the first audio signal.
  • the drone can filter out the noise signal from the first audio signal and the second audio signal is obtained.
  • the filtered noise signal may include a noise signal of an external environment and a noise signal of an internal environment in the drone.
  • the second audio signal may be optimized, for example, equalized processing or the like of the second audio signal.
  • the processed second audio signal and the video signal are synchronously combined to obtain an audio and video signal.
  • the drone can transmit the audio and video signals to the terminal, and the audio and video signals are played by the terminal.
  • the terminal may receive the user's selection operation, and determine the target object selected by the user according to the user's selection operation, and may send indication information to the drone, the indication information is used to indicate that the drone recognizes the video signal through the shooting.
  • the indication information may include information such as an object feature of the target object, and is not limited herein.
  • the drone can identify the target object from the video signal according to the indication information, and can determine the third audio signal according to the above implementation manner, and perform enhancement processing.
  • the user can select one or more of the animals as the target object.
  • the drone can preset the sound characteristics of a plurality of animals, and determine the third audio signal that matches the sound characteristics of the target object by determining the sound characteristics of the target object.
  • the drone may determine the third audio signal according to the time when the target object appears in the video after the target object is identified, which is not limited herein.
  • the terminal plays the audio signal related to the animal sound selected by the user, the audio signal is enhanced, and the sound of the animal sound played is better, thereby improving the user experience. .
  • FIG. 3 is a schematic structural diagram of an audio signal processing apparatus according to an embodiment of the present application.
  • the apparatus 300 may include a receiving unit 310, a processing unit 320, and a transmitting unit 330.
  • the receiving unit 310 is configured to collect an ambient sound of the drone to obtain a first audio signal.
  • the processing unit 320 is configured to filter out the noise signal from the first audio signal to obtain a second audio signal
  • the processing unit 320 is further configured to synthesize the second audio signal and the collected video signal into an audio and video signal;
  • the sending unit 330 is configured to send the audio and video signal to the terminal, where the audio and video signal is used for playing by the terminal.
  • the functions of the above functional units may be implemented by a combination of related components of the drone and related computer instructions stored in the memory, which is not limited herein.
  • FIG. 4 is a schematic structural diagram of an unmanned aerial vehicle according to an embodiment of the present application.
  • the drone 400 includes a center housing 401, a robot arm 402, an audio and video collection device 403, a processor 404, a communication interface 405, and a memory 406.
  • the central housing 401 and the arm 402 may be integrally connected or may be connected in other forms, which is not limited herein.
  • a plurality of systems such as a vision system, a flight control system, etc., may be built into the center housing 401 or the arm 402.
  • the above system may be implemented by a combination of hardware and software.
  • the audio and video collection device 403, the processor 404, the communication interface 405, and the memory 406 can be electrically coupled to each other, for example, by a communication bus, and the like.
  • the audio and video capture device 430 may be disposed in the center housing 401 and/or the arm 402 or disposed outside the center housing 401 and/or the arm 402. Alternatively, the audio and video collection device 420 may be connected to the center housing 401 and/or the arm 402, which is not limited herein.
  • the audio and video capture device may include independent audio capture devices such as a microphone, a microphone array, a sound sensor, etc.; and an independent video capture device such as a camera, a camera, and the like. Alternatively, the audio and video capture device can be integrated with the above independent devices to achieve simultaneous acquisition of sound and images.
  • the drone 400 may also include other components, such as a rechargeable battery, a picture transmission system, a pan/tilt interface, or various sensors for collecting information (such as an infrared sensor, an obstacle sensor, etc.), etc.
  • a rechargeable battery such as a rechargeable battery, a picture transmission system, a pan/tilt interface, or various sensors for collecting information (such as an infrared sensor, an obstacle sensor, etc.), etc.
  • the processor 404 may be an integrated circuit chip with signal processing capabilities. Alternatively, it may be a general purpose processor, a dedicated audio and video processor, a digital signal processor, an application specific integrated circuit, a field programmable gate array or other programmable logic device, a discrete gate or transistor logic device, or a discrete hardware component.
  • the drone 400 can also include one or more memories 406.
  • the memory may include a read only memory, a random access memory, a nonvolatile random access memory, etc., which is not limited herein.
  • the memory may include a computer program or computer instructions, etc., and the processor 404 may retrieve the computer program stored in the memory 406 to implement the above method.
  • the communication interface 405 can include components such as a transceiver, an antenna, and the like for enabling a communication connection with an external device, such as a communication connection with the terminal.
  • the audio and video collection device 403 is configured to collect an ambient sound of the drone to obtain a first audio signal
  • the processor 404 is configured to filter a noise signal from the first audio signal to obtain a second audio signal
  • the processor 404 is further configured to synthesize the second audio signal and the collected video signal into an audio and video signal; and send the audio and video signal to the communication interface;
  • the communication interface 405 is configured to send the audio and video signal to a terminal, and the audio and video signal is used for playing by the terminal.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • Quality & Reliability (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Computational Linguistics (AREA)
  • Acoustics & Sound (AREA)
  • Aviation & Aerospace Engineering (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Remote Sensing (AREA)
  • General Physics & Mathematics (AREA)
  • Automation & Control Theory (AREA)
  • Circuit For Audible Band Transducer (AREA)
  • Soundproofing, Sound Blocking, And Sound Damping (AREA)

Abstract

一种音频信号处理方法及相关设备,该方法包括:采集无人机的环境声音,以得到第一音频信号(101);从所述第一音频信号中滤除噪声信号,以得到第二音频信号(102);将所述第二音频信号与采集的视频信号合成为音视频信号,并将所述音视频信号发送至终端,所述音视频信号用于由所述终端进行播放(103)。能够实现无人机在执行航拍任务时,得到音视频信号,提升用户体验。

Description

音频信号处理方法与相关设备
本申请要求于2016年11月24日提交中国专利局、申请号为201611059030.5、申请名称为“无人机现场声音获取方法与有声视频实现方法及相关装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及无人机技术领域,特别是涉及一种音频信号处理方法与相关设备。
背景技术
当前,随着无人机技术的发展,无人飞行器(Unmanned Aerial Vehicle,UAV),可以简称为无人机,可以实现多种飞行任务,例如,航拍、农业种植植保、载物运输、区域巡检等。无人机可以通过与其通信的遥控器或终端的控制来实现上述飞行任务。
其中,无人机在执行航拍任务时,如何得到音视频信号,成为本领域技术人员积极研究的课题。
发明内容
本申请实施例提供了一种音频信号处理方法及相关设备,能够实现无人机在执行航拍任务时,得到音视频信号,提升用户体验。
第一方面,本申请实施例提供了一种音频信号处理方法,包括:
采集无人机的环境声音,以得到第一音频信号;
从所述第一音频信号中滤除噪声信号,以得到第二音频信号;
将所述第二音频信号与采集的视频信号合成为音视频信号,并将所述音视频信号发送至终端,所述音视频信号用于由所述终端进行播放。
第二方面,本申请实施例提供了一种音频信号处理装置,包括:
接收单元,用于采集无人机的环境声音,以得到第一音频信号;
处理单元,用于从所述第一音频信号中滤除噪声信号,以得到第二音频信号;
所述处理单元,还用于将所述第二音频信号与采集的视频信号合成为音视 频信号;
发送单元,用于将所述音视频信号发送至终端,所述音视频信号用于由所述终端进行播放。
第三方面,本申请实施例提供了一种无人机,包括:
中心机壳;
机臂,其中,所述机臂与所述中心机壳连接;
音视频采集装置,其中,所述音视频采集装置设置在所述中心机壳或机臂;
处理器;以及
通信接口;
其中,所述音视频采集装置、所述处理器与所述通信接口电耦合;
所述音视频采集装置用于采集无人机的环境声音,以得到第一音频信号;
所述处理器用于从所述第一音频信号中滤除噪声信号,以得到第二音频信号;
所述处理器还用于将所述第二音频信号与采集的视频信号合成为音视频信号;将所述音视频信号发送至所述通信接口;
所述通信接口用于将所述音视频信号发送至终端,所述音视频信号用于由所述终端进行播放。
第四方面,本申请实施例提供了一种计算机可读存储介质,所述计算机可读存储介质存储有计算机指令,所述计算机指令用于被处理器执行以实现第一方面的方法。
本申请实施例中,通过采集环境声音,可以得到第一音频信号;从第一音频信号中滤除噪声信号,可以得到第二音频信号;将第二音频信号与采集的视频信号合成为音视频信号,并将该音视频信号发送至终端,该终端可以对接收到的音视频信号进行播放,从而实现了无人机在执行航拍任务时,得到音视频信号,提升用户的现场沉浸感,用户体验较高。
附图说明
图1是本申请实施例提供的一种音频信号处理方法的流程示意图;
图2是本申请实施例提供的另一种音频信号处理方法的流程示意图;
图3是本申请实施例提供的一种音频信号处理装置的结构示意图;
图4是本申请实施例提供的一种无人机的结构示意图。
具体实施方式
下面结合附图对本申请实施例进行详细说明。
请参阅图1,图1是本申请实施例提供的一种音频信号处理方法的流程示意图。如图1所示,该方法可以至少包括以下步骤。
步骤101:采集无人机的环境声音,以得到第一音频信号。
其中,可以利用无人机所配置的声音采集装置,例如,麦克风、声音传感器等装置对无人机的环境声音进行采集。
其中,无人机的环境声音包括无人机所处的外部环境中的现场声音,以及噪声。
可选地,所采集的噪声可以包括无人机所处的外部环境的噪声以及无人机中内部环境的噪声。其中,外部环境的噪声可以是无人机螺旋浆旋转时发出的噪声,或者现场发出的噪声等。举例说明,若无人机所处的现场为演唱会现场时,歌唱者的声音或听众发出的欢呼声即为现场声音,人群发出的嘈杂的声音为外部环境的噪声。其中,无人机中内部环境的噪声是指配置在无人机的机壳内的部件在工作时发出的噪声,例如,机壳内风扇发出的噪声等。
可选的,无人机在利用声音采集装置采集声音的同时,利用无人机配置的相机采集图像以形成视频信号,从而实现无人机得到的音频信号和视频信号同步。
步骤102:从所述第一音频信号中滤除噪声信号,以得到第二音频信号。
可选地,不同的噪声信号匹配不同的噪声特征。噪声特征可以包括频段特征、响度特征、音色特征、音调特征等中的至少一种。无人机可以根据预设的噪声特征,确定预设的噪声特征所匹配的噪声信号。例如,可以根据上述特征确定频率、幅度和相位,确定出与其匹配的噪声信号。
例如,无人机内可预设螺旋桨旋转发出的噪声对应的噪声特征。无人机还可以根据无人机所处的环境,预设与环境对应的噪声特征,例如,若所处环境为演唱会,可以预设听众发出的嘈杂声音对应的噪声特征;若所处环境为自然环境,可以预设风声对应的噪声特征等等。
进一步地,可以将用以表征预设噪声特征的噪声名称发送至终端,由终端将噪声名称作为选项提供给用户进行选择,终端根据用户的选取操作确定用户 选取的噪声名称,并可以向无人机发送用户选取的噪声名称,进而无人机从全部预设的噪声特征中确定出用户选取的噪声名称所表征的噪声特征。
示例性地,当无人机得到第一音频信号后,可以根据所处环境对应的预设噪声特征,或者用户选取的预设噪声特征,来确定出与预设噪声特征匹配的噪声信号。
一种实现方式中,若预设噪音特征包括某一频段,确定第一音频信号在该频段的信号即为与预设噪音特征匹配的噪声信号。或者,根据预设噪音特征确定频率、相位和幅度,确定是否可以从第一音频信号中解析出与该频率、相位和幅度对应的波形信号,若可以,则确定该波形信号即为噪声信号。
其中,对第一音频信号中的噪声信号进行滤除处理,可以包括:将第一音频信号中的该噪声信号全部滤除,使第一音频信号中无该噪声信号;或者,将该噪声信号进行减弱处理,使噪声信号变换为非噪声信号,例如,降低噪声信号的响度、音调等,在此不予限定。
可选地,对噪声信号进行滤除可以通过以下任意一种方式。
方式一、可以确定与该噪声信号对应的反噪声信号。其中,反噪声信号用于抵消上述确定的噪声信号。例如,将上述噪声信号全部抵消,或者,减弱上述噪声信号,在此不予限定。一种实现方式中,反噪声信号可以与噪声信号的幅度和频率相同,相位相反。当然,反噪声信号还可以通过其他方式实现,在此不予限定。
进而,可以将该反噪声信号与第一音频信号进行叠加,从而达到滤除上述噪声信号的效果。反噪声信号与第一音频信号叠加,可以得到第二音频信号,该第二音频信号可以用于表示无人机所处的外部环境的现场声音。
方式二、若噪声信号仅在某一频段内出现,则可以将第一音频信号通过频段滤波器,以得到第二音频信号,进而达到滤除噪声信号的效果。
当然还可以通过其他方式,或上述两种方式的结合,在此不予限定。
步骤103:将所述第二音频信号与采集的视频信号合成为音视频信号,并将所述音视频信号发送至终端,所述音视频信号用于由所述终端进行播放。
示例性地,无人机可以将得到的第二音频信号与采集的视频信号实时合成为音视频信号,或者,将第二音频信号片段与采集的视频信号片段进行合成为音视频信号片段。具体地,确定每个时间点对应的音频信号与视频信号,合成 每个时间点对应的音视频信号,进而得到音视频片段。
示例性地,无人机可以通过上述方式得到的音视频信号发送至终端,从而终端可以播放该音视频信号,提升用户的沉浸感。
本申请实施例中,通过采集环境声音,可以得到第一音频信号;从第一音频信号中滤除噪声信号,可以得到第二音频信号;将第二音频信号与采集的视频信号合成为音视频信号,并将该音视频信号发送至终端,该终端可以对接收到的音视频信号进行播放,从而实现了无人机在执行航拍任务时,得到音视频信号,提升用户的现场沉浸感,用户体验较高。
请参阅图2,图2是本申请实施例提供的另一种音频信号处理方法的流程示意图。如图2所示,该方法可以至少包括以下步骤。
步骤201:采集无人机的环境声音,以得到第一音频信号。
步骤202:从所述第一音频信号中滤除噪声信号,以得到第二音频信号。
其中,步骤201~202的实现方式可以参见上述实施例,在此不予赘述。
步骤203:对所述第二音频信号进行优化处理。
示例性地,优化处理可以包括对第二音频信号的均衡处理等通用处理方式,在此不予限定。
可选地,还可以对第二音频信号中的部分音频信号进行增强处理。
具体实现方式中,可以根据声音特征,从第二音频信号中确定出与该声音特征匹配的第三音频信号。进而对第三音频信号进行增强处理,以突出第三音频信号的播放效果。
其中,若无人机预设置有多个声音特征,无人机可以依次确定第二音频信号中是否包括与声音特征匹配的音频信号;若包括,则该音频信号为第三音频信号。
可选地,无人机可以从多个声音特征中选取出一个或多个声音特征,再进一步确定第二音频信号中与这一个或多个声音特征各自对应的音频信号。
一种实现方式中,无人机可以首先从视频信号中识别出目标物体。其中,无人机可以依据预设的识别规则,或者根据终端发送的指示信息来识别目标物体。
进一步地,无人机识别出目标物体后,可以根据预设的发声物体与声音特 征的对应关系,确定与目标物体对应的声音特征。
无人机在第二音频信号中确定是否与该声音特征匹配的音频信号,若包括,则确定该音频信号为第三音频信号,并对其进行加强。
或者,无人机识别出目标物体后,确定视频信号中,该目标物体出现的时间,进而在第二音频信号中,确定该时间上的音频信号即为目标物体对应的第三音频信号,并对该第三音频信号进行增强处理。
步骤204:将处理后的第二音频信号与所述采集的视频信号合成所述音视频信号。
步骤205:将所述音视频信号发送至终端,所述音视频信号用于由所述终端进行播放。
其中,步骤204~步骤205的实现方式可以参见上述实施例,在此不予限定。
举例说明,无人机可以采集所处环境的环境声音,以得到第一音频信号。无人机可以从第一音频信号中滤除噪声信号,已得到第二音频信号。其中,滤除的噪声信号可以包括外部环境的噪声信号和无人机中的内部环境的噪声信号等。进一步地,可以对第二音频信号进行优化处理,例如,对第二音频信号进行均衡处理等。将处理后的第二音频信号与视频信号进行同步合成处理,以得到音视频信号。无人机可以将音视频信号发送至终端,由终端播放该音视频信号。
进一步地,终端可以接收用户的选取操作,并根据用户的选取操作确定用户选取的目标物体,并可以向无人机发送指示信息,该指示信息用于指示无人机通过拍摄的视频信号识别该目标物体。例如,指示信息可以包括目标物体的物体特征等信息,在此不予限定。
进一步地,无人机可以根据该指示信息从视频信号中识别出目标物体,并可以根据上述实现方式来确定第三音频信号,并对其进行增强处理。
例如,若终端向用户播放的画面中出现了多种动物,用户可以选取其中一种或多种动物作为目标物体。无人机可以预设多种动物的声音特征,通过确定目标物体的声音特征,确定与其相匹配的第三音频信号。或者,无人机可以在识别出目标物体后,根据目标物体在视频出现的时间,确定第三音频信号,在此不予限定。
进而,终端再次接收到无人机发送的音视频信号后,终端播放用户选取的动物声音相关的音频信号时,该音频信号已被增强,播放出的该动物声音的音效更好,提升用户体验。
请参阅图3,图3是本申请实施例提供的一种音频信号处理装置的结构示意图。如图3所示,该装置300可以包括接收单元310、处理单元320以及发送单元330。
其中,接收单元310,用于采集无人机的环境声音,以得到第一音频信号;
处理单元320,用于从所述第一音频信号中滤除噪声信号,以得到第二音频信号;
所述处理单元320,还用于将所述第二音频信号与采集的视频信号合成为音视频信号;
发送单元330,用于将所述音视频信号发送至终端,所述音视频信号用于由所述终端进行播放。
当然,上述功能单元还用于执行上述实施例中无人机所执行的任意一种方法,在此不再赘述。
上述功能单元的功能可以由无人机的相关组件和存储器中存储的相关计算机指令结合实现,在此不予限定。
请参阅图4,图4是本申请实施例提供的一种无人机的结构示意图。如图4所示,无人机400包括:中心壳体401、机臂402、音视频采集装置403、处理器404、通信接口405、存储器406。
其中,中心壳体401与机臂402可以是一体连接的,也可以是以其他形式连接的,在此不予限定。中心壳体401或机臂402中可以内置有多个系统,如视觉系统,飞控系统等,上述系统可以由硬件和软件结合实现。
音视频采集装置403、处理器404、通信接口405、存储器406之间可以电耦合,例如通过通信总线实现耦合等,在此不予限定。
其中,音视频采集装置430可以设置于中心壳体401和/或机臂402内,或设置与中心壳体401和/或机臂402外。或者,音视频采集装置420可以与中心壳体401和/或机臂402连接,在此不予限定。音视频采集装置可以包括包括独立的音频采集装置,如麦克风、麦克风阵列、声音传感器等;以及独立的视频采集装置,如相机、摄像头等。或者,音视频采集装置可以对上述独立装置集成,实现同步采集声音和图像。
当然,该无人机400还可以包括其他组件,如可充电电池、图传系统、云台接口、或者各种用于采集信息的传感器(如红外传感器、障碍物传感器等)等,在此不予赘述。
其中,处理器404可能是一种集成电路芯片,具有信号的处理能力。或者,可以是通用处理器、专用音视频处理器、数字信号处理器、专用集成电路、现场可编程门阵列或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。
无人机400还可以包括一个或多个存储器406。存储器可以包括只读存储器、随机存取存储器、非易失性随机存取存储器等,在此不予限定。存储器中可以包括计算机程序或计算机指令等,处理器404可以调取存储器406中存储的计算机程序,以实现上述方法。
通信接口405可以包括收发器、天线等组件,用于实现与外部设备进行通信连接,例如与终端进行通信连接。
下面结合上述结构,示例性地说明各组件对实现上述方法所起的作用。
例如,所述音视频采集装置403用于采集无人机的环境声音,以得到第一音频信号;
所述处理器404用于从所述第一音频信号中滤除噪声信号,以得到第二音频信号;
所述处理器404还用于将所述第二音频信号与采集的视频信号合成为音视频信号;将所述音视频信号发送至所述通信接口;
所述通信接口405用于将所述音视频信号发送至终端,所述音视频信号用于由所述终端进行播放。
所属领域的技术人员可以理解地是,为描述的方便和简洁,上述描述的系统,装置和单元的具体工作过程,可以参考上述方法实施例中的对应过程,在此不再赘述。
以上所述,仅为本发明的具体实施方式,但本发明的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本发明揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本发明的保护范围之内。因此,本发明的保护范围应以所述权利要求的保护范围为准。

Claims (28)

  1. 一种音频信号处理方法,其特征在于,包括:
    采集无人机的环境声音,以得到第一音频信号;
    从所述第一音频信号中滤除噪声信号,以得到第二音频信号;
    将所述第二音频信号与采集的视频信号合成为音视频信号,并将所述音视频信号发送至终端,所述音视频信号用于由所述终端进行播放。
  2. 根据权利要求1所述的方法,其特征在于,所述从所述第一音频信号中滤除噪声信号之前,所述方法还包括:
    根据预设噪声特征,从所述第一音频信号中确定与所述预设噪声特征匹配的噪声信号。
  3. 根据权利要求2所述的方法,其特征在于,所述噪声信号包括所述无人机所处的外部环境的噪声信号和/或所述无人机中的内部环境的噪声信号。
  4. 根据权利要求2或3所述的方法,其特征在于,所述预设噪声特征包括预设频段特征、预设响度特征、预设音色特征、预设音调特征中的至少一种。
  5. 根据权利要求1至4任一项所述的方法,其特征在于,所述从所述第一音频信号中滤除噪声信号,包括:
    确定与所述噪声信号对应的反噪声信号;
    将所述反噪声信号与所述第一音频信号进行叠加,以得到所述第二音频信号。
  6. 根据权利要求1至5任一项所述的方法,其特征在于,所述方法还包括:
    对所述第二音频信号进行优化处理;
    所述将所述第二音频信号与采集的视频信号合成为音视频信号,包括:
    将处理后的第二音频信号与所述采集的视频信号合成所述音视频信号。
  7. 根据权利要求6所述的方法,其特征在于,所述对所述第二音频信号进行优化处理,包括:
    根据声音特征,从所述第二音频信号中确定与所述声音特征匹配的第三音频信号;
    对所述第三音频信号进行增强处理。
  8. 根据权利要求7所述的方法,其特征在于,所述根据声音特征,从所述第二音频信号中确定与所述声音特征匹配的第三音频信号之前,所述方法还包括:
    从所述视频信号中识别出目标物体;
    确定与所述目标物体对应的所述声音特征。
  9. 根据权利要求8所述的方法,其特征在于,所述从视频信号中识别出目标物体,包括:
    根据所述终端发送的指示信息,从所述视频信号中识别出所述指示信息所指示的所述目标物体。
  10. 一种音频信号处理装置,其特征在于,包括:
    接收单元,用于采集无人机的环境声音,以得到第一音频信号;
    处理单元,用于从所述第一音频信号中滤除噪声信号,以得到第二音频信号;
    所述处理单元,还用于将所述第二音频信号与采集的视频信号合成为音视频信号;
    发送单元,用于将所述音视频信号发送至终端,所述音视频信号用于由所述终端进行播放。
  11. 根据权利要求10所述的装置,其特征在于,所述处理单元还用于:
    根据预设噪声特征,从所述第一音频信号中确定与所述预设噪声特征匹配的噪声信号。
  12. 根据权利要求11所述的装置,其特征在于,所述噪声信号包括所述无人机所处的外部环境的噪声信号和/或所述无人机中的内部环境的噪声信号。
  13. 根据权利要求11或12所述的装置,其特征在于,所述预设噪声特征包括预设频段特征、预设响度特征、预设音色特征、预设音调特征中的至少一种。
  14. 根据权利要求10至13任一项所述的装置,其特征在于,所述处理单 元还用于:
    确定与所述噪声信号对应的反噪声信号;
    将所述反噪声信号与所述第一音频信号进行叠加,以得到所述第二音频信号。
  15. 根据权利要求10至14任一项所述的装置,其特征在于,所述处理单元还用于:
    对所述第二音频信号进行优化处理;
    所述将所述第二音频信号与采集的视频信号合成为音视频信号,包括:
    将处理后的第二音频信号与所述采集的视频信号合成所述音视频信号。
  16. 根据权利要求15所述的装置,其特征在于,所述处理单元还用于:
    根据声音特征,从所述第二音频信号中确定与所述声音特征匹配的第三音频信号;
    对所述第三音频信号进行增强处理。
  17. 根据权利要求16所述的装置,其特征在于,所述处理单元还用于:
    从所述视频信号中识别出目标物体;
    确定与所述目标物体对应的所述声音特征。
  18. 根据权利要求17所述的装置,其特征在于,所述处理单元还用于:
    根据所述终端发送的指示信息,从所述视频信号中识别出所述指示信息所指示的所述目标物体。
  19. 一种无人机,其特征在于,包括:
    中心机壳;
    机臂,其中,所述机臂与所述中心机壳连接;
    音视频采集装置,其中,所述音视频采集装置设置在所述中心机壳或机臂;
    处理器;以及
    通信接口;
    其中,所述音视频采集装置、所述处理器与所述通信接口电耦合;
    所述音视频采集装置用于采集无人机的环境声音,以得到第一音频信号;
    所述处理器用于从所述第一音频信号中滤除噪声信号,以得到第二音频信 号;
    所述处理器还用于将所述第二音频信号与采集的视频信号合成为音视频信号;将所述音视频信号发送至所述通信接口;
    所述通信接口用于将所述音视频信号发送至终端,所述音视频信号用于由所述终端进行播放。
  20. 根据权利要求19所述的无人机,其特征在于,所述处理器还用于:
    根据预设噪声特征,从所述第一音频信号中确定与所述预设噪声特征匹配的噪声信号。
  21. 根据权利要求20所述的无人机,其特征在于,所述噪声信号包括所述无人机所处的外部环境的噪声信号和/或所述无人机中的内部环境的噪声信号。
  22. 根据权利要求20或21所述的无人机,其特征在于,所述预设噪声特征包括预设频段特征、预设响度特征、预设音色特征、预设音调特征中的至少一种。
  23. 根据权利要求19至22任一项所述的无人机,其特征在于,所述处理器还用于:
    确定与所述噪声信号对应的反噪声信号;
    将所述反噪声信号与所述第一音频信号进行叠加,以得到所述第二音频信号。
  24. 根据权利要求19至23任一项所述的无人机,其特征在于,所述处理器还用于:
    对所述第二音频信号进行优化处理;
    所述将所述第二音频信号与采集的视频信号合成为音视频信号,包括:
    将处理后的第二音频信号与所述采集的视频信号合成所述音视频信号。
  25. 根据权利要求24所述的无人机,其特征在于,所述处理器还用于:
    根据声音特征,从所述第二音频信号中确定与所述声音特征匹配的第三音频信号;
    对所述第三音频信号进行增强处理。
  26. 根据权利要求25所述的方法,其特征在于,所述根据声音特征,所 述处理器还用于:
    从所述视频信号中识别出目标物体;
    确定与所述目标物体对应的所述声音特征。
  27. 根据权利要求26所述的方法,其特征在于,所所述处理器还用于:
    根据所述终端发送的指示信息,从所述视频信号中识别出所述指示信息所指示的所述目标物体。
  28. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质存储有计算机指令,所述计算机指令用于被处理器执行以实现如权利要求1至9任一项所述的方法。
PCT/CN2017/112803 2016-11-24 2017-11-24 音频信号处理方法与相关设备 WO2018095400A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201611059030.5 2016-11-24
CN201611059030.5A CN106527478A (zh) 2016-11-24 2016-11-24 无人机现场声音获取方法与有声视频实现方法及相关装置

Publications (1)

Publication Number Publication Date
WO2018095400A1 true WO2018095400A1 (zh) 2018-05-31

Family

ID=58357087

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/112803 WO2018095400A1 (zh) 2016-11-24 2017-11-24 音频信号处理方法与相关设备

Country Status (2)

Country Link
CN (1) CN106527478A (zh)
WO (1) WO2018095400A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111984111A (zh) * 2019-05-22 2020-11-24 中国移动通信有限公司研究院 多媒体处理方法、装置及通信设备
CN113419557A (zh) * 2021-06-17 2021-09-21 哈尔滨工业大学 运动无人机音频合成方法

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106527478A (zh) * 2016-11-24 2017-03-22 深圳市道通智能航空技术有限公司 无人机现场声音获取方法与有声视频实现方法及相关装置
CN107871498A (zh) * 2017-10-10 2018-04-03 昆明理工大学 一种基于Fisher准则以提高语音识别率的混合特征组合算法
CN107821379A (zh) * 2017-12-12 2018-03-23 赵有科 一种利用无人机进行驱赶的系统
WO2019227279A1 (zh) * 2018-05-28 2019-12-05 深圳市大疆创新科技有限公司 降噪方法、装置和无人机
CN109559757A (zh) * 2018-11-30 2019-04-02 维沃移动通信有限公司 一种噪音消除方法及移动终端
CN111247811A (zh) * 2018-12-24 2020-06-05 深圳市大疆创新科技有限公司 一种数据处理方法、无人机、眼镜设备及存储介质
CN115209209A (zh) * 2022-09-15 2022-10-18 成都索贝数码科技股份有限公司 一种演艺现场手机录制分发专业音频短视频的方法

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090081943A1 (en) * 2007-09-26 2009-03-26 Radeum, Inc. Dba Freelinc System and method for near field communications having local security
US20160214713A1 (en) * 2014-12-19 2016-07-28 Brandon Cragg Unmanned aerial vehicle with lights, audio and video
CN105955211A (zh) * 2016-04-28 2016-09-21 中北大学 一种旋翼无人机控制系统
CN105979167A (zh) * 2016-06-24 2016-09-28 谭圆圆 视频制作方法及视频制作装置
CN106527478A (zh) * 2016-11-24 2017-03-22 深圳市道通智能航空技术有限公司 无人机现场声音获取方法与有声视频实现方法及相关装置

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9064497B2 (en) * 2012-02-22 2015-06-23 Htc Corporation Method and apparatus for audio intelligibility enhancement and computing apparatus
CN104427068B (zh) * 2013-09-06 2019-07-12 中兴通讯股份有限公司 一种语音通话方法及装置
EP3161502B1 (en) * 2014-08-29 2020-04-22 SZ DJI Technology Co., Ltd. An unmanned aerial vehicle (uav) for collecting audio data
CN106029500A (zh) * 2015-03-31 2016-10-12 深圳市大疆创新科技有限公司 一种飞行系统、飞行器、声音装置及声音处理方法
CN205051857U (zh) * 2015-10-15 2016-02-24 深圳市大疆创新科技有限公司 飞行装置、拍摄装置及其录音降噪装置
CN205048699U (zh) * 2015-10-19 2016-02-24 珠海格力电器股份有限公司 一种降噪系统、电子膨胀阀和空调
CN205680442U (zh) * 2016-03-24 2016-11-09 王元聪 一种多用途实时语音增强器
CN105848056A (zh) * 2016-04-01 2016-08-10 张俊斌 一种新型主动降噪方法
CN105840462A (zh) * 2016-05-23 2016-08-10 苏州艾柏特精密机械有限公司 一种压缩机有源主动消声降噪系统
CN106005454A (zh) * 2016-06-23 2016-10-12 杨珊珊 无人飞行器音频采集系统及其音频采集方法
CN106161218A (zh) * 2016-09-28 2016-11-23 乐视控股(北京)有限公司 实时通话中的语音处理方法及装置

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090081943A1 (en) * 2007-09-26 2009-03-26 Radeum, Inc. Dba Freelinc System and method for near field communications having local security
US20160214713A1 (en) * 2014-12-19 2016-07-28 Brandon Cragg Unmanned aerial vehicle with lights, audio and video
CN105955211A (zh) * 2016-04-28 2016-09-21 中北大学 一种旋翼无人机控制系统
CN105979167A (zh) * 2016-06-24 2016-09-28 谭圆圆 视频制作方法及视频制作装置
CN106527478A (zh) * 2016-11-24 2017-03-22 深圳市道通智能航空技术有限公司 无人机现场声音获取方法与有声视频实现方法及相关装置

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111984111A (zh) * 2019-05-22 2020-11-24 中国移动通信有限公司研究院 多媒体处理方法、装置及通信设备
CN113419557A (zh) * 2021-06-17 2021-09-21 哈尔滨工业大学 运动无人机音频合成方法
CN113419557B (zh) * 2021-06-17 2022-07-19 哈尔滨工业大学 运动无人机音频合成方法

Also Published As

Publication number Publication date
CN106527478A (zh) 2017-03-22

Similar Documents

Publication Publication Date Title
WO2018095400A1 (zh) 音频信号处理方法与相关设备
US10848889B2 (en) Intelligent audio rendering for video recording
WO2021143599A1 (zh) 基于场景识别的语音处理方法及其装置、介质和系统
CN106653041A (zh) 音频信号处理设备、方法和电子设备
CN104052917B (zh) 通知控制设备、通知控制方法和存储介质
JP7467615B2 (ja) ブロックチェーンを使用したフェイクビデオ検出
CN112165590A (zh) 视频的录制实现方法、装置及电子设备
KR101739942B1 (ko) 오디오 노이즈 제거 방법 및 이를 적용한 영상 촬영 장치
CN102611844A (zh) 用于处理图像的方法和设备
CN111251307B (zh) 应用于机器人的语音采集方法和装置、一种机器人
US20210117650A1 (en) Fake video detection
WO2012177229A1 (en) Apparatus, systems and methods for identifying image objects using audio commentary
CN113439447A (zh) 使用深度学习图像分析的房间声学仿真
US20210117690A1 (en) Fake video detection using video sequencing
CN107707816A (zh) 一种拍摄方法、装置、终端及存储介质
US11257511B1 (en) Voice equalization based on face position and system therefor
JP6818445B2 (ja) 音データ処理装置および音データ処理方法
CN107087208B (zh) 一种全景视频播放方法、系统及存储装置
US20220103639A1 (en) Server apparatus, communication system and communication method
CN111836004A (zh) 一种音频数据传输方法、装置、系统及设备
CN113689873A (zh) 噪声抑制方法、装置及电子设备和存储介质
CN115174816A (zh) 一种基于麦克风阵列的环境噪音声源定向抓拍方法及装置
US11825191B2 (en) Method for assisting the acquisition of media content at a scene
WO2021080815A1 (en) Fake video detection
CN117501359A (zh) 传送系统、音输出方法及程序

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17873270

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17873270

Country of ref document: EP

Kind code of ref document: A1