WO2020151303A1 - Audio collection apparatus, and device and method for audio processing - Google Patents

Audio collection apparatus, and device and method for audio processing Download PDF

Info

Publication number
WO2020151303A1
WO2020151303A1 PCT/CN2019/116331 CN2019116331W WO2020151303A1 WO 2020151303 A1 WO2020151303 A1 WO 2020151303A1 CN 2019116331 W CN2019116331 W CN 2019116331W WO 2020151303 A1 WO2020151303 A1 WO 2020151303A1
Authority
WO
WIPO (PCT)
Prior art keywords
audio
microphones
collection device
microphone
processing
Prior art date
Application number
PCT/CN2019/116331
Other languages
French (fr)
Chinese (zh)
Inventor
滕海
陈仁武
顾凤香
董敏亚
Original Assignee
阿里巴巴集团控股有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 阿里巴巴集团控股有限公司 filed Critical 阿里巴巴集团控股有限公司
Publication of WO2020151303A1 publication Critical patent/WO2020151303A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R5/00Stereophonic arrangements
    • H04R5/027Spatial or constructional arrangements of microphones, e.g. in dummy heads

Definitions

  • the present invention relates to audio processing, and in particular to an audio collection device and equipment and method for audio processing.
  • the microphone array usually includes multiple microphones used to collect audio to achieve audio enhancement, sound source localization, or de-reverberation.
  • the microphones currently used in microphone arrays are usually omnidirectional microphones without directivity. However, in voice interaction scenarios, the sound in a specific direction is often concerned. A microphone array composed of omnidirectional microphones is prone to loss of audio information, especially high-frequency information, and cannot collect information in a specific direction, which increases the difficulty of subsequent processing.
  • the present invention provides a technical solution capable of enhancing the directivity of the microphone array.
  • the present invention achieves its above objectives through the following technical solutions.
  • an audio collection device includes: a base plate, the base plate includes a plurality of sound collecting structures; and a microphone array, the microphone array includes a plurality of microphones, wherein the plurality of The microphones are each placed in a corresponding sound collecting structure, which is configured to enhance the audio collection capability of the corresponding microphone in a specific direction.
  • the sound collecting structure is a recess in the bottom plate
  • the microphone is placed at the bottom of the recess
  • the area of the opening of the recess is larger than the area of the bottom of the recess.
  • the concave portion has a conical shape.
  • the shape of the cross section of the recess is at least one of a part of a ring, a part of a parabola, and a part of a triangle.
  • each sound collecting structure has the same shape.
  • audio from multiple microphones is processed using the same processing parameters.
  • multiple microphones are connected to the same audio processing device.
  • the plurality of microphones are arranged in a linear array.
  • the plurality of microphones are arranged in an area array.
  • a device for processing audio includes: the audio collection device as described above; and an audio processing device that receives the audio received by the audio collection device and Process the audio.
  • the audio processing device includes a preprocessing unit that amplifies and/or denoises the audio.
  • the audio processing device includes a voice recognition unit that performs voice recognition on the audio to recognize a voice command.
  • the audio processing device includes a command execution unit, and the command execution unit executes the recognized voice command.
  • each sound collecting structure in the audio collection device has the same shape, and the audio processing device uses the same processing parameters to process the audio from each microphone in the audio collection device.
  • the audio collection device is installed on a user-facing panel of the device.
  • a method for processing audio comprising: receiving audio collected by an audio collecting device as described above; performing voice recognition on the received audio to recognize a voice command; and Execute the recognized voice command.
  • the same processing parameters are used to process audio from multiple microphones of the audio collection device.
  • the present invention can have the following beneficial effects:
  • Enhance audio especially the collection of high-frequency parts of audio
  • the cost is very low and easy to manufacture.
  • Fig. 1 is a schematic diagram showing an audio collection device according to an embodiment of the present specification.
  • Fig. 2 is a perspective view showing an audio collecting device according to an embodiment of the present specification.
  • Fig. 3 shows an operation scene of the audio collection device according to an embodiment of the present specification.
  • 4A-4E are examples showing the shape of the sound collecting structure of the audio collecting device according to the embodiment of the present specification.
  • FIG. 5 is an example showing the shape of the opening of the sound collecting structure of the audio collecting device according to the embodiment of the present specification.
  • FIG. 6 is a schematic diagram showing an example of the arrangement of microphones of an audio collecting device according to an embodiment of the present specification.
  • FIGS. 7A-7B are schematic diagrams showing a device including an audio collecting device according to an embodiment of the present specification.
  • Fig. 8 is a block diagram showing a device including an audio collecting device according to an embodiment of the present specification.
  • FIG. 9 is a flowchart showing a method of processing audio according to an embodiment of the present specification.
  • the present invention requires a cost-efficient audio collection device and audio processing equipment and method.
  • the audio collection device 100 may include a microphone array 102.
  • the microphone array 102 includes a plurality of microphones 106.
  • the audio collection device 100 may further include a bottom plate 104, and the bottom plate 104 supports the microphone array 102.
  • the bottom plate 104 may be the frame of the television.
  • the bottom plate 104 may not be the frame of the TV, but another bottom plate, which may be fixed to the frame or other positions of the TV.
  • the bottom plate 104 includes a plurality of sound collecting structures 108.
  • the sound collecting structure 108 is configured to enhance the audio collection capability of the corresponding microphone in a specific direction.
  • the sound collecting structure 108 is a recess in the bottom plate.
  • the recess is configured to enhance the audio collection capability of the microphone in a specific direction.
  • the shape of the sound collecting structure 108 will be described in detail below.
  • the sound gathering structure 108 may also take other forms.
  • the sound collecting structure may partially protrude from the bottom plate (not shown in the figure).
  • each microphone 106 of the microphone array 102 is located in the sound collecting structure 108.
  • the microphone 106 is located at the bottom of the sound collecting structure 108, such as the center of the bottom. This will be further described below in conjunction with the shape of the converging structure.
  • the concave portion constituting the sound collecting structure 108 is conical, as shown in the perspective view of FIG. 2.
  • the microphone 106 may be placed at the bottom of the cone (that is, at the apex of the cone), for example. It can be understood that, compared with the traditional solution in which the microphone is simply placed in the small hole on the side of the frame, in the sound collecting structure of the embodiment of this specification, the audio collection ability in the direction facing the cone axis is lower It is reinforced with respect to other directions, as explained below in conjunction with FIG. 3.
  • the included angle of the cone of the sound collecting structure 108 is 74°, and the depth of the cone is 2 cm.
  • the microphone is at 0° (the point sound source is in the sound-concentrating structure
  • the normal voice (different from the single-frequency sound used in other tests) collected by the normal voice (different from the single-frequency tone used in other tests) at 90° (the line connecting the sound source and the concentrating structure is perpendicular to the axis of the concentrating structure)
  • the difference increases by 8db and 4db respectively. In other words, under this parameter, the directivity can reach 4db.
  • any other suitable dimensional parameters can also be used. It can be appreciated that those skilled in the art can design the cone angle and the depth of the cone of the sound collecting structure according to the common distance between the sound source and the microphone, so as to achieve the best sound collecting effect.
  • the distance between the sound source and the microphone is usually 30-50 cm.
  • the included angle of the conical shape of the sound collecting structure 108 can be designed to be 25 degrees, and the depth can be designed to be 0.35 cm.
  • the distance between the sound source and the microphone is usually 3-5 meters (or other distances, depending on the size of the television).
  • the included angle of the conical shape of the sound collecting structure 108 can be designed to be 60 degrees, and the depth can be designed to be 0.7 cm. For example, you can choose different size parameters through experimentation to achieve the best polyphonic effect.
  • the concave portion can take various other shapes, and these shapes can all be selected to enhance the audio collection capability of the microphone in a specific direction.
  • the microphone is placed at the bottom of the recess, and the area of the opening of the cross section of the recess is larger than the area of the bottom of the recess, so as to achieve a sound focusing effect.
  • the shape of the cross-section of the recess may include, but is not limited to, at least one of a part of a ring (FIG. 4A), a part of a parabola (FIG. 4B), and a part of a triangle (FIG. 4C).
  • the recessed portion can also take other shapes, as shown in Figure 4D.
  • the openings are shown as circular in FIGS. 4A-4D, it should be appreciated that the openings may be other shapes, such as rectangular, as shown in FIG. 4E. The shape of the opening will be explained in more detail with reference to FIG. 5 below.
  • the size of the cross section of the recess at the opening is generally larger than the size of the bottom (ie, the position opposite to the opening).
  • the size of the opening of the circular ring is larger than the size of its apex.
  • the size of the opening of the parabola is larger than the size of its apex.
  • the size of the sides of the triangle is larger than the size of the vertices.
  • the microphone 106 can usually be placed at the bottom of the recess.
  • the microphone 106 may be placed at the end opposite to the opening (ie, the position farthest from the opening).
  • the microphone 106 may be placed at its apex.
  • the microphone 106 may be placed at the vertex opposite to the side of the opening.
  • FIG. 5 it shows an example of the opening of the sound collecting structure of the audio collecting device according to the embodiment of the present specification.
  • the opening of the recessed portion as shown in FIG. 5 can also take various shapes, including but not limited to a circle, a square, or other polygons. It can be appreciated that these are only examples of the opening of the sound collecting structure, and the shape of the opening of the sound collecting structure of the present invention is not limited thereto.
  • the cross-sectional shape of the recessed portion can be combined with the shape of the opening shown in Fig. 5 in any suitable manner.
  • the sound collecting structure or the bottom plate is formed by 3D printing. It can also be manufactured in other ways. It can be appreciated that this simple structure is easy to manufacture and low in cost.
  • the sound collecting structures have the same shape. It has been found that this solution is particularly advantageous.
  • the same shape of the sound-concentrating structure not only reduces the difficulty of manufacturing, but also makes the frequency response of each microphone consistent, thereby simplifying the subsequent audio processing.
  • the audio from the multiple microphones is processed with the same processing parameters.
  • the multiple microphones are connected to the same processor, and the same processor uses the same processing parameters to process audio from the multiple microphones.
  • FIG. 6 shows a schematic diagram of an example of the arrangement of the microphone (and the corresponding sound collecting structure) of the audio collecting device according to the embodiment of the present specification.
  • microphones can be arranged in a microphone array.
  • a plurality of microphones are arranged in a linear array.
  • the plurality of microphones may be arranged in a row or a column, that is, arranged in a one-dimensional array. This arrangement is particularly easy to manufacture.
  • the plurality of microphones may be arranged in other ways.
  • the plurality of microphones may be arranged in an area array, that is, arranged in a two-dimensional array.
  • the area array includes, but is not limited to, a square array, a circular array, an elliptical array, and the like.
  • the area array may include other two-dimensional arrays, such as L-shaped arrays, irregular arrays, and the like.
  • the interval between the microphones is uniform. Or, the interval between the microphones may also be uneven.
  • Using the combination of the microphone array and the sound collecting structure of the embodiment of the present specification can uniformly and consistently improve the sound collecting ability of each microphone in the microphone array in a specific direction, thereby achieving a better audio collection effect.
  • Some devices such as tablet computers, already include a microphone, which is usually located on the side of the frame.
  • the microphone in a traditional tablet computer is usually not facing the sound source.
  • a microphone is placed in a small hole, which is usually not specially designed to have a shape suitable for enhancing the audio collection capability in a specific direction.
  • FIGs. 7A-7B there are shown overall diagrams of equipment in which the audio collection device according to the embodiments of the present specification can be used.
  • the device is, for example, a television.
  • the television may include a frame 702 on which the audio collection device 100 as described in the embodiment of the present specification can be implemented.
  • the audio collected by the audio collecting device 100 can simply be recorded, played or performed by a TV set.
  • the audio collected by the audio collecting device 100 can be used to control the device.
  • a television can collect the user's audio through the audio collection device, and perform voice recognition on the collected audio, so that the command that the user wants to execute can be determined.
  • the user can say "pause playback" to the TV, and the audio collection device can collect this audio, recognize the "pause playback” command from it, and pause the current video or audio playback on the TV.
  • the device is, for example, a tablet computer.
  • the tablet computer may also include a frame 706 and implement the audio collection device 100 as described in the embodiment of this specification on the frame.
  • the audio collection device may be placed on the front of the frame to face the sound source (for example, the user). Placing the audio collection device on the front of the frame helps the audio collection device according to the embodiment of the present specification to enhance audio collection in the direction of the sound source.
  • the audio collection device is implemented in the upper frame in the example of FIGS. 7A and 7B, it can be appreciated that the audio collection device can be implemented in any position of the frame. Alternatively, the audio collection device may be implemented in other structures outside the frame (for example, the base 704, etc.).
  • the audio collection device in the embodiments of this specification can be implemented in other devices (such as vending machines, smart speakers, etc.) besides televisions and tablet computers, and the embodiments of this specification are not limited to specific devices.
  • FIG. 8 shows a block diagram of a device including an audio collection device according to an embodiment of the present specification.
  • the device may be, for example, the tablet computer, TV, vending machine, etc. described above with reference to FIGS. 7A-7B, but is not limited thereto.
  • the device may include the audio collection device 100 as described above.
  • audio in a specific direction can be enhanced, thereby facilitating subsequent processing.
  • the device may also include an audio processing device 802.
  • the audio processing device 802 receives and processes audio from the audio collection device 100.
  • the audio processing device 802 can simply store or play audio.
  • the audio processing device 802 may store the collected audio in a voice memo.
  • the audio processing device 802 may include a preprocessing unit 804.
  • the preprocessing unit can preprocess the received audio. For example, operations such as amplification and denoising can be performed on the received audio.
  • the audio processing device 802 may further include a voice recognition unit 806.
  • the voice recognition unit 806 can perform voice recognition on the collected audio, so as to be able to determine the command that the user wants to execute. For example, the user can say “pause playback” to the TV, and the audio collection device can collect this audio and recognize the "pause playback" command from it.
  • the audio processing device 802 may further include a command execution unit 808.
  • the command execution unit 808 can execute the command recognized by the voice recognition unit 806, such as pausing the current video or audio playback on the TV.
  • each function is implemented by hardware (for example, a general-purpose processor or a special audio processor, etc.) or software, and the distribution of each function in each software/hardware unit can be designed by those skilled in the art according to actual needs. These fall into this specification Scope of the embodiment.
  • multiple microphones in the audio collection device 100 are connected to the same audio processing device 802.
  • the same audio processing device can use the same processing parameters to process audio from multiple microphones, thereby reducing processing complexity and improving processing performance.
  • the method 900 may include: in step 902, receiving audio collected by the audio collecting device as described above.
  • the method 900 may further include: optionally, in step 904, preprocessing the received audio.
  • the same processing parameters are used to process audio from multiple microphones of the audio collection device.
  • the same processor can use the same processing parameters to process audio from multiple microphones, thereby reducing processing complexity and improving processing performance.
  • the method 900 may further include: in step 906, performing voice recognition on the received audio to recognize the command. Since the audio collected by the audio collecting device of the embodiment of this specification can enhance the audio in a specific direction, the voice commands spoken by the user in the specific direction can be collected more clearly, thereby improving the effect of voice recognition.
  • the method 900 may also include: in step 908, executing the recognized voice command.
  • modules or elements described or shown as separate herein may be combined into a single module or element, and modules or elements described or shown herein as a single module or element may be split into multiple modules or elements.

Landscapes

  • Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Acoustics & Sound (AREA)
  • Signal Processing (AREA)
  • Circuit For Audible Band Transducer (AREA)
  • Obtaining Desirable Characteristics In Audible-Bandwidth Transducers (AREA)

Abstract

Disclosed is an audio collection apparatus. The audio collection apparatus comprises: a bottom plate, wherein the bottom plate comprises a plurality of sound collection structures; and a microphone array, the microphone array comprising a plurality of microphones, wherein the plurality of microphones are respectively placed in corresponding sound collection structures, and the sound collection structures are constructed to enhance the audio collection capability of a corresponding microphone in a specific direction. Further disclosed are a device and method for audio processing.

Description

音频采集装置及用于处理音频的设备和方法Audio collection device and equipment and method for processing audio 技术领域Technical field
本发明涉及音频处理,尤其涉及音频采集装置及用于处理音频的设备和方法。The present invention relates to audio processing, and in particular to an audio collection device and equipment and method for audio processing.
背景技术Background technique
随着语音交互技术的飞速发展,基于麦克风阵列的音频采集技术得到了广泛应用。麦克风阵列通常包括用于采集音频的多个麦克风,以实现音频增强、声源定位或去混响等功能。With the rapid development of voice interaction technology, audio collection technology based on microphone arrays has been widely used. The microphone array usually includes multiple microphones used to collect audio to achieve audio enhancement, sound source localization, or de-reverberation.
目前应用于麦克风阵列的麦克风通常是不具备指向性的全向麦克风。然而,在语音交互场景中,往往关注特定方向的声音。由全向麦克风构成的麦克风阵列容易损失音频信息,尤其是高频信息,而且无法收集特定方向的信息,增大后续处理的难度。The microphones currently used in microphone arrays are usually omnidirectional microphones without directivity. However, in voice interaction scenarios, the sound in a specific direction is often concerned. A microphone array composed of omnidirectional microphones is prone to loss of audio information, especially high-frequency information, and cannot collect information in a specific direction, which increases the difficulty of subsequent processing.
因此,需要一种能够增强麦克风阵列的指向性并能增强音频尤其是音频的高频部分的收集的成本高效的解决方案。Therefore, there is a need for a cost-effective solution that can enhance the directivity of the microphone array and enhance the collection of audio, especially the high frequency part of the audio.
发明内容Summary of the invention
为了克服现有技术的缺陷,本发明提供了能够增强麦克风阵列的指向性的技术方案。In order to overcome the defects of the prior art, the present invention provides a technical solution capable of enhancing the directivity of the microphone array.
本发明通过以下技术方案来实现其上述目的。The present invention achieves its above objectives through the following technical solutions.
在一个方面中,公开了一种音频采集装置,所述音频采集装置包括:底板,所述底板包括多个聚音结构;以及麦克风阵列,所述麦克风阵列包括多个麦克风,其中所述多个麦克风各自被置于相应的聚音结构中,所述聚音结构被构造成增强相应的麦克风在特定方向上的音频采集能力。In one aspect, an audio collection device is disclosed. The audio collection device includes: a base plate, the base plate includes a plurality of sound collecting structures; and a microphone array, the microphone array includes a plurality of microphones, wherein the plurality of The microphones are each placed in a corresponding sound collecting structure, which is configured to enhance the audio collection capability of the corresponding microphone in a specific direction.
优选地,所述聚音结构是所述底板中的凹陷部,所述麦克风被置于所述凹陷部的底部,所述凹陷部的开口的面积大于所述凹陷部的底部的面积。Preferably, the sound collecting structure is a recess in the bottom plate, the microphone is placed at the bottom of the recess, and the area of the opening of the recess is larger than the area of the bottom of the recess.
优选地,所述凹陷部呈圆锥形。Preferably, the concave portion has a conical shape.
优选地,所述凹陷部的截面的形状为圆环的一部分、抛物线的一部分、三角形的一部分中的至少一者。Preferably, the shape of the cross section of the recess is at least one of a part of a ring, a part of a parabola, and a part of a triangle.
优选地,每个聚音结构具有相同的形状。Preferably, each sound collecting structure has the same shape.
优选地,来自多个麦克风的音频被采用相同的处理参数处理。Preferably, audio from multiple microphones is processed using the same processing parameters.
优选地,多个麦克风被连接到相同的音频处理装置。Preferably, multiple microphones are connected to the same audio processing device.
优选地,多个麦克风被排列为线阵。Preferably, the plurality of microphones are arranged in a linear array.
优选地,多个麦克风被排列为面阵。Preferably, the plurality of microphones are arranged in an area array.
在另一方面中,公开了一种用于处理音频的设备,所述设备包括:如上所述的音频采集装置;以及音频处理装置,所述音频处理装置接收所述音频采集装置接收的音频并处理所述音频。In another aspect, a device for processing audio is disclosed. The device includes: the audio collection device as described above; and an audio processing device that receives the audio received by the audio collection device and Process the audio.
优选地,所述音频处理装置包括预处理单元,所述预处理单元对所述音频进行放大和/或去噪。Preferably, the audio processing device includes a preprocessing unit that amplifies and/or denoises the audio.
优选地,所述音频处理装置包括语音识别单元,所述语音识别单元对所述音频执行语音识别以识别出语音命令。Preferably, the audio processing device includes a voice recognition unit that performs voice recognition on the audio to recognize a voice command.
优选地,所述音频处理装置包括命令执行单元,所述命令执行单元执行识别出语音命令。Preferably, the audio processing device includes a command execution unit, and the command execution unit executes the recognized voice command.
优选地,所述音频采集装置中的每个聚音结构具有相同的形状,其中所述音频处理装置采用相同的处理参数对来自所述音频采集装置中的每个麦克风的音频进行处理。Preferably, each sound collecting structure in the audio collection device has the same shape, and the audio processing device uses the same processing parameters to process the audio from each microphone in the audio collection device.
优选地,所述音频采集装置被安装在所述设备的面向用户的面板上。Preferably, the audio collection device is installed on a user-facing panel of the device.
在又一方面中,公开了一种用于处理音频的方法,所述方法包括:接收通过如上所述的音频采集装置采集的音频;对所接收的音频执行语音识别以识别出语音命令;以及执行所识别的语音命令。In yet another aspect, a method for processing audio is disclosed, the method comprising: receiving audio collected by an audio collecting device as described above; performing voice recognition on the received audio to recognize a voice command; and Execute the recognized voice command.
优选地,使用相同的处理参数来处理来自所述音频采集装置的多个麦克风的音频。Preferably, the same processing parameters are used to process audio from multiple microphones of the audio collection device.
与现有技术相比,本发明可具有如下有益效果:Compared with the prior art, the present invention can have the following beneficial effects:
增强音频,尤其是音频的高频部分的收集;Enhance audio, especially the collection of high-frequency parts of audio;
增强对特定方向的音频的采集性能;Enhance the collection performance of audio in a specific direction;
简化音频的处理;以及Simplify audio processing; and
成本很低且易于制造。The cost is very low and easy to manufacture.
当然,实施本申请的任一技术方案无需同时达到所有上述技术效果。Of course, implementing any technical solution of the present application does not need to achieve all the above technical effects at the same time.
附图说明Description of the drawings
本发明的以上发明内容以及下面的具体实施方式在结合附图阅读时会得到更好的理解。需要说明的是,附图仅作为所请求保护的发明的示例。在附图中,相同的附图标记代表相同或类似的元素。The above content of the invention and the following specific embodiments of the present invention will be better understood when read in conjunction with the accompanying drawings. It should be noted that the drawings are only examples of the claimed invention. In the drawings, the same reference numerals represent the same or similar elements.
图1是示出根据本说明书的实施例的音频采集装置的示意图。Fig. 1 is a schematic diagram showing an audio collection device according to an embodiment of the present specification.
图2是示出根据本说明书的实施例的音频采集装置的立体图。Fig. 2 is a perspective view showing an audio collecting device according to an embodiment of the present specification.
图3是示出根据本说明书的实施例的音频采集装置的操作场景。Fig. 3 shows an operation scene of the audio collection device according to an embodiment of the present specification.
图4A-4E是示出根据本说明书的实施例的音频采集装置的聚音结构的形状的示例。4A-4E are examples showing the shape of the sound collecting structure of the audio collecting device according to the embodiment of the present specification.
图5是示出根据本说明书的实施例的音频采集装置的聚音结构的开口的形状的示例。FIG. 5 is an example showing the shape of the opening of the sound collecting structure of the audio collecting device according to the embodiment of the present specification.
图6是示出根据本说明书的实施例的音频采集装置的麦克风的布置的示例的示意图。FIG. 6 is a schematic diagram showing an example of the arrangement of microphones of an audio collecting device according to an embodiment of the present specification.
图7A-7B是示出根据本说明书的实施例的包括音频采集装置的设备的示意图。7A-7B are schematic diagrams showing a device including an audio collecting device according to an embodiment of the present specification.
图8是示出根据本说明书的实施例的包括音频采集装置的设备的框图。Fig. 8 is a block diagram showing a device including an audio collecting device according to an embodiment of the present specification.
图9是示出根据本说明书的实施例的处理音频的方法的流程图。FIG. 9 is a flowchart showing a method of processing audio according to an embodiment of the present specification.
具体实施方式detailed description
以下在具体实施方式中详细叙述本发明的详细特征以及优点,其内容足以使任何本领域技术人员了解本发明的技术内容并据以实施,且根据本说明书所揭露的说明书、权利要求及附图,本领域技术人员可轻易地理解本发明相关的目的及优点。The detailed features and advantages of the present invention will be described in detail below in the specific embodiments. The content is sufficient to enable any person skilled in the art to understand the technical content of the present invention and implement it accordingly, and according to the specification, claims and drawings disclosed in this specification Those skilled in the art can easily understand the related objectives and advantages of the present invention.
为了增强对特定方向的音频的采集性能,同时简化音频的处理,本发明需要一种成本高效的音频采集装置及音频处理设备和方法。In order to enhance the performance of audio collection in a specific direction while simplifying audio processing, the present invention requires a cost-efficient audio collection device and audio processing equipment and method.
音频采集装置Audio collection device
参见图1,其中示出了根据本说明书的实施例的音频采集装置100的示意图。音频采集装置100可包括麦克风阵列102。麦克风阵列102中包括多个麦克风106。Referring to FIG. 1, there is shown a schematic diagram of an audio collection device 100 according to an embodiment of the present specification. The audio collection device 100 may include a microphone array 102. The microphone array 102 includes a plurality of microphones 106.
优选地,音频采集装置100还可包括底板104,底板104支撑该麦克风阵列102。例如,在如图7A所示的电视机的示例中,底板104可以是电视机的边框。替代地,底板104可以不是电视机的边框,而是另外的底板,其可被固定到电视机的边框或其它位置。Preferably, the audio collection device 100 may further include a bottom plate 104, and the bottom plate 104 supports the microphone array 102. For example, in the example of the television as shown in FIG. 7A, the bottom plate 104 may be the frame of the television. Alternatively, the bottom plate 104 may not be the frame of the TV, but another bottom plate, which may be fixed to the frame or other positions of the TV.
从图1中可以看出,底板104包括多个聚音结构108。聚音结构108被构造成增强对应的麦克风在特定方向上的音频采集能力。It can be seen from FIG. 1 that the bottom plate 104 includes a plurality of sound collecting structures 108. The sound collecting structure 108 is configured to enhance the audio collection capability of the corresponding microphone in a specific direction.
优选地,参考如图2所示的根据本说明书的实施例的音频采集装置100的立体图。如图2所示,聚音结构108是所述底板内的凹陷部。所述凹陷部被构造为增强麦克风在特定方向上的音频采集能力。聚音结构108的形状将在下面详细描述。Preferably, refer to the perspective view of the audio collection device 100 according to the embodiment of the present specification as shown in FIG. 2. As shown in Fig. 2, the sound collecting structure 108 is a recess in the bottom plate. The recess is configured to enhance the audio collection capability of the microphone in a specific direction. The shape of the sound collecting structure 108 will be described in detail below.
替代地,聚音结构108还可采用其它形式。例如,聚音结构可部分突出于底板(图中未示出)。Alternatively, the sound gathering structure 108 may also take other forms. For example, the sound collecting structure may partially protrude from the bottom plate (not shown in the figure).
如图1和图3所示,麦克风阵列102的各麦克风106位于聚音结构108内。优选地,麦克风106位于聚音结构108的底部,例如底部中心位置。下面将结合聚音结构的形状对此进行进一步描述。As shown in FIGS. 1 and 3, each microphone 106 of the microphone array 102 is located in the sound collecting structure 108. Preferably, the microphone 106 is located at the bottom of the sound collecting structure 108, such as the center of the bottom. This will be further described below in conjunction with the shape of the converging structure.
聚音结构的形状Shape of polyphonic structure
优选地,构成聚音结构108的凹陷部呈圆锥形,如图2的立体图所示。此时,麦克风106例如可被置于圆锥形的底部(即圆锥的顶点处)。可以理解,与简单地将麦克风置于边框侧面的小孔中的传统方案相比,在本说明书实施例的聚音结构中,在该圆锥形的轴线所面对的方向上的音频采集能力将相对于其它方向被加强,如以下结合图3所说明的。Preferably, the concave portion constituting the sound collecting structure 108 is conical, as shown in the perspective view of FIG. 2. At this time, the microphone 106 may be placed at the bottom of the cone (that is, at the apex of the cone), for example. It can be understood that, compared with the traditional solution in which the microphone is simply placed in the small hole on the side of the frame, in the sound collecting structure of the embodiment of this specification, the audio collection ability in the direction facing the cone axis is lower It is reinforced with respect to other directions, as explained below in conjunction with FIG. 3.
参考图3,其示出了根据本说明书的实施例的音频采集装置100的操作场景。优选地,聚音结构108的圆锥形的夹角为74°,该圆锥形的深度为2厘米。经过测试,在此参数下,当点声源302与麦克风106的距离为1米时,相对于不使用该聚音结构的全向麦克风,麦克风在0°(该点声源在该聚音结构的轴线上)和90°(该点声源与该聚音结构的连线与该聚音结构的轴线垂直)所采集的正常语音(不同于其它测试中使用的单频音)的拾音的差异分别提升8db和4db。也就是说,在此参数下,指向性可达到4db。Referring to FIG. 3, it shows an operation scene of the audio collection device 100 according to an embodiment of the present specification. Preferably, the included angle of the cone of the sound collecting structure 108 is 74°, and the depth of the cone is 2 cm. After testing, under this parameter, when the distance between the point sound source 302 and the microphone 106 is 1 meter, compared to an omnidirectional microphone that does not use the sound-concentrating structure, the microphone is at 0° (the point sound source is in the sound-concentrating structure The normal voice (different from the single-frequency sound used in other tests) collected by the normal voice (different from the single-frequency tone used in other tests) at 90° (the line connecting the sound source and the concentrating structure is perpendicular to the axis of the concentrating structure) The difference increases by 8db and 4db respectively. In other words, under this parameter, the directivity can reach 4db.
可以理解,也可采用其它任何合适的尺寸参数。可以领会,本领域技术人员可以根据声源与麦克风的常见距离,来设计聚音结构的圆锥形的夹角大小和圆锥形的深度,以实现最佳的聚音效果。It is understood that any other suitable dimensional parameters can also be used. It can be appreciated that those skilled in the art can design the cone angle and the depth of the cone of the sound collecting structure according to the common distance between the sound source and the microphone, so as to achieve the best sound collecting effect.
例如,在平板计算机的示例中,声源与麦克风的距离通常在30-50厘米。在此情况下,可将聚音结构108的圆锥形的夹角设计为25度,深度设计为0.35厘米。For example, in the example of a tablet computer, the distance between the sound source and the microphone is usually 30-50 cm. In this case, the included angle of the conical shape of the sound collecting structure 108 can be designed to be 25 degrees, and the depth can be designed to be 0.35 cm.
又例如,在电视机的示例中,声源与麦克风的距离通常在3-5米(或其它距离,取决于电视机的尺寸)。在此情况下,可将聚音结构108的圆锥形的夹角设计为60度, 深度设计为0.7厘米。例如,可以通过实验选择不同的尺寸参数,以达到最佳的聚音效果。For another example, in the example of a television, the distance between the sound source and the microphone is usually 3-5 meters (or other distances, depending on the size of the television). In this case, the included angle of the conical shape of the sound collecting structure 108 can be designed to be 60 degrees, and the depth can be designed to be 0.7 cm. For example, you can choose different size parameters through experimentation to achieve the best polyphonic effect.
可以理解,在选择聚音结构的参数时,还可以考虑其它因素,例如制造容易度、制造成本、产品美观度等。It can be understood that when selecting the parameters of the sound-converging structure, other factors may also be considered, such as ease of manufacture, manufacturing cost, and product aesthetics.
可以领会,凹陷部可采用其它各种形状,这些形状均可被选择为增强麦克风在特定方向上的音频采集能力。通常,所述麦克风被置于所述凹陷部的底部,所述凹陷部的截面的开口的面积大于所述凹陷部的底部的面积,以便实现聚音效果。It can be appreciated that the concave portion can take various other shapes, and these shapes can all be selected to enhance the audio collection capability of the microphone in a specific direction. Generally, the microphone is placed at the bottom of the recess, and the area of the opening of the cross section of the recess is larger than the area of the bottom of the recess, so as to achieve a sound focusing effect.
参考图4A-4E,其示出了根据本说明书的实施例的音频采集装置的聚音结构的形状的更多示例。4A-4E, which show more examples of the shape of the sound collecting structure of the audio collecting device according to the embodiment of the present specification.
如图4A-4C中所示,所述凹陷部的截面的形状可包括但不限于圆环的一部分(图4A)、抛物线的一部分(图4B)、三角形的一部分(图4C)中的至少一者。所述凹陷部还可采用其它形状,如图4D所示。尽管在图4A-4D中开口被示出为圆形,但应领会,开口可以为其它形状,例如矩形,如图4E中所示。开口的形状将在下面参考图5更详细地解释。As shown in FIGS. 4A-4C, the shape of the cross-section of the recess may include, but is not limited to, at least one of a part of a ring (FIG. 4A), a part of a parabola (FIG. 4B), and a part of a triangle (FIG. 4C). By. The recessed portion can also take other shapes, as shown in Figure 4D. Although the openings are shown as circular in FIGS. 4A-4D, it should be appreciated that the openings may be other shapes, such as rectangular, as shown in FIG. 4E. The shape of the opening will be explained in more detail with reference to FIG. 5 below.
可以领会,通常,所述凹陷部的截面在开口处的尺寸通常大于底部(即与开口相对的位置)的尺寸。例如,在圆环截面的示例中,圆环的开口处的大小大于其顶点的大小。在抛物线截面的示例中,抛物线的开口处的大小大于其顶点的大小。在三角形截面的示例中,三角形的边的大小大于其顶点的大小。It can be appreciated that, generally, the size of the cross section of the recess at the opening is generally larger than the size of the bottom (ie, the position opposite to the opening). For example, in the example of a circular ring section, the size of the opening of the circular ring is larger than the size of its apex. In the example of a parabolic section, the size of the opening of the parabola is larger than the size of its apex. In the example of a triangular section, the size of the sides of the triangle is larger than the size of the vertices.
相应地,麦克风106通常可被置于凹陷部的底部处。例如,在圆环截面的示例中,麦克风106可被置于与开口相对的端点(即距开口最远的位置)处。在抛物线截面的示例中,麦克风106可被置于其顶点处。在三角形截面的示例中,麦克风106可被置于与开口的边相对的顶点处。通过这种构造,从开口进入凹陷部的声音将被聚集到麦克风的位置处,从而实现更好的音频聚集效果。Accordingly, the microphone 106 can usually be placed at the bottom of the recess. For example, in the example of a circular ring section, the microphone 106 may be placed at the end opposite to the opening (ie, the position farthest from the opening). In the example of a parabolic section, the microphone 106 may be placed at its apex. In the example of a triangular cross-section, the microphone 106 may be placed at the vertex opposite to the side of the opening. With this configuration, the sound entering the recessed portion from the opening will be concentrated at the position of the microphone, thereby achieving a better audio focusing effect.
可以领会,这些仅是聚音结构的截面的示例,本发明的聚音结构的截面的形状不限于此。It can be appreciated that these are only examples of the cross section of the sound collecting structure, and the shape of the cross section of the sound collecting structure of the present invention is not limited to this.
参考图5,其示出了根据本说明书的实施例的音频采集装置的聚音结构的开口的示例。如图5中所示凹陷部的开口也可采用各种形状,包括但不限于圆形、方形或其它多边形等。可以领会,这些仅是聚音结构的开口的示例,本发明的聚音结构的开口的形状不限于此。如同上面提及的,凹陷部的截面形状可以和图5所示的开口的形状以任何适 当方式进行组合。Referring to FIG. 5, it shows an example of the opening of the sound collecting structure of the audio collecting device according to the embodiment of the present specification. The opening of the recessed portion as shown in FIG. 5 can also take various shapes, including but not limited to a circle, a square, or other polygons. It can be appreciated that these are only examples of the opening of the sound collecting structure, and the shape of the opening of the sound collecting structure of the present invention is not limited thereto. As mentioned above, the cross-sectional shape of the recessed portion can be combined with the shape of the opening shown in Fig. 5 in any suitable manner.
优选地,所述聚音结构或所述底板是通过3D打印成型的。也可采用其它方式来制造。可以领会,这种简单的结构易于制造且成本较低。Preferably, the sound collecting structure or the bottom plate is formed by 3D printing. It can also be manufactured in other ways. It can be appreciated that this simple structure is easy to manufacture and low in cost.
优选地,所述聚音结构具有相同的形状。已经发现,这种方案是特别有利的。聚音结构采用相同的形状不仅降低了制造的难度,而且使得每个麦克风的频响具有一致性,从而简化了后续的音频处理。在此情况下,来自所述多个麦克风的音频被采用相同的处理参数处理。优选地,所述多个麦克风被连接到相同的处理器,由相同的处理器采用相同的处理参数来处理来自多个麦克风的音频。Preferably, the sound collecting structures have the same shape. It has been found that this solution is particularly advantageous. The same shape of the sound-concentrating structure not only reduces the difficulty of manufacturing, but also makes the frequency response of each microphone consistent, thereby simplifying the subsequent audio processing. In this case, the audio from the multiple microphones is processed with the same processing parameters. Preferably, the multiple microphones are connected to the same processor, and the same processor uses the same processing parameters to process audio from the multiple microphones.
麦克风的布置Microphone placement
参见图6,其示出了根据本说明书的实施例的音频采集装置的麦克风(及对应的聚音结构)的布置的示例的示意图。Refer to FIG. 6, which shows a schematic diagram of an example of the arrangement of the microphone (and the corresponding sound collecting structure) of the audio collecting device according to the embodiment of the present specification.
为了实现音频增强、声源定位或去混响等功能,麦克风可被布置在麦克风阵列中。In order to achieve audio enhancement, sound source localization, or de-reverberation, microphones can be arranged in a microphone array.
优选地,如图1所示,多个麦克风被排列为线阵。例如,所述多个麦克风可被排列为一行或一列,即被排列为一维阵列。此种布置特别易于制造。Preferably, as shown in Fig. 1, a plurality of microphones are arranged in a linear array. For example, the plurality of microphones may be arranged in a row or a column, that is, arranged in a one-dimensional array. This arrangement is particularly easy to manufacture.
替代地,所述多个麦克风可按其它方式来排列。例如,所述多个麦克风可被排列为面阵,即被排列为二维阵列。优选地,所述面阵包括但不限于方阵、圆阵、椭圆阵等。所述面阵可包括其它二维阵列,例如L形阵列、不规则阵列等。Alternatively, the plurality of microphones may be arranged in other ways. For example, the plurality of microphones may be arranged in an area array, that is, arranged in a two-dimensional array. Preferably, the area array includes, but is not limited to, a square array, a circular array, an elliptical array, and the like. The area array may include other two-dimensional arrays, such as L-shaped arrays, irregular arrays, and the like.
优选地,各麦克风之间的间隔是均匀的。或者,各麦克风之间的间隔也可以是不均匀的。Preferably, the interval between the microphones is uniform. Or, the interval between the microphones may also be uneven.
本领域技术人员可构想其它布置,其均落入本发明的保护范围。Those skilled in the art can conceive other arrangements, all of which fall within the protection scope of the present invention.
采用本说明书的实施例的麦克风阵列和聚音结构的组合,能够均匀且一致地提升麦克风阵列中的各个麦克风在特定方向上的聚音能力,从而实现更好的音频采集效果。Using the combination of the microphone array and the sound collecting structure of the embodiment of the present specification can uniformly and consistently improve the sound collecting ability of each microphone in the microphone array in a specific direction, thereby achieving a better audio collection effect.
包括音频采集装置的设备Equipment including audio capture device
在一些设备中,例如平板计算机,已经包括了麦克风,通常麦克风位于边框侧面。传统平板计算机中的麦克风通常不面向声源。Some devices, such as tablet computers, already include a microphone, which is usually located on the side of the frame. The microphone in a traditional tablet computer is usually not facing the sound source.
而且,在传统平板计算机中,麦克风被置于小孔中,该小孔通常不被特别设计为具有适合增强特定方向的音频采集能力的形状。Moreover, in a conventional tablet computer, a microphone is placed in a small hole, which is usually not specially designed to have a shape suitable for enhancing the audio collection capability in a specific direction.
参见图7A-7B,其示出了根据本说明书的实施例的音频采集装置可在其中使用的设备的整体图。Referring to Figs. 7A-7B, there are shown overall diagrams of equipment in which the audio collection device according to the embodiments of the present specification can be used.
在图7A的示例中,该设备例如是电视机。如图7A所示,电视机可包括边框702,在该边框上可实现如本说明书实施例所述的音频采集装置100。In the example of FIG. 7A, the device is, for example, a television. As shown in FIG. 7A, the television may include a frame 702 on which the audio collection device 100 as described in the embodiment of the present specification can be implemented.
在一些实施例中,音频采集装置100所采集的音频可简单地由电视机记录、播放或执行其它处理。In some embodiments, the audio collected by the audio collecting device 100 can simply be recorded, played or performed by a TV set.
在另一些较佳实施例中,音频采集装置100所采集的音频可被用于控制设备。例如,电视机可通过该音频采集装置采集用户的音频,对采集到的音频执行语音识别,从而能够确定用户想要执行的命令。例如,用户可对电视机说“暂停播放”,音频采集装置可采集此音频,从中识别出“暂停播放”命令,并暂停当前电视机上的视频或音频的播放。In other preferred embodiments, the audio collected by the audio collecting device 100 can be used to control the device. For example, a television can collect the user's audio through the audio collection device, and perform voice recognition on the collected audio, so that the command that the user wants to execute can be determined. For example, the user can say "pause playback" to the TV, and the audio collection device can collect this audio, recognize the "pause playback" command from it, and pause the current video or audio playback on the TV.
在图7B的示例中,该设备例如是平板计算机。如图7B所示,平板计算机同样可包括边框706并在边框上实现如本说明书实施例所述的音频采集装置100。In the example of FIG. 7B, the device is, for example, a tablet computer. As shown in FIG. 7B, the tablet computer may also include a frame 706 and implement the audio collection device 100 as described in the embodiment of this specification on the frame.
优选地,如图7A和7B所示,音频采集装置可被置于边框正面,以面向声源(例如面向用户)。将音频采集装置置于边框正面有助于根据本说明书实施例的音频采集装置增强声源方向的音频采集。Preferably, as shown in FIGS. 7A and 7B, the audio collection device may be placed on the front of the frame to face the sound source (for example, the user). Placing the audio collection device on the front of the frame helps the audio collection device according to the embodiment of the present specification to enhance audio collection in the direction of the sound source.
虽然在图7A和7B的示例中该音频采集装置被实现在上边框中,可以领会,该音频采集装置可被实现在边框的任何位置。替代地,该音频采集装置可实现在边框外的其它结构(例如底座704等)中。Although the audio collection device is implemented in the upper frame in the example of FIGS. 7A and 7B, it can be appreciated that the audio collection device can be implemented in any position of the frame. Alternatively, the audio collection device may be implemented in other structures outside the frame (for example, the base 704, etc.).
本领域技术人员可以领会,本说明书实施例的音频采集装置可实现在除了电视机和平板计算机外的其它设备(例如售货机、智能音箱等),本说明书实施例不局限于特定设备。Those skilled in the art can understand that the audio collection device in the embodiments of this specification can be implemented in other devices (such as vending machines, smart speakers, etc.) besides televisions and tablet computers, and the embodiments of this specification are not limited to specific devices.
包括音频采集装置的设备的框图Block diagram of equipment including audio capture device
参见图8,其示出根据本说明书的实施例的包括音频采集装置的设备的框图。该设备例如可以是上面参考图7A-7B描述的平板计算机、电视机、售货机等等,但不限于此。Refer to FIG. 8, which shows a block diagram of a device including an audio collection device according to an embodiment of the present specification. The device may be, for example, the tablet computer, TV, vending machine, etc. described above with reference to FIGS. 7A-7B, but is not limited thereto.
该设备可包括如上所述的音频采集装置100。采用本发明的音频采集装置,特定方向的音频能够被加强,从而有助于后续处理。The device may include the audio collection device 100 as described above. With the audio collection device of the present invention, audio in a specific direction can be enhanced, thereby facilitating subsequent processing.
该设备还可包括音频处理装置802。所述音频处理装置802接收并处理来自音频采集装置100的音频。在一些较简单的实施例中,音频处理装置802可简单地存储或播放 音频。例如,音频处理装置802可将所采集的音频存储在语音备忘录中。The device may also include an audio processing device 802. The audio processing device 802 receives and processes audio from the audio collection device 100. In some simpler embodiments, the audio processing device 802 can simply store or play audio. For example, the audio processing device 802 may store the collected audio in a voice memo.
在另一些较佳实施例中,音频处理装置802可包括预处理单元804。预处理单元可对接收到的音频进行预处理。例如,可对接收到的音频进行放大、去噪等操作。In other preferred embodiments, the audio processing device 802 may include a preprocessing unit 804. The preprocessing unit can preprocess the received audio. For example, operations such as amplification and denoising can be performed on the received audio.
在实施例中,音频处理装置802还可包括语音识别单元806。语音识别单元806可对采集到的音频执行语音识别,从而能够确定用户想要执行的命令。例如,用户可对电视机说“暂停播放”,音频采集装置可采集此音频,从中识别出“暂停播放”命令。In an embodiment, the audio processing device 802 may further include a voice recognition unit 806. The voice recognition unit 806 can perform voice recognition on the collected audio, so as to be able to determine the command that the user wants to execute. For example, the user can say "pause playback" to the TV, and the audio collection device can collect this audio and recognize the "pause playback" command from it.
在实施例中,音频处理装置802还可包括命令执行单元808。命令执行单元808可执行由语音识别单元806所识别的命令,例如暂停当前电视机上的视频或音频的播放。In an embodiment, the audio processing device 802 may further include a command execution unit 808. The command execution unit 808 can execute the command recognized by the voice recognition unit 806, such as pausing the current video or audio playback on the TV.
具体各个功能采用硬件(例如通用处理器或专门的音频处理器等)还是软件来实现,以及各个功能在各软件/硬件单元的分布可由本领域技术人员根据实际需要设计,这些均落入本说明书实施例的范围。Whether each function is implemented by hardware (for example, a general-purpose processor or a special audio processor, etc.) or software, and the distribution of each function in each software/hardware unit can be designed by those skilled in the art according to actual needs. These fall into this specification Scope of the embodiment.
优选地,该音频采集装置100中的多个麦克风被连接到同一音频处理装置802。Preferably, multiple microphones in the audio collection device 100 are connected to the same audio processing device 802.
如上所述,当聚音结构采用相同的形状时,可由相同的音频处理装置利用相同的处理参数来处理来自多个麦克风的音频,从而减少了处理复杂度并提升了处理性能。As described above, when the sound gathering structure adopts the same shape, the same audio processing device can use the same processing parameters to process audio from multiple microphones, thereby reducing processing complexity and improving processing performance.
音频处理方法Audio processing method
参见图9,其示出根据本说明书的实施例的处理音频的方法900的流程图。方法900可包括:在步骤902,接收通过如上所述的音频采集装置采集的音频。Refer to FIG. 9, which shows a flowchart of a method 900 for processing audio according to an embodiment of the present specification. The method 900 may include: in step 902, receiving audio collected by the audio collecting device as described above.
方法900还可包括:可选地,在步骤904,对所接收的音频进行预处理。优选地,使用相同的处理参数来处理来自所述音频采集装置的多个麦克风的音频。例如,如上所述,当聚音结构采用相同的形状时,可由相同的处理器利用相同的处理参数来处理来自多个麦克风的音频,从而减少了处理复杂度并提升了处理性能。The method 900 may further include: optionally, in step 904, preprocessing the received audio. Preferably, the same processing parameters are used to process audio from multiple microphones of the audio collection device. For example, as described above, when the sound collecting structure adopts the same shape, the same processor can use the same processing parameters to process audio from multiple microphones, thereby reducing processing complexity and improving processing performance.
方法900还可包括:在步骤906,对所接收的音频执行语音识别以识别出命令。由于采用本说明书实施例的音频采集装置采集的音频能够增强特定方向上的音频,所以位于该特定方向上的用户所说出的语音命令能够被更清楚地采集,从而提升了语音识别的效果。The method 900 may further include: in step 906, performing voice recognition on the received audio to recognize the command. Since the audio collected by the audio collecting device of the embodiment of this specification can enhance the audio in a specific direction, the voice commands spoken by the user in the specific direction can be collected more clearly, thereby improving the effect of voice recognition.
方法900还可包括:在步骤908,执行所识别的语音命令。The method 900 may also include: in step 908, executing the recognized voice command.
应该理解,本文用单数形式描述或者在附图中仅显示一个的元件并不代表将该元件的数量限于一个。此外,本文中被描述或示出为分开的模块或元件可被组合为单个模块 或元件,且本文中被描述或示出为单个的模块或元件可被拆分为多个模块或元件。It should be understood that the description of an element in the singular form herein or the display of only one element in the drawings does not mean that the number of the element is limited to one. In addition, modules or elements described or shown as separate herein may be combined into a single module or element, and modules or elements described or shown herein as a single module or element may be split into multiple modules or elements.
还应理解,本文采用的术语和表述方式只是用于描述,本发明并不应局限于这些术语和表述。使用这些术语和表述并不意味着排除任何示意和描述(或其中部分)的等效特征,应认识到可能存在的各种修改也应包含在权利要求范围内。其他修改、变化和替换也可能存在。相应的,权利要求应视为覆盖所有这些等效物。It should also be understood that the terms and expressions used herein are only for description, and the present invention should not be limited to these terms and expressions. The use of these terms and expressions does not mean to exclude any equivalent features of the illustration and description (or part of them), and it should be recognized that various modifications that may exist should also be included in the scope of the claims. Other modifications, changes and replacements may also exist. Accordingly, the claims should be regarded as covering all these equivalents.
同样,需要指出的是,虽然本发明已参照当前的具体实施例来描述,但是本技术领域中的普通技术人员应当认识到,以上的实施例仅是用来说明本发明,在没有脱离本发明精神的情况下还可做出各种等效的变化或替换,因此,只要在本发明的实质精神范围内对上述实施例的变化、变型都将落在本申请的权利要求书的范围内。Similarly, it should be pointed out that although the present invention has been described with reference to the current specific embodiments, those of ordinary skill in the art should realize that the above embodiments are only used to illustrate the present invention, and without departing from the present invention. Various equivalent changes or substitutions can be made in the spirit of the present invention. Therefore, as long as the changes and modifications of the above-mentioned embodiments are within the essential spirit of the present invention, they will fall within the scope of the claims of this application.

Claims (17)

  1. 一种音频采集装置,其特征在于,所述音频采集装置包括:An audio collection device, characterized in that the audio collection device includes:
    底板,所述底板包括多个聚音结构;以及A bottom plate including a plurality of sound collecting structures; and
    麦克风阵列,所述麦克风阵列包括多个麦克风,A microphone array, the microphone array includes a plurality of microphones,
    其中所述多个麦克风各自被置于相应的聚音结构中,所述聚音结构被构造成增强该聚音结构中的麦克风在特定方向上的音频采集能力。The plurality of microphones are each placed in a corresponding sound collecting structure, and the sound collecting structure is configured to enhance the audio collection capability of the microphones in the sound collecting structure in a specific direction.
  2. 如权利要求1所述的音频采集装置,其特征在于,所述聚音结构是所述底板中的凹陷部,所述麦克风被置于所述凹陷部的底部,所述凹陷部的开口的面积大于所述凹陷部的底部的面积。The audio collecting device of claim 1, wherein the sound collecting structure is a recess in the bottom plate, the microphone is placed at the bottom of the recess, and the area of the opening of the recess is Larger than the area of the bottom of the recess.
  3. 如权利要求2所述的音频采集装置,其特征在于,所述凹陷部呈圆锥形。3. The audio collecting device of claim 2, wherein the recessed portion is conical.
  4. 如权利要求2所述的音频采集装置,其特征在于,所述凹陷部的截面的形状为圆环的一部分、抛物线的一部分、三角形的一部分中的至少一者。3. The audio collecting device according to claim 2, wherein the shape of the cross section of the recessed portion is at least one of a part of a circular ring, a part of a parabola, and a part of a triangle.
  5. 如权利要求1所述的音频采集装置,其特征在于,每个聚音结构具有相同的形状。5. The audio collecting device of claim 1, wherein each sound collecting structure has the same shape.
  6. 如权利要求5所述的音频采集装置,其特征在于,来自所述多个麦克风的音频被采用相同的处理参数处理。5. The audio collection device of claim 5, wherein the audio from the multiple microphones is processed using the same processing parameters.
  7. 如权利要求5所述的音频采集装置,其特征在于,所述多个麦克风被连接到同一音频处理装置。5. The audio collection device of claim 5, wherein the multiple microphones are connected to the same audio processing device.
  8. 如权利要求1所述的音频采集装置,其特征在于,所述多个麦克风被排列为线阵。The audio collecting device of claim 1, wherein the plurality of microphones are arranged in a linear array.
  9. 如权利要求1所述的音频采集装置,其特征在于,所述多个麦克风被排列为面阵。3. The audio collection device of claim 1, wherein the plurality of microphones are arranged in an area array.
  10. 一种用于处理音频的设备,其特征在于,所述设备包括:A device for processing audio, characterized in that the device includes:
    如权利要求1-9中任一项所述的音频采集装置;以及The audio collection device according to any one of claims 1-9; and
    音频处理装置,所述音频处理装置接收所述音频采集装置接收的音频并处理所述音频。An audio processing device, which receives the audio received by the audio collection device and processes the audio.
  11. 如权利要求10所述的设备,其特征在于,所述音频处理装置包括预处理单元,所述预处理单元对所述音频进行放大和/或去噪。The device according to claim 10, wherein the audio processing device comprises a preprocessing unit, and the preprocessing unit amplifies and/or denoises the audio.
  12. 如权利要求10所述的设备,其特征在于,所述音频处理装置包括语音识别单元,所述语音识别单元对所述音频执行语音识别以识别出语音命令。The device according to claim 10, wherein the audio processing device includes a voice recognition unit, and the voice recognition unit performs voice recognition on the audio to recognize a voice command.
  13. 如权利要求12所述的设备,其特征在于,所述音频处理装置包括命令执行单元, 所述命令执行单元执行识别出语音命令。The device according to claim 12, wherein the audio processing device comprises a command execution unit, and the command execution unit executes the recognized voice command.
  14. 如权利要求10所述的设备,其特征在于,所述音频采集装置中的每个聚音结构具有相同的形状,其中所述音频处理装置采用相同的处理参数对来自所述音频采集装置中的每个麦克风的音频进行处理。The device according to claim 10, wherein each sound collecting structure in the audio collecting device has the same shape, and the audio processing device adopts the same processing parameters to The audio of each microphone is processed.
  15. 如权利要求10所述的设备,其特征在于,所述音频采集装置被安装在所述设备的面向用户的面板上。The device of claim 10, wherein the audio collection device is installed on a user-facing panel of the device.
  16. 一种用于处理音频的方法,其特征在于,所述方法包括:A method for processing audio, characterized in that the method includes:
    接收通过如权利要求1-9中任一项所述的音频采集装置采集的音频;Receiving audio collected by the audio collecting device according to any one of claims 1-9;
    对所接收的音频执行语音识别以识别出语音命令;以及Perform voice recognition on the received audio to recognize voice commands; and
    执行所识别的语音命令。Execute the recognized voice command.
  17. 如权利要求16所述的方法,其特征在于,使用相同的处理参数来处理来自所述音频采集装置的所述多个麦克风的音频。15. The method of claim 16, wherein the same processing parameters are used to process audio from the multiple microphones of the audio collection device.
PCT/CN2019/116331 2019-01-23 2019-11-07 Audio collection apparatus, and device and method for audio processing WO2020151303A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910063526.7A CN109951768A (en) 2019-01-23 2019-01-23 Audio collecting device and device and method for handling audio
CN201910063526.7 2019-01-23

Publications (1)

Publication Number Publication Date
WO2020151303A1 true WO2020151303A1 (en) 2020-07-30

Family

ID=67007970

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/116331 WO2020151303A1 (en) 2019-01-23 2019-11-07 Audio collection apparatus, and device and method for audio processing

Country Status (3)

Country Link
CN (1) CN109951768A (en)
TW (1) TW202029781A (en)
WO (1) WO2020151303A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109951768A (en) * 2019-01-23 2019-06-28 阿里巴巴集团控股有限公司 Audio collecting device and device and method for handling audio

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103813248A (en) * 2014-03-10 2014-05-21 金如利 Sound focusing voice pickup device
CN105898635A (en) * 2016-04-26 2016-08-24 宁波桑德纳电子科技有限公司 Pickup device for outdoor long-distance use
CN107481729A (en) * 2017-09-13 2017-12-15 百度在线网络技术(北京)有限公司 A kind of method and system that intelligent terminal is upgraded to far field speech-sound intelligent equipment
CN108616784A (en) * 2018-06-14 2018-10-02 合肥品冠慧享家智能家居科技有限责任公司 It is a kind of that there is the glass panel for improving the poly- audio fruit of microphone
CN109951768A (en) * 2019-01-23 2019-06-28 阿里巴巴集团控股有限公司 Audio collecting device and device and method for handling audio

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104066036A (en) * 2014-06-19 2014-09-24 华为技术有限公司 Pick-up device and method
CN106710601B (en) * 2016-11-23 2020-10-13 合肥美的智能科技有限公司 Noise-reduction and pickup processing method and device for voice signals and refrigerator
CN108389586A (en) * 2017-05-17 2018-08-10 宁波桑德纳电子科技有限公司 A kind of long-range audio collecting device, monitoring device and long-range collection sound method
CN207053705U (en) * 2017-05-17 2018-02-27 宁波桑德纳电子科技有限公司 Long-range split type sound collector

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103813248A (en) * 2014-03-10 2014-05-21 金如利 Sound focusing voice pickup device
CN105898635A (en) * 2016-04-26 2016-08-24 宁波桑德纳电子科技有限公司 Pickup device for outdoor long-distance use
CN107481729A (en) * 2017-09-13 2017-12-15 百度在线网络技术(北京)有限公司 A kind of method and system that intelligent terminal is upgraded to far field speech-sound intelligent equipment
CN108616784A (en) * 2018-06-14 2018-10-02 合肥品冠慧享家智能家居科技有限责任公司 It is a kind of that there is the glass panel for improving the poly- audio fruit of microphone
CN109951768A (en) * 2019-01-23 2019-06-28 阿里巴巴集团控股有限公司 Audio collecting device and device and method for handling audio

Also Published As

Publication number Publication date
TW202029781A (en) 2020-08-01
CN109951768A (en) 2019-06-28

Similar Documents

Publication Publication Date Title
US11381906B2 (en) Conference system with a microphone array system and a method of speech acquisition in a conference system
CN109599124B (en) Audio data processing method and device and storage medium
US9451379B2 (en) Sound field analysis system
CN104936091B (en) Intelligent interactive method and system based on circular microphone array
KR101495937B1 (en) Microphone array for a camera speakerphone
US8620006B2 (en) Center channel rendering
US10366702B2 (en) Direction detection device for acquiring and processing audible input
CN105933830A (en) Speaker array apparatus
US10424314B2 (en) Techniques for spatial filtering of speech
US20150016658A1 (en) Thin speaker structure
WO2020151303A1 (en) Audio collection apparatus, and device and method for audio processing
Simón-Gálvez et al. The effect of reverberation on personal audio devices
US10362393B2 (en) Direction detection device for acquiring and processing audible input
US10229667B2 (en) Multi-directional beamforming device for acquiring and processing audible input
CN113676592A (en) Recording method, recording device, electronic equipment and computer readable medium
US20210392433A1 (en) Loudspeaker device, method, apparatus and device for adjusting sound effect thereof, and medium
CN106998517A (en) The method that electronic installation and audio are focused on again
US10375474B2 (en) Hybrid horn microphone
US20180226084A1 (en) Device for acquiring and processing audible input
WO2023056905A1 (en) Sound source localization method and apparatus, and device
CN110351633B (en) Sound collection device
WO2020228608A1 (en) Following-type dynamic stereo system for audio visual device
CN210431837U (en) Hidden type phase-reversing tube strip-shaped sound box structure
CN203072135U (en) Vertically asymmetrical directional sound box horn
CN205726411U (en) Highly sensitive microphone

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19911471

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19911471

Country of ref document: EP

Kind code of ref document: A1