WO2024082181A1 - Spatial audio collection method and apparatus - Google Patents
Spatial audio collection method and apparatus Download PDFInfo
- Publication number
- WO2024082181A1 WO2024082181A1 PCT/CN2022/126234 CN2022126234W WO2024082181A1 WO 2024082181 A1 WO2024082181 A1 WO 2024082181A1 CN 2022126234 W CN2022126234 W CN 2022126234W WO 2024082181 A1 WO2024082181 A1 WO 2024082181A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- microphone
- spatial audio
- arrays
- array
- signal
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 78
- 238000003491 array Methods 0.000 claims abstract description 61
- 230000005236 sound signal Effects 0.000 claims abstract description 45
- 238000012545 processing Methods 0.000 claims abstract description 18
- 238000004891 communication Methods 0.000 claims description 29
- 230000015654 memory Effects 0.000 claims description 12
- 238000001914 filtration Methods 0.000 claims description 8
- 230000000694 effects Effects 0.000 claims description 7
- 230000004044 response Effects 0.000 claims description 6
- 230000035945 sensitivity Effects 0.000 claims description 6
- 230000005540 biological transmission Effects 0.000 claims description 2
- 238000010295 mobile communication Methods 0.000 abstract description 2
- 238000010586 diagram Methods 0.000 description 24
- 230000006870 function Effects 0.000 description 18
- 238000004590 computer program Methods 0.000 description 17
- 238000005516 engineering process Methods 0.000 description 17
- 238000013461 design Methods 0.000 description 5
- 239000004065 semiconductor Substances 0.000 description 5
- 229910044991 metal oxide Inorganic materials 0.000 description 4
- 150000004706 metal oxides Chemical class 0.000 description 4
- 101001121408 Homo sapiens L-amino-acid oxidase Proteins 0.000 description 3
- 102100026388 L-amino-acid oxidase Human genes 0.000 description 3
- 101000827703 Homo sapiens Polyphosphoinositide phosphatase Proteins 0.000 description 2
- 102100023591 Polyphosphoinositide phosphatase Human genes 0.000 description 2
- 101100233916 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) KAR5 gene Proteins 0.000 description 2
- 229910000577 Silicon-germanium Inorganic materials 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- JBRZTFJDHDCESZ-UHFFFAOYSA-N AsGa Chemical compound [As]#[Ga] JBRZTFJDHDCESZ-UHFFFAOYSA-N 0.000 description 1
- 101100012902 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) FIG2 gene Proteins 0.000 description 1
- LEVVHYCKPQWKOP-UHFFFAOYSA-N [Si].[Ge] Chemical compound [Si].[Ge] LEVVHYCKPQWKOP-UHFFFAOYSA-N 0.000 description 1
- 230000003044 adaptive effect Effects 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 239000003990 capacitor Substances 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 230000000295 complement effect Effects 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 238000000354 decomposition reaction Methods 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 238000005728 strengthening Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R5/00—Stereophonic arrangements
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S3/00—Systems employing more than two channels, e.g. quadraphonic
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/008—Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
Definitions
- the present disclosure relates to the field of mobile communication technology, and in particular to a spatial audio acquisition method and device.
- spatial audio has been widely used in multimedia and instant communication of civilian equipment.
- the acquisition of spatial audio currently relies on external devices and cannot be directly acquired through smart mobile devices.
- the current spatial audio acquisition devices are too large and difficult to operate, which is not suitable for users' growing demand for high-quality audio and video acquisition.
- the present disclosure proposes a spatial audio acquisition method and device to solve the problem in the prior art that a spatial audio acquisition system cannot be integrated into a UE to perform effective and high-quality spatial audio acquisition.
- a first aspect embodiment of the present disclosure provides a spatial audio acquisition method, which is executed by a user equipment UE, wherein multiple groups of microphone arrays are arranged in the UE, and the maximum response directions of each group of arrays are mutually orthogonal.
- the method includes: performing differential beam processing on microphone signals obtained by the microphone array to obtain spatial audio signals.
- differential beam processing is performed on microphone signals acquired by a microphone array to acquire spatial audio signals, including: adding appropriate delay filtering and corresponding compensation filters to the microphone signals to acquire array signals with desired directivity; and decoding the array signals to acquire spatial audio signals.
- the method further includes: acquiring multiple directivities of the microphone array, the directivities representing the sensitivity of signals in different directions.
- the method further includes: obtaining a plurality of directivity differential arrays; and obtaining the required directivity in three-dimensional space by combining different differential arrays to obtain a spatial audio signal.
- the method further includes: decoding the spatial audio signal to output immersive multi-channel audio and/or ambisonic audio.
- the method further includes: filtering the microphone signal to obtain a low-frequency component and a high-frequency component, wherein the low-frequency component is output as a low-frequency effect and the high-frequency component is used to form a spatial audio signal.
- the microphone array is arranged in the UE in any of the following ways: the microphone array is arranged in a position close to a human voice collection component in the UE; the microphone array is arranged in a position close to an image collection component in the UE.
- the microphone array includes a predetermined number of microphones, which form three groups of microphone arrays.
- the three groups of microphone arrays are orthogonal to each other or the angle deviation from the orthogonality error is within a predetermined range, and the centers of the three groups of microphone arrays coincide or have a distance that does not exceed an error threshold.
- a second aspect of the present disclosure provides a spatial audio acquisition device, which is arranged to be executed in a user equipment UE.
- a plurality of microphone arrays are arranged in the UE, and the maximum response directions of each array are mutually orthogonal.
- the device includes: a spatial audio signal acquisition module, which is used to perform differential beam processing on microphone signals acquired by the microphone array to acquire spatial audio signals.
- the third aspect embodiment of the present disclosure provides a communication device, including: a transceiver; a memory; a processor, which is connected to the transceiver and the memory respectively, and is configured to control the wireless signal reception and transmission of the transceiver by executing computer executable instructions on the memory, and can implement the spatial audio acquisition method of the above-mentioned first aspect embodiment.
- the fourth aspect of the present disclosure provides a computer storage medium, wherein the computer storage medium stores computer executable instructions; after the computer executable instructions are executed by a processor, the spatial audio acquisition method of the first aspect of the present disclosure can be implemented.
- the embodiments of the present disclosure provide a spatial audio acquisition method and device, which arranges multiple groups of mutually orthogonal microphone arrays in the UE, and performs differential beam processing on the microphone signals acquired by the microphone arrays to obtain spatial audio signals.
- the present disclosure controls the size of the acquisition system within a certain size while controlling the directivity of the microphone by setting up mutually orthogonal microphone arrays, so that it can be built into current mobile devices, forming a pickup system that can be built into mobile smart devices.
- the signal collected by the pickup system is controlled to collect spatial audio, reducing the requirements for additional electroacoustic and acoustic hardware, thereby solving the requirements of mobile smart devices for collecting immersive audio while controlling the size of the device.
- FIG1 is a schematic diagram of a flow chart of a spatial audio acquisition method according to an embodiment of the present disclosure
- FIG2 is a schematic diagram of a flow chart of a spatial audio acquisition method according to an embodiment of the present disclosure
- FIG3 is a schematic diagram of spatial audio acquisition logic according to an embodiment of the present disclosure.
- FIG4 is a schematic diagram of a first-order differential array according to an embodiment of the present disclosure.
- FIG5 is a schematic diagram of the directivity of a microphone array according to an embodiment of the present disclosure.
- FIG6 is a schematic diagram of the directivity of the left channel and the right channel after decoding according to an embodiment of the present disclosure
- FIG7 is a schematic diagram of a first-order B format according to an embodiment of the present disclosure.
- FIG8 is a schematic diagram of a first-order B-format directional signal component according to an embodiment of the present disclosure
- FIG9 is a schematic diagram of an arrangement of a microphone array in a mobile device according to an embodiment of the present disclosure.
- FIG10 is a schematic diagram of an arrangement of a microphone array in a mobile device according to an embodiment of the present disclosure
- FIG11 is a block diagram of a spatial audio acquisition device according to an embodiment of the present disclosure.
- FIG12 is a schematic diagram of the structure of a communication device provided in an embodiment of the present disclosure.
- FIG. 13 is a schematic diagram of the structure of a chip provided in an embodiment of the present disclosure.
- Websites such as Youtube and Facebook support spatial audio content.
- audio and video codecs such as AVS support spatial audio codecs.
- the spatial audio acquisition equipment in the relevant technology cannot be built into the current mobile smart devices, and is not suitable for users' growing demand for high-quality audio and video acquisition.
- the smartphone Taking the most common mobile smart device nowadays, the smartphone, as an example, the size is about 7 inches, such as Huawei 12S PRO (length 163.6mm, width: 74.6mm, thickness: 8.16mm).
- the hardware layout in the mobile smart device is very compact, and the volume of the built-in audio acquisition system of the mobile smart device is very limited.
- the present disclosure proposes a spatial audio acquisition method and device to solve the problem in the prior art that a spatial audio acquisition system cannot be integrated into a UE to perform effective and high-quality spatial audio acquisition.
- FIG1 shows a flow chart of a spatial audio acquisition method according to an embodiment of the present disclosure.
- the method may be implemented by a user equipment (UE).
- UE user equipment
- the UE is arranged with multiple microphone arrays, and the maximum response directions of each array are mutually orthogonal.
- the method may include the following steps.
- each microphone array may include multiple microphones.
- the present disclosure does not limit the type of microphones, and an omnidirectional micro-microphone with a small size, small error, more suitable for integration in small devices such as user equipment, and suitable for beam control, such as a MEMS (micro-electromechanical system) microphone, an electret microphone, etc., can be used to control the size of the sound pickup system.
- MEMS micro-electromechanical system
- electret microphone etc.
- the present disclosure uses an omnidirectional micro-microphone to greatly reduce the size of the spatial audio acquisition system.
- Traditional microphone beamforming includes delay-sum, filter-sum, adaptive beamforming (MVDR), and differential beamforming.
- Differential beamforming has the advantages of compact layout and frequency-invariant beam pattern.
- the directivity can be controlled by differential microphone beam technology to assist in obtaining spatial audio signals.
- different differential beam designs are used to combine different low-order arrays to obtain the required directivity in three-dimensional space, thereby collecting spatial audio signals.
- the solution of the present disclosure relies only on beam technology to control directivity, which can effectively reduce the dependence of the pickup system on electroacoustic and acoustic hardware.
- multiple groups of mutually orthogonal microphone arrays are arranged in the UE, and differential beam processing is performed on the microphone signals obtained by the microphone array to obtain spatial audio signals.
- the present disclosure can use mutually orthogonal microphone arrays formed by micro-microphones using beam technology to control the size of the spatial audio acquisition system within a certain size so that it can be built into current mobile devices, forming a pickup system that can be built into mobile smart devices.
- the directivity of the microphone array is controlled to reduce the requirements for additional electroacoustic and acoustic hardware, thereby solving the requirements of mobile smart devices for collecting immersive audio while controlling the size of the device.
- Fig. 2 shows a schematic flow chart of a spatial audio acquisition method according to an embodiment of the present disclosure.
- the method may be executed by a UE.
- the arrangement of the microphone array is first introduced.
- the microphone array includes a predetermined number of microphones, which form three groups of microphone arrays.
- the three groups of microphone arrays are orthogonal to each other or the angle deviation from the orthogonality error is within a predetermined range, and the centers of the three groups of microphone arrays coincide or have a distance that does not exceed an error threshold.
- the number of microphones in the microphone array of the present disclosure is not limited, and each array can be composed of any number of microphones.
- four MEMS microphones are the most cost-effective design, for example, arranged on four adjacent vertices of a regular hexahedron to form three mutually orthogonal microphone arrays.
- the present disclosure does not limit the type of microphone, and a miniature microphone (such as MEMS) can be used to control the size of the sound pickup system. Compared with previous spatial audio acquisition devices, the solution provided by the present disclosure can greatly reduce the volume.
- a miniature microphone such as MEMS
- the arrangement of three microphone arrays at orthogonal angles is a preferred embodiment of the present invention.
- the three microphones may have a certain angle offset, and the centers of the three arrays completely overlap under ideal conditions.
- separation or a certain distance between the centers can be considered as errors.
- the microphone arrays on the actual device should maintain orthogonality between the arrays to reduce the interference caused by position errors.
- calibration is required based on actual conditions.
- the present disclosure uses omnidirectional miniature microphones with completely identical parameters to arrange three pairs of mutually orthogonal microphones, and the midpoints of the connecting lines of each pair of microphones coincide.
- the microphone signals constituting the array can be reused, so at least four microphones are needed to form the microphone array required by the present invention, which can be arranged at any four vertices of a regular hexahedron.
- a preferred embodiment of the present disclosure recommends the arrangement of four microphones, which are arranged at four adjacent vertices of a regular hexahedron, with the main axis directions of the microphones consistent and the spacing between the microphone arrays as small as possible.
- a three-dimensional space coordinate system is established with microphone 0 as the origin, microphone 1 is on the x-axis, microphone 2 is on the y-axis, and microphone 3 is on the z-axis.
- the distances between microphone 0, microphone 1, microphone 2, and microphone 3 are equal, forming three pairs of orthogonal first-order differential arrays. Since miniature microphones have the advantage of small size compared to traditional capacitor and dynamic microphones, the distance between the three pairs of microphones can be completely controlled at 4mm, which is much smaller than the wavelength (1.7cm) of the target signal (20-20kHz), so the error caused by the microphone distance can be ignored.
- the method may include the following steps.
- the microphone signals acquired by the microphone array are filtered, wherein the obtained low-frequency components are output as low-frequency effects, and the high-frequency components are used for subsequent processing to form spatial audio signals, as shown in FIG3 , which shows a logical schematic diagram of spatial audio acquisition described in the present disclosure.
- the performance in the low-frequency part is poor, so the original signal of microphone 0 (i.e. the microphone signal obtained by the microphone array) can be passed through a low-pass filter to retain only the low-frequency component as the LFE channel. Since the low-frequency component has a longer wavelength, it has less impact on the positioning of the human ear, and while strengthening the low-frequency effect, it does not affect the sense of space. The remaining channels are filtered out of the low-frequency component through a high-pass filter as the high-frequency component, which is used for subsequent processing to form a spatial audio signal.
- step S201 appropriate delay filtering and corresponding compensation filters may be added to obtain an array signal with desired directivity.
- directivity represents the sensitivity of signals in different directions.
- the present disclosure obtains the required directivity in three-dimensional space by obtaining differential arrays of multiple directivities and combining different differential arrays, as shown in FIG4 , which shows a schematic diagram of a first-order differential array.
- the standard first-order differential array obtains the target signal by subtracting the microphones with the same main axis direction, and controls the directivity by adding a delay with constant angular frequency to the subtracted microphone signal:
- the output compensation filter can be expressed as: Where ⁇ is the angular frequency, ⁇ 1,1 is the delay filter coefficient,
- the direction of the differential beam can be controlled.
- S204 Decode the array signal to obtain a spatial audio signal.
- three pairs of microphones can form the following five first-order differential arrays with different directivities:
- the required directivity in three-dimensional space is obtained, thereby collecting spatial audio signals.
- different differential beam designs are used to obtain audio signals required for spatial audio.
- different audio formats such as multi-channel audio and ambisonic (B-format) can be output, where multi-channel audio and ambisonic audio are two immersive (surround sound) formats.
- an M ⁇ S-3D recording format is constructed, and 5.1.4-channel multi-channel audio is output by decoding the spatial audio signal.
- two cardioid arrays with opposite directivities point to the positive and anti-phase directions of the X axis
- two figure-8 arrays point to the positive directions of the Y axis and the positive directions of the Z axis respectively.
- the decoding method for obtaining multi-channel audio is as follows, where "+” indicates signal addition and “-" indicates signal inversion addition.
- the arrangement of the microphone array proposed in the present invention is as shown in the five arrays mentioned above, wherein the directivity of array 1 in the xoy section is shown in FIG5(a); the directivity of array 3 in the xoy section is shown in FIG5(b), wherein + is the positive phase, - is the negative phase, and the same positive and negative phase signals will cancel each other out; the directivity of array 4 in the xoz plane section is shown in FIG5(c), wherein + is the positive phase, - is the negative phase, and the same positive and negative phase signals will cancel each other out.
- FIG6 After decoding, the directivity of the left channel and the right channel in the coordinate axis plane section is shown in FIG6 , wherein the left channel is shown in FIG6( a ) and the right channel is shown in FIG6( b ).
- the present disclosure can output standard ambisonic audio.
- the first-order B-format is the first-order decomposition of spherical harmonics, as shown in FIG7 .
- the standard B-format requires an omnidirectional signal (W) and three mutually positive 8-shaped directional signals (X, Y, Z).
- the present disclosure decodes the spatial audio signal to obtain audio signals of different formats to meet the diverse needs of spatial audio acquisition.
- the microphone array in the UE may be arranged according to actual needs.
- the microphone array when the handheld call requirement is taken into account, is arranged in a position close to the human voice collection component in the UE.
- the microphone array is arranged at the lower end of the mobile smart device, closer to the human mouth, to ensure a better signal-to-noise ratio.
- FIG9 shows a schematic diagram of the arrangement of the microphone array in the mobile device, where FIG9(a) is a schematic diagram of the back side of the mobile device, and FIG9(b) is a schematic diagram of the front side.
- the microphone array when taking into account the video effect, is arranged in a position close to the image acquisition component in the UE.
- the microphone array is arranged close to the camera and is consistent with the positive direction of the camera. By ensuring that the viewing angle is as consistent as possible with the camera, a better audio-visual effect is guaranteed.
- Figure 10 shows a schematic diagram of the arrangement of the microphone array in a mobile device, where Figure 10(a) is a schematic diagram of the back of the mobile device, and Figure 10(b) is a schematic diagram of the front.
- the spatial audio acquisition method multiple groups of mutually orthogonal microphone arrays are arranged in the UE, and differential beam processing is performed on the microphone signals obtained by the microphone array to obtain spatial audio signals.
- the present disclosure can use mutually orthogonal microphone arrays formed by micro-microphones using beam technology to control the size of the spatial audio acquisition system within a certain size, so that it can be built into current mobile devices, forming a pickup system that can be built into mobile smart devices.
- the directivity of the microphone array is controlled to reduce the requirements for additional electroacoustic and acoustic hardware, thereby solving the requirements of mobile smart devices for collecting immersive audio while controlling the size of the device.
- the present disclosure can adapt to different application scenarios by arranging microphone arrays at different positions in mobile devices.
- the method provided by the embodiment of the present application is introduced from the perspective of the user equipment.
- the user equipment may include a hardware structure and a software module, and implement the above functions in the form of a hardware structure, a software module, or a hardware structure plus a software module.
- a certain function of the above functions can be executed in the form of a hardware structure, a software module, or a hardware structure plus a software module.
- the present disclosure also provides a spatial audio acquisition device. Since the spatial audio acquisition device provided in the embodiment of the present disclosure corresponds to the spatial audio acquisition methods provided in the above-mentioned embodiments, the implementation method of the spatial audio acquisition method is also applicable to the spatial audio acquisition device provided in this embodiment, and will not be described in detail in this embodiment.
- FIG11 is a schematic diagram of the structure of a spatial audio collection device 1100 provided in an embodiment of the present disclosure.
- the spatial audio collection device 1100 is arranged in a user equipment UE for execution.
- a plurality of microphone arrays are arranged in the UE, and the maximum response directions of each array are mutually orthogonal.
- the apparatus 1100 includes: a spatial audio signal acquisition module 1110 for performing differential beam processing on microphone signals acquired by a microphone array to acquire spatial audio signals.
- the spatial audio acquisition device multiple groups of mutually orthogonal microphone arrays are arranged in the UE, and differential beam processing is performed on the microphone signals obtained by the microphone array to obtain spatial audio signals.
- the present disclosure can use mutually orthogonal microphone arrays formed by micro-microphones using beam technology to control the size of the spatial audio acquisition system within a certain size while controlling the directivity of the microphones so that it can be built into current mobile devices, forming a pickup system that can be built into mobile smart devices.
- differential beam technology the directivity of the signal collected by the pickup system is controlled, and the requirements for additional electroacoustic and acoustic hardware are reduced, thereby solving the requirements of mobile smart devices for collecting immersive audio while controlling the size of the device.
- the spatial audio signal acquisition module 1110 is further used to: add appropriate delay filtering and corresponding compensation filters to the microphone signal to obtain an array signal with desired directivity; and decode the array signal to obtain the spatial audio signal.
- the spatial audio signal acquisition module 1110 is further used to: acquire multiple directivities of the microphone array, where the directivities represent the sensitivity of signals in different directions.
- the spatial audio signal acquisition module 1110 is further used to: acquire a plurality of directivity differential arrays; and acquire the required directivity in three-dimensional space by combining different differential arrays to acquire the spatial audio signal.
- the spatial audio signal acquisition module 1110 is further configured to: decode the spatial audio signal to output immersive multi-channel audio and/or ambisonic audio.
- the spatial audio signal acquisition module 1110 is further used to filter the microphone signal to obtain low-frequency components and high-frequency components, wherein the low-frequency components are output as low-frequency effects and the high-frequency components are used to form a spatial audio signal.
- the microphone array is arranged in the UE in any of the following ways: the microphone array is arranged in a position close to a human voice collection component in the UE; the microphone array is arranged in a position close to an image collection component in the UE.
- the microphone array includes a predetermined number of microphones, which form three groups of microphone arrays.
- the three groups of microphone arrays are orthogonal to each other or the angle deviation from the orthogonality error is within a predetermined range, and the centers of the three groups of microphone arrays coincide or have a distance that does not exceed an error threshold.
- the spatial audio acquisition device multiple groups of mutually orthogonal microphone arrays are arranged in the UE, and differential beam processing is performed on the microphone signals obtained by the microphone array to obtain spatial audio signals.
- the present disclosure can use mutually orthogonal microphone arrays formed by miniature microphones using beam technology to control the size of the spatial audio acquisition system within a certain size while controlling the directivity of the microphone, so that it can be built into current mobile devices, forming a pickup system that can be built into mobile smart devices.
- differential beam technology the directionality of the signal collected by the pickup system is controlled to reduce the requirements for additional electroacoustic and acoustic hardware, thereby solving the requirements of mobile smart devices for collecting immersive audio while controlling the size of the device.
- the present disclosure can adapt to different application scenarios by arranging microphone arrays at different positions in mobile devices.
- FIG 12 is a schematic diagram of the structure of a communication device 1200 provided in an embodiment of the present application.
- the communication device 1200 can be a network device, or a user device, or a chip, a chip system, or a processor that supports the network device to implement the above method, or a chip, a chip system, or a processor that supports the user device to implement the above method.
- the device can be used to implement the method described in the above method embodiment, and the details can be referred to the description in the above method embodiment.
- the communication device 1200 may include one or more processors 1201.
- the processor 1201 may be a general-purpose processor or a dedicated processor, etc.
- it may be a baseband processor or a central processing unit.
- the baseband processor may be used to process the communication protocol and the communication data
- the central processing unit may be used to control the communication device (such as a base station, a baseband chip, a terminal device, a terminal device chip, a DU or a CU, etc.), execute a computer program, and process the data of the computer program.
- the communication device 1200 may further include one or more memories 1202, on which a computer program 1204 may be stored, and the processor 1201 executes the computer program 1204 so that the communication device 1200 performs the method described in the above method embodiment.
- data may also be stored in the memory 1202.
- the communication device 1200 and the memory 1202 may be provided separately or integrated together.
- the communication device 1200 may further include a transceiver 1205 and an antenna 1206.
- the transceiver 1205 may be referred to as a transceiver unit, a transceiver, or a transceiver circuit, etc., and is used to implement a transceiver function.
- the transceiver 1205 may include a receiver and a transmitter, the receiver may be referred to as a receiver or a receiving circuit, etc., and is used to implement a receiving function; the transmitter may be referred to as a transmitter or a transmitting circuit, etc., and is used to implement a transmitting function.
- the communication device 1200 may further include one or more interface circuits 1207.
- the interface circuit 1207 is used to receive code instructions and transmit them to the processor 1201.
- the processor 1201 executes the code instructions to enable the communication device 1200 to execute the method described in the above method embodiment.
- the processor 1201 may include a transceiver for implementing receiving and sending functions.
- the transceiver may be a transceiver circuit, an interface, or an interface circuit.
- the transceiver circuit, interface, or interface circuit for implementing the receiving and sending functions may be separate or integrated.
- the above-mentioned transceiver circuit, interface, or interface circuit may be used for reading and writing code/data, or the above-mentioned transceiver circuit, interface, or interface circuit may be used for transmitting or delivering signals.
- the processor 1201 may store a computer program 1203, which runs on the processor 1201 and enables the communication device 1200 to perform the method described in the above method embodiment.
- the computer program 1203 may be fixed in the processor 1201, in which case the processor 1201 may be implemented by hardware.
- the communication device 1200 may include a circuit that can implement the functions of sending or receiving or communicating in the aforementioned method embodiments.
- the processor and transceiver described in the present application may be implemented in an integrated circuit (IC), an analog IC, a radio frequency integrated circuit RFIC, a mixed signal IC, an application specific integrated circuit (ASIC), a printed circuit board (PCB), an electronic device, etc.
- the processor and transceiver may also be manufactured using various IC process technologies, such as complementary metal oxide semiconductor (CMOS), N-type metal oxide semiconductor (NMOS), P-type metal oxide semiconductor (positive channel metal oxide semiconductor, PMOS), bipolar junction transistor (BJT), bipolar CMOS (BiCMOS), silicon germanium (SiGe), gallium arsenide (GaAs), etc.
- CMOS complementary metal oxide semiconductor
- NMOS N-type metal oxide semiconductor
- PMOS P-type metal oxide semiconductor
- BJT bipolar junction transistor
- BiCMOS bipolar CMOS
- SiGe silicon germanium
- GaAs gallium arsenide
- the communication device described in the above embodiments may be a network device or a user device, but the scope of the communication device described in the present application is not limited thereto, and the structure of the communication device may not be limited by FIG. 12.
- the communication device may be an independent device or may be part of a larger device.
- the communication device may be:
- the IC set may also include a storage component for storing data and computer programs;
- ASIC such as modem
- the communication device can be a chip or a chip system
- the communication device can be a chip or a chip system
- the schematic diagram of the chip structure shown in Figure 13 includes a processor 1301 and an interface 1302.
- the number of processors 1301 can be one or more, and the number of interfaces 1302 can be multiple.
- the chip further includes a memory 1303, and the memory 1303 is used to store necessary computer programs and data.
- the present application also provides a readable storage medium having instructions stored thereon, which implement the functions of any of the above method embodiments when executed by a computer.
- the present application also provides a computer program product, which implements the functions of any of the above method embodiments when executed by a computer.
- the computer program product includes one or more computer programs.
- the computer can be a general-purpose computer, a special-purpose computer, a computer network, or other programmable device.
- the computer program can be stored in a computer-readable storage medium, or transmitted from one computer-readable storage medium to another computer-readable storage medium.
- the computer program can be transmitted from a website site, computer, server or data center by wired (e.g., coaxial cable, optical fiber, digital subscriber line (digital subscriber line, DSL)) or wireless (e.g., infrared, wireless, microwave, etc.) mode to another website site, computer, server or data center.
- the computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device such as a server or data center that contains one or more available media integrated.
- Available media can be magnetic media (e.g., floppy disks, hard disks, tapes), optical media (e.g., high-density digital video discs (DVD)), or semiconductor media (e.g., solid state disks (SSD)), etc.
- magnetic media e.g., floppy disks, hard disks, tapes
- optical media e.g., high-density digital video discs (DVD)
- DVD digital video discs
- semiconductor media e.g., solid state disks (SSD)
- At least one in the present application can also be described as one or more, and a plurality can be two, three, four or more, which is not limited in the present application.
- the technical features in the technical feature are distinguished by “first”, “second”, “third”, “A”, “B”, “C” and “D”, etc., and there is no order of precedence or size between the technical features described by the "first”, “second”, “third”, “A”, “B”, “C” and “D”.
- machine-readable medium and “computer-readable medium” refer to any computer program product, apparatus, and/or device (e.g., disk, optical disk, memory, programmable logic device (PLD)) for providing machine instructions and/or data to a programmable processor, including a machine-readable medium that receives machine instructions as a machine-readable signal.
- machine-readable signal refers to any signal for providing machine instructions and/or data to a programmable processor.
- the systems and techniques described herein may be implemented in a computing system that includes back-end components (e.g., as a data server), or a computing system that includes middleware components (e.g., an application server), or a computing system that includes front-end components (e.g., a user computer with a graphical user interface or a web browser through which a user can interact with implementations of the systems and techniques described herein), or a computing system that includes any combination of such back-end components, middleware components, or front-end components.
- the components of the system may be interconnected by any form or medium of digital data communication (e.g., a communications network). Examples of communications networks include: a local area network (LAN), a wide area network (WAN), and the Internet.
- a computer system may include clients and servers.
- Clients and servers are generally remote from each other and usually interact through a communication network.
- the relationship of client and server is generated by computer programs running on respective computers and having a client-server relationship to each other.
Landscapes
- Physics & Mathematics (AREA)
- Engineering & Computer Science (AREA)
- Acoustics & Sound (AREA)
- Signal Processing (AREA)
- Circuit For Audible Band Transducer (AREA)
Abstract
The present disclosure relates to the technical field of mobile communications. Provided are a spatial audio collection method and apparatus. The method comprises: arranging a plurality of groups of mutually orthogonal microphone arrays in a UE, and performing differential beam processing on microphone signals acquired by the microphone arrays, so as to acquire a spatial audio signal. By means of the present disclosure, mutually orthogonal microphone arrays formed by miniature microphones that use a beam technique may be used, and the size of a spatial audio collection system is controlled to be within certain dimensions, such that the spatial audio collection system is built in an existing mobile device, and a pickup system that can be built in a mobile intelligent device is formed; in addition, the directivity of the microphone arrays is controlled by means of a differential beam technique, and additional electro-acoustic and acoustic hardware requirements are reduced, thereby meeting requirements for the mobile intelligent device regarding collecting immersive audio when the volume of the device is controlled.
Description
本公开涉及移动通信技术领域,特别涉及一种空间音频采集方法及装置。The present disclosure relates to the field of mobile communication technology, and in particular to a spatial audio acquisition method and device.
随着技术发展,空间音频在民用设备的多媒体、即时通信方面有了很多的应用。但是目前空间音频的采集都依赖于外置设备,不能直接通过智能移动设备直接采集,并且目前的空间音频采集设备都存在体积过大、操作不易的问题,不适用于用户日益增长的高质量音视频采集需求。With the development of technology, spatial audio has been widely used in multimedia and instant communication of civilian equipment. However, the acquisition of spatial audio currently relies on external devices and cannot be directly acquired through smart mobile devices. In addition, the current spatial audio acquisition devices are too large and difficult to operate, which is not suitable for users' growing demand for high-quality audio and video acquisition.
发明内容Summary of the invention
本公开提出了一种空间音频采集方法及装置,以解决现有技术中无法将空间音频采集系统集成于UE中进行有效、高质量的空间音频采集问题。The present disclosure proposes a spatial audio acquisition method and device to solve the problem in the prior art that a spatial audio acquisition system cannot be integrated into a UE to perform effective and high-quality spatial audio acquisition.
本公开的第一方面实施例提供了一种空间音频采集方法,该方法由用户设备UE执行,UE中布置有多组麦克风阵列,每组阵列的最大响应方向相互正交,该方法包括:对麦克风阵列获取的麦克风信号进行差分波束处理,以获取空间音频信号。A first aspect embodiment of the present disclosure provides a spatial audio acquisition method, which is executed by a user equipment UE, wherein multiple groups of microphone arrays are arranged in the UE, and the maximum response directions of each group of arrays are mutually orthogonal. The method includes: performing differential beam processing on microphone signals obtained by the microphone array to obtain spatial audio signals.
在一些实施例中,对麦克风阵列获取的麦克风信号进行差分波束处理,以获取空间音频信号包括:对麦克风信号添加适当的延时滤波和对应的补偿滤波器,获得所需指向性的阵列信号;对阵列信号进行解码,获取空间音频信号。In some embodiments, differential beam processing is performed on microphone signals acquired by a microphone array to acquire spatial audio signals, including: adding appropriate delay filtering and corresponding compensation filters to the microphone signals to acquire array signals with desired directivity; and decoding the array signals to acquire spatial audio signals.
在一些实施例中,该方法还包括:获取麦克风阵列的多个指向性,指向性表征不同方向上信号的灵敏度。In some embodiments, the method further includes: acquiring multiple directivities of the microphone array, the directivities representing the sensitivity of signals in different directions.
在一些实施例中,该方法还包括:获取多个指向性的差分阵列;通过不同差分阵列的组合,获取三维空间上所需的指向性,以获取空间音频信号。In some embodiments, the method further includes: obtaining a plurality of directivity differential arrays; and obtaining the required directivity in three-dimensional space by combining different differential arrays to obtain a spatial audio signal.
在一些实施例中,该方法还包括:对空间音频信号进行解码处理,以输出沉浸式的多声道音频和/或ambisonic音频。In some embodiments, the method further includes: decoding the spatial audio signal to output immersive multi-channel audio and/or ambisonic audio.
在一些实施例中,该方法还包括:对麦克风信号进行滤波处理,以获取低频成分和高频成分,其中,低频成分作为低频效果输出,高频成分用于形成空间音频信号。In some embodiments, the method further includes: filtering the microphone signal to obtain a low-frequency component and a high-frequency component, wherein the low-frequency component is output as a low-frequency effect and the high-frequency component is used to form a spatial audio signal.
在一些实施例中,麦克风阵列在UE中以如下任一种方式布置:麦克风阵列布置于UE中靠近人声采集组件的位置;麦克风阵列布置于UE中靠近图像采集组件的位置。In some embodiments, the microphone array is arranged in the UE in any of the following ways: the microphone array is arranged in a position close to a human voice collection component in the UE; the microphone array is arranged in a position close to an image collection component in the UE.
在一些实施例中,麦克风阵列包括预定个数麦克风,预定个数麦克风形成三组麦克风阵列,所述三组麦克风阵列相互正交或角度偏离正交误差在预定范围内,三组麦克风阵列的中心重合或具有不超过误差阈值的距离。In some embodiments, the microphone array includes a predetermined number of microphones, which form three groups of microphone arrays. The three groups of microphone arrays are orthogonal to each other or the angle deviation from the orthogonality error is within a predetermined range, and the centers of the three groups of microphone arrays coincide or have a distance that does not exceed an error threshold.
本公开的第二方面实施例提供了一种空间音频采集装置,该装置布置于用户设备UE执行,UE中布置有多组麦克风阵列,每组阵列的最大响应方向相互正交,该装置包括:空间音频信号获取模块,用于对麦克风阵列获取的麦克风信号进行差分波束处理,以获取空间音频信号。A second aspect of the present disclosure provides a spatial audio acquisition device, which is arranged to be executed in a user equipment UE. A plurality of microphone arrays are arranged in the UE, and the maximum response directions of each array are mutually orthogonal. The device includes: a spatial audio signal acquisition module, which is used to perform differential beam processing on microphone signals acquired by the microphone array to acquire spatial audio signals.
本公开的第三方面实施例提供了一种通信设备,包括:收发器;存储器;处理器,分别与收发器及存储器连接,配置为通过执行存储器上的计算机可执行指令,控制收发器的无线信号收发,并能够实现上述第一方面实施例的空间音频采集方法。The third aspect embodiment of the present disclosure provides a communication device, including: a transceiver; a memory; a processor, which is connected to the transceiver and the memory respectively, and is configured to control the wireless signal reception and transmission of the transceiver by executing computer executable instructions on the memory, and can implement the spatial audio acquisition method of the above-mentioned first aspect embodiment.
本公开第四方面实施例提出了一种计算机存储介质,其中,计算机存储介质存储有计算机可执行指令;计算机可执行指令被处理器执行后,能够实现上述第一方面实施例的空间音频采集方法。The fourth aspect of the present disclosure provides a computer storage medium, wherein the computer storage medium stores computer executable instructions; after the computer executable instructions are executed by a processor, the spatial audio acquisition method of the first aspect of the present disclosure can be implemented.
本公开实施例提供了一种空间音频采集方法及装置,通过在UE中布置有多组相互正交的麦克风阵列,对麦克风阵列获取的麦克风信号进行差分波束处理,以获取空间音频信号。本公开通过设置相互正交的麦克风阵列,在控制麦克风指向性的同时,将采集系统大小控制在一定尺寸以内,以便内置在现在的移动设备之内,构成能内置移动智能设备的拾音系统。并且通过差分波束技术,控制拾音系统采集到的信号,以采集空间音频,减少额外的电声、声学硬件的要求,从而在控制设备体积的情况下,解决移动智能设备对于采集沉浸式音频要求。The embodiments of the present disclosure provide a spatial audio acquisition method and device, which arranges multiple groups of mutually orthogonal microphone arrays in the UE, and performs differential beam processing on the microphone signals acquired by the microphone arrays to obtain spatial audio signals. The present disclosure controls the size of the acquisition system within a certain size while controlling the directivity of the microphone by setting up mutually orthogonal microphone arrays, so that it can be built into current mobile devices, forming a pickup system that can be built into mobile smart devices. And through differential beam technology, the signal collected by the pickup system is controlled to collect spatial audio, reducing the requirements for additional electroacoustic and acoustic hardware, thereby solving the requirements of mobile smart devices for collecting immersive audio while controlling the size of the device.
本公开附加的方面和优点将在下面的描述中部分给出,部分将从下面的描述中变得明显,或通过本公开的实践了解到。Additional aspects and advantages of the present disclosure will be given in part in the following description and in part will be obvious from the following description or learned through practice of the present disclosure.
本公开上述的和/或附加的方面和优点从下面结合附图对实施例的描述中将变得明显和容易理解,其中:The above and/or additional aspects and advantages of the present disclosure will become apparent and easily understood from the following description of the embodiments in conjunction with the accompanying drawings, in which:
图1为根据本公开实施例的一种空间音频采集方法的流程示意图;FIG1 is a schematic diagram of a flow chart of a spatial audio acquisition method according to an embodiment of the present disclosure;
图2为根据本公开实施例的一种空间音频采集方法的流程示意图;FIG2 is a schematic diagram of a flow chart of a spatial audio acquisition method according to an embodiment of the present disclosure;
图3为根据本公开实施例的一种空间音频采集逻辑示意图;FIG3 is a schematic diagram of spatial audio acquisition logic according to an embodiment of the present disclosure;
图4为根据本公开实施例的一种一阶差分阵列示意图;FIG4 is a schematic diagram of a first-order differential array according to an embodiment of the present disclosure;
图5为根据本公开实施例的麦克风阵列的指向性示意图;FIG5 is a schematic diagram of the directivity of a microphone array according to an embodiment of the present disclosure;
图6为根据本公开实施例的解码后左声道和右声道的指向性示意图;FIG6 is a schematic diagram of the directivity of the left channel and the right channel after decoding according to an embodiment of the present disclosure;
图7为根据本公开实施例的一阶B格式示意图;FIG7 is a schematic diagram of a first-order B format according to an embodiment of the present disclosure;
图8为根据本公开实施例的一阶B格式指向信号分量示意图;FIG8 is a schematic diagram of a first-order B-format directional signal component according to an embodiment of the present disclosure;
图9为根据本公开实施例的一种麦克风阵列在移动设备中的布置示意图;FIG9 is a schematic diagram of an arrangement of a microphone array in a mobile device according to an embodiment of the present disclosure;
图10为根据本公开实施例的一种麦克风阵列在移动设备中的布置示意图;FIG10 is a schematic diagram of an arrangement of a microphone array in a mobile device according to an embodiment of the present disclosure;
图11为根据本公开实施例的一种空间音频采集装置的框图;FIG11 is a block diagram of a spatial audio acquisition device according to an embodiment of the present disclosure;
图12为本公开实施例提供的一种通信装置的结构示意图;FIG12 is a schematic diagram of the structure of a communication device provided in an embodiment of the present disclosure;
图13为本公开实施例提供的一种芯片的结构示意图。FIG. 13 is a schematic diagram of the structure of a chip provided in an embodiment of the present disclosure.
下面详细描述本公开的实施例,实施例的示例在附图中示出,其中自始至终相同或类似的标号表示相同或类似的元件或具有相同或类似功能的元件。下面通过参考附图描述的实施例是示例性的,旨在用于解释本公开,而不能理解为对本公开的限制。Embodiments of the present disclosure are described in detail below, and examples of the embodiments are shown in the accompanying drawings, wherein the same or similar reference numerals throughout represent the same or similar elements or elements having the same or similar functions. The embodiments described below with reference to the accompanying drawings are exemplary and are intended to be used to explain the present disclosure, and should not be construed as limiting the present disclosure.
随着技术发展,空间音频在民用设备方面有了很多的应用,Youtube、facebook等网站都支持空间音频内容。实时通信方面,AVS等音视频编解码都支持空间音频的编解码。With the development of technology, spatial audio has been widely used in civilian devices. Websites such as Youtube and Facebook support spatial audio content. In real-time communication, audio and video codecs such as AVS support spatial audio codecs.
现有的空间音频采集技术,虽然能做到很好的音频质量和沉浸式的声场再现,但是目前空间音频的采集都依赖于外置设备,不能直接通过智能移动设备直接采集。此外,目前的空间音频采集设备都存在体积过大、操作不易的问题。下表示出了现有技术中的3D空间音频采集设备:Although the existing spatial audio acquisition technology can achieve good audio quality and immersive sound field reproduction, the current spatial audio acquisition relies on external devices and cannot be directly acquired through smart mobile devices. In addition, the current spatial audio acquisition devices are too large and difficult to operate. The following table shows the 3D spatial audio acquisition devices in the prior art:
可以看出,相关技术中的空间音频采集设备无法内置在现在的移动智能设备内,不适用于用户日益增长的高质量音视频采集需求。以现在最普遍的移动智能设备—智能手机为例,大小都在7寸左右,如小米12S PRO(长度163.6mm,宽度:74.6mm,厚度:8.16mm)。加上移动智能设备内硬件布置十分紧凑,移动智能设备的内置音频采集系统的体积十分有限。It can be seen that the spatial audio acquisition equipment in the relevant technology cannot be built into the current mobile smart devices, and is not suitable for users' growing demand for high-quality audio and video acquisition. Taking the most common mobile smart device nowadays, the smartphone, as an example, the size is about 7 inches, such as Xiaomi 12S PRO (length 163.6mm, width: 74.6mm, thickness: 8.16mm). In addition, the hardware layout in the mobile smart device is very compact, and the volume of the built-in audio acquisition system of the mobile smart device is very limited.
为此,本公开提出了一种空间音频采集方法及装置,以解决现有技术中无法将空间音频采集系统集成于UE中进行有效、高质量的空间音频采集问题。To this end, the present disclosure proposes a spatial audio acquisition method and device to solve the problem in the prior art that a spatial audio acquisition system cannot be integrated into a UE to perform effective and high-quality spatial audio acquisition.
下面结合附图对本申请所提供的空间音频采集方法及装置进行详细地介绍。The spatial audio acquisition method and device provided by the present application are described in detail below with reference to the accompanying drawings.
图1示出了根据本公开实施例的一种空间音频采集方法的流程示意图。该方法可由用户设备(User Equipment,UE)。本公开中UE布置有多组麦克风阵列,每组阵列的最大响应方向相互正交。如图1所示,该方法可以包括以下步骤。FIG1 shows a flow chart of a spatial audio acquisition method according to an embodiment of the present disclosure. The method may be implemented by a user equipment (UE). In the present disclosure, the UE is arranged with multiple microphone arrays, and the maximum response directions of each array are mutually orthogonal. As shown in FIG1 , the method may include the following steps.
S101,对麦克风阵列获取的麦克风信号进行差分波束处理,以获取空间音频信号。S101, performing differential beam processing on microphone signals acquired by a microphone array to acquire spatial audio signals.
在本公开的实施例中,UE中设置的多组相互正交的麦克风阵列,每个麦克风阵列可以包括多个麦克风。本公开不限制麦克风类型,可以采用体积小、误差小、更适合集成于用户设备等小型设备中且适合于进行波束控制的全指向微型麦克风,例如MEMS(微型机电系统)麦克风、驻极体麦克风等,以控制拾音系统的大小,相较以往空间音频采集设备而言,本公开运用全指向微型麦克风能够大大缩小空间音频采集系统的体积。In an embodiment of the present disclosure, multiple groups of mutually orthogonal microphone arrays are set in the UE, and each microphone array may include multiple microphones. The present disclosure does not limit the type of microphones, and an omnidirectional micro-microphone with a small size, small error, more suitable for integration in small devices such as user equipment, and suitable for beam control, such as a MEMS (micro-electromechanical system) microphone, an electret microphone, etc., can be used to control the size of the sound pickup system. Compared with previous spatial audio acquisition devices, the present disclosure uses an omnidirectional micro-microphone to greatly reduce the size of the spatial audio acquisition system.
传统麦克风波束有延时累加(delay-sum),滤波相加(filter-sum),自适应波束形成(MVDR),以及差分波束(Differential beamforming)。由于差分波束具有布局紧凑,频率不变波束模式的优势。在本公开的实施例中,对于多组麦克风阵列,可以通过差分麦克风波束技术控制指向性,以辅助获取空间音频信号。本公开中,通过不同的差分波束设计,以不同低阶阵列的组合,获取三维空间上所需的指向 性,从而采集空间音频信号。本公开的方案仅依靠波束技术控制指向性,能够有效降低拾音系统对电声、声学硬件的依赖性。Traditional microphone beamforming includes delay-sum, filter-sum, adaptive beamforming (MVDR), and differential beamforming. Differential beamforming has the advantages of compact layout and frequency-invariant beam pattern. In the embodiments of the present disclosure, for multiple groups of microphone arrays, the directivity can be controlled by differential microphone beam technology to assist in obtaining spatial audio signals. In the present disclosure, different differential beam designs are used to combine different low-order arrays to obtain the required directivity in three-dimensional space, thereby collecting spatial audio signals. The solution of the present disclosure relies only on beam technology to control directivity, which can effectively reduce the dependence of the pickup system on electroacoustic and acoustic hardware.
综上,根据本公开提供的空间音频采集方法,通过在UE中布置有多组相互正交的麦克风阵列,对麦克风阵列获取的麦克风信号进行差分波束处理,以获取空间音频信号。本公开可使用运用波束技术的微型麦克风形成的相互正交的麦克风阵列,将空间音频采集系统大小控制在一定尺寸以内,以便内置在现在的移动设备之内,构成能内置移动智能设备的拾音系统,同时通过差分波束技术,控制麦克风阵列的指向性,减少额外的电声、声学硬件的要求,从而在控制设备体积的情况下,解决移动智能设备对于采集沉浸式音频要求。In summary, according to the spatial audio acquisition method provided by the present disclosure, multiple groups of mutually orthogonal microphone arrays are arranged in the UE, and differential beam processing is performed on the microphone signals obtained by the microphone array to obtain spatial audio signals. The present disclosure can use mutually orthogonal microphone arrays formed by micro-microphones using beam technology to control the size of the spatial audio acquisition system within a certain size so that it can be built into current mobile devices, forming a pickup system that can be built into mobile smart devices. At the same time, through differential beam technology, the directivity of the microphone array is controlled to reduce the requirements for additional electroacoustic and acoustic hardware, thereby solving the requirements of mobile smart devices for collecting immersive audio while controlling the size of the device.
图2示出了根据本公开实施例的一种空间音频采集方法的流程示意图。该方法可由UE执行,在本公开的实施例中,首先介绍麦克风阵列的布置方式。Fig. 2 shows a schematic flow chart of a spatial audio acquisition method according to an embodiment of the present disclosure. The method may be executed by a UE. In the embodiment of the present disclosure, the arrangement of the microphone array is first introduced.
在一些可选的实施例中,麦克风阵列包括预定个数麦克风,预定个数麦克风形成三组麦克风阵列,所述三组麦克风阵列相互正交或角度偏离正交误差在预定范围内,三组麦克风阵列的中心重合或具有不超过误差阈值的距离。In some optional embodiments, the microphone array includes a predetermined number of microphones, which form three groups of microphone arrays. The three groups of microphone arrays are orthogonal to each other or the angle deviation from the orthogonality error is within a predetermined range, and the centers of the three groups of microphone arrays coincide or have a distance that does not exceed an error threshold.
换言之,本公开中麦克风阵列中的麦克风并不限定数目,每个阵列可以由任意数量的麦克风组成。在一种优选的实施例中,4颗MEMS麦克风为成本最优的设计,例如布置在正6面体相邻的4个顶点上,形成相互正交的三组麦克风阵列。In other words, the number of microphones in the microphone array of the present disclosure is not limited, and each array can be composed of any number of microphones. In a preferred embodiment, four MEMS microphones are the most cost-effective design, for example, arranged on four adjacent vertices of a regular hexahedron to form three mutually orthogonal microphone arrays.
本公开中不限制麦克风类型,可以用微型麦克风(如MEMS)以控制拾音系统的大小,相较以往空间音频采集设备而言,本公开提供的方案能够大大缩小体积。The present disclosure does not limit the type of microphone, and a miniature microphone (such as MEMS) can be used to control the size of the sound pickup system. Compared with previous spatial audio acquisition devices, the solution provided by the present disclosure can greatly reduce the volume.
应当理解,三组麦克风阵列以正交的角度布置为本发明的一种优选方式,在一些可选的实施例中,三组麦克风可以有一定的角度偏移,并且三组阵列在理想状态下中心完全重合,在一些可选的实施例中,分离式或是中心有一定距离可以认为是误差。当然,在实际设备上的麦阵应保持阵列之间相互正交,从而减少位置误差带来的干扰。此外,由于位置误差、麦克风之间不一致性、设备本身干扰都会影响最终性能,因此需要根据实际情况进行校准。It should be understood that the arrangement of three microphone arrays at orthogonal angles is a preferred embodiment of the present invention. In some optional embodiments, the three microphones may have a certain angle offset, and the centers of the three arrays completely overlap under ideal conditions. In some optional embodiments, separation or a certain distance between the centers can be considered as errors. Of course, the microphone arrays on the actual device should maintain orthogonality between the arrays to reduce the interference caused by position errors. In addition, since position errors, inconsistencies between microphones, and interference from the device itself will affect the final performance, calibration is required based on actual conditions.
举例而言,本公开使用参数完全一致的全指向的微型麦克风布置3对相互正交的麦克风,并且每对麦克风连线中点重合。构成阵列的麦克风信号可以复用,因此最少只需要4颗麦克风即可构成本发明所需的麦阵,布置在正六面体的任意4个顶点即可。由于移动智能设备的体积限制,本公开的一种优选实施例推荐布置4颗麦克风,布置在正6面体相邻的4个顶点上,麦克风主轴方向一致,麦阵间距尽量小。For example, the present disclosure uses omnidirectional miniature microphones with completely identical parameters to arrange three pairs of mutually orthogonal microphones, and the midpoints of the connecting lines of each pair of microphones coincide. The microphone signals constituting the array can be reused, so at least four microphones are needed to form the microphone array required by the present invention, which can be arranged at any four vertices of a regular hexahedron. Due to the volume limitation of mobile smart devices, a preferred embodiment of the present disclosure recommends the arrangement of four microphones, which are arranged at four adjacent vertices of a regular hexahedron, with the main axis directions of the microphones consistent and the spacing between the microphone arrays as small as possible.
在该示例中,以麦克风0为原点建立三维空间坐标系,麦克风1在x轴上,麦克风2在y轴上,麦克风3在z轴上。麦克风0与麦克风1,麦克风2,麦克风3的3组麦克风间距相等,形成3对正交的一阶差分阵列。由于微型麦克风相较于传统电容、动圈麦克风具有体积小的优势,3对麦克风间距完全可以控制在4mm,远小于目标信号(20-20kHz)的波长(1.7cm),因此麦克风间距造成的误差可以忽略不计。In this example, a three-dimensional space coordinate system is established with microphone 0 as the origin, microphone 1 is on the x-axis, microphone 2 is on the y-axis, and microphone 3 is on the z-axis. The distances between microphone 0, microphone 1, microphone 2, and microphone 3 are equal, forming three pairs of orthogonal first-order differential arrays. Since miniature microphones have the advantage of small size compared to traditional capacitor and dynamic microphones, the distance between the three pairs of microphones can be completely controlled at 4mm, which is much smaller than the wavelength (1.7cm) of the target signal (20-20kHz), so the error caused by the microphone distance can be ignored.
基于图1所示实施例,如图2所示,该方法可以包括以下步骤。Based on the embodiment shown in FIG. 1 , as shown in FIG. 2 , the method may include the following steps.
S201,对麦克风阵列获取的麦克风信号进行滤波处理,以获取低频成分和高频成分。S201, filtering the microphone signal acquired by the microphone array to obtain low-frequency components and high-frequency components.
在本公开的实施例中,对麦克风阵列获取的麦克风信号进行滤波处理,其中所获得的低频成分作为低频效果输出,高频成分用于进行后续处理以形成空间音频信号,如图3所示,示出了本公开所描述的空间音频采集逻辑示意图。In an embodiment of the present disclosure, the microphone signals acquired by the microphone array are filtered, wherein the obtained low-frequency components are output as low-frequency effects, and the high-frequency components are used for subsequent processing to form spatial audio signals, as shown in FIG3 , which shows a logical schematic diagram of spatial audio acquisition described in the present disclosure.
应理解的是,由于差分波束的高通特性,在低频部分表现较差,因此可以将麦克风0的原始信号(即麦克风阵列获取的麦克风信号),通过低通滤波器只保留低频成分,作为LFE声道,由于低频成分波长较长,对人耳的定位影响较少,在加强低频效果的同时,不影响空间感。其余声道经过高通滤波器滤除低频成分后作为高频成分,以用于后续处理形成空间音频信号。It should be understood that due to the high-pass characteristics of the differential beam, the performance in the low-frequency part is poor, so the original signal of microphone 0 (i.e. the microphone signal obtained by the microphone array) can be passed through a low-pass filter to retain only the low-frequency component as the LFE channel. Since the low-frequency component has a longer wavelength, it has less impact on the positioning of the human ear, and while strengthening the low-frequency effect, it does not affect the sense of space. The remaining channels are filtered out of the low-frequency component through a high-pass filter as the high-frequency component, which is used for subsequent processing to form a spatial audio signal.
S202,对麦克风信号添加适当的延时滤波和对应的补偿滤波器,获得所需指向性的阵列信号。S202, adding appropriate delay filtering and corresponding compensation filters to the microphone signal to obtain an array signal with required directivity.
应理解的是,对于步骤S201所得到的高频成分,可以添加适当的延时滤波和对应的补偿滤波器,获得所需指向性的阵列信号。It should be understood that, for the high frequency components obtained in step S201, appropriate delay filtering and corresponding compensation filters may be added to obtain an array signal with desired directivity.
S203,获取麦克风阵列的多个指向性。S203, obtaining multiple directivities of the microphone array.
在本公开的实施例中,指向性表征不同方向上信号的灵敏度。本公开通过获取多个指向性的差分阵列,通过不同差分阵列的组合,获取三维空间上所需的指向性,如图4所示,示出了一阶差分阵列示意图。In the embodiments of the present disclosure, directivity represents the sensitivity of signals in different directions. The present disclosure obtains the required directivity in three-dimensional space by obtaining differential arrays of multiple directivities and combining different differential arrays, as shown in FIG4 , which shows a schematic diagram of a first-order differential array.
具体地,下面对上述步骤S202-S203进行详细说明。Specifically, the above steps S202-S203 are described in detail below.
标准的一阶差分阵列通过两个主轴方向相同的麦克风之间的麦克风相减得到目标信号,通过在减去的麦克风信号上添加角频率不变的延时来控制指向性:The standard first-order differential array obtains the target signal by subtracting the microphones with the same main axis direction, and controls the directivity by adding a delay with constant angular frequency to the subtracted microphone signal:
输出补偿滤波器可表示为:
其中ω为角频率,∝
1,1为延时滤波器系数,
The output compensation filter can be expressed as: Where ω is the angular frequency, ∝ 1,1 is the delay filter coefficient,
因此,针对θ角度(声源在麦克风处的入射角)上的信号,阵列输出的信号(即上述的阵列信号)表示为:Y(ω,θ)=(X
1(ω,θ)-X
2(ω,θ))H
L(ω),其中X
n(ω,θ)表示第n个麦克风信号。
Therefore, for the signal at angle θ (the incident angle of the sound source at the microphone), the signal output by the array (ie, the array signal mentioned above) is expressed as: Y(ω,θ)=( X1 (ω,θ) -X2 (ω,θ)) HL (ω), where Xn (ω,θ) represents the nth microphone signal.
由于麦克风间距远小于波长,τ
0-∝
1,1τ
0<<2π,X
1,X
2的幅度差异可以忽略不记,且e
x=1+x。
Since the distance between microphones is much smaller than the wavelength, τ 0 -∝ 1,1 τ 0 <<2π, the amplitude difference between X 1 and X 2 can be ignored, and e x =1+x.
则阵列的指向性(对于不同方向上信号灵敏度)为:
经简化,表示为:
Then the directivity of the array (signal sensitivity in different directions) is: After simplification, it is expressed as:
其中两种最常见的指向为(主轴方向为90°):The two most common orientations are (with the main axis at 90°):
指向性Directivity | ∝ 1,1 ∝ 1,1 | 灵敏度为0的角度Angle at which sensitivity is 0 |
8字形\偶极性Figure 8\Dipole | 00 | 0°,180°0°, 180° |
心型Heart | -1-1 | -90°-90° |
因此通过控制延迟滤波器系数,即可控制差分波束的指向。Therefore, by controlling the delay filter coefficient, the direction of the differential beam can be controlled.
S204,对阵列信号进行解码,获取空间音频信号。S204: Decode the array signal to obtain a spatial audio signal.
在本公开的实施例中,根据差分阵列的原理,3对麦克风可以构成以下5种不同指向性的一阶差分阵列:In the embodiments of the present disclosure, according to the principle of differential array, three pairs of microphones can form the following five first-order differential arrays with different directivities:
序号Serial number | 主轴方向Spindle direction | 指向性Directivity | 选用麦克风Microphone Selection |
阵列1:Array 1: | X轴正方向X-axis positive direction | 心型Heart |
麦克风0,麦克风1Microphone 0, |
阵列2:Array 2: | X轴负方向X-axis negative direction | 心型Heart |
麦克风0,麦克风1Microphone 0, |
阵列3:Array 3: | Y轴正方向Positive direction of Y axis | 8字8 characters | 麦克风0,麦克风2Microphone 0, Microphone 2 |
阵列4:Array 4: | Z轴正方向Z-axis positive direction | 8字8 characters | 麦克风0,麦克风3Microphone 0, Microphone 3 |
阵列5:Array 5: | X轴正方向X-axis positive direction | 8字8 characters |
麦克风0,麦克风1Microphone 0, |
通过不同一阶阵列的组合,获取三维空间上所需的指向性,从而采集空间音频信号。By combining different first-order arrays, the required directivity in three-dimensional space is obtained, thereby collecting spatial audio signals.
S205,对空间音频信号进行解码处理,以输出沉浸式的多声道音频和/或ambisonic音频。S205 , decoding the spatial audio signal to output immersive multi-channel audio and/or ambisonic audio.
在本公开的实施例中,通过不同的差分波束设计,得到空间音频所需的音频信号。例如,可以输出不同音频格式如多声道音频和ambisonic(B-format),其中,多声道音频、ambisonic音频是沉浸式(环绕声)的两种格式。In the embodiments of the present disclosure, different differential beam designs are used to obtain audio signals required for spatial audio. For example, different audio formats such as multi-channel audio and ambisonic (B-format) can be output, where multi-channel audio and ambisonic audio are two immersive (surround sound) formats.
举例而言,在一种可选的实施例中,根据M\S录音原理,构筑M\S-3D录音制式,通过对空间音频信号进行解码,输出5.1.4声道的多声道音频。其中,两个指向性相反的心型指向的阵列指向X轴正向和反相,两个8字型指向的阵列分别指向Y轴正方向和Z轴正方向。For example, in an optional embodiment, according to the M\S recording principle, an M\S-3D recording format is constructed, and 5.1.4-channel multi-channel audio is output by decoding the spatial audio signal. Among them, two cardioid arrays with opposite directivities point to the positive and anti-phase directions of the X axis, and two figure-8 arrays point to the positive directions of the Y axis and the positive directions of the Z axis respectively.
解码获得多声道音频的方式如下所示,其中,“+”表示信号相加,“-”表示信号反相相加。The decoding method for obtaining multi-channel audio is as follows, where "+" indicates signal addition and "-" indicates signal inversion addition.
声道 | 阵列1Array 1 | 阵列2Array 2 | 阵列3Array 3 | 阵列4Array 4 | |
左Left | ++ | The | ++ | -- | |
中middle | ++ | The | The | The | |
右right | ++ | The | -- | -- | |
左环Left ring | The | ++ | ++ | -- | |
右环Right ring | The | ++ | -- | -- | |
前方顶部左侧Front top left | ++ | The | ++ | ++ | |
前方顶部右侧Front top right | ++ | The | The | ++ | |
顶部左后Top left rear | The | ++ | ++ | ++ | |
顶部右后Top right rear | The | ++ | -- | ++ |
本发明中所提出的麦克风阵列的布置方式,如上述所示出的五种阵列,其中,阵列1指向性在xoy切面如图5(a)所示;阵列3指向性在xoy切面如图5(b)所示,其中+为正相位,-为负相位,相同的正负相位信号会相互抵消;阵列4在xoz平面切面如图5(c)所示,其中+为正相位,-为负相位,相同的正负相位信号会相互抵消。The arrangement of the microphone array proposed in the present invention is as shown in the five arrays mentioned above, wherein the directivity of array 1 in the xoy section is shown in FIG5(a); the directivity of array 3 in the xoy section is shown in FIG5(b), wherein + is the positive phase, - is the negative phase, and the same positive and negative phase signals will cancel each other out; the directivity of array 4 in the xoz plane section is shown in FIG5(c), wherein + is the positive phase, - is the negative phase, and the same positive and negative phase signals will cancel each other out.
进行解码后左声道和右声道指向性在坐标轴平面切面如图6所示,其中左声道如图6(a)所示,右声道如图6(b)所示。After decoding, the directivity of the left channel and the right channel in the coordinate axis plane section is shown in FIG6 , wherein the left channel is shown in FIG6( a ) and the right channel is shown in FIG6( b ).
在本公开的另一种实施方式中,本公开可以输出标准的ambisonic音频。应理解的是,一阶B-format是球谐函数的一阶分解,如图7所示。构成标准的B格式需要一个全指向信号(W)和三个互相正向的8字型指向信号(X,Y,Z)。通过选取对应的阵列,获取B格式所需的四个分量可以表示为:W=麦克风0;X=阵列1;Y=阵列2;Z=阵列5,如图8所示。In another embodiment of the present disclosure, the present disclosure can output standard ambisonic audio. It should be understood that the first-order B-format is the first-order decomposition of spherical harmonics, as shown in FIG7 . The standard B-format requires an omnidirectional signal (W) and three mutually positive 8-shaped directional signals (X, Y, Z). By selecting the corresponding array, the four components required to obtain the B-format can be expressed as: W = microphone 0; X = array 1; Y = array 2; Z = array 5, as shown in FIG8 .
因此,本公开通过对空间音频信号进行解码,获取不同格式的音频信号,以满足空间音频采集的多样性需求。Therefore, the present disclosure decodes the spatial audio signal to obtain audio signals of different formats to meet the diverse needs of spatial audio acquisition.
此外,在一种可选示例中,麦克风阵列在UE中的方式布置可以根据实际需要进行布局。In addition, in an optional example, the microphone array in the UE may be arranged according to actual needs.
在一种示例中,当兼顾手持通话需求时,麦克风阵列布置于UE中靠近人声采集组件的位置。例如,将麦阵布置在移动智能设备下端,更靠近人嘴的位置,保证更好的信噪比,如图9示出了麦克风阵列在移动设备中的布置示意图,其中图9(a)为移动设备的反面示意图,图9(b)为的正面示意图。In one example, when the handheld call requirement is taken into account, the microphone array is arranged in a position close to the human voice collection component in the UE. For example, the microphone array is arranged at the lower end of the mobile smart device, closer to the human mouth, to ensure a better signal-to-noise ratio. FIG9 shows a schematic diagram of the arrangement of the microphone array in the mobile device, where FIG9(a) is a schematic diagram of the back side of the mobile device, and FIG9(b) is a schematic diagram of the front side.
在另一种示例中,当兼顾视频效果时,麦克风阵列布置于UE中靠近图像采集组件的位置。例如,麦阵布置在靠近相机的位置,并且和相机正方向保持一致。通过保证和相机的视角尽量一致,进而保证更好的视听效果。如图10示出了麦克风阵列在移动设备中的布置示意图,其中图10(a)为移动设备的反面示意图,图10(b)为的正面示意图。In another example, when taking into account the video effect, the microphone array is arranged in a position close to the image acquisition component in the UE. For example, the microphone array is arranged close to the camera and is consistent with the positive direction of the camera. By ensuring that the viewing angle is as consistent as possible with the camera, a better audio-visual effect is guaranteed. Figure 10 shows a schematic diagram of the arrangement of the microphone array in a mobile device, where Figure 10(a) is a schematic diagram of the back of the mobile device, and Figure 10(b) is a schematic diagram of the front.
综上,根据本公开提供的空间音频采集方法,通过在UE中布置有多组相互正交的麦克风阵列,对麦克风阵列获取的麦克风信号进行差分波束处理,以获取空间音频信号。本公开可使用运用波束技术的微型麦克风形成的相互正交的麦克风阵列,将空间音频采集系统的大小控制在一定尺寸以内,以便内置在现在的移动设备之内,构成能内置移动智能设备的拾音系统,同时通过差分波束技术,控制麦克风阵列的指向性,减少额外的电声、声学硬件的要求,从而在控制设备体积的情况下,解决移动智能设备对于采集沉浸式音频要求。此外,通过输出不同格式的音频,可以满足不同的应用需求,并且本公开通过将麦克风阵列布置在移动设备内的不同位置,能够适应不同的应用场景。In summary, according to the spatial audio acquisition method provided by the present disclosure, multiple groups of mutually orthogonal microphone arrays are arranged in the UE, and differential beam processing is performed on the microphone signals obtained by the microphone array to obtain spatial audio signals. The present disclosure can use mutually orthogonal microphone arrays formed by micro-microphones using beam technology to control the size of the spatial audio acquisition system within a certain size, so that it can be built into current mobile devices, forming a pickup system that can be built into mobile smart devices. At the same time, through differential beam technology, the directivity of the microphone array is controlled to reduce the requirements for additional electroacoustic and acoustic hardware, thereby solving the requirements of mobile smart devices for collecting immersive audio while controlling the size of the device. In addition, by outputting audio in different formats, different application requirements can be met, and the present disclosure can adapt to different application scenarios by arranging microphone arrays at different positions in mobile devices.
上述本申请提供的实施例中,从用户设备的角度对本申请实施例提供的方法进行了介绍。为了实现上述本申请实施例提供的方法中的各功能,用户设备可以包括硬件结构、软件模块,以硬件结构、软件模块、或硬件结构加软件模块的形式来实现上述各功能。上述各功能中的某个功能可以以硬件结构、软件模块、或者硬件结构加软件模块的方式来执行。In the above embodiments provided by the present application, the method provided by the embodiment of the present application is introduced from the perspective of the user equipment. In order to implement the various functions in the method provided by the above embodiments of the present application, the user equipment may include a hardware structure and a software module, and implement the above functions in the form of a hardware structure, a software module, or a hardware structure plus a software module. A certain function of the above functions can be executed in the form of a hardware structure, a software module, or a hardware structure plus a software module.
与上述几种实施例提供的空间音频采集方法相对应,本公开还提供一种空间音频采集装置,由于本公开实施例提供的空间音频采集装置与上述几种实施例提供的空间音频采集方法相对应,因此空间音频采集方法的实施方式也适用于本实施例提供的空间音频采集装置,在本实施例中不再详细描述。Corresponding to the spatial audio acquisition methods provided in the above-mentioned embodiments, the present disclosure also provides a spatial audio acquisition device. Since the spatial audio acquisition device provided in the embodiment of the present disclosure corresponds to the spatial audio acquisition methods provided in the above-mentioned embodiments, the implementation method of the spatial audio acquisition method is also applicable to the spatial audio acquisition device provided in this embodiment, and will not be described in detail in this embodiment.
图11为本公开实施例提供的一种空间音频采集装置1100的结构示意图,该空间音频采集装置1100布置于用户设备UE执行,UE中布置有多组麦克风阵列,每组阵列的最大响应方向相互正交。FIG11 is a schematic diagram of the structure of a spatial audio collection device 1100 provided in an embodiment of the present disclosure. The spatial audio collection device 1100 is arranged in a user equipment UE for execution. A plurality of microphone arrays are arranged in the UE, and the maximum response directions of each array are mutually orthogonal.
如图11所示,装置1100包括:空间音频信号获取模块1110,用于对麦克风阵列获取的麦克风信号进行差分波束处理,以获取空间音频信号。As shown in FIG. 11 , the apparatus 1100 includes: a spatial audio signal acquisition module 1110 for performing differential beam processing on microphone signals acquired by a microphone array to acquire spatial audio signals.
根据本公开提供的空间音频采集装置,通过在UE中布置有多组相互正交的麦克风阵列,对麦克风阵列获取的麦克风信号进行差分波束处理,以获取空间音频信号。本公开可使用运用波束技术的微型麦克风形成的相互正交的麦克风阵列,在控制麦克风指向性的同时,将空间音频采集系统大小控制在一定尺寸以内,以便内置在现在的移动设备之内,构成能内置移动智能设备的拾音系统,通过差分波束技术,控制拾音系统采集到的信号的指向性,减少额外的电声、声学硬件的要求,从而在控制设备体积的情况下,解决移动智能设备对于采集沉浸式音频要求。According to the spatial audio acquisition device provided by the present disclosure, multiple groups of mutually orthogonal microphone arrays are arranged in the UE, and differential beam processing is performed on the microphone signals obtained by the microphone array to obtain spatial audio signals. The present disclosure can use mutually orthogonal microphone arrays formed by micro-microphones using beam technology to control the size of the spatial audio acquisition system within a certain size while controlling the directivity of the microphones so that it can be built into current mobile devices, forming a pickup system that can be built into mobile smart devices. Through differential beam technology, the directivity of the signal collected by the pickup system is controlled, and the requirements for additional electroacoustic and acoustic hardware are reduced, thereby solving the requirements of mobile smart devices for collecting immersive audio while controlling the size of the device.
在一些实施例中,空间音频信号获取模块1110还用于:对麦克风信号添加适当的延时滤波和对应的补偿滤波器,获得所需指向性的阵列信号;对阵列信号进行解码,获取空间音频信号。In some embodiments, the spatial audio signal acquisition module 1110 is further used to: add appropriate delay filtering and corresponding compensation filters to the microphone signal to obtain an array signal with desired directivity; and decode the array signal to obtain the spatial audio signal.
在一些实施例中,空间音频信号获取模块1110还用于:获取麦克风阵列的多个指向性,指向性表征不同方向上信号的灵敏度。In some embodiments, the spatial audio signal acquisition module 1110 is further used to: acquire multiple directivities of the microphone array, where the directivities represent the sensitivity of signals in different directions.
在一些实施例中,空间音频信号获取模块1110还用于:获取多个指向性的差分阵列;通过不同差分阵列的组合,获取三维空间上所需的指向性,以获取空间音频信号。In some embodiments, the spatial audio signal acquisition module 1110 is further used to: acquire a plurality of directivity differential arrays; and acquire the required directivity in three-dimensional space by combining different differential arrays to acquire the spatial audio signal.
在一些实施例中,空间音频信号获取模块1110还用于:对空间音频信号进行解码处理,以输出沉浸式的多声道音频和/或ambisonic音频。In some embodiments, the spatial audio signal acquisition module 1110 is further configured to: decode the spatial audio signal to output immersive multi-channel audio and/or ambisonic audio.
在一些实施例中,空间音频信号获取模块1110还用于:对麦克风信号进行滤波处理,以获取低频成分和高频成分,其中,低频成分作为低频效果输出,高频成分用于形成空间音频信号。In some embodiments, the spatial audio signal acquisition module 1110 is further used to filter the microphone signal to obtain low-frequency components and high-frequency components, wherein the low-frequency components are output as low-frequency effects and the high-frequency components are used to form a spatial audio signal.
在一些实施例中,麦克风阵列在UE中以如下任一种方式布置:麦克风阵列布置于UE中靠近人声采集组件的位置;麦克风阵列布置于UE中靠近图像采集组件的位置。In some embodiments, the microphone array is arranged in the UE in any of the following ways: the microphone array is arranged in a position close to a human voice collection component in the UE; the microphone array is arranged in a position close to an image collection component in the UE.
在一些实施例中,麦克风阵列包括预定个数麦克风,预定个数麦克风形成三组麦克风阵列,所述三组麦克风阵列相互正交或角度偏离正交误差在预定范围内,三组麦克风阵列的中心重合或具有不超过误差阈值的距离。In some embodiments, the microphone array includes a predetermined number of microphones, which form three groups of microphone arrays. The three groups of microphone arrays are orthogonal to each other or the angle deviation from the orthogonality error is within a predetermined range, and the centers of the three groups of microphone arrays coincide or have a distance that does not exceed an error threshold.
根据本公开提供的空间音频采集装置,通过在UE中布置有多组相互正交的麦克风阵列,对麦克风阵列获取的麦克风信号进行差分波束处理,以获取空间音频信号。本公开可使用运用波束技术的微型麦克风形成的相互正交的麦克风阵列,在控制麦克风指向性的同时,将空间音频采集系统大小控制在一定尺寸以内,以便内置在现在的移动设备之内,构成能内置移动智能设备的拾音系统,通过差分波束技术,控制拾音系统采集到的信号的指向性,减少额外的电声、声学硬件的要求,从而在控制设备体积的情况下,解决移动智能设备对于采集沉浸式音频要求。此外,通过输出不同格式的音频,可以满足不同的应用需求,并且本公开通过将麦克风阵列布置在移动设备内的不同位置,能够适应不同的应用应用场景。According to the spatial audio acquisition device provided by the present disclosure, multiple groups of mutually orthogonal microphone arrays are arranged in the UE, and differential beam processing is performed on the microphone signals obtained by the microphone array to obtain spatial audio signals. The present disclosure can use mutually orthogonal microphone arrays formed by miniature microphones using beam technology to control the size of the spatial audio acquisition system within a certain size while controlling the directivity of the microphone, so that it can be built into current mobile devices, forming a pickup system that can be built into mobile smart devices. Through differential beam technology, the directionality of the signal collected by the pickup system is controlled to reduce the requirements for additional electroacoustic and acoustic hardware, thereby solving the requirements of mobile smart devices for collecting immersive audio while controlling the size of the device. In addition, by outputting audio in different formats, different application requirements can be met, and the present disclosure can adapt to different application scenarios by arranging microphone arrays at different positions in mobile devices.
请参见图12,图12是本申请实施例提供的一种通信装置1200的结构示意图。通信装置1200可以是网络设备,也可以是用户设备,也可以是支持网络设备实现上述方法的芯片、芯片系统、或处理器等,还可以是支持用户设备实现上述方法的芯片、芯片系统、或处理器等。该装置可用于实现上述方法实施例中描述的方法,具体可以参见上述方法实施例中的说明。Please refer to Figure 12, which is a schematic diagram of the structure of a communication device 1200 provided in an embodiment of the present application. The communication device 1200 can be a network device, or a user device, or a chip, a chip system, or a processor that supports the network device to implement the above method, or a chip, a chip system, or a processor that supports the user device to implement the above method. The device can be used to implement the method described in the above method embodiment, and the details can be referred to the description in the above method embodiment.
通信装置1200可以包括一个或多个处理器1201。处理器1201可以是通用处理器或者专用处理器等。例如可以是基带处理器或中央处理器。基带处理器可以用于对通信协议以及通信数据进行处理,中央处理器可以用于对通信装置(如,基站、基带芯片,终端设备、终端设备芯片,DU或CU等)进行控制,执行计算机程序,处理计算机程序的数据。The communication device 1200 may include one or more processors 1201. The processor 1201 may be a general-purpose processor or a dedicated processor, etc. For example, it may be a baseband processor or a central processing unit. The baseband processor may be used to process the communication protocol and the communication data, and the central processing unit may be used to control the communication device (such as a base station, a baseband chip, a terminal device, a terminal device chip, a DU or a CU, etc.), execute a computer program, and process the data of the computer program.
可选的,通信装置1200中还可以包括一个或多个存储器1202,其上可以存有计算机程序1204,处理器1201执行计算机程序1204,以使得通信装置1200执行上述方法实施例中描述的方法。可选的,存储器1202中还可以存储有数据。通信装置1200和存储器1202可以单独设置,也可以集成在一起。Optionally, the communication device 1200 may further include one or more memories 1202, on which a computer program 1204 may be stored, and the processor 1201 executes the computer program 1204 so that the communication device 1200 performs the method described in the above method embodiment. Optionally, data may also be stored in the memory 1202. The communication device 1200 and the memory 1202 may be provided separately or integrated together.
可选的,通信装置1200还可以包括收发器1205、天线1206。收发器1205可以称为收发单元、收发机、或收发电路等,用于实现收发功能。收发器1205可以包括接收器和发送器,接收器可以称为接收机或接收电路等,用于实现接收功能;发送器可以称为发送机或发送电路等,用于实现发送功能。Optionally, the communication device 1200 may further include a transceiver 1205 and an antenna 1206. The transceiver 1205 may be referred to as a transceiver unit, a transceiver, or a transceiver circuit, etc., and is used to implement a transceiver function. The transceiver 1205 may include a receiver and a transmitter, the receiver may be referred to as a receiver or a receiving circuit, etc., and is used to implement a receiving function; the transmitter may be referred to as a transmitter or a transmitting circuit, etc., and is used to implement a transmitting function.
可选的,通信装置1200中还可以包括一个或多个接口电路1207。接口电路1207用于接收代码指令并传输至处理器1201。处理器1201运行代码指令以使通信装置1200执行上述方法实施例中描述的方法。Optionally, the communication device 1200 may further include one or more interface circuits 1207. The interface circuit 1207 is used to receive code instructions and transmit them to the processor 1201. The processor 1201 executes the code instructions to enable the communication device 1200 to execute the method described in the above method embodiment.
在一种实现方式中,处理器1201中可以包括用于实现接收和发送功能的收发器。例如该收发器可以是收发电路,或者是接口,或者是接口电路。用于实现接收和发送功能的收发电路、接口或接口电路可以是分开的,也可以集成在一起。上述收发电路、接口或接口电路可以用于代码/数据的读写,或者,上述收发电路、接口或接口电路可以用于信号的传输或传递。In one implementation, the processor 1201 may include a transceiver for implementing receiving and sending functions. For example, the transceiver may be a transceiver circuit, an interface, or an interface circuit. The transceiver circuit, interface, or interface circuit for implementing the receiving and sending functions may be separate or integrated. The above-mentioned transceiver circuit, interface, or interface circuit may be used for reading and writing code/data, or the above-mentioned transceiver circuit, interface, or interface circuit may be used for transmitting or delivering signals.
在一种实现方式中,处理器1201可以存有计算机程序1203,计算机程序1203在处理器1201上运行,可使得通信装置1200执行上述方法实施例中描述的方法。计算机程序1203可能固化在处理器1201中,该种情况下,处理器1201可能由硬件实现。In one implementation, the processor 1201 may store a computer program 1203, which runs on the processor 1201 and enables the communication device 1200 to perform the method described in the above method embodiment. The computer program 1203 may be fixed in the processor 1201, in which case the processor 1201 may be implemented by hardware.
在一种实现方式中,通信装置1200可以包括电路,电路可以实现前述方法实施例中发送或接收或者通信的功能。本申请中描述的处理器和收发器可实现在集成电路(integrated circuit,IC)、模拟IC、射频集成电路RFIC、混合信号IC、专用集成电路(application specific integrated circuit,ASIC)、印刷电路板(printed circuit board,PCB)、电子设备等上。该处理器和收发器也可以用各种IC工艺技术来制造,例如互补金属氧化物半导体(complementary metal oxide semiconductor,CMOS)、N型金属氧化物半导体(nMetal-oxide-semiconductor,NMOS)、P型金属氧化物半导体(positive channel metal oxide semiconductor,PMOS)、双极结型晶体管(bipolar junction transistor,BJT)、双极CMOS(BiCMOS)、硅锗(SiGe)、砷化镓(GaAs)等。In one implementation, the communication device 1200 may include a circuit that can implement the functions of sending or receiving or communicating in the aforementioned method embodiments. The processor and transceiver described in the present application may be implemented in an integrated circuit (IC), an analog IC, a radio frequency integrated circuit RFIC, a mixed signal IC, an application specific integrated circuit (ASIC), a printed circuit board (PCB), an electronic device, etc. The processor and transceiver may also be manufactured using various IC process technologies, such as complementary metal oxide semiconductor (CMOS), N-type metal oxide semiconductor (NMOS), P-type metal oxide semiconductor (positive channel metal oxide semiconductor, PMOS), bipolar junction transistor (BJT), bipolar CMOS (BiCMOS), silicon germanium (SiGe), gallium arsenide (GaAs), etc.
以上实施例描述中的通信装置可以是网络设备或者用户设备,但本申请中描述的通信装置的范围并不限于此,而且通信装置的结构可以不受图12的限制。通信装置可以是独立的设备或者可以是较大设备的一部分。例如通信装置可以是:The communication device described in the above embodiments may be a network device or a user device, but the scope of the communication device described in the present application is not limited thereto, and the structure of the communication device may not be limited by FIG. 12. The communication device may be an independent device or may be part of a larger device. For example, the communication device may be:
(1)独立的集成电路IC,或芯片,或,芯片系统或子系统;(1) Independent integrated circuit IC, or chip, or chip system or subsystem;
(2)具有一个或多个IC的集合,可选的,该IC集合也可以包括用于存储数据,计算机程序的存储部件;(2) having a set of one or more ICs, and optionally, the IC set may also include a storage component for storing data and computer programs;
(3)ASIC,例如调制解调器(Modem);(3) ASIC, such as modem;
(4)可嵌入在其他设备内的模块;(4) Modules that can be embedded in other devices;
(5)接收机、终端设备、智能终端设备、蜂窝电话、无线设备、手持机、移动单元、车载设备、网络设备、云设备、人工智能设备等等;(5) Receivers, terminal devices, intelligent terminal devices, cellular phones, wireless devices, handheld devices, mobile units, vehicle-mounted devices, network devices, cloud devices, artificial intelligence devices, etc.;
(6)其他等等。(6)Others
对于通信装置可以是芯片或芯片系统的情况,可参见图13所示的芯片的结构示意图。图13所示的芯片包括处理器1301和接口1302。其中,处理器1301的数量可以是一个或多个,接口1302的数量可以是多个。For the case where the communication device can be a chip or a chip system, please refer to the schematic diagram of the chip structure shown in Figure 13. The chip shown in Figure 13 includes a processor 1301 and an interface 1302. The number of processors 1301 can be one or more, and the number of interfaces 1302 can be multiple.
可选的,芯片还包括存储器1303,存储器1303用于存储必要的计算机程序和数据。Optionally, the chip further includes a memory 1303, and the memory 1303 is used to store necessary computer programs and data.
本领域技术人员还可以了解到本申请实施例列出的各种说明性逻辑块(illustrative logical block)和步骤(step)可以通过电子硬件、电脑软件,或两者的结合进行实现。这样的功能是通过硬件还是软件来实现取决于特定的应用和整个系统的设计要求。本领域技术人员可以对于每种特定的应用,可以使用各种方法实现的功能,但这种实现不应被理解为超出本申请实施例保护的范围。Those skilled in the art may also understand that the various illustrative logical blocks and steps listed in the embodiments of the present application may be implemented by electronic hardware, computer software, or a combination of the two. Whether such functions are implemented by hardware or software depends on the specific application and the design requirements of the entire system. Those skilled in the art may use various methods to implement the functions for each specific application, but such implementation should not be understood as exceeding the scope of protection of the embodiments of the present application.
本申请还提供一种可读存储介质,其上存储有指令,该指令被计算机执行时实现上述任一方法实施例的功能。The present application also provides a readable storage medium having instructions stored thereon, which implement the functions of any of the above method embodiments when executed by a computer.
本申请还提供一种计算机程序产品,该计算机程序产品被计算机执行时实现上述任一方法实施例的功能。The present application also provides a computer program product, which implements the functions of any of the above method embodiments when executed by a computer.
在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。计算机程序产品包括一个或多个计算机程序。在计算机上加载和执行计算机程序时,全部或部分地产生按照本申请实施例的流程或功能。计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。计算机程序可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,计算机程序可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线(digital subscriber line,DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。可用介质可以是磁性介质(例如,软盘、硬盘、磁带)、光介质(例如,高密度数字视频光盘(digital video disc,DVD))、或者半导体介质(例如,固态硬盘(solid state disk,SSD))等。In the above embodiments, it can be implemented in whole or in part by software, hardware, firmware or any combination thereof. When implemented using software, it can be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer programs. When the computer program is loaded and executed on a computer, the process or function according to the embodiment of the present application is generated in whole or in part. The computer can be a general-purpose computer, a special-purpose computer, a computer network, or other programmable device. The computer program can be stored in a computer-readable storage medium, or transmitted from one computer-readable storage medium to another computer-readable storage medium. For example, the computer program can be transmitted from a website site, computer, server or data center by wired (e.g., coaxial cable, optical fiber, digital subscriber line (digital subscriber line, DSL)) or wireless (e.g., infrared, wireless, microwave, etc.) mode to another website site, computer, server or data center. The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device such as a server or data center that contains one or more available media integrated. Available media can be magnetic media (e.g., floppy disks, hard disks, tapes), optical media (e.g., high-density digital video discs (DVD)), or semiconductor media (e.g., solid state disks (SSD)), etc.
本领域普通技术人员可以理解:本申请中涉及的第一、第二等各种数字编号仅为描述方便进行的区分,并不用来限制本申请实施例的范围,也表示先后顺序。A person skilled in the art may understand that the various numerical numbers such as first and second involved in the present application are only used for the convenience of description and are not used to limit the scope of the embodiments of the present application, and also indicate the order of precedence.
本申请中的至少一个还可以描述为一个或多个,多个可以是两个、三个、四个或者更多个,本申请不做限制。在本申请实施例中,对于一种技术特征,通过“第一”、“第二”、“第三”、“A”、“B”、“C”和“D”等区分该种技术特征中的技术特征,该“第一”、“第二”、“第三”、“A”、“B”、“C”和“D”描述的技术特征间无先后顺序或者大小顺序。At least one in the present application can also be described as one or more, and a plurality can be two, three, four or more, which is not limited in the present application. In the embodiments of the present application, for a technical feature, the technical features in the technical feature are distinguished by "first", "second", "third", "A", "B", "C" and "D", etc., and there is no order of precedence or size between the technical features described by the "first", "second", "third", "A", "B", "C" and "D".
如本文使用的,术语“机器可读介质”和“计算机可读介质”指的是用于将机器指令和/或数据提供给可编程处理器的任何计算机程序产品、设备、和/或装置(例如,磁盘、光盘、存储器、可编程逻辑装置(PLD)),包括,接收作为机器可读信号的机器指令的机器可读介质。术语“机器可读信号”指的是用于将机器指令和/或数据提供给可编程处理器的任何信号。As used herein, the terms "machine-readable medium" and "computer-readable medium" refer to any computer program product, apparatus, and/or device (e.g., disk, optical disk, memory, programmable logic device (PLD)) for providing machine instructions and/or data to a programmable processor, including a machine-readable medium that receives machine instructions as a machine-readable signal. The term "machine-readable signal" refers to any signal for providing machine instructions and/or data to a programmable processor.
可以将此处描述的系统和技术实施在包括后台部件的计算系统(例如,作为数据服务器)、或者包括中间件部件的计算系统(例如,应用服务器)、或者包括前端部件的计算系统(例如,具有图形用户界面或者网络浏览器的用户计算机,用户可以通过该图形用户界面或者该网络浏览器来与此处描述的系统和技术的实施方式交互)、或者包括这种后台部件、中间件部件、或者前端部件的任何组合的计算系统中。可以通过任何形式或者介质的数字数据通信(例如,通信网络)来将系统的部件相互连接。通信网络的示例包括:局域网(LAN)、广域网(WAN)和互联网。The systems and techniques described herein may be implemented in a computing system that includes back-end components (e.g., as a data server), or a computing system that includes middleware components (e.g., an application server), or a computing system that includes front-end components (e.g., a user computer with a graphical user interface or a web browser through which a user can interact with implementations of the systems and techniques described herein), or a computing system that includes any combination of such back-end components, middleware components, or front-end components. The components of the system may be interconnected by any form or medium of digital data communication (e.g., a communications network). Examples of communications networks include: a local area network (LAN), a wide area network (WAN), and the Internet.
计算机系统可以包括客户端和服务器。客户端和服务器一般远离彼此并且通常通过通信网络进行交互。通过在相应的计算机上运行并且彼此具有客户端-服务器关系的计算机程序来产生客户端和服务器的关系。A computer system may include clients and servers. Clients and servers are generally remote from each other and usually interact through a communication network. The relationship of client and server is generated by computer programs running on respective computers and having a client-server relationship to each other.
应该理解,可以使用上面所示的各种形式的流程,重新排序、增加或删除步骤。例如,本公开中记载的各步骤可以并行地执行也可以顺序地执行也可以不同的次序执行,只要能够实现本公开公开的技术方案所期望的结果,本文在此不进行限制。It should be understood that the various forms of processes shown above can be used to reorder, add or delete steps. For example, the steps recorded in this disclosure can be executed in parallel, sequentially or in different orders, as long as the desired results of the technical solutions disclosed in this disclosure can be achieved, and this document does not limit this.
此外,应该理解,本申请的各种实施例可以单独实施,也可以在方案允许的情况下与其他实施例组合实施。In addition, it should be understood that the various embodiments of the present application may be implemented individually or in combination with other embodiments when the solution permits.
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。Those of ordinary skill in the art will appreciate that the units and algorithm steps of each example described in conjunction with the embodiments disclosed herein can be implemented in electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are performed in hardware or software depends on the specific application and design constraints of the technical solution. Professional and technical personnel can use different methods to implement the described functions for each specific application, but such implementation should not be considered to be beyond the scope of this application.
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。Those skilled in the art can clearly understand that, for the convenience and brevity of description, the specific working processes of the systems, devices and units described above can refer to the corresponding processes in the aforementioned method embodiments and will not be repeated here.
以上,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以权利要求的保护范围为准。The above are only specific implementations of the present application, but the protection scope of the present application is not limited thereto. Any technician familiar with the technical field can easily think of changes or substitutions within the technical scope disclosed in the present application, which should be included in the protection scope of the present application. Therefore, the protection scope of the present application should be based on the protection scope of the claims.
Claims (11)
- 一种空间音频采集方法,其特征在于,所述方法由用户设备UE执行,所述UE中布置有多组麦克风阵列,每组阵列的最大响应方向相互正交,所述方法包括:A spatial audio acquisition method, characterized in that the method is performed by a user equipment UE, wherein multiple groups of microphone arrays are arranged in the UE, and the maximum response directions of each group of arrays are mutually orthogonal, and the method comprises:对所述麦克风阵列获取的麦克风信号进行差分波束处理,以获取空间音频信号。Perform differential beam processing on microphone signals acquired by the microphone array to acquire spatial audio signals.
- 根据权利要求1所述的方法,其特征在于,所述对所述麦克风阵列获取的麦克风信号进行差分波束处理,以获取空间音频信号包括:The method according to claim 1, characterized in that the performing differential beam processing on the microphone signals obtained by the microphone array to obtain the spatial audio signal comprises:对所述麦克风信号添加适当的延时滤波和对应的补偿滤波器,获得所需指向性的阵列信号;Adding appropriate delay filtering and corresponding compensation filters to the microphone signal to obtain an array signal with desired directivity;对所述阵列信号进行解码,获取所述空间音频信号。The array signal is decoded to obtain the spatial audio signal.
- 根据权利要求2所述的方法,其特征在于,所述方法还包括:The method according to claim 2, characterized in that the method further comprises:获取所述麦克风阵列的多个指向性,所述指向性表征不同方向上信号的灵敏度。A plurality of directivities of the microphone array are obtained, where the directivities represent the sensitivity of signals in different directions.
- 根据权利要求3所述的方法,其特征在于,所述方法还包括:The method according to claim 3, characterized in that the method further comprises:获取所述多个指向性的差分阵列;Acquire the plurality of directivity differential arrays;通过不同差分阵列的组合,获取三维空间上所需的指向性,以获取所述空间音频信号。By combining different differential arrays, the required directivity in three-dimensional space is obtained to obtain the spatial audio signal.
- 根据权利要求1至4中任一项所述的方法,其特征在于,所述方法还包括:The method according to any one of claims 1 to 4, characterized in that the method further comprises:对所述空间音频信号进行解码处理,以输出沉浸式的多声道音频和/或ambisonic音频。The spatial audio signal is decoded to output immersive multi-channel audio and/or ambisonic audio.
- 根据权利要求1至5中任一项所述的方法,其特征在于,所述方法还包括:The method according to any one of claims 1 to 5, characterized in that the method further comprises:对所述麦克风信号进行滤波处理,以获取低频成分和高频成分,Filtering the microphone signal to obtain low-frequency components and high-frequency components,其中,所述低频成分作为低频效果输出,所述高频成分用于形成所述空间音频信号。The low-frequency component is output as a low-frequency effect, and the high-frequency component is used to form the spatial audio signal.
- 根据权利要求1至6中任一项所述的方法,其特征在于,所述麦克风阵列在所述UE中以如下任一种方式布置:The method according to any one of claims 1 to 6, characterized in that the microphone array is arranged in the UE in any of the following ways:所述麦克风阵列布置于所述UE中靠近人声采集组件的位置;The microphone array is arranged at a position in the UE close to a human voice collection component;所述麦克风阵列布置于所述UE中靠近图像采集组件的位置。The microphone array is arranged in the UE at a position close to the image acquisition component.
- 根据权利要求1至7中任一项所述的方法,其特征在于,所述麦克风阵列包括预定个数麦克风,所述预定个数麦克风形成三组麦克风阵列,所述三组麦克风阵列相互正交或角度偏离正交误差在预定范围内,所述三组麦克风阵列的中心重合或具有不超过误差阈值的距离。The method according to any one of claims 1 to 7 is characterized in that the microphone array includes a predetermined number of microphones, the predetermined number of microphones form three groups of microphone arrays, the three groups of microphone arrays are orthogonal to each other or the angle deviation from orthogonality error is within a predetermined range, and the centers of the three groups of microphone arrays coincide or have a distance that does not exceed an error threshold.
- 一种空间音频采集装置,其特征在于,所述装置布置于用户设备UE执行,所述UE中布置有多组麦克风阵列,每组阵列的最大响应方向相互正交,所述装置包括:A spatial audio acquisition device, characterized in that the device is arranged in a user equipment UE for execution, the UE is arranged with multiple groups of microphone arrays, and the maximum response directions of each group of arrays are mutually orthogonal, and the device comprises:空间音频信号获取模块,用于对所述麦克风阵列获取的麦克风信号进行差分波束处理,以获取空间音频信号。The spatial audio signal acquisition module is used to perform differential beam processing on the microphone signals acquired by the microphone array to acquire spatial audio signals.
- 一种通信设备,其中,包括:收发器;存储器;处理器,分别与所述收发器及所述存储器连接,配置为通过执行所述存储器上的计算机可执行指令,控制所述收发器的无线信号收发,并能够实现权利要求1-8中任一项所述的方法。A communication device, comprising: a transceiver; a memory; a processor, connected to the transceiver and the memory respectively, configured to control the wireless signal reception and transmission of the transceiver by executing computer executable instructions on the memory, and capable of implementing any one of the methods of claims 1-8.
- 一种计算机存储介质,其中,所述计算机存储介质存储有计算机可执行指令;所述计算机可执行指令被处理器执行后,能够实现权利要求1-8中任一项所述的方法。A computer storage medium, wherein the computer storage medium stores computer executable instructions; after the computer executable instructions are executed by a processor, the method according to any one of claims 1 to 8 can be implemented.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2022/126234 WO2024082181A1 (en) | 2022-10-19 | 2022-10-19 | Spatial audio collection method and apparatus |
CN202280004436.0A CN118235431A (en) | 2022-10-19 | 2022-10-19 | Spatial audio acquisition method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2022/126234 WO2024082181A1 (en) | 2022-10-19 | 2022-10-19 | Spatial audio collection method and apparatus |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2024082181A1 true WO2024082181A1 (en) | 2024-04-25 |
Family
ID=90736504
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2022/126234 WO2024082181A1 (en) | 2022-10-19 | 2022-10-19 | Spatial audio collection method and apparatus |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN118235431A (en) |
WO (1) | WO2024082181A1 (en) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104321812A (en) * | 2012-05-24 | 2015-01-28 | 高通股份有限公司 | Three-dimensional sound compression and over-the-air-transmission during a call |
CN105451151A (en) * | 2014-08-29 | 2016-03-30 | 华为技术有限公司 | Method and apparatus for processing sound signal |
CN107533843A (en) * | 2015-01-30 | 2018-01-02 | Dts公司 | System and method for capturing, encoding, being distributed and decoding immersion audio |
US20190246203A1 (en) * | 2016-06-15 | 2019-08-08 | Mh Acoustics, Llc | Spatial Encoding Directional Microphone Array |
CN113661538A (en) * | 2019-04-12 | 2021-11-16 | 华为技术有限公司 | Apparatus and method for obtaining a first order ambisonic signal |
-
2022
- 2022-10-19 WO PCT/CN2022/126234 patent/WO2024082181A1/en active Application Filing
- 2022-10-19 CN CN202280004436.0A patent/CN118235431A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104321812A (en) * | 2012-05-24 | 2015-01-28 | 高通股份有限公司 | Three-dimensional sound compression and over-the-air-transmission during a call |
CN105451151A (en) * | 2014-08-29 | 2016-03-30 | 华为技术有限公司 | Method and apparatus for processing sound signal |
CN107533843A (en) * | 2015-01-30 | 2018-01-02 | Dts公司 | System and method for capturing, encoding, being distributed and decoding immersion audio |
US20190246203A1 (en) * | 2016-06-15 | 2019-08-08 | Mh Acoustics, Llc | Spatial Encoding Directional Microphone Array |
CN113661538A (en) * | 2019-04-12 | 2021-11-16 | 华为技术有限公司 | Apparatus and method for obtaining a first order ambisonic signal |
Also Published As
Publication number | Publication date |
---|---|
CN118235431A (en) | 2024-06-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP6023779B2 (en) | Audio information processing method and apparatus | |
KR100919160B1 (en) | A stereo widening network for two loudspeakers | |
US10477310B2 (en) | Ambisonic signal generation for microphone arrays | |
JP2017022718A (en) | Generating surround sound field | |
CN112492446B (en) | Method and processor for realizing signal equalization by using in-ear earphone | |
US11863952B2 (en) | Sound capture for mobile devices | |
CN115335900A (en) | Transforming panoramical acoustic coefficients using an adaptive network | |
WO2024082181A1 (en) | Spatial audio collection method and apparatus | |
WO2021129197A1 (en) | Voice signal processing method and apparatus | |
WO2024037189A9 (en) | Acoustic image calibration method and apparatus | |
WO2024164284A1 (en) | Audio signal processing method, apparatus, device, and storage medium | |
EP3595361B1 (en) | Use of local link to support transmission of spatial audio in a virtual environment | |
US11330371B2 (en) | Audio control based on room correction and head related transfer function | |
WO2022184097A1 (en) | Virtual speaker set determination method and device | |
CN111787458B (en) | Audio signal processing method and electronic equipment | |
CN111246345B (en) | Method and device for real-time virtual reproduction of remote sound field | |
US20220141341A1 (en) | Conference terminal and multi-device coordinating method for conference | |
US20230061896A1 (en) | Method and apparatus for location-based audio signal compensation | |
US20200184988A1 (en) | Sound signal processing device | |
WO2024098221A1 (en) | Audio signal rendering method, apparatus, device, and storage medium | |
US20240365057A1 (en) | Ambisonic microphone | |
CN115002401B (en) | Information processing method, electronic equipment, conference system and medium | |
US20220310114A1 (en) | Smart audio noise reduction system | |
CN118471240B (en) | Audio playing device, audio receiving device and audio system | |
WO2024183191A1 (en) | Multimedia system for wearable apparatus, and wearable apparatus |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
WWE | Wipo information: entry into national phase |
Ref document number: 202280004436.0 Country of ref document: CN |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 22962371 Country of ref document: EP Kind code of ref document: A1 |