WO2019174442A1

WO2019174442A1 - Adapterization equipment, voice output method, device, storage medium and electronic device

Info

Publication number: WO2019174442A1
Application number: PCT/CN2019/075489
Authority: WO
Inventors: 李保民; 任鹏; 蔡成亮
Original assignee: 中兴通讯股份有限公司
Priority date: 2018-03-13
Filing date: 2019-02-19
Publication date: 2019-09-19
Also published as: CN110278512A

Abstract

The present disclosure provides adapterization equipment, a voice output method, a device, a storage medium and an electronic device, wherein the voice output method comprises: acquiring, by using a main Mic in the adapterization equipment, the voice of a first voice source at a corresponding viewing angle of the main Mic; and outputting the voice acquired by the described main Mic to an AR or VR device connected to the adapterization equipment. According to the present disclosure, the problem that a user in the related art can only experience the game watching at a certain specific location and will be affected by the surrounding noise is solved.

Description

Sound pickup device, sound output method, device, storage medium, and electronic device

Technical field

The present disclosure relates to sound processing technology, and in particular to a sound pickup device, a sound output method, a device, a storage medium, and an electronic device.

Background technique

At present, the research on Augmented Reality/Virtual Reality (AR/VR) has been extensive and in-depth. At the same time, major technology media and websites are hot on AR/VR, AR/VR products can be called Another disruptive technology product after the smartphone. People interact more with real-world things in their daily lives, and the core idea of AR/VR products is to help humans get the job done by loading virtual information in real-world environments. Especially with the advent of the 5th Generation mobile communication technology (5G) era, the enhanced mobile broadband defined by The 3rd Generation Partnership Project (3GPP) Enhance Mobile Broadband (abbreviated as eMBB) is one of the important scenarios. The high-traffic mobile broadband services such as 3D/Ultra HD video accelerate the development and application of VA/VR technology.

Now many AR/VR vendors, and even some smart TV manufacturers, have launched functions such as “real-time viewing”, which is designed to allow each user to “live view” by wearing their products, thus bringing the audience Come to a new experience.

In the related art, the overall solution to the "real-time viewing" process is "virtual reality" + "live perception", that is, placing the scene on the scene in front of the user, and implementing stereo sound effects through two wireless earbuds. So that users can hear more powerful audio effects than usual when watching the game. The video part of this technology mainly uses real-world cameras, including a series of sensitive components such as 1080p HD camera, infrared camera and infrared laser projector to capture 3D images. The audio part of this technology uses a real-world camera to output X frames per second to achieve the purpose of simulating the sound field. In short, the camera will record the audio and video of the scene at a certain fixed point, and then increase the delay delay.

The disadvantage of adopting this technique is that the location of the acquisition is fixed from the implementation method and the achieved goal; then the user can only experience the sense of play at a certain specific position even if worn, and the viewing angle is single. Especially in sports such as football and marathon, it is impossible to achieve multi-camera viewing.

Secondly, for sports such as rugby, basketball, and baseball, the sound field environment is extremely noisy. Even with a real-life camera, it is impossible to avoid the interference of other people's noise on the viewers. If the noise reduction algorithm is used, the algorithm operation is very complicated in the extremely noisy sound field environment.

Again, from a technical point of view, the noise reduction process will filter out a part of the sound with a small sound. If the user is at a more remote viewing point, the method will filter the effective sounds such as the game in the field, even if The user can't hear the sound when he views the point. The noise reduction process is also unable to shield or extract the sound of the viewer in the direction of the view.

Finally, because of the limitations of the device recording mode, the live sound effect experienced by the user is relatively simple and cannot be selected. Even if the so-called "3D" or "Dolby" is added, it is only added on the basis of the original recording.

For the problem that the user in the related art can only experience the sense of watching at a certain specific location and is affected by the surrounding noise, an effective solution has not been proposed yet.

Summary of the invention

The embodiment of the present disclosure provides a sound collecting device, a sound output method, a device, and a storage medium, so as to at least solve the problem that the user can only experience the sense of play of a certain specific position and is affected by the surrounding noise. problem.

According to an embodiment of the present disclosure, there is provided a sound pickup apparatus configured to be connected to an augmented reality AR or a virtual reality VR device, comprising: a main microphone Mic and a processor, wherein the main Mic And the processor is connected to the main Mic, and is configured to output the sound acquired by the main Mic to the AR or VR.

According to another aspect of the present disclosure, there is also provided a sound output method comprising: acquiring a sound of a first sound source in a main Mic corresponding angle of view by using a main microphone Mic in a sound pickup device; acquiring the main Mic The sound is output to an augmented reality AR or virtual reality VR device connected to the sound pickup device.

According to another embodiment of the present disclosure, there is also provided a sound output device, comprising: an acquisition module configured to acquire a sound of a first sound source at a corresponding angle of view of the main Mic by using a main microphone Mic in the sound pickup device; And an output module configured to output the sound acquired by the main Mic to an augmented reality AR or virtual reality VR device connected to the sound pickup device.

According to another embodiment of the present disclosure, there is also provided a storage medium having stored therein a computer program, wherein the computer program is configured to execute the steps of any one of the method embodiments described above at runtime.

According to another embodiment of the present disclosure, there is also provided an electronic device comprising a memory and a processor, wherein the memory stores a computer program, the processor being configured to execute the computer program to perform any of the above Said method.

Through the present disclosure, since the main Mic in the sound pickup device can only be set to acquire the sound on its corresponding viewing angle, the sound on the non-corresponding viewing angle is not acquired, thereby effectively shielding the sound on the non-corresponding viewing angle, and due to the main Mic The sound of the corresponding angle of view is collected. Therefore, when the sound pickup device rotates, the angle of view that the main Mic is facing will change accordingly, so the sound collected by the main Mic will also change, realizing the position in real time. The sound, and eliminates the sound of the main Mic non-positive perspective, increases the user's deep immersion to the sound, and solves the problem that the user in the related art can only experience the sense of play at a certain position, and is subject to ambient noise. The impact of the problem.

DRAWINGS

The drawings described herein are provided to provide a further understanding of the present disclosure, which is a part of the present disclosure, and the description of the present disclosure and the description thereof are not intended to limit the disclosure. In the drawing:

1 is a block diagram showing a hardware configuration of a mobile terminal of a sound output method according to an embodiment of the present disclosure;

2 is a flowchart of a sound output method according to an embodiment of the present disclosure;

3 is an overall flow chart of a sound output method according to an embodiment of the present disclosure;

4 is a schematic diagram of a free sound field in accordance with an embodiment of the present disclosure;

FIG. 5 is a schematic diagram of a sound collecting hole according to an embodiment of the present disclosure; FIG.

6 is a schematic view of a convex surface in accordance with an embodiment of the present disclosure;

7 is a schematic diagram of a current aperture receiving plane wave in a one-dimensional space according to an embodiment of the present disclosure;

Figure 8 is a schematic perspective view of an embodiment of the present disclosure;

Figure 9 is a perspective view of an embodiment of the present disclosure;

10 is a schematic diagram of a relationship between frequency and beam width in accordance with an embodiment of the present disclosure;

11 is a schematic diagram of polar coordinates in a horizontal direction according to an embodiment of the present disclosure;

Figure 12 is a test chart 1 in accordance with an embodiment of the present disclosure;

13 is a schematic diagram of a head and shoulder simulator in accordance with an embodiment of the present disclosure;

14 is a test chart 2 in accordance with an embodiment of the present disclosure;

Figure 15 is a test chart three in accordance with an embodiment of the present disclosure;

16 is a test chart 4 in accordance with an embodiment of the present disclosure;

17 is a structural block diagram of a sound output device according to an embodiment of the present disclosure.

detailed description

The present disclosure will be described in detail below with reference to the drawings in conjunction with the embodiments. It should be noted that the embodiments in the present application and the features in the embodiments may be combined with each other without conflict.

It is to be understood that the terms "first", "second", and the like in the specification and claims of the present disclosure are used to distinguish similar objects, and are not necessarily used to describe a particular order or order.

The method embodiment provided in Embodiment 1 of the present application can be executed in a terminal, for example, a mobile terminal, a computer terminal, or the like. Taking a mobile terminal as an example, FIG. 1 is a hardware structural block diagram of a mobile terminal of a sound output method according to an embodiment of the present disclosure. As shown in FIG. 1, mobile terminal 10 may include one or more (only one shown in FIG. 1) processor 102 (processor 102 may include, but is not limited to, a processing device such as a microprocessor MCU or a programmable logic device FPGA. ), a memory 104 configured to store data, and a transmission device 106 configured as a communication function. It will be understood by those skilled in the art that the structure shown in FIG. 1 is merely illustrative and does not limit the structure of the above electronic device. For example, the mobile terminal 10 may also include more or fewer components than those shown in FIG. 1, or have a different configuration than that shown in FIG.

The memory 104 may be configured as a software program and a module for storing application software, such as program instructions/modules corresponding to the sound output method in the embodiment of the present disclosure, and the processor 102 executes each by executing a software program and a module stored in the memory 104. A functional application and data processing, that is, the above method is implemented. Memory 104 may include high speed random access memory, and may also include non-volatile memory such as one or more magnetic storage devices, flash memory, or other non-volatile solid state memory. In some examples, memory 104 may further include memory remotely located relative to processor 102, which may be connected to mobile terminal 10 over a network. Examples of such networks include, but are not limited to, the Internet, intranets, local area networks, mobile communication networks, and combinations thereof.

Transmission device 106 is arranged to receive or transmit data via a network. The above-described network specific example may include a wireless network provided by a communication provider of the mobile terminal 10. In one example, the transmission device 106 includes a Network Interface Controller (NIC) that can be connected to other network devices through a base station to communicate with the Internet. In one example, the transmission device 106 can be a Radio Frequency (RF) module configured to communicate with the Internet wirelessly.

The terminal may be a VR/AR device. With the terminal in the embodiment of the present disclosure, the user can sense the sound effect of the position in real time when the position of the head is rotated in any desired occasion, and eliminate the sound in the non-optical direction. The depth immersion is increased; in the embodiment of the present disclosure, the gear position selectable by the user can also be provided, so that the user can achieve an excellent experience.

In this embodiment, a sound output method running on the mobile terminal is provided. FIG. 2 is a flowchart of a sound output method according to an embodiment of the present disclosure. As shown in FIG. 2, the flow includes the following steps:

Step S202, acquiring, by using the main microphone Mic in the sound collecting device, the sound of the first sound source in the main Mic corresponding angle of view;

Step S204, outputting the sound acquired by the main Mic to the augmented reality AR or the virtual reality VR device connected to the sound collecting device.

The terminal performing the above operation may be the above-mentioned sound pickup device, and the sound pickup device may be connected to the AR or VR device as a part of the AR or the VR, wherein the main Mic in the sound pickup device is used to acquire the main Mic corresponding perspective. sound. The number of master Mic in the above-described sound pickup device may be one or more.

With the above embodiment, since the main Mic in the sound pickup device can only be set to acquire the sound on its corresponding viewing angle, the sound on the non-corresponding viewing angle is not acquired, thereby effectively shielding the sound on the non-corresponding viewing angle, and Mic collects the sound at the corresponding angle of view. Therefore, when the pickup device rotates, the angle of view that the main Mic is facing will change accordingly, so the sound collected by the main Mic will also change, realizing the real-time perception. The sound of the position, and the sound of the Mic non-corresponding angle is excluded, which increases the depth immersion of the user, and solves the problem that the user can only experience the sense of play at a certain position and suffers from surrounding noise. The problem of impact.

In an optional embodiment, before outputting the sound acquired by the main Mic to the augmented reality AR or the virtual reality VR device connected to the sound collecting device, the method further includes: acquiring the auxiliary Mic by using the auxiliary Mic in the sound collecting device. Corresponding to the sound of the second sound source in the viewing angle, wherein the auxiliary Mic is disposed around the main Mic; outputting the sound acquired by the main Mic to the augmented reality AR or the virtual reality VR device connected to the sound collecting device comprises: acquiring the main Mic The sound is synthesized with the sound acquired by the auxiliary Mic, and the synthesized sound is output to the AR or VR device connected to the sound pickup device. In this embodiment, the two operations of acquiring the sound of the auxiliary Mic corresponding angle of view by using the auxiliary Mic in the above-mentioned sound pickup device are not in a proper order. It should be noted that, in actual application, only the main Mic may be set, and the purpose of setting the auxiliary Mic is to better sense the sound at a specific position around, and to increase the stereoscopic effect of the sound to a certain extent.

In an optional embodiment, acquiring the sound of the first sound source in the main Mic corresponding angle of view by using the main Mic in the sound collecting device comprises: configuring an aperture size of the sound collecting hole of the main Mic and/or a sound collecting hole depth; The sound of the first sound source at the main Mic corresponding angle of view is acquired by the main Mic configured with the aperture size and/or the depth of the sound collecting hole. In this embodiment, the aperture size of the sound collecting hole of the main Mic is adjustable, and the depth of the sound collecting hole is also adjustable. Preferably, the aperture size of the sound collecting holes of different specifications can be configured for the main Mic. For example, the configuration of 24mm, 12mm, 6mm, 3mm, or the main Mic configuration can continuously adjust the aperture size of the pickup hole, the specific configuration can be determined according to the specific circumstances. In addition, in a specific configuration, it may be manually adjusted by the user, or automatically adjusted according to the size of the sound of the sound source corresponding to the main Mic, or according to configuration information input from the user, or configuration information from other devices. To adjust, wherein the other device may be a device connected to the sound collecting device, for example, the sound collecting device is a device disposed in a field, the other device is used by the user in the room and wirelessly connected to the sound collecting device The controller is connected in a manner so that remote control of the sound pickup device can be realized.

As stated in the above embodiment, when the aperture size of the pickup hole of the main Mic is configured, the configuration information of the input can be used for configuration. If the configuration information is not received within a certain period of time, the default configuration can be adopted. In this embodiment, configuring the aperture size and/or the pickup depth of the sound collecting hole of the main Mic includes: configuring the main Mic pickup according to the configuration information when receiving the input configuration information. The aperture size of the aperture and/or the depth of the pickup hole; in the case where the input configuration information is not received, the aperture size of the pickup hole of the main Mic and/or the depth of the pickup hole are configured according to a preset value, for example, preset The default aperture size is 12mm, and the preset aperture size can be freely set by the user.

In an optional embodiment, before acquiring the sound of the first sound source in the main Mic corresponding viewing angle by using the main Mic in the sound collecting device, the method further includes: determining a direction and a range of the sound pickup device that needs to be rotated. Using the determined pickup device, the direction and amplitude of rotation are required to control the pickup device to rotate. In this embodiment, the sound collecting device is freely rotatable. If the sound collecting device is a wearable device of the user's head, the sound collecting device rotates with the rotation of the user's head, if the sound collecting device is disposed at The equipment in the field, while the user is indoors, the user can control the rotation of the sound collecting device through the control device connected to the sound collecting device, and the embodiment is mainly directed to the remote control of the sound collecting device, thereby realizing the picking up The sound device moves according to the user's control, so that the user can sense the sound in any direction and improve the user experience. Similarly, the auxiliary Mic and the main Mic are similar, and the configuration of the pickup hole size can also be performed, and details are not described herein again.

In an optional embodiment, before acquiring the sound of the first sound source in the main Mic corresponding viewing angle by using the main Mic in the sound collecting device, the method further includes: determining a direction in which the main Mic needs to rotate and Amplitude; controlling the main Mic to rotate by using the determined direction and amplitude of the main Mic to be rotated. In this embodiment, the pickup device can be configured to be stationary, but the main Mic can be flexibly rotated.

Hereinafter, the sound output method in the present disclosure will be generally described with reference to the accompanying drawings, taking the sound pickup device as a wearable device as an example:

FIG. 3 is an overall flowchart of a sound output method according to an embodiment of the present disclosure. As shown in FIG. 3, the method includes the following steps:

[001] The user starts using the device (the device is the pickup device);

[002] The integrated main Mic (corresponding to the main Mic described above) starts to move with the head and shoulders, and records to determine whether to adopt the default gear position (assuming the default gear position is D gear, different gear positions correspond to different main Mic The aperture size of the sound collecting hole, in this embodiment is assumed to have four apertures of the sound hole), if yes, go to step [004], if not, go to step [005];

[003] The dual-auxiliary Mic located at the ears of the head and shoulders is recorded; and is transmitted to the user's earphone via an electro-acoustic signal;

[004] After the [002] integrated main Mic is recorded, the user selects the D default gear position; then the electroacoustic signal is transmitted to the user's earphone;

[005] After the [002] integrated main Mic is recorded, the user does not select the D default gear position;

[006] When the user does not select the default gear position, the other channels continue to record the sound and correspond to the corresponding path, that is, when the user does not select the default gear position (the process of selecting other gear positions at this time) In the middle), the four positions of the ABCD will correspond to the four channels of the ABCD, and the sound recording will be performed simultaneously in each channel;

[007] After the user selects other gear positions, the system automatically switches to the path of the corresponding gear position, and establishes a connection with the earphone path for sound transmission. Other non-user-selected gear positions close the path to the headset.

In the above embodiment, the recording and output of the sound are mainly explained. The above-mentioned pickup device will be described in detail below:

First, the implementation principle of the embodiment of the present disclosure is explained:

Sound waves are transmitted in free space, that is, in wireless ideal media. Because the boundary is infinite, we can regard the sound field we solve as a directional spherical body.

If you establish a coordinate system with the spherical center as the origin, as shown in Figure 4:

Then there is: x=r*cosA (A is the angle between the radius of the target point and the x-axis)

y=r*cosB (B is the angle between the radius of the target point and the y-axis)

z=r*cosC (C is the angle between the radius of the target point and the z-axis)

First, use a plane to illustrate (the other two planes use the same method to add two variables), taking the free sound field as an example. Figure 4 shows a spherical plane, where 'O' is the location of the person, 'A, B, C' are three different sound sources. People do perspective motion on the horizontal plane. The problem we need to solve is how to determine the 'A, B, C' sound sources, and thus experience the hearing at different positions of 'A, B, C'. effect.

Mic's performance can be described by a series of objective parameters, including sensitivity, flatness, equivalent noise level, directivity, dynamic range, and more. Taking the application environment as the competition venue as an example, the lowest to highest sound pressure level of the general competition venue is generally between 60dB and 110dB. The traditional Mic diameter is generally 24mm, 12mm, 6mm, 3mm, and the frequency response is 20Hz~40kHz. The approximation can be regarded as omnidirectional, and the measurement range of the sound pressure level is 30 dB to 140 dB; the ambient sound pressure level of the sound pickup device in the embodiment of the present disclosure may be within the range of the Mic measurement sound pressure level.

In Mic, the sound hole (also known as the sound hole or the sound channel) is an important part of acquiring an external sound source. See Figure 5 for details. In this embodiment, the problem of entering the sound of the channel and shielding other non-opposing positions can be solved by plane rotation, increasing the aperture and length of the sound collecting hole. In addition, in the present disclosure, the method of contrast stripping is adopted, that is, the sound is collected in real time by adding two omnidirectional Mic as the auxiliary Mic, and then the main Mic is rotated and collected by widening the aperture and length of the sound collecting hole, and then synthesized to obtain the final result. sound.

Based on the above object, in an embodiment of the present disclosure, a sound pickup device is provided, the sound pickup device being configured to be connected to an augmented reality AR or a virtual reality VR device, including: a main Mic and a processor, wherein The main Mic is set to acquire the sound of the first sound source at the main Mic corresponding angle of view; the above processor is connected to the main Mic and the auxiliary Mic, and is arranged to output the sound acquired by the main Mic to the AR or VR. In this embodiment, since the main Mic in the sound pickup device can only be set to acquire the sound on the corresponding viewing angle, the sound on the non-corresponding viewing angle is not acquired, thereby effectively shielding the sound on the non-corresponding viewing angle, and The main Mic is to collect the sound in the opposite direction. Therefore, when the pickup device rotates, the direction that the main Mic is facing will change accordingly, so the sound collected by the main Mic will also change, realizing the real-time perception. The sound of the position, and the sound of the Mic non-positive direction is excluded, which increases the user's deep immersion to the sound, and solves the problem that the user in the related art can only experience the sense of play at a certain position and is subject to ambient noise. The impact of the problem.

In an optional embodiment, the sound collecting device further includes a secondary Mic, wherein the auxiliary Mic is disposed around the main Mic, and is configured to acquire a sound of the second sound source corresponding to the auxiliary Mic; the processor is further configured to Connected to the auxiliary Mic, it is set to synthesize the sound acquired by the main Mic and the sound acquired by the auxiliary Mic and output the synthesized sound to the AR or VR.

In an alternative embodiment, the processor is further configured to configure the aperture size of the pickup aperture of the main Mic and/or the depth of the pickup aperture. As stated above, the aperture size of the pickup hole of the main Mic is adjustable, and the depth of the pickup hole is also adjustable. For details, refer to the foregoing method embodiment, and similarly, the aperture of the pickup hole of the auxiliary Mic. The size and the depth of the pickup hole can also be set to be adjustable, and the configuration thereof is similar to that of the main Mic, and will not be described again here.

In an optional embodiment, the processor may configure the aperture size of the sound hole of the main Mic and/or the depth of the sound collection hole by: configuring the configuration information according to the configuration information when the input configuration information is received. The aperture size of the pickup hole of the main Mic and/or the depth of the pickup hole; in the case where the input configuration information is not received, the aperture size and/or the pickup of the pickup hole of the main Mic is configured according to a preset value. Hole depth.

In an alternative embodiment, the processor is further configured to: determine a direction and an amplitude of the sound pickup device to be rotated; and use the determined sound pickup device to control the rotation direction and amplitude to control the sound pickup device to rotate.

In an optional embodiment, the processor is further configured to: determine a direction and an amplitude of the main Mic to be rotated; and control the main Mic to rotate by using the determined direction and amplitude of the main Mic to be rotated.

In an optional embodiment, when the sound collecting device is a wearable device of a user's head, the main Mic includes a main Mic located at a front end of the wearable device, wherein the front end is set to be the forehead of the wearer Corresponding. In this embodiment, the number of the secondary Mic may be two, respectively located at a position of the left ear of the corresponding user of the wearable device and a position corresponding to the right ear of the user. It should be noted that setting the main Mic at the forehead and setting the auxiliary Mic at the ears is a more preferable setting method, and the specific reasons are as follows:

A, the main Mic implementation design algorithm

1. In the present embodiment, the main Mic is designed as a protruding surface. With the principle of convex reflection, almost all convex surfaces have scattering effects, and they are important reflecting surfaces as diffusing bodies because for convex surfaces, r is always negative (as shown in Figure 6).

If you take a negative value into the convex equation:

In the case, then b will also be a negative value, and each parameter has been marked on Figure 6. In summary, Q1 is the location of the user, and S is the main Mic position. The sound waves transmitted by the Q2 sound source will enter the channel of the S main Mic, and other waveforms will be reflected on the spherical surface, such as points A and B. Conversely, if Q2 is not in the position shown in Figure 6, but in other positions, the waveform transmission must form a scattering, that is, it can be shielded.

2. After confirming the phenomenon described in the above '1', since the first step in the purpose of the embodiment of the present disclosure is to define the main Mic as a device that can be rotated for the main-pair video source acquisition, the next step The aperture widening process will be performed on the pickup hole of the main Mic.

3. The aperture here is expressed as an electroacoustic sensor (Mic) that converts an acoustic signal into an electrical signal.

4. To better illustrate the problem, several variables can be added, namely the Mic receiving aperture of volume V, considering a receiving aperture of volume V, and x(t,r) representing the value of the signal at time t and space r . The impulse response of an infinitesimal volume dV at the receiving aperture at a is a(t,r), then the received signal can be represented by convolution:

x _R (t,r)=∫x(i,r)a(ti,r)di

Or use its frequency domain to represent X _R (f,r)=X(f,r)A(f,r) (1)

Where A(f,r) is the aperture function, and the aperture function can be used to know the corresponding function reflected by the aperture in different spatial sizes.

5. Because the receiving aperture is different for the signals transmitted in different directions, the solid angle of the receiving aperture is different. As shown in FIG. 7, FIG. 7 shows the signal of the linear aperture receiving plane wave in the one-dimensional space.

The response of the aperture is a function of the frequency of the signal entering the aperture and the direction of incidence. It can be derived by solving the wave equation that the aperture response and the aperture function are in the presence of a Fourier transform. In particular, in the scenario described in the embodiment of the present disclosure (taking the game venue as an example), the far field condition is expressed by the response function of the aperture:

Fr{.} is a three-dimensional Fourier transform,

Is a spatial location of the point on the aperture.

It is the direction vector of the wave, and θ and φ can be seen in Fig. 8.

In order to explain the problem more simply and obtain the aperture response, the coordinates shown in FIG. 8 can be simplified to a one-dimensional linear aperture along the X-axis direction, and the aperture length is L, as shown in FIG.

In the case shown in Figure 9:

One-dimensional linear aperture, then in this case

The aperture response is simplified to:

among them:

If the above expression is expressed by θ and φ, then:

The above algorithm is obtained under the assumption of the plane wave, so it is only applicable to the conditions of the far field. For a linear aperture, the following conditions should be met to satisfy the far field condition.

Consider a specific case, if the linear aperture does not change with the frequency position, then the aperture function can be written as:

A _R (x _α )=rect(x _α /L) (5)

The result of the Fourier transform is:

D _R (f, a _x )=Lsinc(α _x /L)}

In summary, the figure is obtained by calculating the uniform aperture function and the corresponding directional aperture function (as shown in Fig. 10). It can be seen from Fig. 10 that the zero point distribution of the directional aperture function is α _x = mλ/L, where m Is an integer. The range of directionality can be derived from the range between: - λ / L _≤ α _{x ≤} λ / L The area between the areas is called the main lobe, and the range is the beam width. Therefore, for a fixed aperture length, the higher the frequency, the narrower the beam width, as shown in FIG.

According to the above calculation, for a fixed aperture length, the higher the frequency, the narrower the beam width, and the normalized aperture response length is:

The length of the caliber response can be expressed in the horizontal direction as:

From the formula (7), the representation of the polar coordinates in the horizontal direction can be obtained. Then, under the condition of L/λ, the polar coordinates are as shown in Fig. 11, which are respectively L/λ=0.5, 1, 2, and 4. Different values.

The linear aperture characteristics can be obtained by the calculations of the above formulas (1) to (7).

Therefore, the main Mic implementation design algorithm describes the linear aperture characteristics, combined with the linear aperture characteristic formula in the horizontal direction:

Where w _n (f) is the weight parameter of the sensor, and C is C when the original λ is explicitly written as f when deriving the formula. The original formula can be found in equation (9).

It is concluded that either the linear Mic or the aperture characteristics under the equally spaced Mic array depend on the following conditions:

The number of sensors N;

The spacing d between the sensors;

The frequency of the sound wave f;

Since the discrete sensor array is an approximation of the continuous aperture. It should be noted that the effective length of the sensor array is defined as the length of the corresponding continuous aperture, L = Nd, and the actual length of the sensor array is d (N-1).

By means of the scattering and frequency and incident direction functions described in Figures 6 and 7, the source in the opposite direction can be more accurately identified.

The present disclosure will be described below in conjunction with simulation results:

In order to facilitate understanding, and to confirm the placement position of the main Mic and the dual auxiliary Mic in the embodiment of the present disclosure. Therefore, the following laboratory demonstrations were made in the audio laboratory. In this test, the B&K "Head and Shoulder Simulator" (HATS) conforming to international standards was introduced, and the test environment was a standard silencer. (Test chart can be seen in Figure 12)

A, two fixed auxiliary Mic test

a. For laboratory testing, the standard "head and shoulder simulator" (Danish 4128C head and torso simulator) is used. The test chart can be seen in Figure 13.

As shown in FIG. 13, if two auxiliary Mic are placed in the head and shoulder simulator artificial ear position, since the human ear auricle is oriented, the angle formed by the main axis is exactly the angle of the main axis shown in FIG. According to the previous experience of audio testing of terminal products, the ∠H (fixed bracket international general angle) formed by the test is the same as the relevant laboratory test of China Mobile Lab.

b. For further demonstration, the following experiment was carried out, and the constructed environment is shown in Fig. 14. In this test, the “artificial ear” can be approximated as the auxiliary Mic (one for each of the left and right ears), and the P.501 source signal/standard English dialogue can be played using a standard speaker.

The test strategy is as follows (see Figure 15 for details):

a, placed in the position designed in the embodiment of the present disclosure, testing 1020Hz, 8 frequency points of receiving distortion;

b. According to the scale of the standard head and shoulder simulator, move the auxiliary Mic on both sides up by 6cm, and test the receiving distortion of 1020Hz and 8 frequency points;

The test conclusions are detailed in Table 1:

Table 1

方案Program	次数frequency	测试结论Test conclusion
a.本专利设计位置a. The design position of this patent	5050	PassPass
b.上移6cm位置b. Move up 6cm position	5050	FailFail

In summary, it is preferred that the two auxiliary Mics are placed in the "head and shoulder simulator" binaural position in the embodiment of the present disclosure.

B, two fixed auxiliary Mic test

The test environment is built (see Figure 15):

In this experiment, a standard 1/4Mic is considered as the main Mic (B&K 2670) test strategy that can be moved with the head and shoulder simulator as follows:

a. Place Mic at the center point of the lip ring (point A), and test the receiving distortion of 1020 Hz and 8 frequency points;

b. Place Mic at the upper end of the center of the lip ring (point B) to test 1020 Hz, receiving distortion at 8 frequency points;

c. Place Mic at the forehead of the center of the lip ring (point C) to test the reception distortion of 1020 Hz and 8 frequency points;

The test conclusions are detailed in Table 2:

Table 2

方案Program	次数frequency	测试结论Test conclusion
a、唇环中心位置a, the center of the lip ring	5050	Fail(0％次)-3个点不过Fail (0% times) - 3 points but
b、鼻子b, nose	5050	Fail(0％次)-4个点不过Fail (0% times) - 4 points but
c、前额c, forehead	5050	Fail(0％次)-1个点不过Fail (0% times) - 1 point but

Therefore, it is preferred to place the main Mic in the forehead portion in the embodiment of the present disclosure.

In an alternative embodiment, the main Mic further includes a main Mic located at the top end of the wearable device, wherein the top end is configured to correspond to the top of the wearer's head. Stereoscopic sound collection can be achieved by setting the main Mic at the position of the corresponding overhead, so that the user can feel a more stereoscopic sound. It should be noted that, in the present disclosure, the position of the main Mic can be freely set according to actual needs, and the number of main Mic can be freely adjusted.

In an optional embodiment, when the above-mentioned sound pickup device is a wearable device of a user's head, the auxiliary Mic includes a secondary Mic located on both sides of the wearable device, and both sides are set to be in contact with the wearer's ears. correspond.

Through the description of the above embodiments, those skilled in the art can clearly understand that the method according to the above embodiment can be implemented by means of software plus a necessary general hardware platform, and of course, by hardware, but in many cases, the former is A better implementation. Based on such understanding, portions of the technical solutions of the present disclosure that contribute substantially or to the prior art may be embodied in the form of a software product stored in a storage medium (eg, ROM/RAM, disk, The optical disc includes a number of instructions for causing a terminal device (which may be a cell phone, a computer, a server, or a network device, etc.) to perform the methods described in various embodiments of the present disclosure.

Also provided in the present embodiment is a sound output device for implementing the above-described embodiments and preferred embodiments, which have not been described again. As used below, the term "module" may implement a combination of software and/or hardware of a predetermined function. Although the apparatus described in the following embodiments is preferably implemented in software, hardware, or a combination of software and hardware, is also possible and contemplated.

17 is a structural block diagram of a sound output device according to an embodiment of the present disclosure. As shown in FIG. 17, the device includes:

The obtaining module 172 is configured to acquire the sound of the first sound source in the main Mic corresponding angle of view by using the main Mic in the sound collecting device; the output module 174 is connected to the obtaining module 172, and is configured to output the sound obtained by the main Mic to the sound An augmented reality AR or virtual reality VR device connected to the above-described sound pickup device.

In an optional embodiment, the apparatus is further configured to: use the auxiliary Mic in the sound pickup device before outputting the sound acquired by the main Mic to the augmented reality AR or the virtual reality VR device connected to the sound pickup device Obtaining a sound of the second sound source corresponding to the auxiliary Mic, wherein the auxiliary Mic is disposed around the main Mic; and outputting the sound acquired by the main Mic to the augmented reality AR or the virtual reality VR device connected to the sound collecting device: The sound acquired by Mic and the sound acquired by the auxiliary Mic are combined, and the synthesized sound is output to an AR or VR device connected to the pickup device.

In an optional embodiment, the obtaining module 172 may acquire the sound of the first sound source in the main Mic corresponding perspective by using the main Mic in the sound collecting device by configuring the aperture of the sound collecting hole of the main Mic. Size and/or pickup hole depth; the main Mic configured with the aperture size and/or the pickup aperture depth is used to acquire the sound of the first sound source at the main Mic corresponding angle of view.

In an optional embodiment, the obtaining module 172 may configure the aperture size and/or the depth of the sound collecting hole of the main Mic by the following manner: in case the input configuration information is received, according to the configuration The information configures the aperture size of the pickup hole of the main Mic and/or the depth of the pickup hole; if the input configuration information is not received, the aperture size and/or the pickup of the pickup hole of the main Mic is configured according to the preset value. Hole depth.

In an optional embodiment, the sound output device is further configured to determine a direction and an amplitude of the sound pickup device to be rotated before acquiring the sound in the direction of the main Mic by using the main Mic in the sound pickup device; The pickup device needs the direction and amplitude of the rotation to control the pickup device to rotate.

In an optional embodiment, the apparatus is further configured to determine a direction in which the main Mic needs to be rotated before acquiring the sound of the first sound source in the main Mic corresponding viewing angle by using the main Mic in the sound collecting device. And amplitude; controlling the main Mic to rotate by using the determined direction and amplitude of the main Mic to be rotated.

It should be noted that each of the above modules may be implemented by software or hardware. For the latter, the foregoing may be implemented by, but not limited to, the foregoing modules are all located in the same processor; or, the above modules are in any combination. The forms are located in different processors.

Embodiments of the present disclosure also provide a storage medium having stored therein a computer program, wherein the computer program is configured to execute the steps of any one of the method embodiments described above.

Optionally, in the embodiment, the foregoing storage medium may include, but is not limited to, a USB flash drive, a Read-Only Memory (ROM), and a Random Access Memory (RAM). A variety of media that can store program code, such as a hard disk, a disk, or an optical disk.

Embodiments of the present disclosure also provide an electronic device including a memory and a processor having stored therein a computer program configured to execute a computer program to perform the steps of any of the above method embodiments.

Optionally, the electronic device may further include a transmission device and an input and output device, wherein the transmission device is connected to the processor, and the input and output device is connected to the processor.

For example, the specific examples in this embodiment may refer to the examples described in the foregoing embodiments and the optional embodiments, and details are not described herein again.

It can be seen from the above embodiments that the basic core of the present disclosure is to define and layout Mic on the AR/VR product by using the AR/VR product. In order to facilitate understanding and better clarify the content described in the present disclosure, the Mic can be temporarily set to have four specifications of 24mm, 12mm, 6mm, and 3mm. The Mic is integrated into an acquisition fan, and the fan can rotate with the user's head at the same angle. The structure, through the plane rotation, increases the aperture length of the widening sound hole to solve the sound entering the channel, and shields the sound source entry of other non-opposing positions, and adds other auxiliary Mic to match.

Among them, the original Mic is a sound pickup device that can perform fan rotation. The four specifications (four specifications are only exemplary descriptions, other specifications can also be used) are integrated, and the integrated Mic module is set to collect the user in the viewing direction. After the user wears the above-mentioned pickup device, the auxiliary Mic is two omnidirectional auxiliary Mic, and the sound next to the position where the user is located can be collected in real time, and can be rotated with the user's head, or can be set to not rotate with the user's head. . The integrated main Mic starts to work. The device defaults to any one of A, B, C, and D (each gear corresponds to one specification). The default is D gear. Each gear corresponds to one path; two auxiliary Mic Start working, collecting the ambient sound of the left and right sides of the user in real time; when the user is facing a certain area, the integrated main Mic position changes with the angle as the steering angle changes; at this time, the integrated Mic converts the electroacoustic signal. Through the earphones to the user; similarly, the two auxiliary Mic convert the sounds on the left and right sides of the head through the electro-acoustic signal, and transmit them to the user through the earphone; when the user feels that the feeling at this time cannot meet the demand of the live viewing, The adjustment button switches the gear position and opens the passage corresponding to the selection to achieve better results.

In the embodiment of the present disclosure, the method adopted is to redefine the Mic and solve the problem of entering the channel sound and shielding other non-user-to-view source by the "plane rotation", "increasing or widening the aperture" and the length of the sound hole. The problem is that hardware modification is performed to perform sound source localization.

Such an acquisition surface Mic allocation layout strategy and usage rules are described in the embodiments of the present disclosure. For example, since one of the cores of the present disclosure is "increasing or widening the sound collecting hole", when the sound source is collected in a certain direction, a plurality of main Mic may be set (the aperture size of each Mic's sound collecting hole is different, or Set a main Mic, and set the aperture size specifications of the various pickup holes for the one main Mic. For example, the number of main Mic added is set to three, which are A, B, and C respectively. The three Mic apertures B and C are a mm, b mm, and c mm, and the orientations are a°, b°, and c° with respect to a certain horizontal position. Due to their structural and positional characteristics, the meanings are expressed in various angle orientations. Brought to the user experience. It should be noted that the above description is one of the basic embodiments of the core of the present disclosure, and does not represent the complete and unique embodiment described in the present disclosure. The embodiments of the present disclosure determine the number, specification, orientation, and the like of the Mic according to the requirements of the post-design.

The "positioning method in the horizontal direction" described in the embodiments of the present disclosure is intended to better explain the problem, is one of the basic embodiments of the core of the present disclosure, and does not represent the complete and unique embodiment described in the present disclosure. The acquisition of the stereoscopic space can be achieved by adding a device that uses the same technical means and is perpendicular to its basic embodiment as described in the present disclosure. Thus, a complete embodiment of the present disclosure includes, but is not limited to, sound source localization in a horizontal direction, as described in this paragraph, when two or more techniques, such as those described in the present disclosure, are added, omnidirectional acquisition can be achieved.

Embodiments of the present disclosure provide a stereo sensing technology that deploys the device described in this patent at a specific location according to the size of the space, and can achieve all-round, stereo space sound source collection. After wearing the above-mentioned sound collecting device, the user can experience the feeling of watching the established position at any point in the local (home, field), thereby increasing the user's selectivity and immersion. And the user's location does not need to change. For example, when the user is at the B point position, the head is turned to the left, and the user's selection can sense the equivalent effect of any point on the scene, such as the head of point A turning to the left.

Each of the received sound sources described in the embodiments of the present disclosure includes position information, and the position information is already included in the sound source at the beginning of the collection, that is, the position information is the initial sound source position. Therefore, when it is restored to the user, it not only includes the sound, but also allows the user to feel the change of the location.

Embodiments of the present disclosure also describe such an audio and video matching synthesis rule. As far as the prior art is concerned, when the matching audio and video is restored, all parameters are adjusted and changed based on time (frame), and the method described in the patent is to add a position coordinate axis by rotating (Pan) and vertical. By rotating (Tilt) two parameters, you can match the sound and the orientation of the view, and combine when the time and position coordinates match at the same time.

Embodiments of the present disclosure describe a frame structure of an audio format. The existing audio file includes simple left and right channels, audio tracks (sounds), and the like. In the embodiment of the present disclosure, position information is added. The specific description is that the origin is a distance source, and the origin is a distance source. The relative position; similarly, the collection point also performs video acquisition, which is also the origin of the video collection point. Obviously, since the video and audio belong to the same coordinate system, the matching can be performed if the position information is the same. In summary, the position information described in this embodiment is the relative position of all sound sources relative to the collection point.

Embodiments of the present disclosure describe a rule for audio reproduction when a position (coordinate point) changes. The sound restored to the user described in this embodiment does not change the size of the sound restored to the user, but changes the proportion of the loudness that is restored to the user for each utterance point within the range of the collected sound field. Therefore, in several factors that affect the user experience, such as video, audio, body, touch light content, this embodiment mainly describes the restoration of the audio ratio.

Embodiments of the present disclosure describe a user self-selection mode. In the embodiment of the present disclosure, the main Mic is described as four default gear positions. When the user does not select the default gear position, the other gear positions can be arbitrarily switched according to the user's own needs, and the other channels continue to record the sound, and corresponding to the corresponding The channel is fed back to the user to increase the user experience. It should be noted here that in order to better illustrate the problem, this patent describes that the main Mic is the four default gear positions, which is one of the basic functional characteristics of the core of the patented invention, and does not represent the complete uniqueness described in the present disclosure. implementation plan. Therefore, a complete embodiment of the present disclosure includes, but is not limited to, four default gear positions.

Embodiments of the present disclosure also describe such an adaptive scheme based on multiplex matching. With the sound source localization method described in the present disclosure, after adding the number, specification, and position of the main Mic that matches the post-design requirements, as described above, the sound source positioning of the full-field multi-plane dimension can be realized, and the main Mic acquisition surface is collected. The source is recorded. For example, there are two users A and B facing two angles of view at different angles a° and b° at the same time, and the sound source recorded in the whole field is α. At this time, the system automatically matches the angle A of the user A with the position information contained in the alpha source. When the matching is successful, the scanned source that has been successfully matched is transmitted to the user A. Similarly, the system automatically matches the viewing angle b° of the user B with the position information contained in the alpha sound source, and when the matching is successful, transmits the scanned matching sound source to the user B. It should be noted that what is described in the examples is one of the basic functional characteristics of the core of the present disclosure, and does not represent the complete and unique embodiment described in the present disclosure.

Embodiments of the present disclosure also describe such a local based non-real time multiplexing acquisition scheme. After adopting the scheme that matches the requirements of the post-design, the system automatically collects the full-field sound for recording, and stores the recorded information locally. When the user uses the above-mentioned multiplexing matching experience scheme, the non-real-time multi-user can be realized. Perception of sound source localization from different perspectives.

The inventive embodiment provides such a solution to the device layout. The device described in the present disclosure can be placed at any point in the field, including but not limited to one terminal device.

The method described in the embodiments of the present disclosure and the method mentioned in the prior art have the following differences in the implementation means:

A. The main content of the existing patents is the corresponding rendering of VR audio and video, and the selection and restoration of video and audio at any position in the virtual scene. There is no distinction and collection of different azimuth sources in the three-dimensional space. The most important thing is the realization of the patent. The basis is the collection method of this patent. Otherwise, the existing patent technology cannot be realized.

B. The solution in the embodiment of the present disclosure is to realize the hardware modification manner of the sound source entering the channel and shielding other non-opposing positions by planar rotation, increasing the aperture and length of the sound collecting hole, and the like. The purpose mentioned. The existing patents are implemented by rendering and encoding the audio data collected by the N audio collection devices mentioned.

C. The method described in the embodiments of the present disclosure differs from the method described in the prior art in the implementation of “sound source localization” in the “sound source localization”. The present disclosure describes the initial acquisition process. The relevant sound source (including the perceived position) information has been entered, and no later processing is required on the implementation method that achieves the basic effect. The existing patent is to input the sound source, and then pass the sound data device corresponding to the determined audio data; and use the determined Q speaker devices to render the M channel audio data. The intervention point is the rendering stage of the M-channel audio data corresponding to the Q speakers in the later stage.

D. Different acquisition methods, the acquisition method described in the present disclosure is achieved by adding two (including but not limited to two) auxiliary Mic, and redefining the integrated main Mic of four specifications (including but not limited to four). The device can achieve multi-directional sound source acquisition and positioning in a single location. The prior art utilizes the set orientation of each video capture device, and then sets N corresponding audio capture devices.

The main content of the existing patents is VR audio and video corresponding rendering, and selection and restoration of video and audio at any position in the virtual scene, and there is no distinction and collection of different azimuth sound sources in the stereo space, but the collection in the embodiment of the present disclosure The method can effectively distinguish and collect sound sources in different directions in the three-dimensional space.

The method described in the embodiment of the present disclosure records the related sound source (including the perceived position) information in the initial collection process, and does not need to be processed later in the implementation manner that achieves the basic effect.

The present disclosure describes a rule for audio reproduction when positional changes (coordinate points). Instead of changing the size of the sound restored to the user, the ratio of the sound level of each sound point within the range of the collected sound field is changed. The prior art does not have this function, but is simply restored to the user after rendering.

The present disclosure implements a mode of user self-selection, the described main Mic has four default gear positions (including but not limited to four), and when the user does not select the default gear position, any other gear position can be switched, other The path continues to record the sound and responds to the corresponding path, giving feedback to the user, increasing the user experience.

The present disclosure establishes such a stereo sensing technology, and according to the size of the site and its space, deploying the device described in this patent at a specific location, can achieve all-dimensional, three-dimensional space acquisition. The user can observe the perception of the established position at any point in the local (at home, at any point in the stadium), increase the user's selectivity and immersion, and the user's location does not need to change.

The beneficial effects of the present disclosure are summarized as follows:

The solution in the embodiment of the present disclosure is simple and convenient to implement, and does not need to be changed too much;

The solution in the embodiment of the present disclosure can better avoid the deficiencies of the noise reduction technology in the prior art;

The solution in the embodiment of the present disclosure can perform gear position adjustment on a device, and the user independently selects the aperture size to increase the user's selectivity.

The specific application scenarios of the present disclosure are described below:

The solution in the embodiment of the present disclosure can be applied to an existing smart phone product, and when the device is used, when the user visits the scene and turns the position of the head, the sound effect of the position can be sensed in real time, and the depth immersion is increased. Sense; also provides a variety of optional gears for users to achieve a better experience.

In other fields, the disclosed patents can be well used in urban planning and urban modeling. For example, in urban planning, users can conduct on-site inspections, and after wearing the terminal, they can perceive the rationality of the set model; Public patents can be well used in geography, and comprehensively use geographic information such as 3D GIS to achieve the perception of certain geographical types, providing reliable reference data and user perception; this patent can be exercised in disaster relief. When a major accident occurs, real-time perception of the internal structure and perception information of the building and other information affecting rescue work, accurately locate the best rescue route, select the best rescue means, and greatly improve the rescue efficiency. The image returned from the scene is tracked and the rescue command is issued in real time.

It should be noted that the patented invention can also be applied to various fields such as military, industrial, electronic cruise, education, and the like.

It will be apparent to those skilled in the art that the various modules or steps of the present disclosure described above can be implemented by a general-purpose computing device that can be centralized on a single computing device or distributed across a network of multiple computing devices. Alternatively, they may be implemented by program code executable by the computing device such that they may be stored in the storage device by the computing device and, in some cases, may be different from the order herein. The steps shown or described are performed, or they are separately fabricated into individual integrated circuit modules, or a plurality of modules or steps thereof are fabricated as a single integrated circuit module. As such, the disclosure is not limited to any specific combination of hardware and software.

The above description is only a preferred embodiment of the present disclosure, and is not intended to limit the disclosure, and various changes and modifications may be made to the present disclosure. Any modifications, equivalent substitutions, improvements, etc., made within the scope of the present disclosure are intended to be included within the scope of the present disclosure.

Industrial applicability

As described above, the sound collecting device, the sound output method, the device, the storage medium, and the electronic device provided by the embodiments of the present invention have the following beneficial effects: the user in the related art can only understand the viewing of a specific location. Feeling, and will be affected by the surrounding noise.

Claims

A sound pickup device, the sound pickup device being configured to be connected to an augmented reality AR or a virtual reality VR device, comprising: a main microphone Mic and a processor, wherein

The main Mic is configured to acquire a sound of a first sound source in a corresponding perspective of the main Mic;

The processor is coupled to the main Mic and configured to output the sound acquired by the main Mic to the AR or VR.
The sound pickup device according to claim 1, wherein said sound pickup device further comprises a secondary Mic, wherein

The auxiliary Mic is disposed around the main Mic, and is configured to acquire a sound of a second sound source corresponding to the auxiliary Mic;

The processor is further configured to be connected to the secondary Mic, configured to combine the sound acquired by the primary Mic and the sound acquired by the secondary Mic, and output the synthesized sound to the AR or VR.
The sound pickup apparatus according to claim 1, wherein said processor is further configured to configure an aperture size of said sound hole of said main Mic and/or a depth of a sound pickup hole.
The sound pickup apparatus according to claim 3, wherein said processor configures an aperture size of said sound pickup hole of said main Mic and/or a depth of a sound pickup hole by:

When receiving the input configuration information, configuring an aperture size and/or a pickup depth of the sound collecting hole of the main Mic according to the configuration information;

In the case where the input configuration information is not received, the aperture size and/or the pickup hole depth of the pickup hole of the main Mic are configured according to a preset aperture size.
The sound pickup device of claim 1, wherein the processor is further configured to:

Determining a direction and a range in which the pickup device needs to be rotated;

The pickup device is controlled to rotate by the determined direction and magnitude of rotation of the pickup device.
The sound pickup device of claim 1, wherein the processor is further configured to:

Determining the direction and magnitude of rotation of the main Mic;

The main Mic is controlled to rotate by using the determined direction and amplitude of the main Mic to be rotated.
The sound pickup device according to any one of claims 1 to 6, wherein when the sound pickup device is a wearable device of a user's head, the main Mic includes a main Mic located at a front end of the wearable device Wherein the front end is arranged to correspond to a wearer's forehead.
The sound pickup apparatus according to claim 7, wherein said main Mic further comprises a main Mic located at a top end of said wearable device, wherein said top end is disposed to correspond to a top of the wearer's head.
The sound pickup device according to claim 2, wherein when the sound pickup device is a wearable device of a user's head, the auxiliary Mic includes a secondary Mic located on both sides of the wearable device, the two The sides are arranged to correspond to the wearer's ears.
A method of sound output, comprising:

Acquiring, by using the main microphone Mic in the sound collecting device, the sound of the first sound source on the main Mic corresponding angle of view;

The sound acquired by the main Mic is output to an augmented reality AR or virtual reality VR device connected to the sound pickup device.
The method of claim 10, wherein

Before outputting the sound acquired by the main Mic to the augmented reality AR or the virtual reality VR device connected to the sound collecting device, the method further comprises: acquiring the auxiliary Mic by using the auxiliary Mic in the sound collecting device Corresponding to the sound of the second sound source in the viewing angle, wherein the auxiliary Mic is disposed around the main Mic;

Outputting the sound acquired by the main Mic to the augmented reality AR or the virtual reality VR device connected to the sound pickup device includes synthesizing the sound acquired by the main Mic and the sound acquired by the auxiliary Mic, and synthesizing The subsequent sound is output to the AR or the VR device connected to the pickup device.
The sound output method according to claim 10, wherein the acquiring the sound of the first sound source in the main Mic corresponding angle of view by using the main Mic in the sound pickup device comprises:

Configuring a hole size of the sound hole of the main Mic and/or a depth of the sound hole;

The sound of the first sound source at the main Mic corresponding angle of view is acquired by the main Mic configured with the aperture size and/or the pickup hole depth.
The sound output method according to claim 12, wherein arranging the aperture size and/or the pickup depth of the sound collecting hole of the main Mic comprises:

When receiving the input configuration information, configuring an aperture size and/or a pickup depth of the sound collecting hole of the main Mic according to the configuration information;

In the case where the input configuration information is not received, the aperture size and/or the pickup hole depth of the pickup hole of the main Mic are configured according to a preset value.
The sound output method according to claim 10, wherein the method further comprises: before acquiring the sound of the first sound source in the main Mic corresponding angle of view by using the main Mic in the sound pickup device, the method further comprising:

Determining a direction and a range in which the pickup device needs to be rotated;

The pickup device is controlled to rotate by using the determined direction and amplitude of the rotation of the pickup device.
The sound output method according to claim 10, wherein the method further comprises: before acquiring the sound of the first sound source in the main Mic corresponding angle of view by using the main Mic in the sound pickup device, the method further comprising:

Determining the direction and magnitude of rotation of the main Mic;

The main Mic is controlled to rotate by using the determined direction and amplitude of the main Mic to be rotated.
A sound output device comprising:

Obtaining a module, configured to acquire, by using the main microphone Mic in the sound collecting device, a sound of the first sound source in the main Mic corresponding angle of view;

And an output module configured to output the sound acquired by the main Mic to an augmented reality AR or virtual reality VR device connected to the sound pickup device.
The sound output device according to claim 16, wherein the acquisition module acquires the sound of the first sound source at the main Mic corresponding angle of view by using the main Mic in the sound pickup device as follows:

Configuring a hole size of the sound hole of the main Mic and/or a depth of the sound hole;

The sound of the first sound source at the main Mic corresponding angle of view is acquired by the main Mic configured with the aperture size and/or the pickup hole depth.
A storage medium having stored therein a computer program, wherein the computer program is arranged to perform the method of any one of claims 10 to 15 at runtime.
An electronic device comprising a memory and a processor, the memory storing a computer program, the processor being arranged to run the computer program to perform the method of any one of claims 10 to 15.