WO2021143656A1 - 立体声拾音方法、装置、终端设备和计算机可读存储介质 - Google Patents
立体声拾音方法、装置、终端设备和计算机可读存储介质 Download PDFInfo
- Publication number
- WO2021143656A1 WO2021143656A1 PCT/CN2021/071156 CN2021071156W WO2021143656A1 WO 2021143656 A1 WO2021143656 A1 WO 2021143656A1 CN 2021071156 W CN2021071156 W CN 2021071156W WO 2021143656 A1 WO2021143656 A1 WO 2021143656A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- data
- sound pickup
- microphone
- terminal device
- target
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 70
- 230000002159 abnormal effect Effects 0.000 claims description 92
- 238000012545 processing Methods 0.000 claims description 42
- 230000004913 activation Effects 0.000 claims description 41
- 238000001514 detection method Methods 0.000 claims description 35
- 238000004590 computer program Methods 0.000 claims description 18
- 230000009466 transformation Effects 0.000 claims description 9
- 238000009432 framing Methods 0.000 claims description 4
- 230000000694 effects Effects 0.000 abstract description 40
- 230000006870 function Effects 0.000 description 24
- 238000010586 diagram Methods 0.000 description 19
- 238000004891 communication Methods 0.000 description 16
- 238000007726 management method Methods 0.000 description 15
- 238000010295 mobile communication Methods 0.000 description 11
- 210000000988 bone and bone Anatomy 0.000 description 10
- 230000008569 process Effects 0.000 description 10
- 230000000903 blocking effect Effects 0.000 description 9
- 230000008859 change Effects 0.000 description 9
- 230000005236 sound signal Effects 0.000 description 7
- 230000001133 acceleration Effects 0.000 description 6
- 238000012512 characterization method Methods 0.000 description 6
- 230000015572 biosynthetic process Effects 0.000 description 5
- 238000005516 engineering process Methods 0.000 description 5
- 229920001621 AMOLED Polymers 0.000 description 4
- 238000013473 artificial intelligence Methods 0.000 description 4
- 238000012937 correction Methods 0.000 description 4
- 230000004044 response Effects 0.000 description 4
- 230000001629 suppression Effects 0.000 description 4
- 230000007423 decrease Effects 0.000 description 3
- 230000008030 elimination Effects 0.000 description 3
- 238000003379 elimination reaction Methods 0.000 description 3
- 238000001914 filtration Methods 0.000 description 3
- 238000013528 artificial neural network Methods 0.000 description 2
- 230000036772 blood pressure Effects 0.000 description 2
- 230000001413 cellular effect Effects 0.000 description 2
- 239000002131 composite material Substances 0.000 description 2
- 230000007613 environmental effect Effects 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 230000005855 radiation Effects 0.000 description 2
- 238000012827 research and development Methods 0.000 description 2
- 238000012546 transfer Methods 0.000 description 2
- 230000005856 abnormality Effects 0.000 description 1
- 230000009471 action Effects 0.000 description 1
- 230000003321 amplification Effects 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 230000003416 augmentation Effects 0.000 description 1
- 230000003190 augmentative effect Effects 0.000 description 1
- 238000010009 beating Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 239000003990 capacitor Substances 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 230000000295 complement effect Effects 0.000 description 1
- 239000004020 conductor Substances 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000007599 discharging Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 230000003862 health status Effects 0.000 description 1
- 230000001976 improved effect Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000001939 inductive effect Effects 0.000 description 1
- 239000010985 leather Substances 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 230000007774 longterm Effects 0.000 description 1
- 238000003199 nucleic acid amplification method Methods 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 239000002096 quantum dot Substances 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 230000000630 rising effect Effects 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
- 238000007711 solidification Methods 0.000 description 1
- 230000008023 solidification Effects 0.000 description 1
- 230000003238 somatosensory effect Effects 0.000 description 1
- 230000006641 stabilisation Effects 0.000 description 1
- 238000011105 stabilization Methods 0.000 description 1
- 238000012549 training Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R5/00—Stereophonic arrangements
- H04R5/04—Circuit arrangements, e.g. for selective connection of amplifier inputs/outputs to loudspeakers, for loudspeaker detection, or for adaptation of settings to personal preferences or hearing impairments
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S7/00—Indicating arrangements; Control arrangements, e.g. balance control
- H04S7/30—Control circuits for electronic adaptation of the sound field
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R5/00—Stereophonic arrangements
- H04R5/027—Spatial or constructional arrangements of microphones, e.g. in dummy heads
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R1/00—Details of transducers, loudspeakers or microphones
- H04R1/20—Arrangements for obtaining desired frequency or directional characteristics
- H04R1/32—Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only
- H04R1/40—Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only by combining a number of identical transducers
- H04R1/406—Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only by combining a number of identical transducers microphones
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R29/00—Monitoring arrangements; Testing arrangements
- H04R29/004—Monitoring arrangements; Testing arrangements for microphones
- H04R29/005—Microphone arrays
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R3/00—Circuits for transducers, loudspeakers or microphones
- H04R3/005—Circuits for transducers, loudspeakers or microphones for combining the signals of two or more microphones
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S1/00—Two-channel systems
- H04S1/007—Two-channel systems in which the audio signals are in digital form
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R2205/00—Details of stereophonic arrangements covered by H04R5/00 but not provided for in any of its subgroups
- H04R2205/026—Single (sub)woofer with two or more satellite loudspeakers for mid- and high-frequency band reproduction driven via the (sub)woofer
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R2430/00—Signal processing covered by H04R, not provided for in its groups
- H04R2430/20—Processing of the output signals of the acoustic transducers of an array for obtaining a desired directivity characteristic
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R2499/00—Aspects covered by H04R or H04S not otherwise provided for in their subgroups
- H04R2499/10—General applications
- H04R2499/11—Transducers incorporated or for use in hand-held devices, e.g. mobile phones, PDA's, camera's
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S2400/00—Details of stereophonic systems covered by H04S but not provided for in its groups
- H04S2400/13—Aspects of volume control, not necessarily automatic, in stereophonic sound systems
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S2400/00—Details of stereophonic systems covered by H04S but not provided for in its groups
- H04S2400/15—Aspects of sound capture and related signal processing for recording or reproduction
Definitions
- the present invention relates to the field of audio processing, and in particular, to a stereo sound pickup method, device, terminal equipment, and computer-readable storage medium.
- the direction of the stereo beam generated by the terminal equipment is often unable to be adjusted due to the solidification of configuration parameters. It is difficult for the terminal equipment to adapt to the requirements of various scenes, so that a better stereo recording effect cannot be obtained.
- the purpose of the present invention is to provide a stereo sound pickup method, device, terminal equipment and computer-readable storage medium, so that the terminal equipment can obtain better stereo recording effects in different video recording scenes.
- an embodiment of the present invention provides a stereo sound pickup method, which is applied to a terminal device, the terminal device includes a plurality of microphones, and the method includes:
- a target beam parameter group corresponding to the multiple target sound pickup data is determined from a plurality of pre-stored beam parameter groups; wherein, the target beam parameter group includes the multiple The beam parameters corresponding to the target sound pickup data;
- a stereo beam is formed according to the target beam parameter group and the multiple target sound pickup data.
- the target beam parameter group is determined according to the posture data and camera data of the terminal device, when the terminal device is in different video recording scenes, different posture data and cameras will be obtained Data, and then determine different target beam parameter sets, so that when stereo beams are formed according to the target beam parameter sets and multiple target pickup data, different target beam parameter sets can be used to adjust the direction of the stereo beam, thereby effectively reducing the recording environment
- the camera data includes activation data, and the activation data characterizes an activated camera;
- the step of determining a target beam parameter group corresponding to the plurality of target sound pickup data from a plurality of pre-stored beam parameter groups according to the attitude data and the camera data includes: according to the attitude data and the The activation data determines a first target beam parameter group corresponding to the plurality of target sound pickup data from a plurality of pre-stored beam parameter groups;
- the step of forming a stereo beam according to the target beam parameter group and the plurality of target sound pickup data includes: forming a first stereo beam according to the first target beam parameter group and the plurality of target sound pickup data; wherein, The first stereo beam points to the shooting direction of the activated camera.
- the first target beam parameter group is determined by the posture data of the terminal device and the activation data characterizing the activated camera, and the first stereo beam is formed according to the first target beam parameter group and multiple target sound pickup data , It is realized that in different video recording scenes, the direction of the first stereo beam is adjusted adaptively according to the attitude data and the activation data, ensuring that the terminal device can obtain a better stereo recording effect when recording video.
- the multiple beam parameter groups include a first beam parameter group, a second beam parameter group, a third beam parameter group, and a fourth beam parameter group.
- the first beam parameter group, the The beam parameters in the second beam parameter group, the third beam parameter group, and the fourth beam parameter group are different;
- the first target beam parameter group is the first beam parameter group
- the first target beam parameter group is the second beam parameter group
- the first target beam parameter group is the third beam parameter group
- the first target beam parameter group is the fourth beam parameter group.
- the camera data includes activation data and zoom data, wherein the zoom data is a zoom factor of an activated camera characterized by the activation data;
- the step of determining a target beam parameter group corresponding to the plurality of target sound pickup data from a plurality of pre-stored beam parameter groups according to the attitude data and the camera data includes: according to the attitude data, the The activation data and the zoom data determine a second target beam parameter group corresponding to the plurality of target sound pickup data from a plurality of pre-stored beam parameter groups;
- the step of forming a stereo beam according to the target beam parameter group and the plurality of target sound pickup data includes: forming a second stereo beam according to the second target beam parameter group and the plurality of target sound pickup data; wherein, The second stereo beam points to the shooting direction of the activated camera, and the width of the second stereo beam narrows as the zoom factor increases.
- the second target beam parameter group is determined by the posture data of the terminal device, the activation data characterizing the activated camera, and the zoom data, and the second target beam parameter group is formed according to the second target beam parameter group and multiple target sound pickup data.
- the second stereo beam realizes that in different video recording scenarios, the direction and width of the second stereo beam can be adjusted adaptively according to the attitude data, activation data, and zoom data, so that it can be used in noisy environments and long-distance sound pickup conditions. Achieve better recording robustness.
- the step of obtaining multiple target sound pickup data from the sound pickup data of the multiple microphones includes:
- If there is abnormal sound data eliminate the abnormal sound data in the sound pickup data of the multiple microphones to obtain the initial target sound pickup data;
- multiple target pickup data used to form a stereo beam are determined by detecting microphone jams on multiple microphones and processing abnormal sounds on the pickup data of multiple microphones. In the case of interference and microphone plugging, it still has good recording robustness, so as to ensure a good stereo recording effect.
- the step of obtaining the serial number of the microphone that has not blocked the microphone according to the sound pickup data of the multiple microphones includes:
- a relatively accurate detection result of microphone jamming can be obtained, which is beneficial to the subsequent determination of multiple target pickup data used to form a stereo beam. , So as to ensure a good stereo recording effect.
- the step of detecting whether there is abnormal sound data in the sound pickup data of each of the microphones includes:
- the pre-trained abnormal sound detection network and the frequency domain information corresponding to the sound pickup data of each microphone it is detected whether there is abnormal sound data in the sound pickup data of each microphone.
- the microphone pickup data is subjected to frequency domain transformation processing, and the pre-trained abnormal sound detection network and the frequency domain information corresponding to the microphone pickup data are used to detect whether there is an abnormality in the microphone pickup data.
- Sound data which is convenient for obtaining relatively clean pickup data in the follow-up, so as to ensure a good stereo recording effect.
- the step of eliminating abnormal sound data in the sound pickup data of the multiple microphones includes:
- the abnormal sound when the abnormal sound is eliminated, by detecting whether there is preset sound data in the abnormal sound data, and taking different elimination measures based on the detection result, it can ensure that relatively clean sound pickup data is obtained. It can also prevent the sound data that the user expects to be recorded to be completely eliminated.
- the step of obtaining multiple target sound pickup data from the sound pickup data of the multiple microphones includes:
- the microphone is blocked by detecting multiple microphones, and then the pickup data corresponding to the serial number of the microphone that is not blocked is selected for subsequent formation of a stereo beam, so that the terminal device will not be blocked by the microphone when recording video.
- the hole causes the sound quality to be significantly reduced, or the stereo is obviously unbalanced, that is, when the microphone is blocked, the stereo recording effect can be guaranteed, and the recording robustness is good.
- the step of obtaining multiple target sound pickup data from the sound pickup data of the multiple microphones includes:
- the abnormal sound data in the sound pickup data of the plurality of microphones is eliminated to obtain a plurality of target sound pickup data.
- the method further includes:
- the frequency response can be corrected to be flat, thereby obtaining a better stereo recording effect.
- the method further includes:
- the low-volume pickup data can be heard clearly, and the large-volume pickup data will not produce clipping and distortion, thereby adjusting the sound recorded by the user to an appropriate volume. , Improve the user's video recording experience.
- the camera data includes the zoom factor of the activated camera
- the step of adjusting the gain of the stereo beam includes:
- the gain of the stereo beam is adjusted according to the zoom factor of the camera.
- the gain of the stereo beam is adjusted according to the zoom factor of the camera, so that the volume of the target sound source will not decrease due to the distance, thereby improving the sound effect of the recorded video.
- the number of the microphones is 3 to 6, wherein at least one microphone is arranged on the front of the screen of the terminal device or the back of the terminal device.
- the number of the microphones is three, one microphone is set on the top and bottom of the terminal device, and one microphone is set on the front of the screen of the terminal device or on the back of the terminal device.
- the number of the microphones is six, two microphones are respectively arranged on the top and bottom of the terminal device, and one microphone is respectively arranged on the front of the screen of the terminal device and the back of the terminal device.
- an embodiment of the present invention provides a stereo sound pickup device, which is applied to a terminal device, the terminal device includes a plurality of microphones, and the device includes:
- a pickup data acquisition module configured to acquire multiple target pickup data from the pickup data of the multiple microphones
- a device parameter acquisition module which is used to acquire the posture data and camera data of the terminal device
- the beam parameter determination module is configured to determine a target beam parameter group corresponding to the multiple target sound pickup data from a plurality of pre-stored beam parameter groups according to the attitude data and the camera data; wherein, the target beam The parameter group includes beam parameters corresponding to each of the multiple target sound pickup data;
- the beam forming module is configured to form a stereo beam according to the target beam parameter group and the multiple target sound pickup data.
- an embodiment of the present invention provides a terminal device, including a memory storing a computer program and a processor.
- a terminal device including a memory storing a computer program and a processor.
- the computer program is read and executed by the processor, the implementation is as described in any of the foregoing embodiments. The method described.
- an embodiment of the present invention provides a computer-readable storage medium on which a computer program is stored.
- the computer program is read and executed by a processor, the method according to any one of the foregoing embodiments is implemented. .
- embodiments of the present invention also provide a computer program product, which when the computer program product runs on a computer, causes the computer to execute the method described in any one of the foregoing embodiments.
- an embodiment of the present invention also provides a chip system, which includes a processor and may also include a memory, configured to implement the method according to any one of the foregoing embodiments.
- the chip system can be composed of chips, or it can include chips and other discrete devices.
- FIG. 1 shows a schematic diagram of a hardware structure of a terminal device provided by an embodiment of the present invention
- FIG. 2 shows a schematic diagram of the layout when the number of microphones on the terminal device is 3 according to an embodiment of the present invention
- FIG. 3 shows a schematic diagram of the layout when the number of microphones on the terminal device is 6 according to an embodiment of the present invention
- FIG. 4 shows a schematic flowchart of a stereo sound pickup method provided by an embodiment of the present invention
- FIG. 5 is a schematic diagram of another flow chart of a stereo sound pickup method provided by an embodiment of the present invention.
- FIG. 6 shows a schematic diagram of the corresponding first stereo beam when the terminal device is in a landscape state and the rear camera is enabled
- FIG. 7 shows a schematic diagram of the corresponding first stereo beam when the terminal device is in a landscape state and the front camera is enabled
- FIG. 8 shows a schematic diagram of the corresponding first stereo beam when the terminal device is in a vertical screen state and the rear camera is enabled
- FIG. 9 shows a schematic diagram of the corresponding first stereo beam when the terminal device is in a vertical screen state and the front camera is enabled
- FIG. 10 shows another schematic flowchart of a stereo sound pickup method provided by an embodiment of the present invention.
- 11a-11c show schematic diagrams of the width of the second stereo beam varying with the zoom factor of the activated camera
- FIG. 12 shows a schematic flowchart of a sub-step of S201 in FIG. 4;
- FIG. 13 shows a schematic flowchart of another seed step of S201 in FIG. 4;
- FIG. 14 shows a schematic flowchart of another sub-step of S201 in FIG. 4;
- FIG. 15 shows another schematic flowchart of a stereo sound pickup method provided by an embodiment of the present invention.
- FIG. 16 shows another schematic flowchart of a stereo sound pickup method provided by an embodiment of the present invention.
- FIG. 17 shows a schematic diagram of a functional module of a stereo sound pickup device provided by an embodiment of the present invention.
- FIG. 18 shows a schematic diagram of another functional module of a stereo sound pickup device provided by an embodiment of the present invention.
- FIG. 19 shows a schematic diagram of another functional module of the stereo sound pickup device provided by an embodiment of the present invention.
- first and “second” and other relational terms are only used to distinguish one entity or operation from another entity or operation, and do not necessarily require or imply one of these entities or operations. There is any such actual relationship or order between.
- the terms “include”, “include” or any other variants thereof are intended to cover non-exclusive inclusion, so that a process, method, article or device including a series of elements not only includes those elements, but also includes those that are not explicitly listed Other elements of, or also include elements inherent to this process, method, article or equipment. If there are no more restrictions, the element defined by the sentence “including a" does not exclude the existence of other same elements in the process, method, article, or equipment that includes the element.
- FIG. 1 shows a schematic diagram of a hardware structure of a terminal device.
- the terminal device may include a processor 110, an internal memory 120, an external memory interface 130, a sensor module 140, a camera 150, a display screen 160, an audio module 170, a speaker 171, a microphone 172, a receiver 173, a headset interface 174, a mobile communication module 180, Wireless communication module 190, USB (Universal Serial Bus) interface 101, charging management module 102, power management module 103, battery 104, buttons 105, motor 106, indicator 107, subscriber identification module (Subscriber Identification Module, SIM) card interface 108, antenna 1, antenna 2, etc.
- USB Universal Serial Bus
- FIG. 1 is only an example.
- the terminal device of the embodiment of the present invention may have more or fewer components than the terminal device shown in FIG. 1, may combine two or more components, or may have different component configurations.
- the various components shown in FIG. 1 may be implemented in hardware, software, or a combination of hardware and software including one or more signal processing and/or application specific integrated circuits.
- the processor 110 may include one or more processing units.
- the processor 110 may include an application processor (Application Processor, AP), a modem processor, a graphics processor (Graphics Processing Unit, GPU), an image signal processor (Image Signal Processor, ISP), a controller, and a memory.
- Video codec digital signal processor (Digital Signal Processor, DSP), baseband processor, and/or neural network processor (Neural-network Processing Unit, NPU), etc.
- the different processing units may be independent devices or integrated in one or more processors.
- the controller can be the nerve center and command center of the terminal device.
- the controller can generate operation control signals according to the instruction operation code and timing signals, and complete the control of fetching and executing instructions.
- a memory may also be provided in the processor 110 to store instructions and data.
- the memory in the processor 110 is a cache memory.
- the memory can store instructions or data that have just been used or recycled by the processor 110. If the processor 110 needs to use the instruction or data again, it can be directly called from the memory, which avoids repeated access and reduces the waiting time of the processor 110, thereby improving the efficiency of the system.
- the internal memory 120 may be used to store computer programs and/or data.
- the internal memory 120 may include a storage program area and a storage data area.
- the storage program area can store the operating system, at least one application program required by the function (such as sound playback function, image playback function, face recognition function), etc.;
- the storage data area can store data created during the use of the terminal device (such as audio data, image data) and so on.
- the processor 110 may execute various functional applications and data processing of the terminal device by running a computer program and/or data stored in the internal memory 120.
- the terminal device when the computer program and/or data stored in the internal memory 120 are read and run by the processor 110, the terminal device can be enabled to execute the stereo sound pickup method provided in the embodiment of the present invention, so that the terminal device can record different videos Better stereo recording effect can be obtained in the scene.
- the internal memory 120 may include a high-speed random access memory, and may also include a non-volatile memory.
- the non-volatile memory may include at least one magnetic disk storage device, flash memory device, Universal Flash Storage (UFS), and so on.
- the external memory interface 130 may be used to connect an external memory card, such as a Micro SD card, to expand the storage capacity of the terminal device.
- the external memory card communicates with the processor 110 through the external memory interface 130 to realize the data storage function. For example, save audio, video and other files in an external memory card.
- the sensor module 140 may include one or more sensors.
- acceleration sensor 140A acceleration sensor 140A, gyroscope sensor 140B, distance sensor 140C, pressure sensor 140D, touch sensor 140E, fingerprint sensor 140F, ambient light sensor 140G, bone conduction sensor 140H, proximity light sensor 140J, temperature sensor 140K, air pressure sensor 140L, The magnetic sensor 140M, etc., is not limited to this.
- the acceleration sensor 140A can perceive changes in acceleration force, such as shaking, falling, rising, falling, and changes in the angle of the handheld terminal device, etc., which can be converted into electrical signals by the acceleration sensor 140A.
- the acceleration sensor 140A can detect whether the terminal device is in a horizontal screen state or a vertical screen state.
- the gyro sensor 140B may be used to determine the motion posture of the terminal device.
- the angular velocity of the terminal device around three axes ie, x, y, and z axes
- the gyro sensor 140B can be used for image stabilization.
- the gyro sensor 140B detects the shake angle of the terminal device, calculates the distance that the lens module needs to compensate according to the angle, and allows the lens to counteract the shake of the terminal device through reverse movement to achieve anti-shake.
- the gyro sensor 140B can also be used for navigation and somatosensory game scenes.
- the distance sensor 140C may be used to measure distance.
- the terminal device can measure the distance by infrared or laser. Exemplarily, in a shooting scene, the terminal device may use the distance sensor 140C to measure distance to achieve rapid focusing.
- the pressure sensor 140D can be used to sense pressure signals and convert the pressure signals into electrical signals.
- the pressure sensor 140D may be provided on the display screen 160.
- the capacitive pressure sensor may include at least two parallel plates with conductive materials.
- the touch sensor 140E is also called “touch panel”.
- the touch sensor 140E may be disposed on the display screen 160, and the touch sensor 140E and the display screen 160 form a touch screen, which is also called a “touch screen”.
- the touch sensor 140E is used to detect touch operations acting on or near it.
- the touch sensor 140E may transmit the detected touch operation to the application processor to determine the type of the touch event, and may provide visual output related to the touch operation through the display screen 160.
- the touch sensor 140E may also be disposed on the surface of the terminal device, which is different from the position of the display screen 160.
- the fingerprint sensor 140F can be used to collect fingerprints.
- the terminal device can use the collected fingerprint characteristics to realize functions such as fingerprint unlocking, accessing the application lock, fingerprint taking pictures, and fingerprint answering calls.
- the ambient light sensor 140G can be used to sense the brightness of the ambient light.
- the terminal device can adaptively adjust the brightness of the display screen 160 according to the perceived brightness of the ambient light.
- the ambient light sensor 140G can also be used to automatically adjust the white balance when taking pictures.
- the ambient light sensor 140G can also cooperate with the proximity light sensor 140J to detect whether the terminal device is in the pocket to prevent accidental touch.
- the bone conduction sensor 140H may be used to obtain vibration signals.
- the bone conduction sensor 140H can acquire the vibration signal of the vibrating bone mass of the human voice.
- the bone conduction sensor 140H can also contact the human pulse and receive the blood pressure pulse signal.
- the bone conduction sensor 140H may also be provided in the earphone, combined with the bone conduction earphone.
- the audio module 170 can parse the voice signal based on the vibration signal of the sound part vibrating bone block obtained by the bone conduction sensor 140H to realize the voice function.
- the application processor can analyze the heart rate information based on the blood pressure beating signal obtained by the bone conduction sensor 140H, and realize the heart rate detection function.
- the proximity light sensor 140J may include, for example, a light emitting diode (LED) and a light detector, such as a photodiode.
- the light emitting diode may be an infrared light emitting diode.
- the terminal device emits infrared light to the outside through the light emitting diode.
- Terminal equipment uses photodiodes to detect infrared reflected light from nearby objects. When sufficient reflected light is detected, it can be determined that there is an object near the terminal device. When insufficient reflected light is detected, the terminal device can determine that there is no object near the terminal device.
- the terminal device can use the proximity light sensor 140J to detect that the user holds the terminal device close to the ear to talk, so as to automatically turn off the screen to save power.
- the temperature sensor 140K can be used to detect temperature.
- the terminal device uses the temperature detected by the temperature sensor 140K to execute the temperature processing strategy. For example, when the temperature reported by the temperature sensor 140K exceeds a threshold value, the terminal device executes to reduce the performance of the processor located near the temperature sensor 140K, so as to reduce power consumption and implement thermal protection.
- the terminal device when the temperature is lower than another threshold, the terminal device heats the battery 104 to avoid abnormal shutdown of the terminal device due to low temperature.
- the terminal device boosts the output voltage of the battery 104 to avoid abnormal shutdown caused by low temperature.
- the air pressure sensor 140L can be used to measure air pressure.
- the terminal device calculates the altitude based on the air pressure value measured by the air pressure sensor 140L to assist positioning and navigation.
- the magnetic sensor 140M may include a Hall sensor.
- the terminal device can use the magnetic sensor 140M to detect the opening and closing of the flip holster.
- the terminal device when the terminal device is a flip machine, the terminal device can detect the opening and closing of the flip cover according to the magnetic sensor 140M, and then set the flip cover to automatically unlock according to the detected opening and closing state of the leather case or the opening and closing state of the flip cover, etc. characteristic.
- the camera 150 is used to capture images or videos.
- the object generates an optical image through the lens and is projected to the photosensitive element.
- the photosensitive element can be a Charge Coupled Device (CCD) or a Complementary Metal-Oxide-Semiconductor (CMOS) phototransistor.
- CCD Charge Coupled Device
- CMOS Complementary Metal-Oxide-Semiconductor
- the photosensitive element converts the light signal into an electric signal, and then transfers the electric signal to the ISP to convert it into a digital image signal.
- the ISP outputs the digital image signal to the DSP for processing, and the DSP converts the digital image signal into standard RGB, YUV and other formats. Signal.
- the terminal device may include one or more cameras 150, which is not limited.
- the terminal device includes two cameras 150, such as one front camera and one rear camera; in another example, the terminal device includes five cameras 150, such as three rear cameras and two front cameras .
- the terminal device can realize the shooting function through the ISP, the camera 150, the video codec, the GPU, the display screen 160, and the application processor.
- the display screen 160 is used to display images, videos, and the like.
- the display screen 160 includes a display panel, and the display panel can be a liquid crystal display (LCD), an organic light-emitting diode (OLED), an active matrix organic light-emitting diode or an active matrix organic light-emitting diode Body (Active-Matrix Organic Light Emitting Diode, AMOLED), Flexible Light-Emitting Diode (FLED), Miniled, MicroLed, Micro-oLed, Quantum Dot Light Emitting Diodes (QLED), etc.
- the terminal device may implement a display function through a GPU, a display screen 160, an application processor, and the like.
- the terminal device can implement audio functions through the audio module 170, the speaker 171, the microphone 172, the receiver 173, the earphone interface 174, and the application processor. For example, audio playback, recording, etc.
- the audio module 170 is used to convert digital audio information into an analog audio signal for output, and is also used to convert an analog audio input into a digital audio signal.
- the audio module 170 can also be used to encode and decode audio signals.
- the audio module 170 may be provided in the processor 110, or part of the functional modules of the audio module 170 may be provided in the processor 110.
- the speaker 171 also called “speaker” is used to convert audio electrical signals into sound signals.
- the terminal device can play music, give voice prompts, etc. through the speaker 171.
- the microphone 172 also called “microphone” or “microphone”, is used to collect sounds (such as ambient sounds, including sounds made by people, sounds made by equipment, etc.), and convert the sound signals into audio electrical signals, that is, in this embodiment Pickup data in.
- the terminal device can be provided with multiple microphones 172. By arranging multiple microphones 172 on the terminal device, the user can obtain high-quality stereo recording effects when using the terminal device to record video.
- the number of microphones 172 provided on the terminal device can be 3 to 6, wherein at least one microphone 172 is provided on the front of the screen of the terminal device or the back of the terminal device to ensure that it can be formed to point to the front and back direction of the terminal device.
- Stereo beam
- the number of microphones 172 when the number of microphones is three, one microphone (ie m1 and m2) is set on the top and bottom of the terminal device, and one microphone is set on the front of the screen of the terminal device or on the back of the terminal device ( That is m3); as shown in Figure 3, when the number of microphones is 6, two microphones (ie m1, m2, and m3, m4) are set on the top and bottom of the terminal device, and the front of the screen of the terminal device and the terminal device Set a microphone (namely m5 and m6) on the back of the. It can be understood that, in other embodiments, the number of microphones 172 may also be 4 or 5, and at least one microphone 172 is provided on the front of the screen of the terminal device or the back of the terminal device.
- the receiver 173 also called “earpiece” is used to convert audio electrical signals into sound signals.
- the terminal device answers a call or voice message, it can receive the voice by bringing the receiver 173 close to the human ear.
- the earphone interface 174 is used to connect wired earphones.
- the earphone interface 174 may be a USB interface, or a 3.5mm Open Mobile Terminal Platform (OMTP) standard interface, or a Cellular Telecommunications Industry Association (Cellular Telecommunications Industry Association of the USA, CTIA) standard interface.
- OMTP Open Mobile Terminal Platform
- CTIA Cellular Telecommunications Industry Association
- the wireless communication function of the terminal device can be implemented by the antenna 1, the antenna 2, the mobile communication module 180, the wireless communication module 190, the modem processor, and the baseband processor.
- the antenna 1 and the antenna 2 are used to transmit and receive electromagnetic wave signals.
- Each antenna in the terminal device can be used to cover a single or multiple communication frequency bands. Different antennas can also be reused to improve antenna utilization.
- antenna 1 can be multiplexed as a diversity antenna of a wireless local area network.
- the antenna can be used in combination with a tuning switch.
- the mobile communication module 180 can provide wireless communication solutions including 2G/3G/4G/5G, etc., which are applied to terminal devices.
- the mobile communication module 180 may include at least one filter, a switch, a power amplifier, a low noise amplifier (LNA), and the like.
- the mobile communication module 180 can receive electromagnetic waves by the antenna 1, and perform processing such as filtering and amplifying the received electromagnetic waves, and transmit them to the modem processor for demodulation.
- the mobile communication module 180 can also amplify the signal modulated by the modem processor, and convert it into electromagnetic waves for radiation via the antenna 1.
- at least part of the functional modules of the mobile communication module 180 may be provided in the processor 110.
- at least part of the functional modules of the mobile communication module 180 and at least part of the modules of the processor 110 may be provided in the same device.
- the modem processor may include a modulator and a demodulator.
- the modulator is used to modulate the low frequency baseband signal to be sent into a medium and high frequency signal
- the demodulator is used to demodulate the received electromagnetic wave signal into a low frequency baseband signal.
- the demodulator then transmits the demodulated low-frequency baseband signal to the baseband processor for processing.
- the application processor outputs a sound signal through an audio device (not limited to the speaker 171, the receiver 173, etc.), or displays an image or video through the display screen 160.
- the modem processor may be an independent device.
- the modem processor may be independent of the processor 110 and be provided in the same device as the mobile communication module 180 or other functional modules.
- the wireless communication module 190 can provide applications on terminal devices including Wireless Local Area Networks (WLAN) (such as Wireless Fidelity (Wi-Fi) networks), Bluetooth (BitTorrent, BT), and global navigation satellite systems. (Global Navigation Satellite System, GNSS), Frequency Modulation (FM), Near Field Communication (NFC), Infrared Technology (Infrared Radiation, IR) and other wireless communication solutions.
- the wireless communication module 190 may be one or more devices integrating at least one communication processing module.
- the wireless communication module 190 receives electromagnetic waves via the antenna 2, frequency modulates and filters the electromagnetic wave signals, and sends the processed signals to the processor 110.
- the wireless communication module 190 may also receive the signal to be sent from the processor 110, perform frequency modulation and amplification processing on it, and convert it into electromagnetic waves to radiate through the antenna 2.
- the antenna 1 of the terminal device is coupled with the mobile communication module 180, and the antenna 2 is coupled with the wireless communication module 190, so that the terminal device can communicate with the network and other devices through wireless communication technology.
- the wireless communication technology may include Global System for Mobile Communication (GSM), General Packet Radio Service (GPRS), Code Division Multiple Access (CDMA), and broadband code Division Multiple Access (Wideband Code Division Multiple Access, WCDMA), Time Division Code Division Multiple Access (Time Division-Synchronous Code Division Multiple Access, TD-SCDMA), Long Term Evolution (LTE), BT, GNSS, WLAN, NFC , FM, and/or IR technology, etc.
- GSM Global System for Mobile Communication
- GPRS General Packet Radio Service
- CDMA Code Division Multiple Access
- WCDMA Wideband Code Division Multiple Access
- WCDMA Wideband Code Division Multiple Access
- Time Division Code Division Multiple Access Time Division-Synchronous Code Division Multiple Access
- TD-SCDMA Time Division Code Division Multiple Access
- LTE Long Term Evolution
- GNSS can include Global Positioning System (GPS), Global Navigation Satellite System (GLONASS), BeiDou Navigation Satellite System (BDS), Quasi-Zenith Satellite System (Quasi-Zenith Satellite System, QZSS) and/or Satellite Based Augmentation System (SBAS).
- GPS Global Positioning System
- GLONASS Global Navigation Satellite System
- BDS BeiDou Navigation Satellite System
- QZSS Quasi-Zenith Satellite System
- SBAS Satellite Based Augmentation System
- the USB interface 101 is an interface that complies with the USB standard specifications, and specifically may be a Mini USB interface, a Micro USB interface, a USB Type C interface, and so on.
- the USB interface 101 can be used to connect a charger to charge the terminal device, and can also be used to transfer data between the terminal device and peripheral devices. It can also be used to connect earphones and play sound through earphones.
- the USB interface 101 can also be used to connect to other terminal devices, such as AR (Augmented Reality) devices, computers, and so on.
- the charging management module 102 is used to receive charging input from the charger.
- the charger can be a wireless charger or a wired charger.
- the charging management module 102 may receive the charging input of the wired charger through the USB interface 101.
- the charging management module 102 may receive a wireless charging input through a wireless charging coil of the terminal device. While the charging management module 102 charges the battery 104, it can also supply power to the terminal device through the power management module 103.
- the power management module 103 is used to connect the battery 104, the charging management module 102, and the processor 110.
- the power management module 103 receives input from the battery 104 and/or the charging management module 102, and supplies power to the processor 110, the internal memory 120, the camera 150, the display screen 160, and the like.
- the power management module 103 can also be used to monitor parameters such as battery capacity, battery cycle times, and battery health status (leakage, impedance).
- the power management module 103 may be provided in the processor 110. In other embodiments, the power management module 103 and the charging management module 102 may also be provided in the same device.
- the button 105 includes a power-on button, a volume button, and so on.
- the button 105 may be a mechanical button or a touch button.
- the terminal device can receive key input, and generate key signal input related to the user settings and function control of the terminal device.
- the motor 106 can generate vibration prompts.
- the motor 106 can be used for incoming call vibrating prompts, and can also be used for touch vibration feedback.
- touch operations for different applications can correspond to different vibration feedback effects.
- Different application scenarios for example: time reminding, receiving information, alarm clock, games, etc.
- the touch vibration feedback effect can also support customization.
- the indicator 107 can be an indicator light, which can be used to indicate the charging status, power change, and can also be used to indicate messages, missed calls, notifications, etc.
- the SIM card interface 108 is used to connect to a SIM card.
- the SIM card can be inserted into the SIM card interface 108 or pulled out from the SIM card interface 108 to achieve contact and separation with the terminal device.
- the terminal device can support one or more SIM card interfaces.
- the SIM card interface 108 may support Nano SIM cards, Micro SIM cards, SIM cards, etc.
- the same SIM card interface 108 can insert multiple cards at the same time. The types of multiple cards can be the same or different.
- the SIM card interface 108 can also be compatible with different types of SIM cards.
- the SIM card interface 108 may also be compatible with external memory cards.
- the terminal equipment interacts with the network through the SIM card to realize functions such as call and data communication.
- the terminal device adopts an eSIM, that is, an embedded SIM card.
- the eSIM card can be embedded in the terminal device and cannot be separated from the terminal device.
- the stereo sound pickup method provided by the embodiment of the present invention uses the posture data of the terminal device and the camera data to determine the target beam parameter group, and combines the target sound pickup data picked up by the microphone to form a stereo beam. Since different attitude data and camera data determine different target beam parameter groups, different target beam parameter groups can be used to adjust the direction of the stereo beam, thereby effectively reducing the impact of noise in the recording environment, allowing the terminal equipment to record different videos Better stereo recording effect can be obtained in the scene. In addition, by detecting the blocking of the microphone, eliminating various abnormal sound data, correcting the tone of the stereo beam, and adjusting the gain of the stereo beam, while ensuring a good stereo recording effect, the robustness of the recording is further enhanced.
- FIG. 4 is a schematic flowchart of a stereo sound pickup method provided by an embodiment of the present invention.
- the stereo sound pickup method can be implemented on a terminal device having the above-mentioned hardware structure. Please refer to FIG. 4, the stereo sound pickup method may include the following steps:
- S201 Acquire multiple target sound pickup data from the sound pickup data of multiple microphones.
- the terminal device may collect sound through multiple microphones provided thereon, and then obtain multiple target sound pickup data from the sound pickup data of the multiple microphones.
- the multiple target sound pickup data can be obtained directly according to the pickup data of the multiple microphones, or can be obtained by selecting the pickup data of some of the multiple microphones according to a certain rule, or it can be obtained by combining multiple microphones.
- the sound pickup data is obtained after processing in a certain way, and there is no restriction on this.
- the posture data of the terminal device can be obtained through the aforementioned acceleration sensor 140A, and the posture data can indicate that the terminal device is in a horizontal screen state or a vertical screen state; the camera data can be understood as the user using the terminal device to record video During the process, the usage situation corresponding to the camera set on the terminal device.
- S203 Determine a target beam parameter group corresponding to the multiple target sound pickup data from the multiple pre-stored beam parameter groups according to the attitude data and the camera data; wherein the target beam parameter group includes the respective beams corresponding to the multiple target sound pickup data. parameter.
- the beam parameter group can be obtained through pre-training and stored in the terminal device, and it includes several parameters that affect stereo beam forming.
- the posture data and camera data corresponding to the terminal device can be determined in advance for the video recording scene that the terminal device may be in, and a matching beam parameter group can be set based on the posture data and camera data.
- a matching beam parameter group can be set based on the posture data and camera data.
- multiple beam parameter groups can be obtained, respectively corresponding to different video recording scenes, and the multiple beam parameter groups are stored in the terminal device for subsequent use in video recording. For example, when a user uses a terminal device to take a picture or record a video, the terminal device can determine a matching target beam parameter group from multiple beam parameter groups based on the currently acquired posture data and camera data.
- the posture data and camera data corresponding to the terminal device will change accordingly, so based on the posture data and camera data, different target beams can be determined from multiple beam parameter groups.
- the parameter group that is, the beam parameters corresponding to the multiple target sound pickup data will change with the different video recording scenes.
- the beam parameters in the target beam parameter group can be understood as a weight value.
- each target sound pickup data and the corresponding weight can be used Values are weighted and summed to finally get a stereo beam.
- the stereo beam has spatial directivity
- different levels of suppression can be achieved on the pickup data outside the spatial direction of the stereo beam pointing, thereby effectively reducing the recording environment.
- Noise impact since the beam parameters corresponding to multiple target sound pickup data will change with different video recording scenes, the direction of the stereo beam formed by the target beam parameter group and the multiple target sound pickup data will also follow The video recording scene changes and changes, so that the terminal device can obtain a better stereo recording effect in different video recording scenes.
- the camera data of the terminal device may include activation data, and the activation data is used to characterize the activated camera. As shown in FIG.
- the above step S203 may include sub-step S203-1: determining a first target beam parameter group corresponding to multiple target sound pickup data from a plurality of pre-stored beam parameter groups according to the attitude data and the activation data;
- the above step S204 may include a sub-step S204-1: forming a first stereo beam according to the first target beam parameter group and multiple target sound pickup data, where the first stereo beam points to the shooting direction of the activated camera.
- the terminal device when it is in a different video recording scene, it needs to correspond to different beam parameter groups, so multiple beam parameter groups can be pre-stored in the terminal device.
- the plurality of beam parameter groups may include a first beam parameter group, a second beam parameter group, a third beam parameter group, and a fourth beam parameter group, the first beam parameter group, the second beam parameter group, and the first beam parameter group.
- the beam parameters in the three-beam parameter group and the fourth beam parameter group are different.
- the first target The beam parameter group is the first beam parameter group; when the posture data characterizes the terminal device in the landscape state and the enable data characterizes the front camera is enabled, the first target beam parameter group is the second beam parameter group; when the posture data characterizes the terminal When the device is in the vertical screen state, and the data characterization rear camera is enabled, the first target beam parameter group is the third beam parameter group; when the posture data characterization terminal device is in the vertical screen state, and the data characterization front camera is enabled When, the first target beam parameter group is the fourth beam parameter group.
- the direction of the first stereo beam changes according to the switching of the horizontal and vertical screen states of the terminal device and the activation of the front and rear cameras.
- the terminal device in Figure 6 is in the landscape state and the rear camera is enabled for shooting
- the terminal device in Figure 7 is in the landscape state and the front camera is enabled for shooting
- the terminal device in Figure 8 is in the portrait state and enabled
- the rear camera is used for shooting
- the terminal device in Figure 9 is in a vertical screen state and the front camera is enabled for shooting.
- the left and right arrows indicate the directions of the left and right beams respectively.
- the first stereo beam can be understood as a composite beam of the left and right beams; the horizontal plane refers to the current shooting attitude (horizontal) of the terminal device.
- the vertical side of the vertical plane in the screen state or the vertical screen state), and the main axis of the formed first stereo beam is located in the horizontal plane.
- the direction of the first stereo beam will also change accordingly.
- the main axis of the first stereo beam shown in FIG. 6 is located on a horizontal plane perpendicular to the vertical side of the terminal device in the horizontal screen state.
- the main axis of the first stereo beam is located on the horizontal plane.
- the vertical side is vertical on the horizontal plane, as shown in Figure 8.
- the shooting direction of the activated camera is generally the direction that the user needs to focus on collecting sound
- the direction of the first stereo beam will also change with the shooting direction of the activated camera. For example, in FIGS. 6 and 8, the direction of the first stereo beam all points to the shooting direction of the rear camera, and in FIGS. 7 and 9, the direction of the first stereo beam all points to the shooting direction of the front camera.
- the multiple target pickup data will correspond to different first target beam parameter groups, and then form first stereo beams in different directions, so that the direction of the first stereo beam is based on the terminal device
- the switching of the horizontal and vertical screen status and the activation of the front and rear cameras are adaptively adjusted to ensure that the terminal device can obtain a better stereo recording effect when recording video.
- the camera data may include the aforementioned activation data and zoom data, where the zoom data is the zoom factor of the activated camera represented by the activation data.
- the above step S203 may include sub-step S203-2: determine the second target beam corresponding to the multiple target sound pickup data from the multiple beam parameter groups stored in advance according to the attitude data, the activation data, and the zoom data.
- step S204 may include sub-step S204-2: forming a second stereo beam according to the second target beam parameter group and multiple target pickup data; wherein the second stereo beam points to the shooting direction of the activated camera, and The width of the second stereo beam narrows as the zoom factor increases.
- the width of the second stereo beam becomes narrower with the increase of the zoom factor of the enabled camera, which can make the sound image more concentrated, because when the user uses the zoom, it is often a long-distance sound pickup scene.
- the signal-to-noise ratio is lower, and the signal-to-noise ratio can be improved through the narrowing of the second stereo beam, so that the terminal device has better recording robustness in the case of a low signal-to-noise ratio, thereby obtaining a better stereo recording effect.
- the second stereo beam in order to realize that the width of the second stereo beam becomes narrower as the zoom factor of the activated camera increases, the second stereo beam can be preset under different attitude data, activation data, and zoom data.
- Corresponding target shape and then use the least squares method to train to obtain a matching beam parameter set, so that the second stereo beam formed according to the beam parameter set is similar to the set target shape, so as to obtain different attitude data, activation data and zoom data. Under the corresponding beam parameter group.
- the terminal device can be matched to the second target beam parameter group corresponding to different zoom factors, and then based on the second target beam parameter group and multiple targets.
- the audio data forms a second stereo beam of different widths to meet the user's video recording needs.
- FIGS. 11a-11c it is a schematic diagram of the width of the second stereo beam changing with the zoom factor of the activated camera.
- the second stereo beam is a composite beam of the left and right beams
- the 0-degree direction is the shooting direction of the camera that is activated when the user records the video (also referred to as the target direction).
- the terminal device can match the second target beam parameter group corresponding to the low zoom factor to form a wider second stereo beam as shown in FIG. 11a; among them, the left and right beams in FIG. 11a The right beams are respectively directed at 45 degrees to the left and right of the shooting direction.
- the terminal device can match the second target beam parameter group corresponding to the medium zoom factor, and then form the narrowed second stereo beam as shown in Figure 11b; where, the left and right in Figure 11b The direction of the beam is narrowed to around 30 degrees to the left and right of the shooting direction.
- the terminal device can match the second target beam parameter group corresponding to the high zoom factor to form a further narrower second stereo beam as shown in Figure 11c; among them, the left and right in Figure 11c The direction of the right beam is further narrowed to around 10 degrees to the left and right of the shooting direction.
- the second target beam parameter group is formed to form a second stereo beam with different directions and widths, so that the direction and width of the second stereo beam can be adaptively adjusted with the posture of the terminal device, the enabled camera and the change of the zoom factor , So it can achieve better recording robustness in noisy environments and long-distance sound pickup conditions.
- the stereo recording effect will not only be interfered by environmental noise, but it is also easy to cause the user to hold the terminal device and block the microphone with his fingers or other parts, or to enter the guide due to dirt.
- the microphone blocking problem caused by the sound hole is affected; and as the function of the terminal device becomes more and more powerful, the self-noise of the terminal device (that is, the noise generated by the internal circuit of the terminal device) is also more and more likely to be picked up by the microphone, such as Camera motor noise, WiFi interference sound, noise caused by capacitor charging and discharging, etc.; in addition, the user’s fingers or other parts will touch the screen or rub near the microphone hole due to zooming or other operations during the camera, resulting in something that is not what the user expects Abnormal sound recorded.
- These self-noise or abnormal sound interferences affect the stereo recording effect of the video to a certain extent.
- this embodiment proposes that after the sound pickup data of multiple microphones is acquired, the microphone blocking detection is performed on the multiple microphones and the abnormal sound processing is performed on the sound pickup data of the multiple microphones to determine the stereo beam.
- Multiple target pickup data can still achieve better recording robustness in the case of abnormal sound interference and/or microphone plugging, so as to ensure a good stereo recording effect.
- the process of acquiring multiple target sound pickup data will be described in detail.
- S201 includes the following sub-steps:
- S2011-A Obtain the serial number of the microphone without microphone jam based on the sound pickup data of multiple microphones.
- the terminal device after the terminal device obtains the sound pickup data of multiple microphones, by performing time domain framing processing and frequency domain transformation processing on the sound pickup data of each microphone, the corresponding sound pickup data of each microphone can be obtained.
- Time domain information and frequency domain information compare the time domain information and frequency domain information corresponding to the pickup data of different microphones respectively, and obtain the time domain comparison result and the frequency domain comparison result, according to the time domain comparison result and frequency domain comparison result Determine the serial number of the microphone that has jammed, and determine the serial number of the microphone that has not jammed based on the serial number of the microphone that has jammed.
- the same time domain information does not mean that the two signals are exactly the same. The signal needs to be further analyzed from the perspective of the frequency domain.
- the sound pickup data of the microphone is obtained from the time domain and the frequency domain. Analyzing these two different angles can effectively improve the accuracy of microphone jam detection and avoid the misjudgment of microphone jamming caused by analysis from a single angle.
- the time domain information may be the RMS (Root-Mean-Square) value of the time domain signal corresponding to the pickup data
- the frequency domain information may be the frequency domain signal corresponding to the pickup data at the set frequency. (For example, 2KHz) above the RMS value of the high-frequency part, the RMS value of the high-frequency part is more obvious when the microphone is blocked.
- the RMS value of the time domain signal and the RMS value of the high-frequency part of the pickup data of the microphone that has blocked the microphone and the microphone that has not occurred will be both There are differences. Even between microphones without microphone blocking, due to the influence of factors such as the structure of the microphone and the shielding of the terminal device housing, there will be slight differences between the RMS value of the time domain signal and the RMS value of the high frequency part. Therefore, in the research and development stage of terminal equipment, it is necessary to find the difference between microphones with and without microphone jams, and set the corresponding time domain threshold and frequency domain threshold according to the difference, which are used to compare the time domain.
- the time domain threshold and the frequency domain threshold may be empirical values obtained by those skilled in the art through experiments.
- the serial numbers of the 3 microphones are m1, m2, and m3 respectively.
- the RMS values of the time domain signals corresponding to the pickup data of the 3 microphones are A1, A2, and A3, respectively.
- the RMS values of the high-frequency parts corresponding to the pickup data of two microphones are B1, B2, and B3 respectively; when comparing the time domain information corresponding to the pickup data of the three microphones in the time domain, A1 and A2 can be calculated separately , The difference between A1 and A3, A2 and A3, and compare the difference with the set time domain threshold.
- the time domain corresponding to the pickup data of the two microphones is considered The information is consistent; when the difference is higher than the time domain threshold, the time domain information corresponding to the pickup data of the two microphones is considered to be inconsistent, and the magnitude relationship of the time domain information corresponding to the pickup data of the two microphones is determined;
- the difference between B1 and B2, B1 and B3, and B2 and B3 can be calculated respectively, and the difference is compared with the set frequency domain The threshold is compared.
- the frequency domain information corresponding to the pickup data of the two microphones is considered to be the same; when the difference is higher than the frequency domain threshold, the pickup data of the two microphones are considered to correspond
- the frequency domain information of the two microphones is inconsistent, and the magnitude relationship of the frequency domain information corresponding to the pickup data of the two microphones is determined.
- the time domain comparison result when combining the time domain comparison result and the frequency domain comparison result to determine whether the microphone is blocked, if you want to detect the blocked microphone as much as possible, you can use the time domain information and frequency domain information of the two microphones. One of them is inconsistent, to determine the microphone blocking the microphone.
- the frequency domain transformation process can be performed on the pickup data of each microphone, and the frequency domain information corresponding to the pickup data of each microphone can be obtained. According to the pre-trained abnormal sound detection network and the pickup of each microphone The frequency domain information corresponding to the data detects whether there is abnormal sound data in the sound pickup data of each microphone.
- the pre-trained abnormal sound detection network can be performed by collecting a large amount of abnormal sound data (for example, some sound data with a specific frequency) in the research and development stage of the terminal device, and using AI (Artificial Intelligence) algorithm to perform characteristics. Learned.
- AI Artificial Intelligence
- the frequency domain information corresponding to the sound pickup data of each microphone is input into the pre-trained abnormal sound detection network, and the detection result of whether there is abnormal sound data can be obtained.
- the abnormal sound data may include the self-noise of the terminal device, the user’s finger touching the screen or rubbing the microphone hole and other abnormal sounds.
- the abnormal sound data can be eliminated by using AI algorithms combined with time domain filtering and frequency domain filtering. To process.
- the frequency of the abnormal sound data can be reduced by gain, that is, multiplied by a value between 0 and 1, to achieve the purpose of eliminating the abnormal sound data or reducing the intensity of the abnormal sound data.
- a pre-trained sound detection network can be used to detect whether there is preset sound data in the abnormal sound data, where the pre-trained sound detection network can be obtained by feature learning using an AI algorithm, and the preset sound data It can be understood as the non-noise data that the user expects to record, such as speech, music, etc.
- the pre-trained voice detection network has the non-noise data that the user expects to record, the abnormal sound data is not eliminated, and only needs to be reduced.
- the intensity of the abnormal sound data (for example, multiplied by a value of 0.5); when there is no non-noise data that the user expects to record using the pre-trained sound detection network, the abnormal sound data is directly eliminated (for example, multiplied by a value of 0) .
- S2014-A Select the sound pickup data corresponding to the serial number of the microphone that has not blocked the microphone from the initial target sound pickup data as multiple target sound pickup data.
- the serial number of the microphone that has jammed is m1
- the serial number of the microphone that has not jammed is m2 and m3
- the serial number can be selected from the initial target pickup data
- the sound pickup data corresponding to m2 and m3 are used as target sound pickup data, and multiple target sound pickup data are obtained for subsequent formation of stereo beams.
- S2011-A can be executed before S2012-A, can also be executed after S2012-A, and can also be executed at the same time as S2012-A; that is, this embodiment does not detect blocked microphones and abnormal sounds.
- the order of data processing is restricted.
- the microphone jam detection and the abnormal sound processing of the microphone pickup data it is possible to determine multiple target pickup data used to form a stereo beam.
- the user uses the terminal device to record video, even if there is The microphone hole is blocked and there is abnormal sound data in the microphone's pickup data, which can still ensure a good stereo recording effect, thereby achieving better recording robustness.
- S201 when multiple target pickup data for forming stereo beams are determined by detecting microphone jamming, S201 includes the following sub-steps:
- S2011-B Obtain the serial number of the microphone without microphone jam based on the sound pickup data of multiple microphones.
- S2011-B can refer to the aforementioned S2011-A, which will not be repeated here.
- S2012-B Select the sound pickup data corresponding to the serial number of the microphone without microphone jam from the sound pickup data of the multiple microphones as the multiple target sound pickup data.
- the serial number of the microphone that has jammed is m1
- the serial number of the microphone that has not jammed is m2 and m3
- the sound pickup data of the microphones with serial numbers m2 and m3 are selected as the target sound pickup data, and multiple target sound pickup data are obtained.
- the terminal device obtains the sound pickup data of multiple microphones, it performs a blocking detection on the multiple microphones according to the sound pickup data of the multiple microphones, and obtains the failure
- the serial number of the microphone that is clogged, and the pickup data corresponding to the serial number of the microphone that is not clogged is selected for subsequent formation of a stereo beam.
- S201 when multiple target sound pickup data for forming a stereo beam are determined by performing abnormal sound processing on the sound pickup data of the microphone, S201 includes the following sub-steps:
- S2011-C can refer to the aforementioned S2012-A, which will not be repeated here.
- the terminal device after the terminal device obtains the sound pickup data of multiple microphones, it can obtain relatively "clean" sound pickup data by performing abnormal sound detection and abnormal sound elimination processing on the sound pickup data of the multiple microphones. (I.e. multiple target pickup data) for subsequent formation of stereo beams.
- abnormal sound data such as finger rubbing on the microphone and various self-noises of the terminal device on the stereo recording effect is effectively reduced when the terminal device records the video.
- the stereo sound pickup method further includes the following steps:
- the frequency response can be corrected to be flat, so as to obtain a better stereo recording effect.
- gain control may also be performed on the generated stereo beam.
- the stereo sound pickup method further includes the following steps:
- the low-volume pickup data can be heard clearly, and the high-volume pickup data will not produce clipping distortion, so as to adjust the sound recorded by the user to an appropriate volume and improve the user's video recording Experience.
- the present embodiment proposes to adjust the gain of the stereo beam according to the zoom factor of the camera.
- the zoom factor increases, the amount of gain enlargement also increases, thereby ensuring the target of the long-distance sound pickup scene
- the volume of the sound source is still clear and loud.
- the terminal device after the terminal device forms a stereo beam according to the target beam parameter group and multiple target pickup data, it can first perform tone correction on the stereo beam, and then adjust the gain of the stereo beam , In order to get a better stereo recording effect.
- FIG. 17 is a functional block diagram of a stereo sound pickup device provided by an embodiment of the present invention. It should be noted that the basic principles and technical effects of the stereo sound pickup device provided by this embodiment are the same as those of the above embodiment. For a brief description, for the parts not mentioned in this embodiment, please refer to the above embodiment. In the corresponding content.
- the stereo sound pickup device includes: a pickup data acquisition module 510, a device parameter acquisition module 520, a beam parameter determination module 530, and a beam forming module 540.
- the pickup data acquisition module 510 is used to acquire multiple target pickup data from the pickup data of multiple microphones.
- the pickup data acquisition module 510 can execute the above S201.
- the device parameter acquisition module 520 is used to acquire the posture data and camera data of the terminal device.
- the device parameter acquisition module 520 can execute the foregoing S202.
- the beam parameter determination module 530 is configured to determine a target beam parameter group corresponding to multiple target sound pickup data from a plurality of pre-stored beam parameter groups according to attitude data and camera data; wherein, the target beam parameter group includes multiple target pickup data.
- the beam parameters corresponding to the sound data are configured to determine a target beam parameter group corresponding to multiple target sound pickup data from a plurality of pre-stored beam parameter groups according to attitude data and camera data; wherein, the target beam parameter group includes multiple target pickup data. The beam parameters corresponding to the sound data.
- the beam parameter determining module 530 can perform the foregoing S203.
- the beam forming module 540 is used to form a stereo beam according to the target beam parameter group and multiple target sound pickup data.
- the beam forming module 540 can perform the foregoing S204.
- the camera data may include activation data, which characterizes the activated camera
- the beam parameter determination module 530 is configured to determine, according to the posture data and the activation data, from a plurality of beam parameter groups stored in advance.
- the first target beam parameter group corresponding to the target sound pickup data.
- the beam forming module 540 may form a first stereo beam according to the first target beam parameter group and multiple target sound pickup data; wherein, the first stereo beam is directed to the shooting direction of the activated camera.
- the multiple beam parameter groups include a first beam parameter group, a second beam parameter group, a third beam parameter group, and a fourth beam parameter group, the first beam parameter group, the second beam parameter group, and the third beam parameter group
- the beam parameters in the group and the fourth beam parameter group are different.
- the first target beam parameter group is the first beam parameter group; when the posture data characterizes the terminal device in the landscape state and is enabled, the first target beam parameter group is the second beam parameter group; when the posture data characterization terminal device is in the vertical screen state, and the data characterization rear camera is enabled, the first target beam parameter group It is the third beam parameter group; when the posture data characterizes that the terminal device is in the vertical screen state, and the activation data characterizes that the front camera is activated, the first target beam parameter group is the fourth beam parameter group.
- the beam parameter determining module 530 can perform the above S203-1, and the beam forming module 540 can perform the above S204-1.
- the camera data may include activation data and zoom data, where the zoom data is the zoom factor of the activated camera represented by the activation data, and the beam parameter determination module 530 is configured to perform according to the attitude data, the activation data, and the zoom factor.
- the data determines a second target beam parameter group corresponding to the plurality of target sound pickup data from a plurality of pre-stored beam parameter groups.
- the beam forming module 540 can form a second stereo beam according to the second target beam parameter group and multiple target pickup data; wherein, the second stereo beam points to the shooting direction of the activated camera, and the width of the second stereo beam increases with The zoom factor increases and narrows.
- the beam parameter determining module 530 can perform the above S203-2, and the beam forming module 540 can perform the above S204-2.
- the pickup data acquisition module 510 may include a blocked microphone detection module 511 and/or an abnormal sound processing module 512, and a target pickup data selection module 513, through the blocked microphone detection module 511 and/or an abnormal sound processing module 512, and the target sound pickup data selection module 513 can obtain multiple target sound pickup data from the sound pickup data of multiple microphones.
- the microphone jam detection module 511 is configured to detect the sound of the multiple microphones.
- the data acquires the serial number of the microphone that has not blocked the microphone.
- the abnormal sound processing module 512 is used to detect whether there is abnormal sound data in the sound pickup data of each microphone, and if there is abnormal sound data, eliminate the sound pickup data of multiple microphones.
- the target sound pickup data selection module 513 is used to select the sound pickup data corresponding to the serial number of the microphone that has not blocked the microphone from the initial target sound pickup data as multiple target sound pickup data .
- the blocked microphone detection module 511 is used to perform time domain framing processing and frequency domain transformation processing on the pickup data of each microphone to obtain the time domain information and frequency domain information corresponding to the pickup data of each microphone.
- the time domain information and frequency domain information corresponding to the pickup data of different microphones are respectively compared, and the time domain comparison result and the frequency domain comparison result are obtained.
- the serial number of the microphone that has blocked the microphone is determined. Determine the serial number of the microphone that has not blocked the microphone based on the serial number of the microphone that has blocked the microphone.
- the abnormal sound processing module 512 is used to perform frequency domain transformation processing on the pickup data of each microphone to obtain the frequency domain information corresponding to the pickup data of each microphone, according to the pre-trained abnormal sound detection network and the pickup of each microphone.
- the frequency domain information corresponding to the sound data detects whether there is abnormal sound data in the sound pickup data of each microphone.
- the pre-trained sound detection network can be used to detect whether there is preset sound data in the abnormal sound data. If there is no preset sound data, the abnormal sound data is eliminated. If there is a preset sound Data, reduce the intensity of abnormal sound data.
- the microphone jam detection module 511 is used to obtain the non-occurring microphone jam based on the sound pickup data of the multiple microphones.
- the target sound pickup data selection module 513 selects the sound pickup data corresponding to the serial number of the microphone that has not blocked the microphone from the sound pickup data of the multiple microphones as the multiple target sound pickup data.
- the abnormal sound processing module 512 is used to detect whether there is an abnormal sound in the sound pickup data of each microphone. Data, if there is abnormal sound data, the abnormal sound data in the sound pickup data of a plurality of microphones is eliminated to obtain a plurality of target sound pickup data.
- the blocked microphone detection module 511 can execute the aforementioned S2011-A and S2011-B; the abnormal sound processing module 512 can execute the aforementioned S2012-A, S2013-A, and S2011-C; the target pickup data selection module 513 can Perform the above S2014-A, S2012-B, and S2012-C.
- the stereo sound pickup device may further include a tone color correction module 550 and a gain control module 560.
- the timbre correction module 550 is used to correct the timbre of the stereo beam.
- tone color correction module can execute the above-mentioned S301.
- the gain control module 560 is used to adjust the gain of the stereo beam.
- the gain control module 560 can adjust the gain of the stereo beam according to the zoom factor of the camera.
- gain control module 560 can execute the foregoing S401.
- the embodiment of the present invention also provides a computer-readable storage medium on which a computer program is stored.
- a computer program is stored on which a computer program is stored.
- the stereo sound pickup method disclosed in each of the foregoing embodiments is implemented.
- the embodiment of the present invention also provides a computer program product, which when the computer program product runs on a computer, causes the computer to execute the stereo sound pickup method disclosed in each of the foregoing embodiments.
- the embodiment of the present invention also provides a chip system, which includes a processor and may also include a memory, which is used to implement the stereo sound pickup method disclosed in each of the foregoing embodiments.
- the chip system can be composed of chips, or it can include chips and other discrete devices.
- the target beam parameter set is determined according to the posture data and camera data of the terminal device, when the terminal device is in a different position When recording a video scene, different posture data and camera data will be obtained, and then different target beam parameter groups will be determined.
- different target beam parameters are used when forming a stereo beam based on the target beam parameter group and multiple target pickup data.
- the group can adjust the direction of the stereo beam, thereby effectively reducing the impact of noise in the recording environment, so that the terminal device can obtain a better stereo recording effect in different video recording scenes.
- by detecting the blocking of the microphone and eliminating various abnormal sound data it is possible to record video in the case of microphone blocking and abnormal sound data, which can still ensure a good stereo recording effect and sound recording.
- each block in the flowchart or block diagram may represent a module, program segment, or part of the code, and the module, program segment, or part of the code contains one or more functions for realizing the specified logic function.
- Executable instructions may also occur in a different order from the order marked in the drawings.
- each block in the block diagram and/or flowchart, and a combination of blocks in the block diagram and/or flowchart can be implemented by a dedicated hardware-based system that performs the specified functions or actions Or it can be realized by a combination of dedicated hardware and computer instructions.
- the functional modules in the various embodiments of the present invention may be integrated together to form an independent part, or each module may exist alone, or two or more modules may be integrated to form an independent part.
- the function is implemented in the form of a software function module and sold or used as an independent product, it can be stored in a computer readable storage medium.
- the technical solution of the present invention essentially or the part that contributes to the prior art or the part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, including Several instructions are used to make a computer device (which can be a mobile phone, a tablet computer, etc.) execute all or part of the steps of the methods described in the various embodiments of the present invention.
- the aforementioned storage media include: U disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disks or optical disks and other media that can store program codes. .
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Otolaryngology (AREA)
- General Health & Medical Sciences (AREA)
- Multimedia (AREA)
- Studio Devices (AREA)
- Circuit For Audible Band Transducer (AREA)
- User Interface Of Digital Computer (AREA)
Abstract
Description
Claims (19)
- 一种立体声拾音方法,应用于终端设备,所述终端设备包括多个麦克风,其特征在于,所述方法包括:从所述多个麦克风的拾音数据中获取多个目标拾音数据;获取所述终端设备的姿态数据和摄像头数据;根据所述姿态数据和所述摄像头数据从预先存储的多个波束参数组中确定与所述多个目标拾音数据对应的目标波束参数组;其中,所述目标波束参数组包括所述多个目标拾音数据各自对应的波束参数;根据所述目标波束参数组和所述多个目标拾音数据形成立体声波束。
- 根据权利要求1所述的方法,其特征在于,所述摄像头数据包括启用数据,所述启用数据表征被启用的摄像头;所述根据所述姿态数据和所述摄像头数据从预先存储的多个波束参数组中确定与所述多个目标拾音数据对应的目标波束参数组的步骤包括:根据所述姿态数据和所述启用数据从预先存储的多个波束参数组中确定与所述多个目标拾音数据对应的第一目标波束参数组;根据所述目标波束参数组和所述多个目标拾音数据形成立体声波束的步骤包括:根据所述第一目标波束参数组和所述多个目标拾音数据形成第一立体声波束;其中,所述第一立体声波束指向被启用的摄像头的拍摄方向。
- 根据权利要求2所述的方法,其特征在于,所述多个波束参数组包括第一波束参数组、第二波束参数组、第三波束参数组和第四波束参数组,所述第一波束参数组、所述第二波束参数组、所述第三波束参数组和所述第四波束参数组中的所述波束参数不同;其中,当所述姿态数据表征所述终端设备处于横屏状态,且所述启用数据表征后置摄像头被启用时,所述第一目标波束参数组为所述第一波束参数组;当所述姿态数据表征所述终端设备处于横屏状态,且所述启用数据表征前置摄像头被启用时,所述第一目标波束参数组为所述第二波束参数组;当所述姿态数据表征所述终端设备处于竖屏状态,且所述启用数据表征后置摄像头被启用时,所述第一目标波束参数组为所述第三波束参数组;当所述姿态数据表征所述终端设备处于竖屏状态,且所述启用数据表征前置摄像头被启用时,所述第一目标波束参数组为所述第四波束参数组。
- 根据权利要求1所述的方法,其特征在于,所述摄像头数据包括启用数据和变焦数据,其中所述变焦数据为所述启用数据表征的被启用的摄像头的变焦倍 数;所述根据所述姿态数据和所述摄像头数据从预先存储的多个波束参数组中确定与所述多个目标拾音数据对应的目标波束参数组的步骤包括:根据所述姿态数据、所述启用数据和所述变焦数据从预先存储的多个波束参数组中确定与所述多个目标拾音数据对应的第二目标波束参数组;根据所述目标波束参数组和所述多个目标拾音数据形成立体声波束的步骤包括:根据所述第二目标波束参数组和所述多个目标拾音数据形成第二立体声波束;其中,所述第二立体声波束指向被启用的摄像头的拍摄方向,且所述第二立体声波束的宽度随着所述变焦倍数的增大而收窄。
- 根据权利要求1-4任一项所述的方法,其特征在于,所述从所述多个麦克风的拾音数据中获取多个目标拾音数据的步骤包括:根据所述多个麦克风的拾音数据获取未发生堵麦的麦克风的序号;检测每个所述麦克风的拾音数据中是否存在异常音数据;若存在异常音数据,则消除所述多个麦克风的拾音数据中的异常音数据,得到初始目标拾音数据;从所述初始目标拾音数据中选取所述未发生堵麦的麦克风的序号对应的拾音数据作为所述多个目标拾音数据。
- 根据权利要求5所述的方法,其特征在于,所述根据所述多个麦克风的拾音数据获取未发生堵麦的麦克风的序号的步骤包括:对每个所述麦克风的拾音数据均进行时域分帧处理和频域变换处理,以得到每个所述麦克风的拾音数据对应的时域信息和频域信息;将不同麦克风的拾音数据对应的时域信息和频域信息分别进行比较,得到时域比较结果和频域比较结果;根据所述时域比较结果和所述频域比较结果确定发生堵麦的麦克风的序号;基于所述发生堵麦的麦克风的序号确定未发生堵麦的麦克风的序号。
- 根据权利要求5所述的方法,其特征在于,所述检测每个所述麦克风的拾音数据中是否存在异常音数据的步骤包括:对每个所述麦克风的拾音数据进行频域变换处理,得到每个所述麦克风的拾音数据对应的频域信息;根据预先训练的异常音检测网络和每个所述麦克风的拾音数据对应的频域信息检测每个所述麦克风的拾音数据中是否存在异常音数据。
- 根据权利要求5所述的方法,其特征在于,所述消除所述多个麦克风的拾音数据中的异常音数据的步骤包括:利用预先训练的声音检测网络检测所述异常音数据中是否存在预设的声音数据;若不存在预设的声音数据,则消除所述异常音数据;若存在预设的声音数据,则降低所述异常音数据的强度。
- 根据权利要求1-4任一项所述的方法,其特征在于,所述从所述多个麦克风的拾音数据中获取多个目标拾音数据的步骤包括:根据所述多个麦克风的拾音数据获取未发生堵麦的麦克风的序号;从所述多个麦克风的拾音数据中选取所述未发生堵麦的麦克风的序号对应的拾音数据作为所述多个目标拾音数据。
- 根据权利要求1-4任一项所述的方法,其特征在于,所述从所述多个麦克风的拾音数据中获取多个目标拾音数据的步骤包括:检测每个所述麦克风的拾音数据中是否存在异常音数据;若存在异常音数据,则消除所述多个麦克风的拾音数据中的异常音数据,得到多个目标拾音数据。
- 根据权利要求1-4任一项所述的方法,其特征在于,所述根据所述目标波束参数组和所述多个目标拾音数据形成立体声波束的步骤之后,所述方法还包括:修正所述立体声波束的音色。
- 根据权利要求1-4任一项所述的方法,其特征在于,所述根据所述目标波束参数组和所述多个目标拾音数据形成立体声波束的步骤之后,所述方法还包括:调节所述立体声波束的增益。
- 根据权利要求12所述的方法,其特征在于,所述摄像头数据包括被启用的摄像头的变焦倍数,所述调节所述立体声波束的增益的步骤包括:根据所述摄像头的变焦倍数调节所述立体声波束的增益。
- 根据权利要求1-4任一项所述的方法,其特征在于,所述麦克风的数量为3至6个,其中至少一个麦克风设置在所述终端设备的屏幕正面或所述终端设备的背面。
- 根据权利要求14所述的方法,其特征在于,所述麦克风的数量为3个,所述终端设备的顶部和底部分别设置一个麦克风,所述终端设备的屏幕正面或所述终端设备的背面设置一个麦克风。
- 根据权利要求14所述的方法,其特征在于,所述麦克风的数量为6个,所述终端设备的顶部和底部分别设置两个麦克风,所述终端设备的屏幕正面和所述终端设备的背面分别设置一个麦克风。
- 一种立体声拾音装置,应用于终端设备,所述终端设备包括多个麦克风,其特征在于,所述装置包括:拾音数据获取模块,用于从所述多个麦克风的拾音数据中获取多个目标拾音数据;设备参数获取模块,用于获取所述终端设备的姿态数据和摄像头数据;波束参数确定模块,用于根据所述姿态数据和所述摄像头数据从预先存储的多个波束参数组中确定与所述多个目标拾音数据对应的目标波束参数组;其中,所述目标波束参数组包括所述多个目标拾音数据各自对应的波束参数;波束形成模块,用于根据所述目标波束参数组和所述多个目标拾音数据形成立体声波束。
- 一种终端设备,其特征在于,包括存储有计算机程序的存储器和处理器,所述计算机程序被所述处理器读取并运行时,实现如权利要求1-16中任一项所述的方法。
- 一种计算机可读存储介质,其特征在于,其上存储有计算机程序,所述计算机程序被处理器读取并运行时,实现如权利要求1-16中任一项所述的方法。
Priority Applications (6)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311246081.9A CN117528349A (zh) | 2020-01-16 | 2021-01-12 | 立体声拾音方法、装置、终端设备和计算机可读存储介质 |
CN202180007656.4A CN114846816B (zh) | 2020-01-16 | 2021-01-12 | 立体声拾音方法、装置、终端设备和计算机可读存储介质 |
BR112022013690A BR112022013690A2 (pt) | 2020-01-16 | 2021-01-12 | Método e aparelho de captura de som estéreo, dispositivo terminal, e meio de armazenamento legível por computador |
JP2022543511A JP7528228B2 (ja) | 2020-01-16 | 2021-01-12 | ステレオ収音方法および装置、端末デバイス、ならびにコンピュータ可読記憶媒体 |
EP21740899.6A EP4075825A4 (en) | 2020-01-16 | 2021-01-12 | STEREO SOUND CAPTURE METHOD AND APPARATUS, TERMINAL DEVICE AND COMPUTER READABLE MEMORY MEDIA |
US17/758,927 US20230048860A1 (en) | 2020-01-16 | 2021-01-12 | Stereo Sound Pickup Method and Apparatus, Terminal Device, and Computer-Readable Storage Medium |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010048851.9 | 2020-01-16 | ||
CN202010048851.9A CN113132863B (zh) | 2020-01-16 | 2020-01-16 | 立体声拾音方法、装置、终端设备和计算机可读存储介质 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2021143656A1 true WO2021143656A1 (zh) | 2021-07-22 |
Family
ID=76771795
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2021/071156 WO2021143656A1 (zh) | 2020-01-16 | 2021-01-12 | 立体声拾音方法、装置、终端设备和计算机可读存储介质 |
Country Status (6)
Country | Link |
---|---|
US (1) | US20230048860A1 (zh) |
EP (1) | EP4075825A4 (zh) |
JP (1) | JP7528228B2 (zh) |
CN (3) | CN113132863B (zh) |
BR (1) | BR112022013690A2 (zh) |
WO (1) | WO2021143656A1 (zh) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2023016032A1 (zh) * | 2021-08-12 | 2023-02-16 | 北京荣耀终端有限公司 | 一种视频处理方法及电子设备 |
CN116668892A (zh) * | 2022-11-14 | 2023-08-29 | 荣耀终端有限公司 | 音频信号的处理方法、电子设备及可读存储介质 |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115843054A (zh) * | 2021-09-18 | 2023-03-24 | 维沃移动通信有限公司 | 参数选择方法、参数配置方法、终端及网络侧设备 |
CN115134499B (zh) * | 2022-06-28 | 2024-02-02 | 世邦通信股份有限公司 | 一种音视频监控方法及系统 |
CN116700659B (zh) * | 2022-09-02 | 2024-03-08 | 荣耀终端有限公司 | 一种界面交互方法及电子设备 |
CN118250608A (zh) * | 2022-12-24 | 2024-06-25 | 荣耀终端有限公司 | 一种麦克风控制方法及电子设备 |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20050062266A (ko) * | 2003-12-20 | 2005-06-23 | 엘지전자 주식회사 | 이동 통신 단말기의 캠코더용 외부마이크 장치 |
CN104244137A (zh) * | 2014-09-30 | 2014-12-24 | 广东欧珀移动通信有限公司 | 一种录像过程中提升远景录音效果的方法及系统 |
CN108200515A (zh) * | 2017-12-29 | 2018-06-22 | 苏州科达科技股份有限公司 | 多波束会议拾音系统及方法 |
CN108831474A (zh) * | 2018-05-04 | 2018-11-16 | 广东美的制冷设备有限公司 | 语音识别设备及其语音信号捕获方法、装置和存储介质 |
WO2019130908A1 (ja) * | 2017-12-26 | 2019-07-04 | キヤノン株式会社 | 撮像装置及びその制御方法及び記録媒体 |
Family Cites Families (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP4269883B2 (ja) * | 2003-10-20 | 2009-05-27 | ソニー株式会社 | マイクロホン装置、再生装置及び撮像装置 |
CN102780947B (zh) * | 2011-05-13 | 2015-12-16 | 宏碁股份有限公司 | 降低手持式电子装置录音噪音的系统及其方法 |
EP2680616A1 (en) * | 2012-06-25 | 2014-01-01 | LG Electronics Inc. | Mobile terminal and audio zooming method thereof |
KR102060712B1 (ko) * | 2013-01-31 | 2020-02-11 | 엘지전자 주식회사 | 이동 단말기, 및 그 동작방법 |
WO2014149050A1 (en) * | 2013-03-21 | 2014-09-25 | Nuance Communications, Inc. | System and method for identifying suboptimal microphone performance |
CN104424953B (zh) * | 2013-09-11 | 2019-11-01 | 华为技术有限公司 | 语音信号处理方法与装置 |
AU2014321133A1 (en) * | 2013-09-12 | 2016-04-14 | Cirrus Logic International Semiconductor Limited | Multi-channel microphone mapping |
US9338575B2 (en) * | 2014-02-19 | 2016-05-10 | Echostar Technologies L.L.C. | Image steered microphone array |
JP6613503B2 (ja) * | 2015-01-15 | 2019-12-04 | 本田技研工業株式会社 | 音源定位装置、音響処理システム、及び音源定位装置の制御方法 |
US9716944B2 (en) * | 2015-03-30 | 2017-07-25 | Microsoft Technology Licensing, Llc | Adjustable audio beamforming |
US10122914B2 (en) * | 2015-04-17 | 2018-11-06 | mPerpetuo, Inc. | Method of controlling a camera using a touch slider |
CN106486147A (zh) * | 2015-08-26 | 2017-03-08 | 华为终端(东莞)有限公司 | 指向性录音方法、装置及录音设备 |
CN111724823B (zh) * | 2016-03-29 | 2021-11-16 | 联想(北京)有限公司 | 一种信息处理方法及装置 |
CN107026934B (zh) * | 2016-10-27 | 2019-09-27 | 华为技术有限公司 | 一种声源定位方法和装置 |
JP6312069B1 (ja) * | 2017-04-20 | 2018-04-18 | 株式会社Special Medico | 通話システムにおける個人情報管理方法、サーバ及びプログラム |
-
2020
- 2020-01-16 CN CN202010048851.9A patent/CN113132863B/zh active Active
-
2021
- 2021-01-12 CN CN202180007656.4A patent/CN114846816B/zh active Active
- 2021-01-12 JP JP2022543511A patent/JP7528228B2/ja active Active
- 2021-01-12 CN CN202311246081.9A patent/CN117528349A/zh active Pending
- 2021-01-12 US US17/758,927 patent/US20230048860A1/en active Pending
- 2021-01-12 WO PCT/CN2021/071156 patent/WO2021143656A1/zh unknown
- 2021-01-12 EP EP21740899.6A patent/EP4075825A4/en active Pending
- 2021-01-12 BR BR112022013690A patent/BR112022013690A2/pt unknown
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20050062266A (ko) * | 2003-12-20 | 2005-06-23 | 엘지전자 주식회사 | 이동 통신 단말기의 캠코더용 외부마이크 장치 |
CN104244137A (zh) * | 2014-09-30 | 2014-12-24 | 广东欧珀移动通信有限公司 | 一种录像过程中提升远景录音效果的方法及系统 |
WO2019130908A1 (ja) * | 2017-12-26 | 2019-07-04 | キヤノン株式会社 | 撮像装置及びその制御方法及び記録媒体 |
CN108200515A (zh) * | 2017-12-29 | 2018-06-22 | 苏州科达科技股份有限公司 | 多波束会议拾音系统及方法 |
CN108831474A (zh) * | 2018-05-04 | 2018-11-16 | 广东美的制冷设备有限公司 | 语音识别设备及其语音信号捕获方法、装置和存储介质 |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2023016032A1 (zh) * | 2021-08-12 | 2023-02-16 | 北京荣耀终端有限公司 | 一种视频处理方法及电子设备 |
CN116668892A (zh) * | 2022-11-14 | 2023-08-29 | 荣耀终端有限公司 | 音频信号的处理方法、电子设备及可读存储介质 |
CN116668892B (zh) * | 2022-11-14 | 2024-04-12 | 荣耀终端有限公司 | 音频信号的处理方法、电子设备及可读存储介质 |
Also Published As
Publication number | Publication date |
---|---|
US20230048860A1 (en) | 2023-02-16 |
BR112022013690A2 (pt) | 2022-09-06 |
JP7528228B2 (ja) | 2024-08-05 |
CN113132863A (zh) | 2021-07-16 |
CN117528349A (zh) | 2024-02-06 |
EP4075825A1 (en) | 2022-10-19 |
CN113132863B (zh) | 2022-05-24 |
CN114846816A (zh) | 2022-08-02 |
CN114846816B (zh) | 2023-10-20 |
EP4075825A4 (en) | 2023-05-24 |
JP2023511090A (ja) | 2023-03-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2021143656A1 (zh) | 立体声拾音方法、装置、终端设备和计算机可读存储介质 | |
CN111050269B (zh) | 音频处理方法和电子设备 | |
US20220206682A1 (en) | Gesture Interaction Method and Apparatus, and Terminal Device | |
US11956607B2 (en) | Method and apparatus for improving sound quality of speaker | |
US11606455B2 (en) | Method for preventing mistouch by using top-emitted proximity light, and terminal | |
EP3993460B1 (en) | Method, electronic device and system for realizing functions through nfc tag | |
WO2021180085A1 (zh) | 拾音方法、装置和电子设备 | |
CN114697812A (zh) | 声音采集方法、电子设备及系统 | |
WO2020019355A1 (zh) | 一种可穿戴设备的触控方法、可穿戴设备及系统 | |
CN113810601A (zh) | 终端的图像处理方法、装置和终端设备 | |
US11978384B2 (en) | Display method for electronic device and electronic device | |
CN113496708A (zh) | 拾音方法、装置和电子设备 | |
WO2023273476A1 (zh) | 一种检测设备方法和电子设备 | |
WO2022156555A1 (zh) | 屏幕亮度的调整方法、装置和终端设备 | |
WO2020077508A1 (zh) | 一种对内部存储器动态调频的方法及电子设备 | |
CN114339429A (zh) | 音视频播放控制方法、电子设备和存储介质 | |
US20240135946A1 (en) | Method and apparatus for improving sound quality of speaker | |
US20230162718A1 (en) | Echo filtering method, electronic device, and computer-readable storage medium | |
US20230370718A1 (en) | Shooting Method and Electronic Device | |
WO2022142795A1 (zh) | 一种设备的识别方法及设备 | |
CN113436635B (zh) | 分布式麦克风阵列的自校准方法、装置和电子设备 | |
CN115706755A (zh) | 回声消除方法、电子设备及存储介质 | |
CN115378303A (zh) | 驱动波形的调整方法及装置、电子设备、可读存储介质 | |
CN113867520A (zh) | 设备控制方法、电子设备和计算机可读存储介质 | |
WO2022105670A1 (zh) | 一种显示方法及终端 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21740899 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2022543511 Country of ref document: JP Kind code of ref document: A |
|
REG | Reference to national code |
Ref country code: BR Ref legal event code: B01A Ref document number: 112022013690 Country of ref document: BR |
|
ENP | Entry into the national phase |
Ref document number: 2021740899 Country of ref document: EP Effective date: 20220714 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
ENP | Entry into the national phase |
Ref document number: 112022013690 Country of ref document: BR Kind code of ref document: A2 Effective date: 20220708 |