WO2022022647A1 - 电子设备的录音方法及录音装置 - Google Patents

电子设备的录音方法及录音装置 Download PDF

Info

Publication number
WO2022022647A1
WO2022022647A1 PCT/CN2021/109323 CN2021109323W WO2022022647A1 WO 2022022647 A1 WO2022022647 A1 WO 2022022647A1 CN 2021109323 W CN2021109323 W CN 2021109323W WO 2022022647 A1 WO2022022647 A1 WO 2022022647A1
Authority
WO
WIPO (PCT)
Prior art keywords
focal length
gain
signal
initial
target
Prior art date
Application number
PCT/CN2021/109323
Other languages
English (en)
French (fr)
Inventor
史建兴
Original Assignee
维沃移动通信有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 维沃移动通信有限公司 filed Critical 维沃移动通信有限公司
Publication of WO2022022647A1 publication Critical patent/WO2022022647A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0316Speech enhancement, e.g. noise reduction or echo cancellation by changing the amplitude
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0316Speech enhancement, e.g. noise reduction or echo cancellation by changing the amplitude
    • G10L21/0356Speech enhancement, e.g. noise reduction or echo cancellation by changing the amplitude for synchronising with other signals, e.g. video signals

Definitions

  • the present application belongs to the field of communication technologies, and in particular relates to a recording method and a recording device of an electronic device.
  • the distance between the sound source and the mobile terminal usually changes.
  • the sound recorded by the mobile terminal will gradually decrease, so that the user may not be able to hear the sound clearly; as the distance between the sound source and the mobile terminal gradually decreases, the sound recorded by the mobile terminal will gradually decrease. Gradually increase, resulting in possible breakouts. Therefore, in view of the situation where the distance between the sound source and the mobile terminal changes, how to improve the recording quality has become an urgent problem to be solved.
  • the purpose of the embodiments of the present application is to provide a recording method and a recording device for an electronic device, which can solve the problem of how to improve the recording quality.
  • an embodiment of the present application provides a recording method of an electronic device.
  • the electronic device includes M microphones, and each microphone is connected to a first voice path and a second voice path, where M is an integer greater than or equal to 2.
  • the method includes: obtaining the shooting focal length of the camera in a state of video shooting; if the shooting focal length of the camera changes from the initial focal length to the target focal length, determining the target gain according to the initial focal length, the target focal length and the initial gain, and comparing the target gain with the The gain of the second voice path connected to the ith microphone is adjusted to the target gain, the initial gain is the gain of the first voice path connected to the ith microphone, and i takes values 1, 2...M in turn; Perform signal enhancement processing on the voice signal output by the first voice channel connected to the ith microphone and the voice signal output by the second voice channel connected with the ith microphone to obtain the ith voice enhanced signal; perform signal fusion on the M voice enhanced signals processing to obtain a first recording signal.
  • an embodiment of the present application provides a recording device.
  • the recording device includes M microphones, and each microphone is connected with a first voice channel and a second voice channel, where M is an integer greater than or equal to 2.
  • the recording device includes an acquisition module, a determination module and a processing module.
  • the acquiring module is used for acquiring the shooting focal length of the camera when the video is in the state of shooting.
  • the determining module is used to determine the target gain according to the initial focal length, the target focal length and the initial gain if the shooting focal length obtained by the acquiring module is changed from the initial focal length to the target focal length, and the initial gain is the first voice path connected with the i-th microphone. gain.
  • the processing module is used for adjusting the gain of the second voice path connected with the i-th microphone to the target gain determined by the determination module; and to the voice signal output from the first voice path connected with the i-th microphone and with the i-th microphone.
  • i takes the value of 1, 2...M in turn.
  • an embodiment of the present application provides an electronic device, the electronic device includes a processor, a memory, and a program or instruction stored in the memory and executable on the processor, the program or instruction being executed by the processor When executed, the steps of the method as provided in the first aspect are implemented.
  • an embodiment of the present application provides a readable storage medium, where a program or an instruction is stored on the readable storage medium, and when the program or instruction is executed by a processor, the steps of the method provided in the first aspect are implemented.
  • an embodiment of the present application provides a chip, the chip includes a processor and a communication interface, the communication interface is coupled to the processor, and the processor is configured to run a program or an instruction to implement the method provided in the first aspect.
  • the shooting focal length of the camera can be obtained in the state of video shooting; if the shooting focal length of the camera changes from the initial focal length to the target focal length, the target gain is determined according to the initial focal length, the target focal length and the initial gain , and adjust the gain of the second voice path connected with the ith microphone to the target gain, the initial gain is the gain of the first voice path connected with the ith microphone, and i takes values 1, 2...M in turn; Carry out signal enhancement processing to the voice signal output by the first voice path connected with the ith microphone and the voice signal output by the second voice path connected with the ith microphone to obtain the ith voice enhancement signal; The signal is subjected to signal fusion processing to obtain a first recording signal.
  • the shooting focal length of the camera of the electronic device also changes. Therefore, by setting the first voice path connected to the i-th microphone as Fixed gain, the second voice channel connected with the i-th microphone is set to a variable gain that changes with the shooting focal length, so that when the shooting focal length becomes larger, the gain of the second voice path becomes larger to record signals at longer distances.
  • the gain of the second voice channel is smaller to record a signal at a closer distance, and then the voice signal is enhanced by comparing the difference between the two signals, thereby improving the quality of the voice signal obtained by the final fusion.
  • the effect of video shooting is
  • FIG. 1 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
  • FIG. 2 is one of the schematic diagrams of a recording method of an electronic device provided by an embodiment of the present application.
  • FIG 3 is the second schematic diagram of the recording method of the electronic device provided by the embodiment of the present application.
  • FIG. 4 is a schematic diagram of a field of view of a camera provided by an embodiment of the present application.
  • FIG. 5 is a schematic structural diagram of a recording device provided by an embodiment of the present application.
  • FIG. 6 is one of the schematic hardware diagrams of the electronic device provided by the embodiment of the present application.
  • FIG. 7 is the second schematic diagram of the hardware of the electronic device provided by the embodiment of the present application.
  • first, second and the like in the description and claims of the present application are used to distinguish similar objects, and are not used to describe a specific order or sequence. It is to be understood that the data so used are interchangeable under appropriate circumstances so that the embodiments of the present application can be practiced in sequences other than those illustrated or described herein, and distinguish between “first”, “second”, etc.
  • the objects are usually of one type, and the number of objects is not limited.
  • the first object may be one or more than one.
  • “and/or” in the description and claims indicates at least one of the connected objects, and the character “/" generally indicates that the associated objects are in an "or” relationship.
  • the embodiment of the present application provides a recording method and a recording device for an electronic device, which can obtain the shooting focal length of a camera in a video shooting state; Target focal length and initial gain, determine the target gain, and adjust the gain of the second voice path connected with the ith microphone to the target gain, and the initial gain is the gain of the first voice path connected with the ith microphone, i is taken in turn.
  • the value is 1, 2...M; perform signal enhancement processing on the voice signal output by the first voice path connected to the ith microphone and the voice signal output by the second voice path connected with the ith microphone to obtain the ith voice signal.
  • Speech enhancement signal perform signal fusion processing on the M speech enhancement signals to obtain a first recording signal.
  • the shooting focal length of the camera of the electronic device also changes. Therefore, by setting the first voice path connected to the i-th microphone as Fixed gain, the second voice channel connected with the i-th microphone is set to a variable gain that changes with the shooting focal length, so that when the shooting focal length becomes larger, the gain of the second voice path becomes larger to record signals at longer distances.
  • the gain of the second voice channel is smaller to record a signal at a closer distance, and then the voice signal is enhanced by comparing the difference between the two signals, thereby improving the quality of the voice signal obtained by the final fusion.
  • the effect of video shooting is
  • an embodiment of the present application provides an electronic device.
  • the electronic device includes M microphones, and each microphone is connected to a first voice path and a second voice path, where M is an integer greater than or equal to 2.
  • the electronic device is a mobile phone, and the mobile phone is provided with two microphones (mic), a codec (codec) and a digital signal processor (adsp) as an example.
  • the microphone 1 is connected with the first voice path 01 and the second voice path 02
  • the microphone 2 is connected with the first voice path 03 and the second voice path 04
  • each voice path in the four voice paths respectively includes an analog-to-digital converter (analog-to-digital converter, ADC), and the other end of the 4 voice paths is connected to the coding module of the codec.
  • the digital signal processor includes a recording enhancement module connected with the encoding module, and a noise reduction module connected with the recording enhancement module.
  • the microphone is used to collect sound signals (also known as sound source signals, sound wave signals) emitted by sound sources (such as characters, musical instruments, water currents, wind waves, etc.);
  • the ADC is used to convert the sound signals collected by the microphone by analog The signal is converted into a digital signal;
  • the encoding module is used to encode the digital signal output by the ADC to obtain the encoded signal;
  • the recording enhancement module is used to enhance the 2-channel encoded signal connected with each microphone, and obtain the same signal as each microphone.
  • the 1-channel enhanced signal corresponding to the microphone, and then the 2-channel enhanced signal corresponding to the two microphones is obtained;
  • the noise reduction module is used to fuse the 2-channel enhanced signal to obtain a 1-channel fusion signal, and perform noise reduction processing on the fusion signal. , to get the final recording signal.
  • the specific description of the recording method of the electronic device may be referred to the following embodiments, which will not be repeated here.
  • FIG. 1 is an example of an electronic device including two microphones. These two microphones can be used to collect different types of audio respectively.
  • the microphone 1 can be mainly used to collect human voices
  • the microphone 2 It is used to collect ambient sound, but it does not form any limitation to the embodiments of the present application.
  • the electronic device may further include three microphones or more than three microphones, and as the number of microphones increases, the recording effect is gradually enhanced.
  • an embodiment of the present application provides a recording method of an electronic device.
  • the method may include the following S201 to S204.
  • the method will be exemplarily described below by taking the execution subject as an electronic device as shown in FIG. 1 as an example.
  • the electronic device acquires the shooting focal length of the camera.
  • the above-mentioned video shooting state is a state in which the user triggers the camera to collect and cache video frames through touch input on the video shooting control or video recording control after running the camera application, that is, the electronic device is in the process of video shooting among.
  • the shooting focal length of the camera may change at any time.
  • the electronic device is in an auto-focus state, and as the subject moves, the focal length of the camera is automatically adjusted; another possible implementation is that as the subject moves or the subject moves To change, the user triggers the camera to adjust the focus through manual input. Therefore, in order to detect the change of the shooting focal length of the camera in time, after the electronic device enters the video shooting state, the electronic device can periodically detect the shooting focal length of the camera to determine whether the shooting focal length of the camera has changed, and then determine whether The gain of the speech path needs to be adjusted.
  • the recording method provided by the embodiment of the present application may further include: running a camera application; after that, setting the shooting focal length of the camera to the initial focal length, and setting the field of view of the camera to the initial field of view , and set the gain of each speech channel to the initial gain.
  • the initial focal length, initial field angle and initial gain may all be preset values.
  • the electronic device runs the camera application and performs initial settings. For example, set the initial focal length of the camera to 4 times the focal length, set the initial field of view of the camera to ⁇ , and set the gain of the ADC in the first voice path and the second voice path connected to each camera to 12dB (decibels ).
  • the gain of the first voice channel is a fixed gain, and as the shooting focal length of the camera changes, the gain of the first voice channel remains unchanged.
  • the gain of the second voice path is a variable gain. As the focal length of the camera increases, the gain of the second voice path increases, and as the focal length of the camera decreases, the gain of the second voice path decreases.
  • the above-mentioned embodiment is described by taking an example that the initial gain of the second voice path is equal to the initial gain of the first voice path. It can be understood that, in actual implementation, the initial gain of the second voice path may be larger or smaller than that of the first voice path.
  • the initial gain of the channel may be specifically determined according to actual usage requirements, which is not limited in this embodiment of the present application.
  • the electronic device determines the target gain according to the initial focal length, the target focal length and the initial gain, and adjusts the gain of the second voice path connected to the ith microphone as the target gain.
  • the above-mentioned initial gain is the gain of the first voice path connected to the ith microphone, that is, the fixed gain set for the first voice path connected to the ith microphone when the shooting focal length of the camera is the initial focal length.
  • i takes values of 1, 2...M in sequence. That is, if the shooting focal length of the camera changes from the initial focal length to the target focal length, the electronic device determines the second voice connected to the first microphone according to the initial focal length, the target focal length and the gain of the first voice path connected to the first microphone the gain of the channel and adjust it; according to the initial focal length, the target focal length and the gain of the first voice channel connected with the second microphone, determine the gain of the second voice channel connected with the second microphone and adjust it; ...; According to the initial focal length, the target focal length and the gain of the first voice path connected with the Mth microphone, determine and adjust the gain of the second voice path connected with the Mth microphone.
  • the above-mentioned target focal length is larger than the initial focal length, or smaller than the initial focal length.
  • the gain of the second voice path connected to the ith microphone needs to be increased; if the target focal length is smaller than the initial focal length, the gain of the second voice path connected to the ith microphone needs to be reduced.
  • the above-mentioned "determining the target gain according to the initial focal length, target focal length and initial gain” can be implemented in any one of the following two ways:
  • Method 1 If the target focal length is greater than the initial focal length, the electronic device uses the sum of the initial gain and the first gain as the target gain, wherein the first gain is the product of the target value and the preset value, and the target value is the target focal length.
  • *preset value initial gain+(target focal length-initial focal length)*preset value, and the preset value can be used to represent The gain difference between two consecutive focal lengths.
  • the initial gain is 12 dB
  • the preset value is 3 dB
  • the initial focal length is ⁇ 4 (ie, 4 times the focal length).
  • Method 2 If the target focal length is smaller than the initial focal length, the electronic device uses the difference between the initial gain and the first gain as the target gain, where the first gain is the product of the target value and the preset value, and the target value is the target focal length.
  • *preset value initial gain-(initial focal length-target focal length)*preset value, and the preset value can be used to represent The gain difference between two consecutive focal lengths.
  • the initial gain is 12 dB
  • the preset value is 3 dB
  • the initial focal length is ⁇ 4 (ie, 4 times the focal length).
  • the focal length of the camera is changed to ⁇ 1 (ie, 1 times the focal length), the target gain
  • variable gain range is set for the gain of the second speech path, that is, the gain of the second speech passage can only be adjusted within the variable gain range.
  • the variable gain range is 0-30dB, combined with the example in the above-mentioned way 1, when the shooting focal length of the camera changes to ⁇ 10 (ie, 10 times the focal length), the target gain is 30dB, and when the shooting focal length of the camera changes When the focal length is greater than 10 times, the target gain is still 30dB.
  • the above embodiment is exemplified by taking the initial gain of the first voice path connected to each microphone as the same gain as an example. It can be understood that in actual implementation, the initial gain of the first voice path connected to each microphone is It can also be unequal, which can be determined according to the actual use requirements.
  • a voice enhancement signal can be obtained by comparing the voice signal output from the first voice path and the voice signal output from the second voice path and performing enhancement processing.
  • the electronic device may compare the signal-to-noise ratio of the speech signal output by the first speech channel with the signal-to-noise ratio of the speech signal output by the second speech channel. If the signal-to-noise ratio of the speech signal output by the first speech channel is greater than the signal-to-noise ratio of the speech signal output by the second speech channel, then the speech signal output by the first speech channel is enhanced to obtain a speech enhancement signal; If the signal-to-noise ratio of the speech signal output by the speech channel is smaller than that of the speech signal output by the second speech channel, the speech signal output by the second speech channel is enhanced to obtain a speech enhancement signal.
  • the electronic device can compare the preset feature parameters of the voice signal output by the first voice channel with the preset feature parameters of the voice signal output by the second voice channel;
  • the speech fragments that meet the requirements are synthesized to obtain a speech enhancement signal.
  • the preset characteristic parameters may include at least one of acoustic wave amplitude information, voiceprint information, and signal-to-noise ratio.
  • the output voice signal and the voice signal output from the second voice channel 04 connected to the microphone 2 are input into the recording enhancement module after being respectively encoded by the ADC analog-to-digital conversion and the encoding module.
  • the recording enhancement module can compare the voice signal output by the first voice path 01 and the voice signal output by the second voice path 02 to obtain the first voice enhancement signal, and compare the voice signal output by the first voice path 03 with the second voice path. 04 The output voice signal, get the second voice enhancement signal.
  • the two speech enhancement signals can be directly fused to obtain the first recording signal; or, the two speech enhancement signals can be fused through the noise reduction module to obtain a fusion signal, and then the fusion signal can be processed by fusion processing. Perform noise reduction processing to obtain a first recording signal.
  • the embodiments of the present application provide multiple noise reduction processing methods:
  • the first method is that the electronic device performs noise reduction processing on the first fusion signal according to the field of view and shooting direction of the camera to eliminate noise from outside the shooting range, thereby obtaining the first recording signal.
  • the second method is that the electronic device is provided with at least three microphones, and these three microphones can form a microphone array.
  • Each sound source is located, so as to obtain the position information of each sound source, and then eliminate the noise information outside the azimuth of the sound source.
  • the position information can be understood as the distance information of the sound source signal from the earphone and the position information relative to the earphone.
  • the electronic device may also use other noise reduction processing methods to perform noise reduction processing on the fusion signal to obtain the first recording signal, which may be determined according to actual usage requirements, which is not limited in the embodiments of the present application.
  • An embodiment of the present application provides a recording method for an electronic device, since when the distance between the sound source and the electronic device changes, the shooting focal length of the camera of the electronic device also changes.
  • the connected first voice path is set to a fixed gain
  • the second voice path connected to the i-th microphone is set to a variable gain that changes with the shooting focal length, so that the gain of the second voice path when the shooting focal length becomes larger
  • the gain of the second voice channel is smaller to record the signal at a closer distance, and then the voice signal is enhanced by comparing the difference between the two signals, thereby improving the final fusion.
  • the quality of the obtained voice signal is improved, and the shooting effect of the video is improved.
  • noise reduction processing may be performed on the fusion signals first, and then a final recording signal is obtained.
  • S204 may be specifically implemented by the following S204A to S204C.
  • the electronic device performs signal fusion processing on the M speech enhancement signals to obtain a first fusion signal.
  • the electronic device acquires the field of view and the shooting direction of the camera.
  • the angle of view of the camera When the electronic device is in the process of video shooting, when the focal length of the camera changes, the angle of view of the camera also changes. For example, when the focal length increases, the angle of view decreases, and when the focal length decreases, the angle of view decreases. Increase; when the electronic device is turned, such as moving from left to right along a horizontal line or from top to bottom along a vertical line, the camera's shooting direction will change.
  • the embodiment of the present application can acquire the field of view and shooting direction of the camera, so as to eliminate noise from outside the shooting range according to the field of view and shooting direction of the camera.
  • the electronic device performs noise reduction processing on the first fusion signal according to the field of view angle and shooting direction of the camera to obtain a first recording signal.
  • the above noise reduction processing is used to eliminate noise from outside the shooting range, that is, to eliminate sound from outside the actual viewing area.
  • the focal length of the camera of the mobile phone is ⁇ 4 (ie, 4 times the focal length)
  • the field of view (also called the wide angle) of the camera can be a. Since the fusion signal includes the recording signal within the shooting range and the recording signal outside the shooting range, after the mobile phone determines the shooting range according to the field of view and shooting direction of the camera, the mobile phone can use the preset algorithm according to the shooting range, The recording signal outside the shooting range in the fusion signal is eliminated or canceled, so as to obtain the recording signal within the shooting range, that is, the first recording signal.
  • the recording method provided by the embodiment of the present application can perform noise reduction processing on the fusion signal according to the field of view angle and the shooting direction of the camera, so as to eliminate noise from outside the shooting range, thereby improving the recording quality of the finally obtained recording signal.
  • the embodiment of the present application may Periodically acquire the field of view and shooting direction of the camera to determine whether the field of view and shooting direction of the camera change.
  • the recording method provided in this embodiment of the present application may further include the following S205 to S207.
  • the electronic device obtains the shooting focal length of the camera again to obtain the second fusion signal.
  • the shooting focal length may not have changed or may have changed relative to the shooting focal length obtained last time.
  • the shooting focal length does not change, then there is no need to adjust the gain of the second voice path, and directly compare the voice signal output by the first voice path and the voice signal output by the second voice path to obtain a voice enhancement signal, that is, the second voice signal. fusion signal.
  • the gain of the second voice path needs to be re-adjusted; after that, the gain to be adjusted is determined according to the initial focal length, the adjusted focal length and the initial gain, and the gain of the second voice path is adjusted to this The gain is to be adjusted; after that, compare the speech signal output by the first speech channel and the speech signal output by the second speech channel to obtain a speech enhancement signal, that is, a second fusion signal.
  • a speech enhancement signal that is, a second fusion signal.
  • the electronic device obtains the field of view and the shooting direction of the camera again.
  • the electronic device when the target object changes, the electronic device performs noise reduction processing on the second fusion signal according to the changed target object to obtain a second recording signal.
  • the above-mentioned target object includes at least one of a field of view angle and a shooting direction.
  • the electronic device may periodically acquire the field of view angle and shooting direction of the camera according to a preset period, so as to detect whether the field of view angle and shooting direction of the camera have changed. If at least one of the field of view and shooting direction of the camera changes, the electronic device needs to re-determine the noise reduction direction according to the changed field of view and shooting direction, and perform noise reduction processing on the new fusion signal. For example, if the user turns the electronic device and the position of the sound source remains the same, since the relative position of the sound source and the microphone changes, the shooting direction also changes.
  • the electronic device needs to Shooting direction, re-determine the shooting range, and then use a preset algorithm according to the shooting range to eliminate or cancel the recording signal outside the shooting range in the second fusion signal, so as to obtain the recording signal within the shooting range, that is, the second recording signal .
  • the relative position of the sound source and the electronic device may change. Therefore, by periodically acquiring the field of view of the camera and shooting direction, can accurately determine the shooting range, and more accurately denoise the voice signal.
  • the execution subject may be an electronic device, a recording device, or a control module in the recording device for executing the recording method.
  • the recording device provided by the embodiment of the present application is described by taking the recording method performed by the recording device as an example.
  • an embodiment of the present application provides a recording apparatus 500 .
  • the recording device includes M microphones, and each microphone is connected with a first voice channel and a second voice channel, where M is an integer greater than or equal to 2.
  • the recording device 500 includes an acquisition module 501 , a determination module 502 and a processing module 503 .
  • the obtaining module 501 may be configured to obtain the shooting focal length of the camera in the state of video shooting.
  • the determining module 502 can be used to determine the target gain according to the initial focal length, the target focal length and the initial gain if the shooting focal length obtained by the obtaining module 501 is changed from the initial focal length to the target focal length, and the initial gain is connected to the ith microphone The gain of the first speech path.
  • the processing module 503 can be used to adjust the gain of the second voice path connected with the i-th microphone to the target gain determined by the determination module 502; Perform signal enhancement processing on the voice signal output by the second voice channel connected to the ith microphone to obtain the ith voice enhancement signal; and perform signal fusion processing on the M voice enhancement signals to obtain the first recording signal.
  • i takes the value of 1, 2...M in turn.
  • the determination module 502 can be specifically used for: if the target focal length is greater than the initial focal length, then the sum of the initial gain and the first gain is taken as the target gain; or, if the target focal length is less than the initial focal length, then the initial gain and the first The difference between a gain is used as the target gain.
  • the first gain is a product of a target value and a preset value, and the target value is an absolute value of a difference between the target focal length and the initial focal length.
  • the processing module 503 may be specifically configured to perform signal fusion processing on the M speech enhancement signals to obtain a first fusion signal.
  • the acquiring module 501 can also be used to acquire the field of view and shooting direction of the camera.
  • the processing module 503 can be specifically configured to perform noise reduction processing on the first fusion signal according to the field of view and shooting direction of the camera acquired by the acquisition module 501 to obtain a first recording signal, and the noise reduction processing is used to eliminate the noise from the shooting range. outside noise.
  • the acquiring module 501 may be configured to re-acquire the shooting focal length of the camera after obtaining the first recording signal to obtain the second fusion signal; and re-acquire the field of view and shooting direction of the camera.
  • the processing module 503 can also be used to perform noise reduction processing on the second fusion signal according to the changed target object when the target object changes to obtain a second recording signal, wherein the target object includes the angle of view and the At least one of the shooting directions.
  • the processing module 503 may also be configured to run a camera application before acquiring the focal length of the camera.
  • the processing module 503 can also be used to set the shooting focal length of the camera as the initial focal length, set the field of view of the camera as the initial field of view, and set the gain of each speech channel as the initial gain.
  • An embodiment of the present application provides a recording device, because when the distance between the sound source and the electronic device changes, the shooting focal length of the camera also changes. Therefore, the recording device passes through the first voice channel that connects with the i-th microphone. Set as a fixed gain, and set the second voice path connected to the i-th microphone to a variable gain that changes with the shooting focal length, so that when the shooting focal length becomes larger, the gain of the second voice path becomes larger to record farther When the focal length of the shooting becomes smaller, the gain of the second voice channel is smaller to record the signal at a closer distance, and then the recording device enhances the voice signal by comparing the difference between the two signals, thereby improving the voice signal obtained by the final fusion. quality, and improve the shooting effect of the video.
  • the recording device in this embodiment of the present application may be a device, or may be a component, an integrated circuit, or a chip in a terminal.
  • the apparatus may be a mobile electronic device or a non-mobile electronic device.
  • the mobile electronic device may be a mobile phone, a tablet computer, a notebook computer, a palmtop computer, an in-vehicle electronic device, a wearable device, an ultra-mobile personal computer (UMPC), a netbook, or a personal digital assistant (personal digital assistant).
  • UMPC ultra-mobile personal computer
  • netbook or a personal digital assistant (personal digital assistant).
  • non-mobile electronic devices can be servers, network attached storage (NAS), personal computer (personal computer, PC), television (television, TV), teller machine or self-service machine, etc., this application Examples are not specifically limited.
  • the recording device in the embodiment of the present application may be a device having an operating system.
  • the operating system may be an Android (Android) operating system, an ios operating system, or other possible operating systems, which are not specifically limited in the embodiments of the present application.
  • the recording device provided in the embodiment of the present application can implement each process implemented by the method embodiments in FIG. 1 to FIG. 4 , and to avoid repetition, details are not described here.
  • an embodiment of the present application further provides an electronic device 600, including a processor 601, a memory 602, and a program or instruction stored in the memory 602 and executable on the processor 601, the program Or, when the instruction is executed by the processor 601, each process of the above-mentioned recording method embodiment can be implemented, and the same technical effect can be achieved. To avoid repetition, details are not repeated here.
  • the electronic devices in the embodiments of the present application include the above-mentioned mobile electronic devices and non-mobile electronic devices.
  • FIG. 7 is a schematic diagram of a hardware structure of an electronic device implementing an embodiment of the present application.
  • the electronic device 700 includes but is not limited to: a radio frequency unit 701, a network module 702, an audio output unit 703, an input unit 704, a sensor 705, a display unit 706, a user input unit 707, an interface unit 708, a memory 709, and a processor 710, etc. part.
  • the electronic device 700 may also include a power source (such as a battery) for supplying power to various components, and the power source may be logically connected to the processor 710 through a power management system, so as to manage charging, discharging, and power management through the power management system. consumption management and other functions.
  • the electronic device includes M microphones, and each microphone is connected to a first voice path and a second voice path, where M is an integer greater than or equal to 2.
  • the structure of the electronic device shown in FIG. 7 does not constitute a limitation on the electronic device.
  • the electronic device may include more or less components than the one shown, or combine some components, or arrange different components, which will not be repeated here. .
  • the processor 710 may be configured to acquire the shooting focal length of the camera in the state of video shooting.
  • the processor 710 can also be used to determine the target gain according to the initial focal length, the target focal length and the initial gain if the shooting focal length obtained by the processor 710 is changed from the initial focal length to the target focal length, and the initial gain is the same as that of the ith microphone.
  • the gain of the connected first speech path can also be used to adjust the gain of the second voice path connected with the i-th microphone to the target gain; Perform signal enhancement processing on the voice signals output by the connected second voice channel to obtain the i-th voice enhanced signal; and perform signal fusion processing on the M voice enhanced signals to obtain the first recording signal.
  • i takes the value of 1, 2...M in turn.
  • the processor 710 may be specifically configured to: if the target focal length is greater than the initial focal length, use the sum of the initial gain and the first gain as the target gain; or, if the target focal length is less than the initial focal length, then use the initial gain and the first The difference between a gain is used as the target gain.
  • the first gain is the product of the target value and the preset value, and the target value is the absolute value of the difference between the target focal length and the initial focal length.
  • the processor 710 may be specifically configured to: perform signal fusion processing on the M speech enhancement signals to obtain a first fusion signal; and obtain the field of view and shooting direction of the camera; and according to the field of view and shooting direction of the camera In the direction, noise reduction processing is performed on the first fusion signal to obtain a first recording signal, and the noise reduction processing is used to eliminate noise from outside the shooting range.
  • the processor 710 can be used to re-acquire the shooting focal length of the camera after obtaining the first recording signal to obtain the second fusion signal; and re-acquire the field of view and shooting direction of the camera; when the target object changes In the case of , according to the changed target object, noise reduction processing is performed on the second fusion signal to obtain a second recording signal.
  • the target object includes at least one of a field of view angle and a shooting direction.
  • the processor 710 may be further configured to run a camera application before acquiring the shooting focal length of the camera; set the shooting focal length of the camera to the initial focal length, and set the field of view of the camera to the initial field of view, and Set the gain of each speech channel to the initial gain.
  • the embodiment of the present application provides an electronic device, because when the distance between the sound source and the electronic device changes, the shooting focal length of the camera of the electronic device also changes, therefore, the electronic device is connected to the ith microphone by connecting The first voice path is set to a fixed gain, and the second voice path connected to the i-th microphone is set to a variable gain that changes with the shooting focal length, so that the gain of the second voice path changes when the shooting focal length becomes larger.
  • the focal length becomes smaller the gain of the second voice channel is smaller to record the signal at a closer distance, and then the electronic device enhances the voice signal by comparing the difference between the two signals, thereby improving the final performance.
  • the quality of the voice signal obtained by fusion is improved, and the shooting effect of the video is improved.
  • the input unit 704 may include a graphics processing unit (graphics processing unit, GPU) 7041 and a microphone 7042. Such as camera) to obtain still pictures or video image data for processing.
  • the display unit 706 may include a display panel 7061, which may be configured in the form of a liquid crystal display, an organic light emitting diode, or the like.
  • the user input unit 707 includes a touch panel 7071 and other input devices 7072 .
  • the touch panel 7071 is also called a touch screen.
  • the touch panel 7071 may include two parts, a touch detection device and a touch controller.
  • Other input devices 7072 may include, but are not limited to, physical keyboards, function keys (such as volume control keys, switch keys, etc.), trackballs, mice, and joysticks, which will not be repeated here.
  • Memory 709 may be used to store software programs as well as various data including, but not limited to, application programs and operating systems.
  • the processor 710 may integrate an application processor and a modem processor, wherein the application processor mainly handles the operating system, user interface, and application programs, and the like, and the modem processor mainly handles wireless communication. It can be understood that, the above-mentioned modulation and demodulation processor may not be integrated into the processor 710.
  • Embodiments of the present application further provide a readable storage medium, where a program or an instruction is stored on the readable storage medium, and when the program or instruction is executed by a processor, each process of the above-mentioned recording method embodiment can be implemented, and the same technology can be achieved. The effect, in order to avoid repetition, is not repeated here.
  • the processor is the processor in the electronic device in the above embodiment.
  • the readable storage medium includes a computer-readable storage medium, such as computer read-only memory (read-only memory, ROM), random access memory (random access memory, RAM), magnetic disk or optical disk, etc.
  • An embodiment of the present application further provides a chip, the chip includes a processor and a communication interface, the communication interface is coupled to the processor, the processor is used for running a program or an instruction, implements each process of the above recording method embodiment, and can To achieve the same technical effect, in order to avoid repetition, details are not repeated here.
  • the chip mentioned in the embodiments of the present application may also be referred to as a system-on-chip, a system-on-chip, a system-on-a-chip, or a system-on-a-chip, or the like.
  • the method of the above embodiment can be implemented by means of software plus a necessary general hardware platform, and of course can also be implemented by hardware, but in many cases the former is better implementation.
  • the technical solution of the present application can be embodied in the form of a software product in essence or in a part that contributes to the prior art, and the computer software product is stored in a storage medium (such as ROM/RAM, magnetic disk, CD-ROM), including several instructions to make a terminal (which may be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) execute the methods in the various embodiments of the present application.
  • a storage medium such as ROM/RAM, magnetic disk, CD-ROM

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Quality & Reliability (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Studio Devices (AREA)
  • Circuit For Audible Band Transducer (AREA)

Abstract

一种电子设备的录音方法及录音装置,电子设备包括M个麦克风,且每个麦克风与一个第一语音通路和一个第二语音通路连接,录音方法包括:在处于视频拍摄状态的情况下,获取摄像头的拍摄焦距;若拍摄焦距由初始焦距变化为目标焦距,则根据初始焦距、目标焦距和初始增益,确定目标增益,并将与第i个麦克风连接的第二语音通路的增益调整为目标增益,初始增益为与第i个麦克风连接的第一语音通路的增益;对与第i个麦克风连接的第一语音通路输出的语音信号和第二语音通路输出的语音信号进行信号增强处理,得到第i语音增强信号;对M个语音增强信号进行信号融合处理,得到第一录音信号。

Description

电子设备的录音方法及录音装置
相关申请的交叉引用
本申请主张在2020年07月31日在中国提交的中国专利申请号202010760783.9的优先权,其全部内容通过引用包含于此。
技术领域
本申请属于通信技术领域,具体涉及一种电子设备的录音方法及录音装置。
背景技术
随着通信技术的快速发展,大部分移动终端具备了录音功能,例如,用户可以使用移动终端拍摄一段包括语音的小视频。
目前,当用户处于不同的环境时,声源与移动终端的距离通常会发生变化。随着声源与移动终端的距离逐渐增加,移动终端录制的声音会逐渐减小,导致用户可能无法听清声音;而随着声源与移动终端的距离逐渐减小,移动终端录制的声音会逐渐增大,导致可能出现破音。因此,针对声源与移动终端的距离发生改变的情况,如何提升录音质量成为亟待解决的问题。
发明内容
本申请实施例的目的是提供一种电子设备的录音方法及录音装置,能够解决如何提升录音质量的问题。
为了解决上述技术问题,本申请实施例是这样实现的:
第一方面,本申请实施例提供了一种电子设备的录音方法。该电子设备包括M个麦克风,且每个麦克风与一个第一语音通路和一个第二语音通路连接,M为大于或等于2的整数。该方法包括:在处于视频拍摄状态的情况下,获取摄像头的拍摄焦距;若摄像头的拍摄焦距由初始焦距变化为目标焦距,则根据初始焦距、目标焦距和初始增益,确定目标增益,并将与第i个麦克风连接的第二语音通路的增益调整为目标增益,初始增益为与第i个麦克风连接的第一 语音通路的增益,i依次取值为1、2……M;对与第i个麦克风连接的第一语音通路输出的语音信号和与第i个麦克风连接的第二语音通路输出的语音信号进行信号增强处理,得到第i个语音增强信号;对M个语音增强信号进行信号融合处理,得到第一录音信号。
第二方面,本申请实施例提供了一种录音装置。该录音装置包括M个麦克风,且每个麦克风与一个第一语音通路和一个第二语音通路连接,M为大于或等于2的整数。该录音装置包括获取模块、确定模块和处理模块。获取模块,用于在处于视频拍摄状态的情况下,获取摄像头的拍摄焦距。确定模块,用于若获取模块获取的拍摄焦距由初始焦距变化为目标焦距,则根据初始焦距、目标焦距和初始增益,确定目标增益,初始增益为与第i个麦克风连接的第一语音通路的增益。处理模块,用于将与第i个麦克风连接的第二语音通路的增益调整为确定模块确定的目标增益;并对与第i个麦克风连接的第一语音通路输出的语音信号和与第i个麦克风连接的第二语音通路输出的语音信号进行信号增强处理,得到第i个语音增强信号;以及对M个语音增强信号进行信号融合处理,得到第一录音信号。其中,i依次取值为1、2……M。
第三方面,本申请实施例提供了一种电子设备,该电子设备包括处理器、存储器及存储在该存储器上并可在该处理器上运行的程序或指令,该程序或指令被该处理器执行时实现如第一方面提供的方法的步骤。
第四方面,本申请实施例提供了一种可读存储介质,该可读存储介质上存储程序或指令,该程序或指令被处理器执行时实现如第一方面提供的方法的步骤。
第五方面,本申请实施例提供了一种芯片,该芯片包括处理器和通信接口,该通信接口和该处理器耦合,该处理器用于运行程序或指令,实现如第一方面提供的方法。
在本申请实施例中,可以在处于视频拍摄状态的情况下,获取摄像头的拍摄焦距;若摄像头的拍摄焦距由初始焦距变化为目标焦距,则根据初始焦距、目标焦距和初始增益,确定目标增益,并将与第i个麦克风连接的第二语音通 路的增益调整为目标增益,初始增益为与第i个麦克风连接的第一语音通路的增益,i依次取值为1、2……M;对与第i个麦克风连接的第一语音通路输出的语音信号和与第i个麦克风连接的第二语音通路输出的语音信号进行信号增强处理,得到第i个语音增强信号;对M个语音增强信号进行信号融合处理,得到第一录音信号。通过该方案,由于当声源与电子设备之间的距离发生改变时,电子设备的摄像头的拍摄焦距也会发生变化,因此,通过将与第i个麦克风的连接的第一语音通路设定为固定增益,将与第i个麦克风连接的第二语音通路设定为随拍摄焦距变化的可变增益,从而使得拍摄焦距变大时该第二语音通路的增益变大以录制更远距离的信号,拍摄焦距变小时该第二语音通路增益较小以录制更近距离的信号,进而通过比较两路信号的差异对语音信号做增强处理,从而改善了最终融合得到的语音信号的质量,并且提高了视频的拍摄效果。
附图说明
图1为本申请实施例提供的一种电子设备的结构示意图;
图2为本申请实施例提供的电子设备的录音方法的示意图之一;
图3为本申请实施例提供的电子设备的录音方法的示意图之二;
图4为本申请实施例提供的摄像头的视场角的示意图;
图5为本申请实施例提供的录音装置的结构示意图;
图6为本申请实施例提供的电子设备的硬件示意图之一;
图7为本申请实施例提供的电子设备的硬件示意图之二。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
本申请的说明书和权利要求书中的术语“第一”、“第二”等是用于区别类似的对象,而不用于描述特定的顺序或先后次序。应该理解这样使用的数据 在适当情况下可以互换,以便本申请的实施例能够以除了在这里图示或描述的那些以外的顺序实施,且“第一”、“第二”等所区分的对象通常为一类,并不限定对象的个数,例如第一对象可以是一个,也可以是多个。此外,说明书以及权利要求中“和/或”表示所连接对象的至少其中之一,字符“/”,一般表示前后关联对象是一种“或”的关系。
本申请实施例提供一种电子设备的录音方法及录音装置,可以在处于视频拍摄状态的情况下,获取摄像头的拍摄焦距;若摄像头的拍摄焦距由初始焦距变化为目标焦距,则根据初始焦距、目标焦距和初始增益,确定目标增益,并将与第i个麦克风连接的第二语音通路的增益调整为目标增益,初始增益为与第i个麦克风连接的第一语音通路的增益,i依次取值为1、2……M;对与第i个麦克风连接的第一语音通路输出的语音信号和与第i个麦克风连接的第二语音通路输出的语音信号进行信号增强处理,得到第i个语音增强信号;对M个语音增强信号进行信号融合处理,得到第一录音信号。通过该方案,由于当声源与电子设备之间的距离发生改变时,电子设备的摄像头的拍摄焦距也会发生变化,因此,通过将与第i个麦克风的连接的第一语音通路设定为固定增益,将与第i个麦克风连接的第二语音通路设定为随拍摄焦距变化的可变增益,从而使得拍摄焦距变大时该第二语音通路的增益变大以录制更远距离的信号,拍摄焦距变小时该第二语音通路增益较小以录制更近距离的信号,进而通过比较两路信号的差异对语音信号做增强处理,从而改善了最终融合得到的语音信号的质量,并且提高了视频的拍摄效果。
下面结合附图,通过具体的实施例及其应用场景对本申请实施例提供电子设备的录音方法、录音装置及电子设备进行详细地说明。
如图1所示,本申请实施例提供一种电子设备。电子设备包括M个麦克风,且每个麦克风与一个第一语音通路和一个第二语音通路连接,M为大于或等于2的整数。
示例性的,以电子设备为手机,且手机内设置有两个麦克风(mic)、一个编译码器(codec)和一个数字信号处理器(adsp)为例。麦克风1与第一语 音通路01和第二语音通路02连接,麦克风2与第一语音通路03和第二语音通路04连接,这4个语音通路中的每个语音通路分别包括一个模数转换器(analog-to-digital converter,ADC),且4个语音通路的另一端连接至编译码器的编码模块。数字信号处理器包括与编码模块连接的录音增强模块、与录音增强模块连接的降噪模块。
具体的,麦克风,用于采集由声源(例如人物、乐器、水流和风浪等)发出的声音信号(也称为声源信号、声波信号);ADC,用于将麦克风采集的声音信号由模拟信号转换为数字信号;编码模块,用于对由ADC输出的数字信号进行编码,得到编码信号;录音增强模块,用于对与每个麦克风连接的2路编码信号进行增强处理,得到与每个麦克风对应的1路增强信号,进而得到与两个麦克风对应的2路增强信号;降噪模块,用于对2路增强信号进行融合处理得到1路融合信号,并对该融合信号进行降噪处理,得到最终的录音信号。可以参照下述实施例对电子设备的录音方法的具体描述,此处不予赘述。
需要说明的是,上述图1是电子设备包括两个麦克风为例进行说明的,这两个麦克风可以分别用于采集不同类型的音频,例如,麦克风1可以主要用于采集人声,麦克风2主要用于采集环境声,但其并不对本申请实施例形成任何限定。可以理解,电子设备还可以包括3个麦克风或3个以上的麦克风,随着麦克风数量的增多,录音效果也逐步增强。
如图2所示,本申请实施例提供一种电子设备的录音方法。该方法可以包括下述的S201至S204。下面以执行主体为如图1所示的电子设备为例对该方法进行示例性说明。
S201、在处于视频拍摄状态的情况下,电子设备获取摄像头的拍摄焦距。
本申请实施例中,上述视频拍摄状态是在运行相机应用程序之后,用户通过对视频拍摄控件或视频录制控件的触控输入,触发摄像头采集并缓存视频帧的状态,即电子设备处于视频拍摄过程之中。
在电子设备进入视频拍摄状态之后,摄像头的拍摄焦距随时可能会发生变化。一种可能的实现方式为,电子设备处于自动对焦状态,随着被拍摄对象的 移动,摄像头的拍摄焦距自动调整;另一种可能的实现方式为,随着被拍摄对象的移动或被拍摄对象的改变,用户通过手动输入,触发摄像头调整焦距。因此,为了及时检测到摄像头的拍摄焦距的变化,在电子设备进入视频拍摄状态之后,电子设备可以周期性地对摄像头的拍摄焦距进行检测,以确定摄像头的拍摄焦距是否发生了变化,进而确定是否需要对语音通路的增益进行调整。
可选的,在S201之前,本申请实施例提供的录音方法还可以包括:运行相机应用程序;之后,将摄像头的拍摄焦距设置为初始焦距,并将摄像头的视场角设置为初始视场角,以及将每个语音通路的增益设置为初始增益。其中,初始焦距、初始视场角和初始增益均可以为预设值。
示例性的,在用户点击相机应用程序的图标之后,电子设备运行相机应用程序,并进行初始化设置。例如,将摄像头的初始焦距设置为4倍焦距,将摄像头的初始视场角设置为α,将与每个摄像头连接的第一语音通路和第二语音通路中ADC的增益均设置为12dB(分贝)。
需要说明的是,本申请实施例中,第一语音通路的增益为固定增益,随着摄像头的拍摄焦距的变化,第一语音通过的增益保持不变。第二语音通路的增益为可变增益,随着摄像头的拍摄焦距的增加,第二语音通路的增益增加,而随着摄像头的拍摄焦距的减小,第二语音通路的增益减小。另外,上述实施例是以第二语音通路的初始增益和第一语音通路的初始增益相等为例进行说明的,可以理解,实际实现时,第二语音通路的初始增益可以大于或小于第一语音通路的初始增益,具体可以根据实际使用需求确定,本申请实施例不作限定。
S202、若摄像头的拍摄焦距由初始焦距变化为目标焦距,则电子设备根据初始焦距、目标焦距和初始增益,确定目标增益,并将与第i个麦克风连接的第二语音通路的增益调整为目标增益。
其中,上述初始增益为与第i个麦克风连接的第一语音通路的增益,即当摄像头的拍摄焦距为初始焦距时为与第i个麦克风连接的第一语音通路设置的固定增益。
本申请实施例中,i依次取值为1、2……M。即,若摄像头的拍摄焦距由 初始焦距变化为目标焦距,则电子设备根据初始焦距、目标焦距和与第1个麦克风连接的第一语音通路的增益,确定与第1个麦克风连接的第二语音通路的增益并对其进行调整;根据初始焦距、目标焦距和与第2个麦克风连接的第一语音通路的增益,确定与第2个麦克风连接的第二语音通路的增益并对其进行调整;……;根据初始焦距、目标焦距和与第M个麦克风连接的第一语音通路的增益,确定与第M个麦克风连接的第二语音通路的增益并对其进行调整。
可选的,上述目标焦距大于初始焦距,或者小于初始焦距。
若目标焦距大于初始焦距,则需要增加与第i个麦克风连接的第二语音通路的增益;若目标焦距小于初始焦距,则需要减小与第i个麦克风连接的第二语音通路的增益。具体的,上述“根据初始焦距、目标焦距和初始增益,确定目标增益”具体可以通过下述两种方式中的任意一种方式实现:
方式一、若目标焦距大于初始焦距,则电子设备将初始增益与第一增益之和,作为目标增益,其中,该第一增益为目标值和预设值的乘积,该目标值为该目标焦距与该初始焦距的差值的绝对值。即,目标增益=初始增益+第一增益=初始增益+|目标焦距-初始焦距|*预设值=初始增益+(目标焦距-初始焦距)*预设值,该预设值可以用于表示连续两个焦距之间的增益差值。
示例性的,假设初始增益为12dB,预设值为3dB,初始焦距为×4(即4倍焦距)。当摄像头的拍摄焦距改变为×6(即6倍焦距)时,目标增益为12dB+(6-4)*3dB=18dB;当摄像头的拍摄焦距改变为×10(即10倍焦距)时,目标增益为12dB+(10-4)*3dB=30dB。
方式二、若目标焦距小于初始焦距,则电子设备将初始增益与第一增益之差,作为目标增益,其中,该第一增益为目标值和预设值的乘积,该目标值为该目标焦距与该初始焦距的差值的绝对值。即,目标增益=初始增益-第一增益=初始增益-|目标焦距-初始焦距|*预设值=初始增益-(初始焦距-目标焦距)*预设值,该预设值可以用于表示连续两个焦距之间的增益差值。
示例性的,假设初始增益为12dB,预设值为3dB,初始焦距为×4(即4倍焦距)。当摄像头的拍摄焦距改变为×2(即2倍焦距)时,目标增益为12dB- (4-2)*3dB=6dB;当摄像头的拍摄焦距改变为×1(即1倍焦距)时,目标增益为12dB-(4-1)*3dB=3dB。
需要说明的是,本申请实施例为第二语音通路的增益设定了可变增益范围,即第二语音通过的增益仅能在可变增益范围内进行调整。示例性的,假设可变增益范围为0~30dB,结合上述方式一中的示例,当摄像头的拍摄焦距改变为×10(即10倍焦距)时,目标增益为30dB,当摄像头的拍摄焦距改变为大于10倍焦距时,目标增益仍为30dB。
上述实施例是以与每个麦克风连接的第一语音通路的初始增益为同一个增益为例进行示例性说明的,可以理解,实际实现时,与每个麦克风连接的第一语音通路的初始增益也可以不相等,具体可以根据实际使用需求确定。
S203、对与第i个麦克风连接的第一语音通路输出的语音信号和与第i个麦克风连接的第二语音通路输出的语音信号进行信号增强处理,得到第i个语音增强信号。
对于每个麦克风的第一语音通路和第二语音通路,由于第一语音通路的增益和第二语音通路的增益不同,因此,从第一语音通路输出的语音信号和从第二语音通路输出的语音信号存在差异。对于每个麦克风,通过比较从第一语音通路输出的语音信号和从第二语音通路输出的语音信号并进行增强处理,可以得到一个语音增强信号。
可选的,一种方式为,电子设备可以对比第一语音通路输出的语音信号的信噪比和第二语音通路输出的语音信号的信噪比。若第一语音通路输出的语音信号的信噪比大于第二语音通路输出的语音信号的信噪比,则对第一语音通路输出的语音信号进行增强处理,得到一个语音增强信号;若第一语音通路输出的语音信号的信噪比小于第二语音通路输出的语音信号的信噪比,则对第二语音通路输出的语音信号进行增强处理,得到一个语音增强信号。
另一种方式为,电子设备可以对比第一语音通路输出的语音信号的预设特征参数和第二语音通路输出的语音信号的预设特征参数;之后,将两路语音信号中预设特征参数符合需求的语音片段进行合成,从而得到一个语音增强信 号。其中,预设特征参数可以包括声波振幅信息、声纹信息和信噪比等中的至少一项。
S204、对M个语音增强信号进行信号融合处理,得到第一录音信号。
示例性的,如图1所示,与麦克风1连接的第一语音通路01输出的语音信号、与麦克风1连接的第二语音通路02输出的语音信号、与麦克风2连接的第一语音通路03输出的语音信号、与麦克风2连接的第二语音通路04输出的语音信号,在分别经过ADC模数转换和编码模块的编码之后被输入录音增强模块。录音增强模块可以比较第一语音通路01输出的语音信号与第二语音通路02输出的语音信号,得到第一个语音增强信号,并且,比较第一语音通路03输出的语音信号和第二语音通路04输出的语音信号,得到第二个语音增强信号。之后,可以对这两个语音增强信号直接进行融合处理,得到第一录音信号;或者,可以先通过降噪模块对这两个语音增强信号进行融合处理,得到一个融合信号,再对这个融合信号进行降噪处理,得到第一录音信号。
可选的,本申请实施例提供了多种降噪处理方法:
第一种方法为,电子设备根据摄像头的视场角和拍摄方向,对第一融合信号进行降噪处理,以消除来自拍摄范围之外的噪声,从而得到第一录音信号。可以参照下述实施例的具体描述,此处不再赘述。
第二种方法为,电子设备设置有至少3个麦克风,这3个麦克风可以形成麦克风阵列,根据基于高分辨率谱估计的定向技术、基于可控波束形成技术等多声源定位算法,可以对各个声源进行定位,从而获取各个声源的位置信息,进而消除声源方位外的噪声信息。其中,该位置信息可以理解为声源信号距离耳机的距离信息以及相对于耳机的方位信息。
当然,可以理解的是,电子设备还可以采用其他降噪处理方法对融合信号进行降噪处理,以得到第一录音信号,可以根据实际使用需求确定,本申请实施例不作限定。
本申请实施例提供一种电子设备的录音方法,由于当声源与电子设备之间的距离发生改变时,电子设备的摄像头的拍摄焦距也会发生变化,因此,通过 将与第i个麦克风的连接的第一语音通路设定为固定增益,将与第i个麦克风连接的第二语音通路设定为随拍摄焦距变化的可变增益,从而使得拍摄焦距变大时该第二语音通路的增益变大以录制更远距离的信号,拍摄焦距变小时该第二语音通路增益较小以录制更近距离的信号,进而通过比较两路信号的差异对语音信号做增强处理,从而改善了最终融合得到的语音信号的质量,并且提高了视频的拍摄效果。
可选的,在对从M个麦克风得到的M个语音增强信号进行融合处理之后,可以先对融合信号进行降噪处理,再得到最终的录音信号。示例性的,结合图2,如图3所示,上述S204具体可以通过下述的S204A至S204C实现。
S204A、电子设备对M个语音增强信号进行信号融合处理,得到第一融合信号。
S204B、电子设备获取摄像头的视场角和拍摄方向。
在电子设备处于视频拍摄过程之中,当摄像头的拍摄焦距发生变化时,摄像头的视场角也会发生变化,例如当拍摄焦距增大时视场角减小,当拍摄焦距减小时视场角增大;当电子设备出现转向时,例如沿着水平线从左向右移动或者沿着垂直线从上向下移动,摄像头的拍摄方向会发生变化。
由于当摄像头的视场角和拍摄方向发生变化时,声源位置均有可能会发生变化,那么存在着有来自拍摄范围内的主音频,还有来自拍摄范围外的噪声,因此,为了更为准确地对语音信号降噪,本申请实施例可以获取摄像头的视场角和拍摄方向,以根据摄像头的视场角和拍摄方向,消除来自拍摄范围之外的噪声。
S204C、电子设备根据摄像头的视场角和拍摄方向,对第一融合信号进行降噪处理,得到第一录音信号。
其中,上述降噪处理用于消除来自拍摄范围之外的噪声,即消除来自实际取景区域之外的声音。
如图4所示,当用户手持水平放置的手机朝向用户的正前方时,若手机的摄像头的拍摄焦距为×4(即4倍焦距),摄像头的视场角(也称为广角)可 以为α。由于融合信号包括了拍摄范围内的录音信号和拍摄范围外的录音信号,因此,在手机根据摄像头的视场角和拍摄方向,确定拍摄范围之后,手机可以根据该拍摄范围,采用预设算法,将融合信号中的拍摄范围外的录音信号消除或抵消,从而得到拍摄范围内的录音信号,即第一录音信号。
本申请实施例提供的录音方法,可以根据摄像头的视场角和拍摄方向,对融合信号进行降噪处理,以消除来自拍摄范围之外的噪声,从而可以提高最终得到的录音信号的录音质量。
可选的,由于在摄像头的视场角和拍摄方向发生变化时,声源与电子设备的相对位置有可能会发生变化,因此,为了更为准确地对语音信号降噪,本申请实施例可以周期性地获取摄像头的视场角和拍摄方向,以确定摄像头的视场角和拍摄方向是否发生变化。示例性的,在S204C之后,本申请实施例提供的录音方法还可以包括下述的S205至S207。
S205、电子设备重新获取摄像头的拍摄焦距,以得到第二融合信号。
在电子设备按照预设周期重新获取摄像头的拍摄焦距之后,该拍摄焦距相对于上次获取的拍摄焦距可能没有发生变化,也可能发生了变化。
如果拍摄焦距没有发生变化,那么无需对第二语音通路的增益进行调整,并直接对比较第一语音通路输出的语音信号和第二语音通路输出的语音信号,得到一个语音增强信号,即第二融合信号。
如果拍摄焦距发生了变化,那么需要重新对第二语音通路的增益进行调整;之后,根据初始焦距、调整后的焦距和初始增益,确定待调整增益,并将第二语音通路的增益调整为该待调整增益;再之后,比较第一语音通路输出的语音信号和第二语音通路输出的语音信号,得到一个语音增强信号,即第二融合信号。具体可以参照上述实施例S201至S204相关描述,此处不再赘述。
S206、电子设备重新获取摄像头的视场角和拍摄方向。
S207、在目标对象发生变化的情况下,电子设备根据变化后的目标对象,对第二融合信号进行降噪处理,得到第二录音信号。
其中,上述目标对象包括视场角和拍摄方向中的至少一项。
电子设备可以按照预设周期,周期性获取摄像头的视场角和拍摄方向,以检测摄像头的视场角和拍摄方向是否发生了变化。如果摄像头的视场角和拍摄方向中的至少一项发生了变化,那么电子设备需要根据变化后的视场角和拍摄方向,重新确定降噪方向,并对新的融合信号进行降噪处理。例如,如果用户转动电子设备,且声源位置保持不变,由于声源和麦克风的相对位置发生了变化,拍摄方向也发生了变化,因此,在这种情况下,电子设备需要根据变化后的拍摄方向,重新确定拍摄范围,之后根据该拍摄范围,采用预设算法,将第二融合信号中的拍摄范围外的录音信号消除或抵消,从而得到拍摄范围内的录音信号,即第二录音信号。
本申请实施例提供的录音方法,在摄像头的视场角和拍摄方向发生变化的情况下,声源与电子设备的相对位置有可能会发生变化,因此,通过周期性地获取摄像头的视场角和拍摄方向,能够准确确定拍摄范围,并更为准确地对语音信号降噪。
需要说明的是,本申请实施例提供的录音方法,执行主体可以为电子设备,录音装置,或者该录音装置中的用于执行录音方法的控制模块。本申请实施例中以录音装置执行录音方法为例,说明本申请实施例提供的录音装置。
如图5所示,本申请实施例提供一种录音装置500。该录音装置包括M个麦克风,且每个麦克风与一个第一语音通路和一个第二语音通路连接,M为大于或等于2的整数。该录音装置500包括获取模块501、确定模块502和处理模块503。
获取模块501,可以用于在处于视频拍摄状态的情况下,获取摄像头的拍摄焦距。确定模块502,可以用于若获取模块501获取的拍摄焦距由初始焦距变化为目标焦距,则根据该初始焦距、该目标焦距和初始增益,确定目标增益,该初始增益为与第i个麦克风连接的第一语音通路的增益。处理模块503,可以用于将与第i个麦克风连接的第二语音通路的增益调整为确定模块502确定的目标增益;并对与第i个麦克风连接的第一语音通路输出的语音信号和与第i个麦克风连接的第二语音通路输出的语音信号进行信号增强处理,得到第i 个语音增强信号;以及对M个语音增强信号进行信号融合处理,得到第一录音信号。其中,i依次取值为1、2……M。
可选的,确定模块502,具体可以用于:若目标焦距大于初始焦距,则将初始增益与第一增益之和,作为目标增益;或者,若目标焦距小于初始焦距,则将初始增益与第一增益之差,作为目标增益。其中,该第一增益为目标值和预设值的乘积,该目标值为该目标焦距与该初始焦距的差值的绝对值。
可选的,处理模块503,具体可以用于对M个语音增强信号进行信号融合处理,得到第一融合信号。获取模块501,还可以用于获取摄像头的视场角和拍摄方向。处理模块503,具体可以用于根据获取模块501获取的摄像头的视场角和拍摄方向,对第一融合信号进行降噪处理,得到第一录音信号,该降噪处理用于消除来自拍摄范围之外的噪声。
可选的,获取模块501,可以用于在得到第一录音信号之后,重新获取摄像头的拍摄焦距,以得到第二融合信号;以及重新获取摄像头的视场角和拍摄方向。处理模块503,还可以用于在目标对象发生变化的情况下,根据变化后的目标对象,对第二融合信号进行降噪处理,得到第二录音信号,其中,该目标对象包括视场角和拍摄方向中的至少一项。
可选的,处理模块503,还可以用于在获取摄像头的拍摄焦距之前,运行相机应用程序。处理模块503,还可以用于将摄像头的拍摄焦距设置为初始焦距,并将摄像头的视场角设置为初始视场角,以及将每个语音通路的增益设置为初始增益。
本申请实施例提供一种录音装置,由于当声源与电子设备的距离发生改变时,摄像头的拍摄焦距也会发生变化,因此,录音装置通过将与第i个麦克风的连接的第一语音通路设定为固定增益,将与第i个麦克风连接的第二语音通路设定为随拍摄焦距变化的可变增益,从而使得拍摄焦距变大时该第二语音通路的增益变大以录制更远距离的信号,拍摄焦距变小时该第二语音通路增益较小以录制更近距离的信号,进而录音装置通过比较两路信号的差异对语音信号做增强处理,从而改善了最终融合得到的语音信号的质量,并提高了视频的拍 摄效果。
本申请实施例中的录音装置可以是装置,也可以是终端中的部件、集成电路、或芯片。该装置可以是移动电子设备,也可以为非移动电子设备。示例性的,移动电子设备可以为手机、平板电脑、笔记本电脑、掌上电脑、车载电子设备、可穿戴设备、超级移动个人计算机(ultra-mobile personal computer,UMPC)、上网本或者个人数字助理(personal digital assistant,PDA)等,非移动电子设备可以为服务器、网络附属存储器(network attached storage,NAS)、个人计算机(personal computer,PC)、电视机(television,TV)、柜员机或者自助机等,本申请实施例不作具体限定。
本申请实施例中的录音装置可以为具有操作系统的装置。该操作系统可以为安卓(Android)操作系统,可以为ios操作系统,还可以为其他可能的操作系统,本申请实施例不作具体限定。
本申请实施例提供的录音装置能够实现图1至图4的方法实施例实现的各个过程,为避免重复,这里不再赘述。
可选的,如图6所示,本申请实施例还提供一种电子设备600,包括处理器601,存储器602,存储在存储器602上并可在处理器601上运行的程序或指令,该程序或指令被处理器601执行时实现上述录音方法实施例的各个过程,且能达到相同的技术效果,为避免重复,这里不再赘述。
需要说明的是,本申请实施例中的电子设备包括上述的移动电子设备和非移动电子设备。
图7为实现本申请实施例的一种电子设备的硬件结构示意图。
该电子设备700包括但不限于:射频单元701、网络模块702、音频输出单元703、输入单元704、传感器705、显示单元706、用户输入单元707、接口单元708、存储器709、以及处理器710等部件。
本领域技术人员可以理解,电子设备700还可以包括给各个部件供电的电源(比如电池),电源可以通过电源管理系统与处理器710逻辑相连,从而通过电源管理系统实现管理充电、放电、以及功耗管理等功能。该电子设备包括 M个麦克风,且每个麦克风与一个第一语音通路和一个第二语音通路连接,M为大于或等于2的整数。图7中示出的电子设备结构并不构成对电子设备的限定,电子设备可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置,在此不再赘述。
处理器710,可以用于在处于视频拍摄状态的情况下,获取摄像头的拍摄焦距。处理器710,还可以用于若处理器710获取的拍摄焦距由初始焦距变化为目标焦距,则根据该初始焦距、该目标焦距和初始增益,确定目标增益,该初始增益为与第i个麦克风连接的第一语音通路的增益。处理器710,还可以用于将与第i个麦克风连接的第二语音通路的增益调整为目标增益;并对与第i个麦克风连接的第一语音通路输出的语音信号和与第i个麦克风连接的第二语音通路输出的语音信号进行信号增强处理,得到第i个语音增强信号;以及对M个语音增强信号进行信号融合处理,得到第一录音信号。其中,i依次取值为1、2……M。
可选的,处理器710,具体可以用于:若目标焦距大于初始焦距,则将初始增益与第一增益之和,作为目标增益;或者,若目标焦距小于初始焦距,则将初始增益与第一增益之差,作为目标增益。其中,第一增益为目标值和预设值的乘积,该目标值为目标焦距与初始焦距的差值的绝对值。
可选的,处理器710,具体可以用于:对M个语音增强信号进行信号融合处理,得到第一融合信号;并获取摄像头的视场角和拍摄方向;以及根据摄像头的视场角和拍摄方向,对第一融合信号进行降噪处理,得到第一录音信号,该降噪处理用于消除来自拍摄范围之外的噪声。
可选的,处理器710,可以用于在得到第一录音信号之后,重新获取摄像头的拍摄焦距,以得到第二融合信号;以及重新获取摄像头的视场角和拍摄方向;在目标对象发生变化的情况下,根据变化后的目标对象,对第二融合信号进行降噪处理,得到第二录音信号。其中,该目标对象包括视场角和拍摄方向中的至少一项。
可选的,处理器710,还可以用于在获取摄像头的拍摄焦距之前,运行相 机应用程序;将摄像头的拍摄焦距设置为初始焦距,并将摄像头的视场角设置为初始视场角,以及将每个语音通路的增益设置为初始增益。
本申请实施例提供一种电子设备,由于当声源与电子设备之间的距离发生改变时,电子设备的摄像头的拍摄焦距也会发生变化,因此,电子设备通过将与第i个麦克风的连接的第一语音通路设定为固定增益,将与第i个麦克风连接的第二语音通路设定为随拍摄焦距变化的可变增益,从而使得拍摄焦距变大时该第二语音通路的增益变大以录制更远距离的信号,拍摄焦距变小时该第二语音通路增益较小以录制更近距离的信号,进而电子设备通过比较两路信号的差异对语音信号做增强处理,从而改善了最终融合得到的语音信号的质量,并提高了视频的拍摄效果。
应理解的是,本申请实施例中,输入单元704可以包括图形处理器(graphics processing unit,GPU)7041和麦克风7042,图形处理器7041对在视频捕获模式或图像捕获模式中由图像捕获装置(如摄像头)获得的静态图片或视频的图像数据进行处理。显示单元706可包括显示面板7061,可以采用液晶显示器、有机发光二极管等形式来配置显示面板7061。用户输入单元707包括触控面板7071以及其他输入设备7072。触控面板7071,也称为触摸屏。触控面板7071可包括触摸检测装置和触摸控制器两个部分。其他输入设备7072可以包括但不限于物理键盘、功能键(比如音量控制按键、开关按键等)、轨迹球、鼠标、操作杆,在此不再赘述。存储器709可用于存储软件程序以及各种数据,包括但不限于应用程序和操作系统。处理器710可集成应用处理器和调制解调处理器,其中,应用处理器主要处理操作系统、用户界面和应用程序等,调制解调处理器主要处理无线通信。可以理解的是,上述调制解调处理器也可以不集成到处理器710中。
本申请实施例还提供一种可读存储介质,该可读存储介质上存储有程序或指令,该程序或指令被处理器执行时实现上述录音方法实施例的各个过程,且能达到相同的技术效果,为避免重复,这里不再赘述。
其中,处理器为上述实施例中的电子设备中的处理器。可读存储介质,包 括计算机可读存储介质,如计算机只读存储器(read-only memory,ROM)、随机存取存储器(random access memory,RAM)、磁碟或者光盘等。
本申请实施例另提供了一种芯片,该芯片包括处理器和通信接口,该通信接口和该处理器耦合,该处理器用于运行程序或指令,实现上述录音方法实施例的各个过程,且能达到相同的技术效果,为避免重复,这里不再赘述。
应理解,本申请实施例提到的芯片还可以称为系统级芯片、系统芯片、芯片系统或片上系统芯片等。
需要说明的是,在本文中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者装置不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者装置所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括该要素的过程、方法、物品或者装置中还存在另外的相同要素。此外,需要指出的是,本申请实施方式中的方法和装置的范围不限按示出或讨论的顺序来执行功能,还可包括根据所涉及的功能按基本同时的方式或按相反的顺序来执行功能,例如,可以按不同于所描述的次序来执行所描述的方法,并且还可以添加、省去、或组合各种步骤。另外,参照某些示例所描述的特征可在其他示例中被组合。
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质(如ROM/RAM、磁碟、光盘)中,包括若干指令用以使得一台终端(可以是手机,计算机,服务器,空调器,或者网络设备等)执行本申请各个实施例中的方法。
上面结合附图对本申请的实施例进行了描述,但是本申请并不局限于上述的具体实施方式,上述的具体实施方式仅仅是示意性的,而不是限制性的,本领域的普通技术人员在本申请的启示下,在不脱离本申请宗旨和权利要求所保 护的范围情况下,还可做出很多形式,均属于本申请的保护之内。

Claims (13)

  1. 一种电子设备的录音方法,所述电子设备包括M个麦克风,且每个麦克风与一个第一语音通路和一个第二语音通路连接,M为大于或等于2的整数,所述方法包括:
    在处于视频拍摄状态的情况下,获取摄像头的拍摄焦距;
    若所述摄像头的拍摄焦距由初始焦距变化为目标焦距,则根据所述初始焦距、所述目标焦距和初始增益,确定目标增益,并将与第i个麦克风连接的第二语音通路的增益调整为目标增益,所述初始增益为与第i个麦克风连接的第一语音通路的增益,i依次取值为1、2……M;
    对与第i个麦克风连接的第一语音通路输出的语音信号和与第i个麦克风连接的第二语音通路输出的语音信号进行信号增强处理,得到第i个语音增强信号;
    对M个语音增强信号进行信号融合处理,得到第一录音信号。
  2. 根据权利要求1所述的方法,其中,所述根据所述初始焦距、所述目标焦距和初始增益,确定目标增益,包括:
    若所述目标焦距大于所述初始焦距,则将所述初始增益与第一增益之和,作为所述目标增益;或者,
    若所述目标焦距小于所述初始焦距,则将所述初始增益与第一增益之差,作为所述目标增益;
    其中,所述第一增益为目标值和预设值的乘积,所述目标值为所述目标焦距与所述初始焦距的差值的绝对值。
  3. 根据权利要求1或2所述的方法,其中,所述对M个语音增强信号进行信号融合处理,得到第一录音信号,包括:
    对所述M个语音增强信号进行信号融合处理,得到第一融合信号;
    获取所述摄像头的视场角和拍摄方向;
    根据所述摄像头的视场角和拍摄方向,对所述第一融合信号进行降噪处理,得到所述第一录音信号,所述降噪处理用于消除来自拍摄范围之外的噪声。
  4. 根据权利要求3所述的方法,其中,所述根据所述摄像头的视场角和拍摄方向,对所述第一融合信号进行降噪处理,得到所述第一录音信号之后,所述方法还包括:
    重新获取所述摄像头的拍摄焦距,以得到第二融合信号;
    重新获取所述摄像头的视场角和拍摄方向;
    在目标对象发生变化的情况下,根据变化后的目标对象,对所述第二融合信号进行降噪处理,得到第二录音信号,其中,所述目标对象包括视场角和拍摄方向中的至少一项。
  5. 根据权利要求1或2所述的方法,其中,所述在处于视频拍摄状态的情况下,获取摄像头的拍摄焦距之前,所述方法还包括:
    运行相机应用程序;
    将所述摄像头的拍摄焦距设置为初始焦距,并将所述摄像头的视场角设置为初始视场角,以及将每个语音通路的增益设置为所述初始增益。
  6. 一种录音装置,其中,所述录音装置包括M个麦克风,且每个麦克风与一个第一语音通路和一个第二语音通路连接,M为大于或等于2的整数,所述录音装置包括获取模块、确定模块和处理模块;
    所述获取模块,用于在处于视频拍摄状态的情况下,获取摄像头的拍摄焦距;
    所述确定模块,用于若所述获取模块获取的所述拍摄焦距由初始焦距变化为目标焦距,则根据所述初始焦距、所述目标焦距和初始增益,确定目标增益,所述初始增益为与第i个麦克风连接的所述第一语音通路的增益;
    所述处理模块,用于将与第i个麦克风连接的第二语音通路的增益调整为所述确定模块确定的所述目标增益;并对与第i个麦克风连接的第一语音通路输出的语音信号和与第i个麦克风连接的第二语音通路输出的语音信号进行信号增强处理,得到第i个语音增强信号;以及对M个语音增强信号进行信号融合处理,得到第一录音信号;
    其中,i依次取值为1、2……M。
  7. 根据权利要求6所述的录音装置,其中,所述确定模块,具体用于:
    若所述目标焦距大于所述初始焦距,则将所述初始增益与第一增益之和,作为所述目标增益;或者,
    若所述目标焦距小于所述初始焦距,则将所述初始增益与第一增益之差,作为所述目标增益;
    其中,所述第一增益为目标值和预设值的乘积,所述目标值为所述目标焦距与所述初始焦距的差值的绝对值。
  8. 根据权利要求6或7所述的录音装置,其中,所述处理模块,具体用于对所述M个语音增强信号进行信号融合处理,得到第一融合信号;
    所述获取模块,还用于获取所述摄像头的视场角和拍摄方向;
    所述处理模块,具体用于根据所述获取模块获取的所述摄像头的视场角和拍摄方向,对所述第一融合信号进行降噪处理,得到所述第一录音信号,所述降噪处理用于消除来自拍摄范围之外的噪声。
  9. 根据权利要求8所述的录音装置,其中,所述获取模块,用于在得到所述第一录音信号之后,重新获取所述摄像头的拍摄焦距,以得到第二融合信号;以及重新获取所述摄像头的视场角和拍摄方向;
    所述处理模块,还用于在目标对象发生变化的情况下,根据变化后的目标对象,对所述第二融合信号进行降噪处理,得到第二录音信号,其中,所述目标对象包括视场角和拍摄方向中的至少一项。
  10. 根据权利要求6或7所述的录音装置,其中,所述处理模块,还用于在获取摄像头的拍摄焦距之前,运行相机应用程序;
    所述处理模块,还用于将所述摄像头的拍摄焦距设置为初始焦距,并将所述摄像头的视场角设置为初始视场角,以及将每个语音通路的增益设置为所述初始增益。
  11. 一种电子设备,其中,包括处理器、存储器及存储在所述存储器上并可在所述处理器上运行的计算机程序,所述计算机程序被所述处理器执行时实现如权利要求1至5中任一项所述的录音方法的步骤。
  12. 一种电子设备,被配置成用于执行如权利要求1至5中任一项所述的录音方法。
  13. 一种可读存储介质,其中,所述可读存储介质上存储计算机程序,所述计算机程序被处理器执行时实现如权利要求1至4中任一项所述的录音方法的步骤。
PCT/CN2021/109323 2020-07-31 2021-07-29 电子设备的录音方法及录音装置 WO2022022647A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010760783.9 2020-07-31
CN202010760783.9A CN111916102B (zh) 2020-07-31 2020-07-31 电子设备的录音方法及录音装置

Publications (1)

Publication Number Publication Date
WO2022022647A1 true WO2022022647A1 (zh) 2022-02-03

Family

ID=73287363

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/109323 WO2022022647A1 (zh) 2020-07-31 2021-07-29 电子设备的录音方法及录音装置

Country Status (2)

Country Link
CN (1) CN111916102B (zh)
WO (1) WO2022022647A1 (zh)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111916102B (zh) * 2020-07-31 2024-05-28 维沃移动通信有限公司 电子设备的录音方法及录音装置
CN112492430B (zh) * 2020-12-17 2023-12-15 维沃移动通信有限公司 电子设备和电子设备的录音方法
CN112689221B (zh) * 2020-12-18 2023-05-30 Oppo广东移动通信有限公司 录音方法、录音装置、电子设备及计算机可读存储介质
CN113099031B (zh) * 2021-02-26 2022-05-17 华为技术有限公司 声音录制方法及相关设备
CN113472943B (zh) * 2021-06-30 2022-12-09 维沃移动通信有限公司 音频处理方法、装置、设备及存储介质

Citations (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101510425A (zh) * 2008-02-15 2009-08-19 株式会社东芝 声音识别装置以及用于执行声音识别的方法
CN103079148A (zh) * 2012-12-28 2013-05-01 中兴通讯股份有限公司 一种终端双麦克风降噪的方法及装置
CN103888703A (zh) * 2014-03-28 2014-06-25 深圳市中兴移动通信有限公司 增强录音的拍摄方法和摄像装置
CN104376847A (zh) * 2013-08-12 2015-02-25 联想(北京)有限公司 一种语音信号处理方法和装置
CN104699445A (zh) * 2013-12-06 2015-06-10 华为技术有限公司 一种音频信息处理方法及装置
CN106713793A (zh) * 2015-11-18 2017-05-24 天津三星电子有限公司 一种声音播放控制方法及其装置
CN106774882A (zh) * 2012-09-17 2017-05-31 联想(北京)有限公司 一种信息处理的方法及电子设备
CN107197090A (zh) * 2017-05-18 2017-09-22 维沃移动通信有限公司 一种语音信号的接收方法及移动终端
EP3373037A1 (en) * 2017-03-10 2018-09-12 The Hi-Tech Robotic Systemz Ltd Single casing advanced driver assistance system
CN109313904A (zh) * 2016-05-30 2019-02-05 索尼公司 视频音频处理设备、视频音频处理方法和程序
CN110970057A (zh) * 2018-09-29 2020-04-07 华为技术有限公司 一种声音处理方法、装置与设备
CN111050269A (zh) * 2018-10-15 2020-04-21 华为技术有限公司 音频处理方法和电子设备
CN111385728A (zh) * 2018-12-29 2020-07-07 华为技术有限公司 一种音频信号处理方法及装置
CN111916102A (zh) * 2020-07-31 2020-11-10 维沃移动通信有限公司 电子设备的录音方法及录音装置
US10923124B2 (en) * 2013-05-24 2021-02-16 Google Llc Method and apparatus for using image data to aid voice recognition

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107197187A (zh) * 2017-05-27 2017-09-22 维沃移动通信有限公司 一种视频的拍摄方法及移动终端
CN110995909B (zh) * 2019-11-20 2021-03-30 维沃移动通信有限公司 一种声音补偿方法及装置

Patent Citations (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101510425A (zh) * 2008-02-15 2009-08-19 株式会社东芝 声音识别装置以及用于执行声音识别的方法
CN106774882A (zh) * 2012-09-17 2017-05-31 联想(北京)有限公司 一种信息处理的方法及电子设备
CN103079148A (zh) * 2012-12-28 2013-05-01 中兴通讯股份有限公司 一种终端双麦克风降噪的方法及装置
US10923124B2 (en) * 2013-05-24 2021-02-16 Google Llc Method and apparatus for using image data to aid voice recognition
CN104376847A (zh) * 2013-08-12 2015-02-25 联想(北京)有限公司 一种语音信号处理方法和装置
CN104699445A (zh) * 2013-12-06 2015-06-10 华为技术有限公司 一种音频信息处理方法及装置
CN103888703A (zh) * 2014-03-28 2014-06-25 深圳市中兴移动通信有限公司 增强录音的拍摄方法和摄像装置
CN106713793A (zh) * 2015-11-18 2017-05-24 天津三星电子有限公司 一种声音播放控制方法及其装置
CN109313904A (zh) * 2016-05-30 2019-02-05 索尼公司 视频音频处理设备、视频音频处理方法和程序
EP3373037A1 (en) * 2017-03-10 2018-09-12 The Hi-Tech Robotic Systemz Ltd Single casing advanced driver assistance system
CN107197090A (zh) * 2017-05-18 2017-09-22 维沃移动通信有限公司 一种语音信号的接收方法及移动终端
CN110970057A (zh) * 2018-09-29 2020-04-07 华为技术有限公司 一种声音处理方法、装置与设备
CN111050269A (zh) * 2018-10-15 2020-04-21 华为技术有限公司 音频处理方法和电子设备
CN111385728A (zh) * 2018-12-29 2020-07-07 华为技术有限公司 一种音频信号处理方法及装置
CN111916102A (zh) * 2020-07-31 2020-11-10 维沃移动通信有限公司 电子设备的录音方法及录音装置

Also Published As

Publication number Publication date
CN111916102B (zh) 2024-05-28
CN111916102A (zh) 2020-11-10

Similar Documents

Publication Publication Date Title
WO2022022647A1 (zh) 电子设备的录音方法及录音装置
CN110970057B (zh) 一种声音处理方法、装置与设备
CN107105367B (zh) 一种音频信号处理方法及终端
US20160227336A1 (en) Contextual Switching of Microphones
US20100150360A1 (en) Audio source localization system and method
US10461712B1 (en) Automatic volume leveling
CN112291672B (zh) 扬声器的控制方法、控制装置以及电子设备
CN110390953B (zh) 啸叫语音信号的检测方法、装置、终端及存储介质
CN115831155A (zh) 音频信号的处理方法、装置、电子设备及存储介质
US20230014836A1 (en) Method for chorus mixing, apparatus, electronic device and storage medium
CN111462764A (zh) 音频编码方法、装置、计算机可读存储介质及设备
WO2023151526A1 (zh) 音频采集方法、装置、电子设备及外设组件
CN113160846A (zh) 噪声抑制方法和电子设备
CN112735370A (zh) 一种语音信号处理方法、装置、电子设备和存储介质
WO2023016053A1 (zh) 一种声音信号处理方法及电子设备
CN113077808B (zh) 一种语音处理方法、装置和用于语音处理的装置
CN109348021B (zh) 移动终端及音频播放方法
WO2016109103A1 (en) Directional audio capture
CN111508513A (zh) 音频处理方法及装置、计算机存储介质
US11646046B2 (en) Psychoacoustic enhancement based on audio source directivity
CN114758669B (zh) 音频处理模型的训练、音频处理方法、装置及电子设备
WO2024077452A1 (zh) 音频处理方法、装置、设备及存储介质
CN113450823B (zh) 基于音频的场景识别方法、装置、设备及存储介质
CN116913328B (zh) 音频处理方法、电子设备及存储介质
CN113380248B (zh) 语音控制方法、装置、设备及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21850535

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21850535

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 21850535

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 02.08.2023)

122 Ep: pct application non-entry in european phase

Ref document number: 21850535

Country of ref document: EP

Kind code of ref document: A1