WO2021204027A1 - 麦克风阵列控制方法、装置、电子设备及计算机存储介质 - Google Patents

麦克风阵列控制方法、装置、电子设备及计算机存储介质 Download PDF

Info

Publication number
WO2021204027A1
WO2021204027A1 PCT/CN2021/084099 CN2021084099W WO2021204027A1 WO 2021204027 A1 WO2021204027 A1 WO 2021204027A1 CN 2021084099 W CN2021084099 W CN 2021084099W WO 2021204027 A1 WO2021204027 A1 WO 2021204027A1
Authority
WO
WIPO (PCT)
Prior art keywords
microphone
combination
electronic device
microphone array
target
Prior art date
Application number
PCT/CN2021/084099
Other languages
English (en)
French (fr)
Inventor
陈祥
孙渊
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2021204027A1 publication Critical patent/WO2021204027A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R1/00Details of transducers, loudspeakers or microphones
    • H04R1/20Arrangements for obtaining desired frequency or directional characteristics
    • H04R1/32Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only
    • H04R1/40Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only by combining a number of identical transducers
    • H04R1/406Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only by combining a number of identical transducers microphones
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R1/00Details of transducers, loudspeakers or microphones
    • H04R1/20Arrangements for obtaining desired frequency or directional characteristics
    • H04R1/32Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only
    • H04R1/40Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only by combining a number of identical transducers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R2201/00Details of transducers, loudspeakers or microphones covered by H04R1/00 but not provided for in any of its subgroups
    • H04R2201/40Details of arrangements for obtaining desired directional characteristic by combining a number of identical transducers covered by H04R1/40 but not provided for in any of its subgroups
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/70Reducing energy consumption in communication networks in wireless communication networks

Definitions

  • This application belongs to the field of audio processing technology, and in particular relates to a microphone array control method, device, electronic equipment, and computer storage medium.
  • Microphone Array refers to the arrangement of microphones.
  • the microphone array is composed of a certain number of microphones, which are used to sample and process the spatial characteristics of the sound field.
  • Microphone arrays include inline arrays, cross arrays, planar arrays, spiral arrays, spherical arrays, and random arrays.
  • the embodiments of the present application provide a microphone array control method, device, electronic equipment, and computer storage medium to solve the problem that the power consumption of the microphone array is relatively high and the battery life of the electronic equipment is short in the existing microphone array application solutions.
  • the problem is not limited to the above.
  • the first aspect of the embodiments of the present application provides a microphone array control method, including:
  • the electronic device acquires the voice signal collected by the microphone array
  • the electronic device uses a microphone in the target microphone combination in the microphone array to perform a sound pickup operation.
  • the electronic device obtains the voice signal collected by the microphone array, and when the voice signal meets the preset switching condition, the electronic device enters a low power consumption working state.
  • the electronic device determines the pickup position where the pickup operation needs to be performed according to the voice signal, and obtains the parameter set corresponding to each microphone combination in the microphone array.
  • the pickup performance of each parameter set at each position is determined. Therefore, the electronic device can use the parameter set that meets the preset performance index conditions at the pickup position as the target parameter set, and the microphone combination corresponding to the target parameter set as the target microphone combination .
  • the electronic device uses the microphones in the target microphone combination to reduce the number of microphones in working state, reduce the power consumption of the microphone array, and improve the endurance of the electronic device while ensuring the pickup performance of the microphone array.
  • the preset switching condition is that the voice signal triggers an application of a preset application type.
  • the preset switching condition is that within a preset duration, an increment of the number of interactions corresponding to the sound source position of the voice signal is greater than or equal to a preset number threshold.
  • the electronic device acquires the voice signal collected by the microphone array, if the voice signal contains a human voice signal, it can be determined that the user or other electronic device that sent the human voice signal has performed the same with the electronic device of this embodiment. For one interaction, the sound source position of the voice signal is determined, and the number of interactions corresponding to the sound source position is increased by one.
  • the increment of the number of interactions corresponding to the sound source position of the voice signal is greater than or equal to the preset number of thresholds, it means that the user has interacted with the electronic device multiple times at that position, and the subsequent interactions may be at the same location. The location continues to interact with the electronic device.
  • the electronic device can enter a low power consumption working state, use the sound source position of the voice signal as the pickup position, and select the target microphone combination and apply it according to the pickup position.
  • the preset switching condition is that when the electronic device is awakened by the voice signal, the remaining power of the electronic device is lower than a preset power threshold.
  • the electronic device when the voice signal acquired by the electronic device contains a wake-up word, the electronic device will exit the sleep state and enter the working state.
  • the preset switching condition is that the spatial information corresponding to the voice signal is that there is an obstructed area.
  • the electronic device can generate a detection signal for detecting spatial information by means of self-sounding, etc., and collect the voice signal reflected by the detection signal after contacting an object through a microphone array.
  • the electronic device analyzes the above-mentioned voice signal to obtain the spatial information of the electronic device and determine whether there is an obstructed area around the electronic device, for example, whether the electronic device is against a wall. If the electronic device is against a wall, the area where the wall is located is the obstructed area.
  • the electronic device When the electronic device detects that the spatial information is an obstructed area, because the obstructed area (such as the area where the wall is located) usually does not need to perform a sound pickup operation, the electronic device can determine the unobstructed area as the sound pickup position, and target the sound pickup position according to the sound pickup position. Select the target microphone combination and apply it.
  • the setting a microphone combination corresponding to a parameter set that satisfies a preset performance index condition at a pickup position corresponding to the voice signal as a target microphone combination includes:
  • the microphone combination corresponding to the parameter set that meets the preset performance index condition at the pickup position is taken as the target microphone combination.
  • the electronic device can analyze the voice signal collected by the microphone array. If the voice signal contains noise, it means that there is a noise source in the environment around the electronic device, and the noise source will affect the sound pickup quality of the microphone array.
  • the electronic device can determine the sound source position of the noise as the non-pickup area, and determine the area outside the non-pickup area as the pickup position, and select the target microphone combination and apply it according to the pickup position to reduce the noise source The impact on the microphone array.
  • the electronic device turns off microphones other than the target microphone combination or puts the microphones other than the target microphone combination into a dormant state.
  • the electronic device can reduce the number of microphones in the working state by turning off the microphones other than the target microphone combination or making the microphones other than the target microphone combination enter the dormant state, thereby reducing the number of microphones in the working state.
  • the power consumption of the array improves the endurance of electronic equipment.
  • each of the microphone combinations includes at least one microphone
  • the electronic device acquires the parameter set corresponding to each microphone combination in the microphone array, and uses the microphone combination corresponding to the parameter set meeting the preset performance index condition at the pickup position as the target microphone combination, including:
  • the electronic device uses the candidate microphone combination with the least number of microphones as the first microphone combination
  • the electronic device determines the first microphone combination as the target microphone combination.
  • the electronic device may use the microphone combination corresponding to the parameter set meeting the preset performance index condition at the pickup position as the candidate microphone combination.
  • the electronic device can directly determine the candidate microphone combination as the target microphone combination.
  • the electronic device can select the target microphone combination according to the number of microphones. Since the number of microphones is smaller, the power consumption of the microphone array is lower. Therefore, the electronic device can use the candidate microphone combination with the smallest number of microphones as the first microphone combination.
  • the first microphone combination can be directly determined as the target microphone combination.
  • the method further includes:
  • the electronic device uses the first microphone combination with the lowest CPU occupancy rate as the second microphone combination;
  • the electronic device determines the second microphone combination as the target microphone combination.
  • the electronic device can obtain the CPU occupancy rate of each first microphone combination, and use the microphone combination with the lowest CPU occupancy rate as the second microphone combination.
  • the electronic device can directly determine the second microphone combination as the target microphone combination.
  • the method further includes:
  • the electronic device determines the second microphone combination with the highest sound pickup performance at the sound pickup position as the target microphone combination.
  • the electronic device can obtain the pickup performance of each second microphone combination at the pickup position, and set the second microphone combination with the highest pickup performance at the pickup position. Determined as the target microphone combination.
  • a second aspect of the embodiments of the present application provides a microphone array control device, including:
  • the mode switching module is configured to, if the voice signal meets the preset switching condition, obtain the parameter set corresponding to each microphone combination in the microphone array, and set the sound pickup position corresponding to the voice signal to meet the preset performance index condition
  • the microphone combination corresponding to the parameter set is used as the target microphone combination
  • the target application module is configured to use the microphones in the target microphone combination in the microphone array to perform sound pickup operations.
  • the preset switching condition is that the voice signal triggers an application of a preset application type.
  • the preset switching condition is that within a preset duration, an increase in the number of interactions corresponding to the sound source position of the voice signal is greater than or equal to a preset number threshold.
  • the preset switching condition is that when the electronic device is awakened by the voice signal, the remaining power of the electronic device is lower than a preset power threshold.
  • the mode switching module includes:
  • the noise source sub-module is configured to determine the sound source position of the noise as a non-pickup area if the voice signal contains noise, and determine the area outside the non-pickup area as a pickup position;
  • the target combination sub-module is configured to use the microphone combination corresponding to the parameter set meeting the preset performance index condition at the pickup position as the target microphone combination.
  • the device further includes:
  • the disabling module is used to turn off microphones other than the target microphone combination or make the microphones other than the target microphone combination enter a dormant state.
  • each of the microphone combinations includes at least one microphone
  • the mode switching module includes:
  • the candidate combination sub-module is configured to obtain the parameter set corresponding to each microphone combination in the microphone array, and use the microphone combination corresponding to the parameter set meeting preset performance index conditions at the pickup position corresponding to the voice signal as the candidate microphone combination;
  • the first combination sub-module is configured to, if the number of candidate microphone combinations is greater than 1, use the candidate microphone combination with the least number of microphones as the first microphone combination;
  • the first target sub-module is configured to determine the first microphone combination as the target microphone combination if the number of the first microphone combination is one.
  • the mode switching module further includes:
  • the second combination sub-module is configured to use the first microphone combination with the lowest CPU occupancy rate as the second microphone combination if the number of the first microphone combinations is greater than one;
  • the second target sub-module is configured to determine the second microphone combination as the target microphone combination if the number of the second microphone combination is one.
  • the mode switching module further includes:
  • the third target sub-module is configured to determine the second microphone combination with the highest sound pickup performance at the sound pickup position as the target microphone combination if the number of the second microphone combinations is greater than one.
  • the third aspect of the embodiments of the present application provides an electronic device including a memory, a processor, and a computer program stored in the memory and running on the processor.
  • the processor executes the computer program, Implement the steps as described above.
  • the fourth aspect of the embodiments of the present application provides a computer-readable storage medium, the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, the steps of the foregoing method are implemented.
  • the fifth aspect of the embodiments of the present application provides a computer program product.
  • the computer program product runs on an electronic device, the electronic device realizes the steps of the above-mentioned method.
  • the electronic device when the voice signal acquired by the electronic device meets the preset switching condition, the electronic device enters a low power consumption state, recognizes the pickup position where the pickup operation needs to be performed, and targets according to the pickup position. Select the target microphone combination, and control the microphone array according to the target microphone combination.
  • the working process of the microphone array only part of the microphones are used to collect the voice signals at the pickup position in a targeted manner, reducing the number of working microphones. The fewer microphones working in the microphone array, the lower the power consumption of the microphone array. Reducing the working microphones in the microphone array can effectively reduce the power consumption of the microphone array, improve the battery life of electronic devices, and solve the existing microphone array In the application scheme, the power consumption of the microphone array is relatively high, and the battery life of the electronic equipment is short.
  • FIG. 1 is a schematic flowchart of a method for controlling a microphone array according to an embodiment of the present application
  • Figure 2 is a schematic diagram of an application scenario provided by an embodiment of the present application.
  • FIG. 3 is a schematic diagram of another application scenario provided by an embodiment of the present application.
  • FIG. 4 is a schematic diagram of another application scenario provided by an embodiment of the present application.
  • Fig. 5 is a schematic diagram of another application scenario provided by an embodiment of the present application.
  • Fig. 6 is a schematic diagram of another application scenario provided by an embodiment of the present application.
  • FIG. 7 is a schematic diagram of another application scenario provided by an embodiment of the present application.
  • FIG. 8 is a schematic diagram of another application scenario provided by an embodiment of the present application.
  • FIG. 9 is a schematic diagram of a microphone array control device provided by an embodiment of the present application.
  • Fig. 10 is a schematic diagram of an electronic device provided by an embodiment of the present application.
  • the term “if” can be interpreted as “when” or “once” or “in response to determination” or “in response to detection” depending on the context .
  • the phrase “if determined” or “if detected [described condition or event]” can be interpreted as meaning “once determined” or “in response to determination” or “once detected [described condition or event]” depending on the context ]” or “in response to detection of [condition or event described]”.
  • the microphone array control method provided in the embodiments of the present application can be applied to electronic devices.
  • the electronic device can be any device with voice interaction function, including but not limited to smart speakers with voice interaction function, smart home appliances, smart phones, tablet computers, in-vehicle devices, wearable devices, and augmented reality (AR)/ Virtual reality equipment (virtual reality, VR), etc.
  • the microphone array control method provided in this application may be specifically stored in an electronic device in the form of an application program or software, and the electronic device implements the microphone array control method provided in this application by executing the application program or software.
  • a microphone is an energy conversion device that converts sound signals into electrical signals.
  • Microphone Array refers to the arrangement of microphones, that is, the microphone array is composed of a certain number of microphones, which are used to sample and process the spatial characteristics of the sound field.
  • Microphone arrays include inline arrays, cross arrays, planar arrays, spiral arrays, spherical arrays, and random arrays.
  • the number of elements of the microphone array (that is, the number of microphones) can range from 2 to thousands. Due to cost constraints, the number of elements of a consumer microphone array generally does not exceed 8. The most common ones on the market are an array of 6 microphones and an array of 4 microphones.
  • the microphone array consumes a lot of power when the sound is picked up.
  • the more elements in the microphone array the greater the amount of data generated, the more complex the sound pickup algorithm, and the greater the power consumption. For example, if you use a single microphone to receive sound, you only need to process the sound signal collected by a single microphone, and use a single microphone model to process the sound signal
  • the microphone model processes the sound signals collected by the microphones; if you use 4 microphones to collect the sound, you need to process the sound signals collected by the 4 microphones, and use the model corresponding to the 4 microphones to process the sound signals collected by the microphones; if you use 6 microphones, you need to process the sound signals collected by the microphones. Process the sound signals collected by the 6 microphones, and use the model corresponding to the 6 microphones to process the sound signals collected by the microphones.
  • most current electronic devices have limited battery capacity, and the microphone array will greatly affect the battery life of these electronic devices.
  • the microphone array control method includes:
  • the electronic device acquires the voice signal collected by the microphone array
  • the microphone array has a signal collection area, which specifically refers to the area where the microphone array can collect voice signals.
  • the signal collection area of the microphone array may be a 180° spatial area in front of the TV;
  • the signal collection area of the microphone array includes a 360° spatial area surrounding the microphone array.
  • the electronic device obtains a parameter set corresponding to each microphone combination in the microphone array, and sets the sound pickup position corresponding to the voice signal to meet the preset performance index condition.
  • the microphone combination corresponding to the parameter set is used as the target microphone combination;
  • some specific switching conditions can be set in advance.
  • the microphone array of the electronic device executes the steps shown in S102 to S103 proposed in this solution, and enters low power consumption.
  • Working status In the low-power working state, the electronic device recognizes the pickup location that needs to perform the pickup operation, and specifically turns on and off some microphones in the microphone array, and only uses some of the microphones in the microphone array to perform the pickup operation, reducing the microphone array Power consumption.
  • the preset switching conditions for entering the low-power working state and the recognition method of the pickup position can be set according to the actual situation, and the following description is combined with specific application scenarios.
  • Some applications require users and electronic devices to conduct multiple rounds of dialogue interaction, and no wake-up words are required during the interaction, such as idiom solitaire, family KTV and other applications.
  • the user's position is relatively stable, and the microphone array does not need to collect sound signals in all directions, but only needs to collect the voice signals of the area where the user is located.
  • the electronic device can enter a low power consumption working state, and only use some microphones to perform sound pickup operations, reducing the power consumption of the microphone array.
  • the electronic device After detecting the application type of the called application, the electronic device can determine whether the application type of the called application is a preset application type. If the application type of the called application is the preset application type, the position information of the sound source of the application call instruction is determined, and the position information of the sound source of the application call instruction is used as the sound pickup position to enter a low power consumption working state.
  • the initial state of the electronic device 201 is the standby state, and the wake-up word of the electronic device 201 is set to "Xiaoyi". At this time, the electronic device 201 can turn on all microphones in the microphone array to monitor surrounding voice signals.
  • the electronic device 201 When the electronic device 201 receives the voice signal "Xiaoyi Xiaoyi, turn on the idiom solitaire" containing the wake-up word and the application call instruction, it exits the standby state and recognizes the application type in the voice signal.
  • the electronic device 201 recognizes that the application type of "Idiom Solitaire" belongs to the preset application type, which means that the user will interact with the electronic device 201 for multiple rounds of dialogue.
  • the source position is used as the pickup position to enter a low-power working state, reducing the power consumption of the microphone array.
  • Scenario 2 When the electronic device acquires the voice signal collected by the microphone array, if the voice signal contains a human voice signal, it can be determined that the user or other electronic device emitting the human voice signal has interacted with the electronic device of this embodiment once , To determine the sound source position of the voice signal, and the number of interactions corresponding to the sound source position plus one. If a user or other electronic device performs multiple rounds of interaction with the electronic device of this embodiment in the same area within a unit time, so that the increase in the number of interactions corresponding to a certain sound source position reaches the preset number threshold, it means that the user or other electronic devices The electronic device may continue to interact with the electronic device of this embodiment in the same area. For example, the user sits on a sofa and interacts with the electronic device of this embodiment in a game.
  • the microphone array does not need to collect voice signals in all directions.
  • the electronic device can perform the steps shown in S102 to S103 proposed in this solution to detect the sound source position of the human voice signal, and use the sound source position of the human voice signal as the pickup.
  • the sound position enters a low-power working state, which reduces the power consumption of the microphone array.
  • the length of the unit time and the preset threshold can be set according to actual needs. For example, the unit time length can be set to 3 minutes, and the preset number threshold can be set to 10 times. If the user interacts with the electronic device 10 times in the same area within 3 minutes, the electronic device enters a low power consumption working state.
  • Scenario 3 The electronic device obtains the voice signal collected by the microphone array. If the voice signal contains a wake-up word, the electronic device can exit the dormant state, enter the working state, perform a self-check, and check the remaining power. If the remaining power of the electronic device is lower than the preset power threshold, the electronic device can identify the sound source location of the voice signal containing the wake-up word, determine the sound source location of the voice signal as the pickup location, and execute S102 proposed in this solution Go to the step shown in S103 to enter a low-power consumption working state to reduce the power consumption of the microphone array.
  • the initial state of the electronic device 301 is the standby state
  • the wake-up word of the electronic device 301 is set to "Xiaoyi”
  • the preset power threshold is 20%.
  • the electronic device 301 can turn on all microphones in the microphone array to monitor surrounding sound signals.
  • the electronic device 301 When the electronic device 301 receives the voice signal "Xiaoyi Xiaoyi" containing the wake-up word, it exits the standby state and performs a self-check on the device state. At this time, the electronic device 301 detects that only 15% of its own power remains, which is lower than the preset power threshold of 20%. In order to improve the endurance time, the electronic device 301 uses the sound source position of the collected voice signal as the pickup position, and executes the steps shown in S102 to S103 proposed in this solution to enter a low power consumption working state and reduce the power consumption of the microphone array.
  • the electronic device can also periodically or aperiodically detect the environmental information of the surrounding environment through spatial sensing technology and self-voicing methods.
  • the environmental information may include one or more of spatial information, environmental noise, and other information.
  • the electronic device 501 can periodically activate the microphone array to collect voice signals from the surrounding environment, perform noise detection on the collected voice signals, and detect whether there is noise in the environment.
  • one side of the electronic device 601 is against the wall, and a noise source 602 exists on the other side.
  • the electronic device 601 detects the surrounding environment information through spatial sensing technology, and detects the area where the wall is located and the area where the noise source 602 is located.
  • the electronic device 601 can use the area where the wall is located and the area where the noise source 602 is located as the non-sound pickup area, determine other areas as the pickup area (ie, the oblique line marked area in FIG. 6), and use the above pickup area as the pickup area.
  • For the sound position perform the steps shown in S102 to S103 proposed in this solution to enter a low power consumption working state, reduce the power consumption of the microphone array, and reduce the interference of the strong noise source 602.
  • the sound pickup position described in this embodiment may be a specific orientation or a certain area, and the specific definition of the sound pickup position can be set according to actual conditions, which is not limited here.
  • each microphone combination includes at least one microphone.
  • Each microphone combination can be set with one or more parameter sets.
  • the parameter sets can include one of the pickup direction, noise suppression parameters, automatic voice level adjustment parameters, adjustable pickup distance, and various threshold thresholds. Or multiple parameters. Once the parameter set is set, the sound pickup performance of the microphone combination corresponding to the parameter set in the corresponding area is also determined.
  • the electronic device After the electronic device obtains the sound pickup position, it can first obtain the parameter set of each microphone combination. In these parameter sets, some can meet the preset performance index conditions at the pickup position, and some cannot meet the preset performance index conditions at the pickup position. In order to ensure user experience, the electronic device should select a parameter set that can meet the preset performance index conditions at the pickup position from these parameter sets for application, and use the microphone combination corresponding to the parameter set that meets the preset performance index condition as the target microphone combination.
  • Area A, Area B, and Area C in Table 1 indicate the location of the sound source of the arousal word
  • D1 is the volume of the sound source of the arousal word
  • D2 and D3 indicate the volume of the noise source
  • D4 indicates the preset number of tests
  • Q1 to Q9 They represent the number of wake-ups in different situations
  • P1 to P9 represent the wake-up rates corresponding to Q1 to Q9, respectively.
  • the electronic device can also update the test data according to the application data through self-learning methods such as machine learning to obtain more accurate sound pickup performance index data.
  • the electronic device 701 can select parameter set A
  • the corresponding microphone combination is used as the target microphone combination, or the electronic device 701 may also select the microphone combination corresponding to the parameter set B as the target microphone combination, or the electronic device 701 may also select the microphone combination corresponding to the parameter set C as the target microphone combination;
  • the parameter sets that meet the preset performance index conditions include parameter set A and parameter set B
  • the electronic device 701 can select the microphone combination corresponding to the parameter set A as the target microphone combination
  • the electronic device 701 can also The microphone combination corresponding to parameter set B can be selected as the target microphone combination
  • the parameter set that meets the preset performance index conditions is only parameter set C
  • the electronic device 701 can only select the microphone corresponding to parameter set C
  • the combination serves as the target microphone combination.
  • the pickup performance index corresponding to the preset performance index conditions can be set according to the actual situation, which can include one or more of the pickup performance index such as wake-up rate, false wake-up rate, ASR (Automatic Speech Recognition) accuracy rate, etc. kind.
  • the preset performance index conditions can be set to a wake-up rate greater than 95%, and an ASR accuracy rate greater than 95%.
  • the microphones in the target microphone combination should be non-damaged microphones.
  • the theoretical pickup performance of certain microphone combinations at the pickup position can meet the preset performance index conditions.
  • the actual pickup performance of these microphone combinations at the pickup position may not meet the requirements, and these microphone combinations should not be listed as target microphone combinations.
  • the target microphone combination should be selected from the two microphone combinations of [Microphone 1, Microphone 5] and [Microphone 1, Microphone 3, Microphone 4, Microphone 5].
  • the electronic device can directly select the microphone combination as the target microphone combination.
  • the electronic device may use the microphone combination corresponding to the parameter set meeting the preset performance index condition at the pickup position as the candidate microphone combination, Then the target microphone combination is selected from the candidate microphone combinations according to the preset decision strategy.
  • the electronic device may determine the microphone combination with the least number of microphones among the candidate microphone combinations as the first microphone combination. Since the two microphone combinations [Microphone 1, Microphone 5] and [Microphone 2, Microphone 4] both contain 2 microphones, the microphone combination [Microphone 1, Microphone 2, Microphone 4, Microphone 5] contains 4 microphones. Therefore, the first microphone combination includes two microphone combinations [microphone 1, microphone 5] and [microphone 2, microphone 4]. At this time, the number of the first microphone combination is greater than 1, and the target microphone combination cannot be directly determined, and the electronic device can obtain the CPU occupancy rate of each first microphone combination.
  • the CPU occupancy rate refers to the ratio of the CPU time consumed by a process in a period of time to the length of the period of time. It should be understood that although the number of microphones in each first microphone combination is the same, the same microphone model is used to process the voice signals collected by the microphone array. However, it is possible that the placement positions of the microphones in the first microphone combinations are different, resulting in different CPU occupancy rates for each first microphone combination. At this time, the first microphone combination with the lowest CPU usage may be determined as the second microphone combination. At this time, assuming that the CPU occupancy rate of the two microphone combinations [microphone 1, microphone 5] and [microphone 2, microphone 4] is the same, then the two microphones [microphone 1, microphone 5] and [microphone 2, microphone 4] The microphone combination is used as the second microphone combination.
  • the number of second microphone combinations is greater than 1, and the target microphone combination cannot be directly determined.
  • the electronic device can determine the second microphone combination with the best sound pickup performance as the target microphone combination according to the preset sound pickup performance index. Assuming that the preset sound pickup performance index is the wake-up rate, the wake-up rate of the microphone combination [microphone 1, microphone 5] at the pickup position is 95%, and the microphone combination [microphone 2, microphone 4] is at the pickup position. The wake-up rate is 96%, and the microphone combination [microphone 2, microphone 4] is selected as the target microphone combination.
  • the CPU occupancy rate of different microphone combinations and the pickup performance of each microphone combination in each area can be obtained according to the test data before leaving the factory.
  • the electronic device 801 uses the sound source position of the detected human voice signal as the sound pickup position.
  • the electronic device 801 obtains the parameter set of each microphone combination, and uses the microphone combination corresponding to the parameter set meeting the preset performance index condition at the pickup position as a candidate microphone combination. After screening, the electronic device 801 determines the three microphone combinations [microphone 1, microphone 5], [microphone 2, microphone 4], and [microphone 1, microphone 2, microphone 4, microphone 5] (that is, the dashed box in Figure 8). The microphone combination) as a candidate microphone combination.
  • the target microphone combination is selected according to a preset decision strategy.
  • the preset decision-making strategy the number of microphones, CPU occupancy rate, and sound pickup performance are used as decision factors, and the priority is: number of microphones> CPU occupancy rate> sound pickup performance.
  • the microphone combination with the least number of microphones is first selected as the target microphone combination.
  • the CPU occupancy rate corresponding to each candidate microphone combination can be obtained, and the candidate microphone combination with the lowest CPU occupancy rate is selected as the target microphone combination.
  • the candidate microphone combination with the best pickup performance in the pickup direction can be selected as the target microphone combination.
  • the two sets of microphone combinations [microphone 1, microphone 5] and [microphone 2, microphone 4] both use 2 microphones, [microphone 1, microphone 2, microphone 4, microphone 5] this group
  • the microphone combination uses 4 microphones. Therefore, first exclude the microphone combination [microphone 1, microphone 2, microphone 4, microphone 5] according to the principle of reducing the number of microphones.
  • the electronic device can use the target microphone combination to perform sound pickup operations.
  • the electronic device can use the microphones in the target microphone combination to perform the sound pickup operation, turn off the microphones other than the target microphone combination, or put the microphones other than the target microphone combination into a dormant state, use only part of the microphones for the sound pickup operation, and select
  • a suitable microphone model processes the collected voice signals, reduces the voice signals that electronic devices need to process, reduces algorithm complexity, saves calculation examples, reduces power consumption, and improves the performance of electronic devices.
  • select [Microphone 2, Microphone 4] as the target microphone combination then enable microphone 2 and microphone 4 in the microphone array, turn off microphone 1, microphone 3, microphone 5, and microphone 6 in the microphone array, or let microphone 1, microphone 3 , Microphone 5 and microphone 6 enter the dormant state, and load the 2-microphone model, and use the 2-microphone model to process the voice signals collected by the microphone 2 and the microphone 4.
  • an instruction to exit the application issued by the user or other electronic device may be used as a triggering condition for exiting the low-power consumption working state. For example, if the user issues an application call instruction "Xiaoyi Xiaoyi, open idiom solitaire", the electronic device starts the "idiom solitaire” application and enters a low power consumption working state.
  • the electronic device closes the "idiom solitaire” application, and the electronic device exits the low-power consumption working state according to the instruction and enters the normal working state or the sleep state.
  • the trigger condition for exiting the low-power working state may be that the user enters the low-power working state for a preset period of time. Receive the user's vocal signal. For example, the unit time is set to 3 minutes, the preset number threshold is set to 10 times, and the preset duration is set to 3 minutes. The user sits on the sofa and interacts with the electronic device in a game. If the number of interactions with the electronic device reaches 10 times within 3 minutes, the electronic device enters a low power consumption working state. When the user does not interact with the electronic device within 3 minutes due to leaving or other reasons, the electronic device exits the low power consumption working state and enters the normal working state or the sleep state.
  • the preset duration can be set according to the actual situation.
  • the preset duration can be set to 1 minute, 3 minutes, 5 minutes, and so on.
  • the trigger condition for exiting the low-power operating state may be that the power of the electronic device is greater than or equal to the preset power threshold.
  • the preset power threshold is set to 20%.
  • the electronic device When the electronic device is awakened, it is detected that 15% of its own power remains, which is less than 20% of the preset power threshold, and then it enters a low power consumption working state.
  • the power of the electronic device gradually recovers, and when the power of the electronic device is greater than or equal to 20%, it exits the low power consumption working state and enters the normal working state.
  • the target microphone combination used in the low-power working state can also be changed. For example, suppose that the electronic device enters a low power consumption working state because it is against a wall. When the position of the electronic device changes, if the electronic device is still against the wall but the direction of the wall changes, the electronic device will change the target microphone combination accordingly. Assuming that the electronic device enters a low power consumption working state due to a strong noise source, when the position of the strong noise source changes, the electronic device can change the target microphone combination used according to the changed position of the strong noise source. When the target microphone combination needs to be changed correspondingly according to changes in environmental factors, the method for selecting the target microphone combination can refer to the above-mentioned process of selecting the target microphone combination when entering the low power consumption working state.
  • FIG. 9 An embodiment of the present application provides a microphone array control device. For ease of description, only parts related to the present application are shown. As shown in FIG. 9, the microphone array control device includes:
  • the mode switching module 902 is configured to, if the voice signal meets the preset switching condition, obtain the parameter set corresponding to each microphone combination in the microphone array, and meet the preset performance index condition at the pickup position corresponding to the voice signal
  • the microphone combination corresponding to the parameter set of is used as the target microphone combination;
  • the target application module 903 is configured to use the microphones in the target microphone combination in the microphone array to perform sound pickup operations.
  • the preset switching condition is that the voice signal triggers an application of a preset application type.
  • the preset switching condition is that within a preset duration, the increment of the number of interactions corresponding to the sound source position of the voice signal is greater than or equal to a preset threshold of the number of times.
  • the preset switching condition is that when the electronic device is awakened by the voice signal, the remaining power of the electronic device is lower than a preset power threshold.
  • the preset switching condition is that the spatial information corresponding to the voice signal is that there is an obstructed area.
  • the mode switching module 902 includes:
  • the noise source sub-module is configured to determine the sound source position of the noise as a non-pickup area if the voice signal contains noise, and determine the area outside the non-pickup area as a pickup position;
  • the target combination sub-module is configured to use the microphone combination corresponding to the parameter set meeting the preset performance index condition at the pickup position as the target microphone combination.
  • the device further includes:
  • the disabling module is used to turn off microphones other than the target microphone combination or make the microphones other than the target microphone combination enter a dormant state.
  • each of the microphone combinations includes at least one microphone
  • the mode switching module 902 includes:
  • the candidate combination sub-module is configured to obtain the parameter set corresponding to each microphone combination in the microphone array, and use the microphone combination corresponding to the parameter set meeting preset performance index conditions at the pickup position corresponding to the voice signal as the candidate microphone combination;
  • the first combination sub-module is configured to, if the number of candidate microphone combinations is greater than 1, use the candidate microphone combination with the least number of microphones as the first microphone combination;
  • the first target sub-module is configured to determine the first microphone combination as the target microphone combination if the number of the first microphone combination is one.
  • mode switching module 902 further includes:
  • the second combination sub-module is configured to use the first microphone combination with the lowest CPU occupancy rate as the second microphone combination if the number of the first microphone combinations is greater than one;
  • mode switching module 902 further includes:
  • the third target sub-module is configured to determine the second microphone combination with the highest sound pickup performance at the sound pickup position as the target microphone combination if the number of the second microphone combinations is greater than one.
  • the computer program 1002 may be divided into one or more modules/units, and the one or more modules/units are stored in the memory 1001 and executed by the processor 1000 to complete This application.
  • the one or more modules/units may be a series of computer program instruction segments capable of completing specific functions, and the instruction segments are used to describe the execution process of the computer program 1002 in the electronic device 100.
  • the computer program 1002 can be divided into a signal acquisition module, a mode switching module, and a target application module. The specific functions of each module are as follows:
  • the signal acquisition module is used to acquire the voice signal collected by the microphone array
  • the mode switching module is configured to, if the voice signal meets the preset switching condition, obtain the parameter set corresponding to each microphone combination in the microphone array, and set the sound pickup position corresponding to the voice signal to meet the preset performance index condition
  • the microphone combination corresponding to the parameter set is used as the target microphone combination
  • the target application module is configured to use the microphones in the target microphone combination in the microphone array to perform sound pickup operations.
  • the electronic device 100 may be an electronic device equipped with a microphone array, such as a desktop computer, a notebook, a palmtop computer, and a smart speaker.
  • the electronic device may include, but is not limited to, a processor 1000 and a memory 1001.
  • FIG. 10 is only an example of the electronic device 100, and does not constitute a limitation on the electronic device 100. It may include more or less components than shown, or a combination of certain components, or different components.
  • the electronic device may also include input and output devices, network access devices, buses, and so on.
  • the so-called processor 1000 can be a central processing unit (Central Processing Unit, CPU), other general-purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), Field-Programmable Gate Array (FPGA) or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components, etc.
  • the general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
  • the memory 1001 may be an internal storage unit of the electronic device 100, such as a hard disk or a memory of the electronic device 100.
  • the memory 1001 may also be an external storage device of the electronic device 100, such as a plug-in hard disk equipped on the electronic device 100, a smart memory card (Smart Media Card, SMC), and a Secure Digital (SD) Card, Flash Card, etc.
  • the memory 1001 may also include both an internal storage unit of the electronic device 100 and an external storage device.
  • the memory 1001 is used to store the computer program and other programs and data required by the electronic device.
  • the memory 1001 can also be used to temporarily store data that has been output or will be output.
  • the disclosed device/electronic device and method may be implemented in other ways.
  • the device/electronic device embodiments described above are only illustrative.
  • the division of the modules or units is only a logical function division, and there may be other divisions in actual implementation, such as multiple units.
  • components can be combined or integrated into another system, or some features can be omitted or not implemented.
  • the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
  • the functional units in the various embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
  • the above-mentioned integrated unit can be implemented in the form of hardware or software functional unit.
  • the integrated module/unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium.
  • the present application implements all or part of the processes in the above-mentioned embodiments and methods, and can also be completed by instructing relevant hardware through a computer program.
  • the computer program can be stored in a computer-readable storage medium. When the program is executed by the processor, it can implement the steps of the foregoing method embodiments.
  • the computer program includes computer program code, and the computer program code may be in the form of source code, object code, executable file, or some intermediate forms.
  • the computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, U disk, mobile hard disk, magnetic disk, optical disk, computer memory, read-only memory (ROM, Read-Only Memory) , Random Access Memory (RAM, Random Access Memory), electrical carrier signal, telecommunications signal, and software distribution media, etc.
  • ROM Read-Only Memory
  • RAM Random Access Memory
  • electrical carrier signal telecommunications signal
  • software distribution media etc.
  • the content contained in the computer-readable medium can be appropriately added or deleted according to the requirements of the legislation and patent practice in the jurisdiction.
  • the computer-readable medium Does not include electrical carrier signals and telecommunication signals.

Landscapes

  • Health & Medical Sciences (AREA)
  • Otolaryngology (AREA)
  • Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Acoustics & Sound (AREA)
  • Signal Processing (AREA)
  • Circuit For Audible Band Transducer (AREA)

Abstract

本申请适用于音频处理技术领域,提供了一种麦克风阵列控制方法、装置、电子设备及计算机存储介质。在本申请的方法中,如果电子设备获取到的语音信号满足预设切换条件,则电子设备会进入低功耗状态,识别拾音位置,根据拾音位置针对性选取目标麦克风组合,根据目标麦克风组合对麦克风阵列进行控制。麦克风阵列在工作的过程中,只使用部分麦克风,针对性地采集拾音位置的语音信号,减少了工作的麦克风数量。麦克风阵列中工作的麦克风越少,麦克风阵列的功耗越低,减少麦克风阵列中工作的麦克风可以有效降低了麦克风阵列的功耗,提高了电子设备的电池续航能力,解决了现有的麦克风阵列应用方案中,麦克风阵列的功耗较大,电子设备续航时间短的问题。

Description

麦克风阵列控制方法、装置、电子设备及计算机存储介质
本申请要求于2020年04月08日提交国家知识产权局、申请号为202010270470.5、申请名称为“麦克风阵列控制方法、装置、电子设备及计算机存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请属于音频处理技术领域,尤其涉及一种麦克风阵列控制方法、装置、电子设备及计算机存储介质。
背景技术
随着科技的发展,许多电子设备上设置了麦克风阵列,麦克风阵列成为了人机语音交互的重要枢纽。
麦克风阵列(Microphone Array)指的是麦克风的排列。麦克风阵列由一定数量的麦克风组成,用来对声场的空间特性进行采样并处理。麦克风阵列包括一字阵列、十字阵列、平面阵列、螺旋阵列、球形阵列以及无规则阵列等类型。
麦克风阵列在拾音状态下会非常耗电,麦克风阵列中的阵元数量(即麦克风数量)越多,功耗越大。而当前的大多数电子设备的电池容量有限,麦克风阵列的使用会极大地影响电子设备的续航时间。
因此,如何降低麦克风阵列的功耗成为了本领域技术人员亟需解决的技术问题。
发明内容
有鉴于此,本申请实施例提供了一种麦克风阵列控制方法、装置、电子设备及计算机存储介质,以解决现有的麦克风阵列应用方案中,麦克风阵列的功耗较大,电子设备续航时间短的问题。
本申请实施例的第一方面提供了一种麦克风阵列控制方法,包括:
电子设备获取麦克风阵列采集到的语音信号;
若所述语音信号满足预设切换条件,则所述电子设备获取所述麦克风阵列中各个麦克风组合对应的参数集,将在所述语音信号对应的拾音位置满足预设性能指标条件的参数集对应的麦克风组合作为目标麦克风组合;
所述电子设备使用所述麦克风阵列中所述目标麦克风组合中的麦克风执行拾音操作。
需要说明的是,电子设备获取麦克风阵列采集到的语音信号,当语音信号满足预设切换条件时,电子设备进入低功耗工作状态。
此时,电子设备根据语音信号确定需要执行拾音操作的拾音位置,获取麦克风阵列中各个麦克风组合对应的参数集。
各个参数集在各个位置的拾音性能是确定的,因此,电子设备可以将在拾音位置满足预设性能指标条件的参数集作为目标参数集,将目标参数集对应的麦克风组合作为目标麦克风组合。
然后,电子设备使用目标麦克风组合中的麦克风,在确保麦克风阵列的拾音性能的条件下,减少处于工作状态的麦克风数量,降低麦克风阵列的功耗,提高电子设备的续航能力。
在第一方面的一种可能的实现方式中,所述预设切换条件为所述语音信号触发预设应用类型的应用程序。
需要说明的是,由于部分应用需要用户和电子设备进行多轮对话交互,且交互的过程中不需要唤醒词,例如成语接龙、家庭KTV等应用。当用户使用这些应用时,用户的位置相对稳定,麦克风阵列不必采集所有方向上的声音信号,只需要采集用户所在区域的语音信号即可。
因此,当电子设备检测到应用调用指令对应的应用类型为预设应用类型时,表示上述语音信号触发了预设应用类型的应用程序,电子设备可以进入低功耗工作状态,将上述语音信号的声源位置作为拾音位置,根据拾音位置针对性地选取目标麦克风组合并应用。
在第一方面的另一种可能的实现方式中,所述预设切换条件为在预设时长内,所述语音信号的声源位置对应的交互次数的增量大于或等于预设次数阈值。
需要说明的是,当电子设备获取麦克风阵列采集的语音信号时,如果语音信号中包含人声信号,则可以判定为发出上述人声信号的用户或其他电子设备与本实施例的电子设备进行了一次交互,确定该语音信号的声源位置,该声源位置对应的交互次数加1。
如果在预设时长内,语音信号的声源位置对应的交互次数的增量大于或等于预设次数阈值,则表示用户在该位置与电子设备进行了多次交互,并且后续还可能在相同的位置继续与电子设备进行交互。
此时,电子设备可以进入低功耗工作状态,将上述语音信号的声源位置作为拾音位置,根据拾音位置针对性地选取目标麦克风组合并应用。
在第一方面的另一种可能的实现方式中,所述预设切换条件为所述电子设备被所述语音信号唤醒时,所述电子设备的剩余电量低于预设电量阈值。
需要说明的是,当电子设备获取到的语音信号包含唤醒词时,电子设备会退出休眠状态,进入工作状态。
此时,电子设备可以进行自检,检查剩余电量。如果电子设备的剩余电量低于预设电量阈值,则电子设备可以进入低功耗工作状态,将上述语音信号的声源位置作为拾音位置,根据拾音位置针对性地选取目标麦克风组合并应用。
在第一方面的另一种可能的实现方式中,所述预设切换条件为所述语音信号对应的空间信息为存在遮挡区域。
需要说明的是,电子设备可以通过自发声等方式产生用于检测空间信息的检测信号,并通过麦克风阵列采集检测信号接触物体后反射的语音信号。
电子设备对上述语音信号进行分析,可以得到电子设备的空间信息,判断电子设备周围是否存在遮挡区域,例如电子设备是否靠墙,如果电子设备靠墙,则墙壁所在的区域为遮挡区域。
当电子设备检测到空间信息为存在遮挡区域时,由于遮挡区域(例如墙壁所在区 域)通常不需要执行拾音操作,因此,电子设备可以将非遮挡区域确定为拾音位置,根据拾音位置针对性地选取目标麦克风组合并应用。
在第一方面的另一种可能的实现方式中,所述将在所述语音信号对应的拾音位置满足预设性能指标条件的参数集对应的麦克风组合作为目标麦克风组合,包括:
若所述语音信号包含噪声,则所述电子设备将所述噪声的声源位置确定为非拾音区域,将所述非拾音区域以外的区域确定为拾音位置;
将在所述拾音位置满足预设性能指标条件的参数集对应的麦克风组合作为目标麦克风组合。
需要说明的是,电子设备可以对麦克风阵列采集到的语音信号进行分析,如果上述语音信号中包含噪声,则表示电子设备周围的环境中存在噪声源,噪声源会影响麦克风阵列的拾音质量。
此时,电子设备可以将噪声的声源位置确定为非拾音区域,将非拾音区域以外的区域确定为拾音位置,根据拾音位置针对性地选取目标麦克风组合并应用,减少噪声源对麦克风阵列的影响。
在第一方面的一种可能的实现方式中,所述方法还包括:
所述电子设备关闭所述目标麦克风组合以外的麦克风或使所述目标麦克风组合以外的麦克风进入休眠状态。
需要说明的是,对于目标麦克风组合以外的麦克风,电子设备可以通过关闭目标麦克风组合以外的麦克风或使目标麦克风组合以外的麦克风进入休眠状态的方式,减少处于工作状态的麦克风的数量,从而降低麦克风阵列的功耗,提高电子设备的续航能力。
在第一方面的一种可能的实现方式中,每个所述麦克风组合包括至少一个麦克风;
所述电子设备获取所述麦克风阵列中各个麦克风组合对应的参数集,将在所述拾音位置满足预设性能指标条件的参数集对应的麦克风组合作为目标麦克风组合,包括:
所述电子设备获取所述麦克风阵列中各个麦克风组合对应的参数集,将在所述拾音位置满足预设性能指标条件的参数集对应的麦克风组合作为候选麦克风组合;
若所述候选麦克风组合的数量大于1,则所述电子设备将麦克风数量最少的候选麦克风组合作为第一麦克风组合;
若所述第一麦克风组合的数量为1,则所述电子设备将所述第一麦克风组合确定为所述目标麦克风组合。
需要说明的是,电子设备可以将在拾音位置满足预设性能指标条件的参数集对应的麦克风组合作为候选麦克风组合。
候选麦克风组合可能有1个或多个。当只有1个候选麦克风组合时,电子设备可以直接将候选麦克风组合确定为目标麦克风组合。
当存在多个候选麦克风组合时,电子设备可以根据麦克风数量选择目标麦克风组合。由于麦克风数量越少,麦克风阵列的功耗越低,因此,电子设备可以将麦克风数量最少的候选麦克风组合作为第一麦克风组合。
如果只有1个第一麦克风组合,则可以直接将该第一麦克风组合确定为目标麦克风组合。
在第一方面的一种可能的实现方式中,所述方法还包括:
若所述第一麦克风组合的数量大于1,则所述电子设备将中央处理器占用率最低的第一麦克风组合作为第二麦克风组合;
若所述第二麦克风组合的数量为1,则所述电子设备将所述第二麦克风组合确定为所述目标麦克风组合。
需要说明的是,如果存在多个第一麦克风组合时,电子设备可以获取各第一麦克风组合的中央处理器占用率,将中央处理器占用率最低的麦克风组合作为第二麦克风组合。
如果只有一个第二麦克风组合,则电子设备可以直接将该第二麦克风组合确定为目标麦克风组合。
在第一方面的一种可能的实现方式中,所述方法还包括:
若所述第二麦克风组合的数量大于1,则所述电子设备将在所述拾音位置的拾音性能最高的第二麦克风组合确定为所述目标麦克风组合。
需要说明的是,如果存在多个第二麦克风组合时,电子设备可以获取各个第二麦克风组合在拾音位置的拾音性能,将在所述拾音位置的拾音性能最高的第二麦克风组合确定为所述目标麦克风组合。
本申请实施例的第二方面提供了一种麦克风阵列控制装置,包括:
信号获取模块,用于获取麦克风阵列采集到的语音信号;
模式切换模块,用于若所述语音信号满足预设切换条件,则获取所述麦克风阵列中各个麦克风组合对应的参数集,将在所述语音信号对应的拾音位置满足预设性能指标条件的参数集对应的麦克风组合作为目标麦克风组合;
目标应用模块,用于使用所述麦克风阵列中所述目标麦克风组合中的麦克风执行拾音操作。
在第二方面的一种可能的实现方式中,所述预设切换条件为所述语音信号触发预设应用类型的应用程序。
在第二方面的另一种可能的实现方式中,所述预设切换条件为在预设时长内,所述语音信号的声源位置对应的交互次数的增量大于或等于预设次数阈值。
在第二方面的另一种可能的实现方式中,所述预设切换条件为电子设备被所述语音信号唤醒时,所述电子设备的剩余电量低于预设电量阈值。
在第二方面的另一种可能的实现方式中,所述预设切换条件为所述语音信号对应的空间信息为存在遮挡区域。
在第二方面的另一种可能的实现方式中,所述模式切换模块包括:
噪声声源子模块,用于若所述语音信号包含噪声,则将所述噪声的声源位置确定为非拾音区域,将所述非拾音区域以外的区域确定为拾音位置;
目标组合子模块,用于将在所述拾音位置满足预设性能指标条件的参数集对应的麦克风组合作为目标麦克风组合。
在第二方面的一种可能的实现方式中,所述装置还包括:
禁用模块,用于关闭所述目标麦克风组合以外的麦克风或使所述目标麦克风组合以外的麦克风进入休眠状态。
在第二方面的一种可能的实现方式中,每个所述麦克风组合包括至少一个麦克风;
所述模式切换模块包括:
候选组合子模块,用于获取所述麦克风阵列中各个麦克风组合对应的参数集,将在所述语音信号对应的拾音位置满足预设性能指标条件的参数集对应的麦克风组合作为候选麦克风组合;
第一组合子模块,用于若所述候选麦克风组合的数量大于1,则将麦克风数量最少的候选麦克风组合作为第一麦克风组合;
第一目标子模块,用于若所述第一麦克风组合的数量为1,则将所述第一麦克风组合确定为所述目标麦克风组合。
在第二方面的一种可能的实现方式中,所述模式切换模块还包括:
第二组合子模块,用于若所述第一麦克风组合的数量大于1,则将中央处理器占用率最低的第一麦克风组合作为第二麦克风组合;
第二目标子模块,用于若所述第二麦克风组合的数量为1,则将所述第二麦克风组合确定为所述目标麦克风组合。
在第二方面的一种可能的实现方式中,所述模式切换模块还包括:
第三目标子模块,用于若所述第二麦克风组合的数量大于1,则将在所述拾音位置的拾音性能最高的第二麦克风组合确定为所述目标麦克风组合。
本申请实施例的第三方面提供了一种电子设备,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机程序,所述处理器执行所述计算机程序时实现如上述方法的步骤。
本申请实施例的第四方面提供了一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,所述计算机程序被处理器执行时实现如上述方法的步骤。
本申请实施例的第五方面提供了一种计算机程序产品,当计算机程序产品在电子设备上运行时,使得电子设备实现如上述方法的步骤。
本申请实施例与现有技术相比存在的有益效果是:
在本申请的麦克风阵列控制方法中,当电子设备获取到的语音信号满足预设切换条件时,电子设备会进入低功耗状态,识别需要执行拾音操作的拾音位置,根据拾音位置针对性选取目标麦克风组合,根据目标麦克风组合对麦克风阵列进行控制。麦克风阵列在工作的过程中,只使用部分麦克风,针对性地采集拾音位置的语音信号,减少了工作的麦克风数量。麦克风阵列中工作的麦克风越少,麦克风阵列的功耗越低,减少麦克风阵列中工作的麦克风可以有效降低了麦克风阵列的功耗,提高了电子设备的电池续航能力,解决了现有的麦克风阵列应用方案中,麦克风阵列的功耗较大,电子设备续航时间短的问题。
附图说明
图1是本申请实施例提供的一种麦克风阵列控制方法的流程示意图;
图2是本申请实施例提供的一种应用场景的示意图;
图3是本申请实施例提供的另一种应用场景的示意图;
图4是本申请实施例提供的另一种应用场景的示意图;
图5是本申请实施例提供的另一种应用场景的示意图;
图6是本申请实施例提供的另一种应用场景的示意图;
图7是本申请实施例提供的另一种应用场景的示意图;
图8是本申请实施例提供的另一种应用场景的示意图;
图9是本申请实施例提供的一种麦克风阵列控制装置的示意图;
图10是本申请实施例提供的电子设备的示意图。
具体实施方式
以下描述中,为了说明而不是为了限定,提出了诸如特定系统结构、技术之类的具体细节,以便透彻理解本申请实施例。然而,本领域的技术人员应当清楚,在没有这些具体细节的其它实施例中也可以实现本申请。在其它情况中,省略对众所周知的系统、装置、电路以及方法的详细说明,以免不必要的细节妨碍本申请的描述。
为了说明本申请所述的技术方案,下面通过具体实施例来进行说明。
应当理解,当在本说明书和所附权利要求书中使用时,术语“包括”指示所描述特征、整体、步骤、操作、元素和/或组件的存在,但并不排除一个或多个其它特征、整体、步骤、操作、元素、组件和/或其集合的存在或添加。
还应当理解,在此本申请说明书中所使用的术语仅仅是出于描述特定实施例的目的而并不意在限制本申请。如在本申请说明书和所附权利要求书中所使用的那样,除非上下文清楚地指明其它情况,否则单数形式的“一”、“一个”及“该”意在包括复数形式。
还应当进一步理解,在本申请说明书和所附权利要求书中使用的术语“和/或”是指相关联列出的项中的一个或多个的任何组合以及所有可能组合,并且包括这些组合。
如在本说明书和所附权利要求书中所使用的那样,术语“如果”可以依据上下文被解释为“当...时”或“一旦”或“响应于确定”或“响应于检测到”。类似地,短语“如果确定”或“如果检测到[所描述条件或事件]”可以依据上下文被解释为意指“一旦确定”或“响应于确定”或“一旦检测到[所描述条件或事件]”或“响应于检测到[所描述条件或事件]”。
另外,在本申请的描述中,术语“第一”、“第二”等仅用于区分描述,而不能理解为指示或暗示相对重要性。
本申请实施例提供的麦克风阵列控制方法可以应用于电子设备。该电子设备可以是任意具有语音交互功能的设备,包括但不限于具有语音交互功能的智能音箱、智能家电、智能手机、平板电脑、车载设备、可穿戴设备以及增强现实(Augmented Reality,AR)/虚拟现实设备(virtual reality,VR)等。本申请提供的麦克风阵列控制方法具体可以以应用程序或软件的形式存储于电子设备,电子设备通过执行该应用程序或软件,实现本申请提供的麦克风阵列控制方法。
麦克风(microphone)是将声音信号转换成电信号的能量转换器件。麦克风阵列(Microphone Array)指的是麦克风的排列,也即是说,麦克风阵列由一定数量的麦克风组成,用来对声场的空间特性进行采样并处理。麦克风阵列包括一字阵列、十字阵列、平面阵列、螺旋阵列、球形阵列以及无规则阵列等类型。麦克风阵列的阵元数量(即麦克风数量),可以从2个到上千个不等。由于成本限制,消费级麦克风阵列的阵元数量一般不超过8个,市面上最常见的是6个麦克风的阵列和4个麦克风的阵列。
麦克风阵列在拾音状态下会非常耗电。麦克风阵列中的阵元越多,产生的数据量越大, 拾音算法就越复杂,占用的功耗就越大。例如,使用单个麦克风收音,只需要处理单个麦克风采集到的声音信号,使用单麦克风模型处理麦克风采集到的声音信号;使用2个麦克风收音,则需要处理2个麦克风采集到的声音信号,使用双麦克风模型处理麦克风采集到的声音信号;使用4个麦克风收音,则需要处理4个麦克风采集到的声音信号,使用4麦克风对应的模型处理麦克风采集到的声音信号;使用6个麦克风收音,则需要处理6个麦克风采集到的声音信号,使用6麦克风对应的模型处理麦克风采集到的声音信号。而当前的大多数电子设备的电池容量有限,麦克风阵列会极大地影响这些电子设备的电池续航时间。
综上,现有的麦克风阵列应用方案中,麦克风阵列的功耗较大,极大地降低了电子设备的电池续航时间。为了解决上述问题,本申请实施例提供了一种麦克风阵列控制方法,通过识别需要执行拾音操作的拾音位置,根据拾音位置针对性选取目标麦克风组合,根据目标麦克风组合对麦克风阵列进行控制。麦克风阵列在工作的过程中,只使用部分麦克风,针对性地采集拾音位置的语音信号,减少了工作的麦克风数量。麦克风阵列中工作的麦克风越少,麦克风阵列的功耗越低,减少麦克风阵列中工作的麦克风可以有效降低了麦克风阵列的功耗,提高了电子设备的电池续航能力,解决了现有的麦克风阵列应用方案中,麦克风阵列的功耗较大,电子设备续航时间短的问题。
接下来,从电子设备的角度,对本申请实施例提供的麦克风阵列控制方法进行详细介绍。参见图1所示的麦克风阵列控制方法的流程图,该方法包括:
S101、电子设备获取麦克风阵列采集到的语音信号;
麦克风阵列具体是指多个麦克风的排列,其可以用于对声场的空间特性进行采样并处理。具体到本申请实施例,麦克风阵列包括一字阵列、十字阵列、平面阵列、螺旋阵列、球形阵列以及无规则阵列等类型。麦克风阵列的阵元数量,也即麦克风数量,可以根据实际需求而设置。作为本申请的一些具体示例,麦克风阵列可以为2麦克风阵列、4麦克风阵列、6麦克风阵列或者是8麦克风阵列等。
麦克风阵列具有信号采集区域,该信号采集区域具体是指该麦克风阵列能够采集到语音信号的区域,例如,针对电视等电子设备,其麦克风阵列的信号采集区域可以是电视前方180°的空间区域;又例如针对音箱等设备,其麦克风阵列的信号采集区域包括环绕该麦克风阵列360°的空间区域。
需要说明的是,本申请实施例麦克风阵列能够采集位于信号采集区域的声源发出的语音信号,该声源可以是用户。当然,在一些可能的实现方式中,该声源也可以是其他电子设备,例如,在智能家居系统中,智能电视可以发出开启音箱的语音指令,智能音箱的麦克风阵列可以采集智能电视发出的语音信号,并基于该语音信号进行语音识别,从而确定是否执行上述语音指令。在实际应用时,声源的数量可以是一个,也可以是多个,该麦克风阵列能够同时采集一个或多个声源发出的语音信号。
S102、若所述语音信号满足预设切换条件,则所述电子设备获取所述麦克风阵列中各个麦克风组合对应的参数集,将在所述语音信号对应的拾音位置满足预设性能指标条件的参数集对应的麦克风组合作为目标麦克风组合;
在使用麦克风阵列的过程中,由于麦克风阵列中工作的麦克风数量越多,麦克风阵列占用的功耗越大。而在实际的应用场景中,不一定需要时刻开启麦克风阵列中的全部的麦 克风,采集所有方向的语音信号。
因此,可以预先设置一些特定的切换条件,当电子设备的麦克风阵列采集的语音信号满足这些预设切换条件时,电子设备的麦克风阵列执行本方案提出的S102至S103所示步骤,进入低功耗工作状态。在低功耗工作状态中,电子设备识别需要执行拾音操作的拾音位置,针对性地开启和关闭麦克风阵列中的部分麦克风,只使用麦克风阵列中部分麦克风执行拾音操作,降低麦克风阵列的功耗。进入低功耗工作状态的预设切换条件和拾音位置的识别方式可以根据实际情况进行设置,以下结合具体应用场景进行说明。
场景一:电子设备获取麦克风阵列采集的语音信号,所述语音信号可以只包括应用调用指令,也可以同时包括唤醒词和应用调用指令。电子设备根据唤醒词从待机状态切换到工作状态,对所述应用调用指令进行解析,得到被调用应用的应用类型。
由于部分应用需要用户和电子设备进行多轮对话交互,且交互的过程中不需要唤醒词,例如成语接龙、家庭KTV等应用。当用户使用这些应用时,用户的位置相对稳定,麦克风阵列不必采集所有方向上的声音信号,只需要采集用户所在区域的语音信号即可。此时,电子设备可以进入低功耗工作状态,只使用部分麦克风执行拾音操作,降低麦克风阵列的功耗。
电子设备在检测到被调用应用的应用类型后,可以判断被调用应用的应用类型是否为预设应用类型。如果被调用应用的应用类型是预设应用类型,则确定上述应用调用指令的声源的方位信息,将上述应用调用指令的声源的方位信息作为拾音位置,进入低功耗工作状态。
以图2所示应用场景为例。在图2所示的应用场景中,电子设备201的初始状态为待机状态,电子设备201的唤醒词设置为“小艺”。此时,电子设备201可以开启麦克风阵列中的全部麦克风监测周围的语音信号。
当电子设备201接收到包含唤醒词以及应用调用指令的语音信号“小艺小艺,打开成语接龙”时,退出待机状态,识别语音信号中的应用类型。
或者,电子设备201也可能先接收包含唤醒词的语音信号“小艺小艺”,退出待机状态,然后接收到包含应用调用指令的语音信号“打开成语接龙”,识别语音信号的应用类型。
电子设备201识别到“成语接龙”的应用类型属于预设应用类型,表示用户将会与电子设备201进行多轮对话交互,则识别上述应用调用指令的声源方向,将上述应用调用指令的声源位置作为拾音位置,进入低功耗工作状态,降低麦克风阵列的功耗。
场景二:当电子设备获取麦克风阵列采集的语音信号时,如果语音信号中包含人声信号,则可以判定为发出上述人声信号的用户或其他电子设备与本实施例的电子设备进行了一次交互,确定该语音信号的声源位置,该声源位置对应的交互次数加1。如果在单位时间内,用户或其他电子设备在相同的区域与本实施例的电子设备进行多轮交互,使得某一声源位置对应的交互次数的增量达到预设次数阈值,则表示用户或其他电子设备可能还会在相同的区域上继续与本实施例的电子设备进行交互。例如,用户坐在沙发上与本实施例的电子设备进行游戏互动。
此时,麦克风阵列不必采集所有方向上的语音信号,电子设备可以执行本方案提出的S102至S103所示步骤,检测上述人声信号的声源位置,将上述人声信号的声源位置作为 拾音位置,进入低功耗工作状态,降低麦克风阵列的功耗。
单位时间的长度以及预设次数阈值可以根据实际需求进行设置。例如,单位时间长度可以设置为3分钟,预设次数阈值可以设置为10次,如果用户在3分钟内在同一区域与电子设备交互10次,则电子设备进入低功耗工作状态。
场景三:电子设备获取麦克风阵列采集的语音信号,如果上述语音信号中包含唤醒词,则电子设备可以退出休眠状态,进入工作状态并进行自检,检查剩余电量。如果电子设备的剩余电量低于预设电量阈值,则电子设备可以识别上述包含唤醒词的语音信号的声源位置,将上述语音信号的声源位置确定为拾音位置,执行本方案提出的S102至S103所示步骤,进入低功耗工作状态,降低麦克风阵列的功耗。
预设电量阈值可以根据实际情况进行设置。例如,预设电量阈值可以设置为总电量的15%、20%、25%等。
以图3所示的应用场景为例。在图3所示的应用场景中,电子设备301的初始状态为待机状态,电子设备301的唤醒词设置为“小艺”,预设电量阈值为20%。在待机状态下,电子设备301可以开启麦克风阵列中的全部麦克风监测周围的声音信号。
当电子设备301接收到包含唤醒词的语音信号“小艺小艺”时,退出待机状态并对设备状态进行自检。此时,电子设备301检测到自身的电量只剩下15%,低于预设电量阈值20%。为了提高续航时间,电子设备301将采集到语音信号的声源位置作为拾音位置,执行本方案提出的S102至S103所示步骤,进入低功耗工作状态,降低麦克风阵列的功耗。
场景四:电子设备还可以通过空间感知技术,通过自发声等方式,周期性或不周期性地检测周围环境的环境信息。环境信息可以包括空间信息、环境噪声等信息中的一种或多种。
此时,电子设备可以将一些环境信息作为进入低功耗工作状态的触发条件,电子设备上电后,启用麦克风阵列监测环境信息。电子设备可以获取麦克风阵列采集的语音信号,该语音信号可以为电子设备自发声产生的语音信号,或者,该语音信号也可以环境中噪声源发出的噪声信号,或者,该语音信号也可以是其他类型的语音信号。电子设备对语音信号进行分析,确定环境信息。然后,电子设备根据检测到环境因素确定是否进入低功耗工作状态。这些环境信息的内容可以根据实际需求进行设置。例如,环境信息可以为电子设备周围是否存在遮挡区域(例如墙壁所在的区域)、电子设备周围是否存在噪声源等。
以图4、图5和图6所示应用场景为例。在图4所示的应用场景中,电子设备401两侧靠墙。
电子设备401上电后,通过自发声的方式产生用于检测电子设备401的空间信息的检测信号,并启用麦克风阵列,采集检测信号接触物体后反射的语音信号。电子设备401根据麦克风阵列采集的语音信号识别电子设备401的空间信息。由于电子设备401靠墙时接收到的语音信号和不靠墙时接收到的语音信号存在较大的差异,因此,电子设备401可以对采集到的语音信号进行分析,判断电子设备401是否靠墙以及墙壁所在的方向。
由于用户或其他电子设备通常不会在墙壁方向对本实施例的电子设备401下达指令,因此,如果电子设备401的位置状态为靠墙,则电子设备401可以将墙壁所在区域确定为非拾音区域,将非靠墙区域(即非拾音区域以外的区域)确定为拾音区域(即图4中斜线标记区域),将上述拾音区域作为拾音位置,执行本方案提出的S102至S103所示步骤, 进入低功耗工作状态,降低麦克风阵列的功耗。
在图5所示的应用场景中,电子设备501的一侧存在噪声源502。电子设备501上电后,可以周期性地启用麦克风阵列采集周围环境的语音信号,对采集到的语音信号进行噪声检测,检测环境中是否存在噪声。
经过检测,电子设备501检测到语音信号存在噪声,则电子设备501识别噪声源502的声源位置。在环境中存在噪声源502的情况下,如果电子设备501应用麦克风阵列中的全部麦克风,则部分麦克风采集到的声音信号将包含大量的噪声信号,污染声音信号的质量。因此,可以将噪声的声源位置(即图5中的斜线标记区域)作为非拾音区域,将非拾音区域以外的区域确定为拾音区域,将上述拾音区域作为拾音位置,执行本方案提出的S102至S103所示步骤,进入低功耗工作状态,降低麦克风阵列的功耗,减少噪声源502对麦克风阵列的干扰。
在图6所示的应用场景中,电子设备601的一侧靠墙,另一侧存在噪声源602。电子设备601上电后,通过空间感知技术对周围的环境信息进行检测,检测到墙壁所在区域和噪声源602所在区域。此时,电子设备601可以将墙壁所在区域和噪声源602所在区域作为非拾音区域,将其他区域确定为拾音区域(即图6中的斜线标记区域),将上述拾音区域作为拾音位置,执行本方案提出的S102至S103所示步骤,进入低功耗工作状态,降低麦克风阵列的功耗,减少强噪声源602的干扰。
可以理解的是,本实施例描述的拾音位置可以是某一个具体的方位,也可以是某一片区域,拾音位置的具体定义可以根据实际情况进行设置,在此不对其进行限制。
在电子设备中,可以设置有至少两种麦克风组合,每一个麦克风组合包括至少一个麦克风。每一种麦克风组合可以设置有一套或一套以上参数集,参数集可以包括拾音方向、噪声抑制参数、自动语音点平调整参数、可调拾音距离以及各类门限阀值中的一种或多种参数。一旦参数集设置完毕,则该参数集对应的麦克风组合在相应区域的拾音性能也随之确定。
因此,电子设备获取到拾音位置后,可以先获取各个麦克风组合的参数集。在这些参数集中,有的能在拾音位置满足预设性能指标条件,有的不能在拾音位置满足预设性能指标条件。为了确保用户体验,电子设备应当从这些参数集中选取能够在拾音位置满足预设性能指标条件的参数集进行应用,将满足预设性能指标条件的参数集对应的麦克风组合作为目标麦克风组合。
各个参数集在各个区域的拾音性能可以通过预先测试得到,可参见表1:
表1
Figure PCTCN2021084099-appb-000001
Figure PCTCN2021084099-appb-000002
表1中的区域A、区域B和区域C表示唤醒词声源所处的位置,D1为唤醒词声源的音量,D2和D3表示噪声源的音量,D4表示预设测试次数,Q1至Q9分别表示不同情况下的唤醒次数,P1至P9分别表示Q1至Q9对应的唤醒率。
如表1所示,开发者可以在设置好参数集后,可以设置不同的场景。在不同的场景下,分别对各个参数集进行测试,得到各个参数集在各个区域的拾音性能指标。
除此以外,在上述电子设备应用的过程中,电子设备也可以通过机器学习等自学习方式,根据应用数据对上述测试数据进行更新,得到更为准确的拾音性能指标数据。
以图7所示场景为例,假设当拾音位置于区域1时,满足预设性能指标条件的参数集有参数集A、参数集B和参数集C,则电子设备701可以选择参数集A对应的麦克风组合作为目标麦克风组合,或者,电子设备701也可以选择参数集B对应的麦克风组合作为目标麦克风组合,或者,电子设备701也可以选择参数集C对应的麦克风组合作为目标麦克风组合;当拾音位置于区域2时,满足预设性能指标条件的参数集有参数集A和参数集B,则电子设备701可以选择参数集A对应的麦克风组合作为目标麦克风组合,或者,电子设备701也可以选取参数集B对应的麦克风组合作为目标麦克风组合;当拾音位置于区域3时,满足预设性能指标条件的参数集只有参数集C,则电子设备701只能选取参数集C对应的麦克风组合作为目标麦克风组合。
预设性能指标条件对应的拾音性能指标可以根据实际情况进行设置,可以包括唤醒率、误唤醒率、ASR(Automatic Speech Recognition,自动语音识别)正确率等拾音性能指标中的一种或多种。例如,预设性能指标条件可以设置为唤醒率大于95%,且ASR正确率大于95%。
同时,目标麦克风组合中的麦克风应当为非损坏的麦克风。某一些麦克风组合在拾音位置的理论拾音性能能够满足预设性能指标条件。但是,如果这些麦克风组合中存在损坏的麦克风,则这些麦克风组合在拾音位置的实际拾音性能可能不符合要求,不应当将这些麦克风组合列为目标麦克风组合。例如,假设[麦克风1,麦克风5]、[麦克风2,麦克风4]、[麦克风1,麦克风2,麦克风4,麦克风5]以及[麦克风1,麦克风3,麦克风4,麦克风5]这四组麦克风组合预设性能指标条件。但是麦克风2为损坏的麦克风,则应当从[麦克风1,麦克风5]和[麦克风1,麦克风3,麦克风4,麦克风5]这两组麦克风组合中选取 目标麦克风组合。
如上所述,可能存在一组和多组麦克风组合的参数集在拾音位置能够满足预设性能指标条件。
如果只有一组麦克风组合的参数集在拾音位置能够满足预设性能指标条件,则电子设备可以直接选择该麦克风组合作为目标麦克风组合。
如果存在多组麦克风组合的参数集在拾音位置能够满足预设性能指标条件,则电子设备可以将在所述拾音位置满足预设性能指标条件的参数集对应的麦克风组合作为候选麦克风组合,然后根据预设的决策策略从候选麦克风组合中选出目标麦克风组合。
预设的决策策略可以根据实际情况进行设置。在一些可能的实现方式中,可以将麦克风数量、CPU占用率和拾音性能作为决策指标,并设置各个决策指标的优先级,按照优先级从高到低的顺序依次使用各个决策指标对候选麦克风组合进行筛选,确定目标麦克风组合。
例如,假设存在三组候选麦克风组合,分别为[麦克风1,麦克风5]、[麦克风2,麦克风4]以及[麦克风1,麦克风2,麦克风4,麦克风5]。
首先,麦克风数量越多,则麦克风阵列占用的功耗越大,因此,电子设备可以将候选麦克风组合中麦克风数量最少的麦克风组合确定为第一麦克风组合。由于[麦克风1,麦克风5]以及[麦克风2,麦克风4]这两个麦克风组合均包含2个麦克风,[麦克风1,麦克风2,麦克风4,麦克风5]这一麦克风组合中包含4个麦克风。所以,第一麦克风组合包括[麦克风1,麦克风5]以及[麦克风2,麦克风4]这两组麦克风组合。此时,第一麦克风组合的数量大于1,无法直接确定目标麦克风组合,则电子设备可以获取各个第一麦克风组合的CPU占用率。CPU占用率是指某一进程在一个时间段内消耗的CPU时间与该时间段长度的比值。应理解,虽然各个第一麦克风组合的麦克风数量相同,使用相同的麦克风模型对麦克风阵列采集到的语音信号进行处理。但是,可能因为第一麦克风组合中麦克风的放置位置不同,导致各个第一麦克风组合拥有不同的CPU占用率。此时,可以将CPU占用率最低的第一麦克风组合确定为第二麦克风组合。此时,假设[麦克风1,麦克风5]以及[麦克风2,麦克风4]这两个麦克风组合的CPU占用率相同,则将[麦克风1,麦克风5]以及[麦克风2,麦克风4]这两个麦克风组合均作为第二麦克风组合。此时,第二麦克风组合数量大于1,无法直接确定目标麦克风组合,则电子设备可以根据预先设置的拾音性能指标,将拾音性能最优的第二麦克风组合确定为目标麦克风组合。假设预先设置的拾音性能指标为唤醒率,[麦克风1,麦克风5]这一麦克风组合在拾音位置的唤醒率为95%,[麦克风2,麦克风4]这一麦克风组合在拾音位置的唤醒率为96%,则选取[麦克风2,麦克风4]这一麦克风组合作为目标麦克风组合。
不同麦克风组合的CPU占用率以及各个麦克风组合在各个区域的拾音性能可以根据出厂前的测试数据得到。
以图8所示的应用场景为例。在图8所示的应用场景中,电子设备801以检测到的人声信号的声源位置作为拾音位置。
电子设备801获取各个麦克风组合的参数集,将在拾音位置满足预设性能指标条件的参数集对应的麦克风组合作为候选麦克风组合。经过筛选,电子设备801确定了[麦克风1,麦克风5]、[麦克风2,麦克风4]以及[麦克风1,麦克风2,麦克风4,麦克风5]这三组 麦克风组合(即图8中虚线框标记的麦克风组合)作为候选麦克风组合。
确定了候选麦克风组合后,根据预设决策策略选取目标麦克风组合。在预设决策策略中,将麦克风数量、CPU占用率和拾音性能作为决策因素,优先级为:麦克风数量>CPU占用率>拾音性能。
在筛选的过程中,优先选取麦克风数量最少的麦克风组合作为目标麦克风组合。当存在多个麦克风数量最少的候选麦克风组合时,可以获取各个候选麦克风组合对应的CPU占用率,选取CPU占用率最低的候选麦克风组合作为目标麦克风组合。当存在多个CPU占用率最低的候选麦克风组合时,可以选取在拾音方向上拾音性能最好的候选麦克风组合作为目标麦克风组合。
根据上述预设决策策略,[麦克风1,麦克风5]和[麦克风2,麦克风4]这两组麦克风组合均采用了2个麦克风,[麦克风1,麦克风2,麦克风4,麦克风5]这一组麦克风组合采用了4个麦克风。因此,根据麦克风数量取少原则先排除[麦克风1,麦克风2,麦克风4,麦克风5]这一麦克风组合。
之后,获取[麦克风1,麦克风5]和[麦克风2,麦克风4]这两组麦克风组合的CPU占用率。假设[麦克风1,麦克风5]和[麦克风2,麦克风4]这两组麦克风组合的CPU占用率相同,则进一步比较[麦克风1,麦克风5]和[麦克风2,麦克风4]这两组麦克风组合在拾音方向的拾音性能。
假设以拾音性能中的唤醒率作为评价指标。[麦克风1,麦克风5]在拾音方向的唤醒率低于[麦克风2,麦克风4],因此,选择最优的[麦克风2,麦克风4]作为目标麦克风组合。
S103、所述电子设备使用所述麦克风阵列中所述目标麦克风组合中的麦克风执行拾音操作。
确定了目标麦克风组合后,电子设备可以使用目标麦克风组合执行拾音操作。此时,电子设备可以使用目标麦克风组合中的麦克风执行拾音操作,关闭目标麦克风组合以外的麦克风,或者,使目标麦克风组合以外的麦克风进入休眠状态,只使用部分麦克风进行拾音操作,并选择合适的麦克风模型处理采集到的语音信号,减少电子设备需要处理的语音信号,降低算法复杂度,节省算例,降低功耗,提升电子设备的性能。
例如,选择[麦克风2,麦克风4]作为目标麦克风组合,则启用麦克风阵列中麦克风2和麦克风4,关闭麦克风阵列中麦克风1、麦克风3、麦克风5和麦克风6,或者,让麦克风1、麦克风3、麦克风5和麦克风6进入休眠状态,并加载2麦克风模型,使用2麦克风模型处理麦克风2和麦克风4采集到的语音信号。
在本实施例的麦克风阵列控制方法中,首先识别需要拾音的拾音位置,根据拾音位置针对性地选取候选麦克风组合。据预设决策策略从候选的麦克风组合中选取目标麦克风组合,根据目标麦克风组合对麦克风阵列进行控制。麦克风阵列在工作的过程中,只使用部分麦克风,可以调节麦克风阵列的拾音波束区域,从而针对性地采集拾音位置的语音信号,减少了工作的麦克风数量。麦克风阵列中工作的麦克风越少,麦克风阵列的功耗越低,减少麦克风阵列中工作的麦克风可以有效降低了麦克风阵列的功耗,提高了电子设备的电池续航能力。因此,本实施例的麦克风阵列控制方法解决了现有的麦克风阵列应用方案中,麦克风阵列的功耗较大,电子设备续航时间短的问题。
除了设置进入低功耗工作状态的触发条件之外,还可以设置退出低功耗工作状态的触发条件。
在一些可能的实现方式中,如果电子设备因为调用指定类型的应用进入低功耗工作状态,则可以将用户或其他电子设备发出的退出该应用的指令作为退出低功耗工作状态的触发条件。例如,用户发出应用调用指令“小艺小艺,打开成语接龙”,则电子设备启动“成语接龙”应用,并进入低功耗工作状态。当用户发出应用退出指令“小艺小艺,关闭成语接龙”,则电子设备关闭“成语接龙”应用,并且电子设备根据该指令退出低功耗工作状态,进入正常工作状态或休眠状态。
在另一些可能的实现方式中,如果因为用户在单位时间内在相同的区域与电子设备进行多轮交互进入低功耗工作状态,则退出低功耗工作状态的触发条件可以为在预设时长未接收到用户的人声信号。例如,单位时间设置为3分钟,预设次数阈值设置为10次,预设时长设置为3分钟。用户坐在沙发上与电子设备进行游戏互动,在3分钟内与电子设备的交互次数达到10次,则电子设备进入低功耗工作状态。当用户因为离开或其他原因,在3分钟内未与电子设备进行交互,则电子设备退出低功耗工作状态,进入正常工作状态或休眠状态。
预设时长可以根据实际情况进行设置。例如,预设时长可以设置为1分钟、3分钟、5分钟等。
在另一些可能的实现方式中,如果电子设备因为电量过低进入低功耗工作状态,则退出低功耗工作状态的触发条件可以为电子设备的电量大于或等于预设电量阈值。例如,预设电量阈值设置为20%。电子设备被唤醒时检测到自身电量剩余15%,低于预设电量阈值20%,则进入低功耗工作状态。当用户将电子设备与电源连接,电子设备的电量逐渐恢复,当电子设备的电量大于或等于20%时,退出低功耗工作状态,进入正常工作状态。
在另一些可能的实现方式中,如果电子设备因为环境信息进入低功耗工作状态,则退出低功耗工作状态的触发条件可以为相应的环境信息发生变化。例如,假设电子设备因为靠墙进入低功耗工作状态,则当电子设备的位置变更,电子设备不靠墙时,可以退出低功耗工作状态,进入正常工作状态。假设电子设备因为强噪声源进入低功耗工作状态,则当强噪声源消失时,电子设备可以退出低功耗工作状态,进入正常工作状态。
此外,当环境信息发生变化时,除了退出低功耗工作状态以外,还可以变更低功耗工作状态下采用的目标麦克风组合。例如,假设电子设备因为靠墙进入低功耗工作状态,则当电子设备的位置变更时,如果电子设备依然靠墙,但是墙的方向发生变化,则电子设备相应地改变采用的目标麦克风组合。假设电子设备因为强噪声源进入低功耗工作状态,则当强噪声源的位置发生变化时,电子设备可以根据强噪声源变化后的位置改变采用的目标麦克风组合。当需要根据环境因素的变化对应变更目标麦克风组合时,目标麦克风组合的选取方式可以参照上述进入低功耗工作状态时选取目标麦克风组合的过程。
应理解,上述实施例中各步骤的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。
请参阅图9,本申请实施例提供了一种麦克风阵列控制装置,为便于说明,仅示出与本申请相关的部分,如图9所示,麦克风阵列控制装置包括,
信号获取模块901,用于获取麦克风阵列采集到的语音信号;
模式切换模块902,用于若所述语音信号满足预设切换条件,则获取所述麦克风阵列中各个麦克风组合对应的参数集,将在所述语音信号对应的拾音位置满足预设性能指标条件的参数集对应的麦克风组合作为目标麦克风组合;
目标应用模块903,用于使用所述麦克风阵列中所述目标麦克风组合中的麦克风执行拾音操作。
进一步地,所述预设切换条件为所述语音信号触发预设应用类型的应用程序。
进一步地,所述预设切换条件为在预设时长内,所述语音信号的声源位置对应的交互次数的增量大于或等于预设次数阈值。
进一步地,所述预设切换条件为电子设备被所述语音信号唤醒时,所述电子设备的剩余电量低于预设电量阈值。
进一步地,所述预设切换条件为所述语音信号对应的空间信息为存在遮挡区域。
进一步地,所述模式切换模块902包括:
噪声声源子模块,用于若所述语音信号包含噪声,则将所述噪声的声源位置确定为非拾音区域,将所述非拾音区域以外的区域确定为拾音位置;
目标组合子模块,用于将在所述拾音位置满足预设性能指标条件的参数集对应的麦克风组合作为目标麦克风组合。
进一步地,所述装置还包括:
禁用模块,用于关闭所述目标麦克风组合以外的麦克风或使所述目标麦克风组合以外的麦克风进入休眠状态。
进一步地,每个所述麦克风组合包括至少一个麦克风;
所述模式切换模块902包括:
候选组合子模块,用于获取所述麦克风阵列中各个麦克风组合对应的参数集,将在所述语音信号对应的拾音位置满足预设性能指标条件的参数集对应的麦克风组合作为候选麦克风组合;
第一组合子模块,用于若所述候选麦克风组合的数量大于1,则将麦克风数量最少的候选麦克风组合作为第一麦克风组合;
第一目标子模块,用于若所述第一麦克风组合的数量为1,则将所述第一麦克风组合确定为所述目标麦克风组合。
进一步地,所述模式切换模块902还包括:
第二组合子模块,用于若所述第一麦克风组合的数量大于1,则将中央处理器占用率最低的第一麦克风组合作为第二麦克风组合;
第二目标子模块,用于若所述第二麦克风组合的数量为1,则将所述第二麦克风组合确定为所述目标麦克风组合。
进一步地,所述模式切换模块902还包括:
第三目标子模块,用于若所述第二麦克风组合的数量大于1,则将在所述拾音位置的拾音性能最高的第二麦克风组合确定为所述目标麦克风组合。
需要说明的是,上述装置/单元之间的信息交互、执行过程等内容,由于与本申请方法实施例基于同一构思,其具体功能及带来的技术效果,具体可参见方法实施例部分,此处不再赘述。
请参阅图10,本申请实施例还提供了一种电子设备。如图10所示,该实施例的电子设备100包括:处理器1000、存储器1001、存储在所述存储器1001中并可在所述处理器1000上运行的计算机程序1002以及麦克风阵列1003。所述处理器1000执行所述计算机程序1002时实现上述麦克风阵列控制方法实施例中的步骤,例如图1所示的步骤S101至S103。或者,所述处理器1000执行所述计算机程序1002时实现上述各装置实施例中各模块/单元的功能,例如图9所示模块901至903的功能。
示例性的,所述计算机程序1002可以被分割成一个或多个模块/单元,所述一个或者多个模块/单元被存储在所述存储器1001中,并由所述处理器1000执行,以完成本申请。所述一个或多个模块/单元可以是能够完成特定功能的一系列计算机程序指令段,该指令段用于描述所述计算机程序1002在所述电子设备100中的执行过程。例如,所述计算机程序1002可以被分割成信号获取模块、模式切换模块以及目标应用模块,各模块具体功能如下:
信号获取模块,用于获取麦克风阵列采集到的语音信号;
模式切换模块,用于若所述语音信号满足预设切换条件,则获取所述麦克风阵列中各个麦克风组合对应的参数集,将在所述语音信号对应的拾音位置满足预设性能指标条件的参数集对应的麦克风组合作为目标麦克风组合;
目标应用模块,用于使用所述麦克风阵列中所述目标麦克风组合中的麦克风执行拾音操作。
所述电子设备100可以是桌上型计算机、笔记本、掌上电脑及智能音箱等具备麦克风阵列的电子设备。所述电子设备可包括,但不仅限于,处理器1000、存储器1001。本领域技术人员可以理解,图10仅仅是电子设备100的示例,并不构成对电子设备100的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件,例如所述电子设备还可以包括输入输出设备、网络接入设备、总线等。
所称处理器1000可以是中央处理单元(Central Processing Unit,CPU),还可以是其他通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现场可编程门阵列(Field-Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。
所述存储器1001可以是所述电子设备100的内部存储单元,例如电子设备100的硬盘或内存。所述存储器1001也可以是所述电子设备100的外部存储设备,例如所述电子设备100上配备的插接式硬盘,智能存储卡(Smart Media Card,SMC),安全数字(Secure Digital,SD)卡,闪存卡(Flash Card)等。进一步地,所述存储器1001还可以既包括所述电子设备100的内部存储单元也包括外部存储设备。所述存储器1001用于存储所述计算机程序以及所述电子设备所需的其他程序和数据。所述存储器1001还可以用于暂时地存储已经输出或者将要输出的数据。
所述麦克风阵列1003中的麦克风可以是电动麦克风、电容麦克风、晶体麦克风、碳质麦克风、动态麦克风等类型的麦克风。
所属领域的技术人员可以清楚地了解到,为了描述的方便和简洁,仅以上述各功能单 元、模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能单元、模块完成,即将所述装置的内部结构划分成不同的功能单元或模块,以完成以上描述的全部或者部分功能。实施例中的各功能单元、模块可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中,上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。另外,各功能单元、模块的具体名称也只是为了便于相互区分,并不用于限制本申请的保护范围。上述系统中单元、模块的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详述或记载的部分,可以参见其它实施例的相关描述。
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
在本申请所提供的实施例中,应该理解到,所揭露的装置/电子设备和方法,可以通过其它的方式实现。例如,以上所描述的装置/电子设备实施例仅仅是示意性的,例如,所述模块或单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通讯连接可以是通过一些接口,装置或单元的间接耦合或通讯连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
所述集成的模块/单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请实现上述实施例方法中的全部或部分流程,也可以通过计算机程序来指令相关的硬件来完成,所述的计算机程序可存储于一计算机可读存储介质中,该计算机程序在被处理器执行时,可实现上述各个方法实施例的步骤。其中,所述计算机程序包括计算机程序代码,所述计算机程序代码可以为源代码形式、对象代码形式、可执行文件或某些中间形式等。所述计算机可读介质可以包括:能够携带所述计算机程序代码的任何实体或装置、记录介质、U盘、移动硬盘、磁碟、光盘、计算机存储器、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、电载波信号、电信信号以及软件分发介质等。需要说明的是,所述计算机可读介质包含的内容可以根据司法管辖区内立法和专利实践的要求进行适当的增减,例如在某些司法管辖区,根据立法和专利实践,计算机可读介质不包括电载波信号和电信信号。
以上所述,仅为本发明的具体实施方式,但本发明的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本发明揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本发明的保护范围之内。因此,本发明的保护范围应以所述权利要求的保护范围为准。

Claims (20)

  1. 一种麦克风阵列控制方法,其特征在于,包括:
    电子设备获取麦克风阵列采集到的语音信号;
    若所述语音信号满足预设切换条件,则所述电子设备获取所述麦克风阵列中各个麦克风组合对应的参数集,将在所述语音信号对应的拾音位置满足预设性能指标条件的参数集对应的麦克风组合作为目标麦克风组合;
    所述电子设备使用所述麦克风阵列中所述目标麦克风组合中的麦克风执行拾音操作。
  2. 如权利要求1所述的麦克风阵列控制方法,其特征在于,所述预设切换条件为所述语音信号触发预设应用类型的应用程序。
  3. 如权利要求1所述的麦克风阵列控制方法,其特征在于,所述预设切换条件为在预设时长内,所述语音信号的声源位置对应的交互次数的增量大于或等于预设次数阈值。
  4. 如权利要求1所述的麦克风阵列控制方法,其特征在于,所述预设切换条件为所述电子设备被所述语音信号唤醒时,所述电子设备的剩余电量低于预设电量阈值。
  5. 如权利要求1所述的麦克风阵列控制方法,其特征在于,所述预设切换条件为所述语音信号对应的空间信息为存在遮挡区域。
  6. 如权利要求1所述的麦克风阵列控制方法,其特征在于,所述方法还包括:
    所述电子设备关闭所述目标麦克风组合以外的麦克风或使所述目标麦克风组合以外的麦克风进入休眠状态。
  7. 如权利要求1至6任一项所述的麦克风阵列控制方法,其特征在于,每个所述麦克风组合包括至少一个麦克风;
    所述电子设备获取所述麦克风阵列中各个麦克风组合对应的参数集,将在所述语音信号对应的拾音位置满足预设性能指标条件的参数集对应的麦克风组合作为目标麦克风组合,包括:
    所述电子设备获取所述麦克风阵列中各个麦克风组合对应的参数集,将在所述语音信号对应的拾音位置满足预设性能指标条件的参数集对应的麦克风组合作为候选麦克风组合;
    若所述候选麦克风组合的数量大于1,则所述电子设备将麦克风数量最少的候选麦克风组合作为第一麦克风组合;
    若所述第一麦克风组合的数量为1,则所述电子设备将所述第一麦克风组合确定为所述目标麦克风组合。
  8. 如权利要求7所述的麦克风阵列控制方法,其特征在于,所述方法还包括:
    若所述第一麦克风组合的数量大于1,则所述电子设备将中央处理器占用率最低的第一麦克风组合作为第二麦克风组合;
    若所述第二麦克风组合的数量为1,则所述电子设备将所述第二麦克风组合确定为所述目标麦克风组合。
  9. 如权利要求8所述的麦克风阵列控制方法,其特征在于,所述方法还包括:
    若所述第二麦克风组合的数量大于1,则所述电子设备将在所述拾音位置的拾音 性能最高的第二麦克风组合确定为所述目标麦克风组合。
  10. 一种麦克风阵列控制装置,其特征在于,包括:
    信号获取模块,用于获取麦克风阵列采集到的语音信号;
    模式切换模块,用于若所述语音信号满足预设切换条件,则获取所述麦克风阵列中各个麦克风组合对应的参数集,将在所述语音信号对应的拾音位置满足预设性能指标条件的参数集对应的麦克风组合作为目标麦克风组合;
    目标应用模块,用于使用所述麦克风阵列中所述目标麦克风组合中的麦克风执行拾音操作。
  11. 如权利要求10所述的麦克风阵列控制装置,其特征在于,所述预设切换条件为所述语音信号触发预设应用类型的应用程序。
  12. 如权利要求10所述的麦克风阵列控制装置,其特征在于,所述预设切换条件为在预设时长内,所述语音信号的声源位置对应的交互次数的增量大于或等于预设次数阈值。
  13. 如权利要求10所述的麦克风阵列控制装置,其特征在于,所述预设切换条件为电子设备被所述语音信号唤醒时,所述电子设备的剩余电量低于预设电量阈值。
  14. 如权利要求10所述的麦克风阵列控制装置,其特征在于,所述预设切换条件为所述语音信号对应的空间信息为存在遮挡区域。
  15. 如权利要求10所述的麦克风阵列控制装置,其特征在于,所述装置还包括:
    禁用模块,用于关闭所述目标麦克风组合以外的麦克风或使所述目标麦克风组合以外的麦克风进入休眠状态。
  16. 如权利要求10至15任一项所述的麦克风阵列控制装置,其特征在于,每个所述麦克风组合包括至少一个麦克风;
    所述模式切换模块包括:
    候选组合子模块,用于获取所述麦克风阵列中各个麦克风组合对应的参数集,将在所述语音信号对应的拾音位置满足预设性能指标条件的参数集对应的麦克风组合作为候选麦克风组合;
    第一组合子模块,用于若所述候选麦克风组合的数量大于1,则将麦克风数量最少的候选麦克风组合作为第一麦克风组合;
    第一目标子模块,用于若所述第一麦克风组合的数量为1,则将所述第一麦克风组合确定为所述目标麦克风组合。
  17. 如权利要求16所述的麦克风阵列控制装置,其特征在于,所述模式切换模块还包括:
    第二组合子模块,用于若所述第一麦克风组合的数量大于1,则将中央处理器占用率最低的第一麦克风组合作为第二麦克风组合;
    第二目标子模块,用于若所述第二麦克风组合的数量为1,则将所述第二麦克风组合确定为所述目标麦克风组合。
  18. 如权利要求17所述的麦克风阵列控制装置,其特征在于,所述模式切换模块还包括:
    第三目标子模块,用于若所述第二麦克风组合的数量大于1,则将在所述拾音位 置的拾音性能最高的第二麦克风组合确定为所述目标麦克风组合。
  19. 一种电子设备,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机程序,其特征在于,所述处理器执行所述计算机程序时实现如权利要求1至9任一项所述方法的步骤。
  20. 一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,其特征在于,所述计算机程序被处理器执行时实现如权利要求1至9任一项所述方法的步骤。
PCT/CN2021/084099 2020-04-08 2021-03-30 麦克风阵列控制方法、装置、电子设备及计算机存储介质 WO2021204027A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010270470.5A CN113497995B (zh) 2020-04-08 2020-04-08 麦克风阵列控制方法、装置、电子设备及计算机存储介质
CN202010270470.5 2020-04-08

Publications (1)

Publication Number Publication Date
WO2021204027A1 true WO2021204027A1 (zh) 2021-10-14

Family

ID=77994769

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/084099 WO2021204027A1 (zh) 2020-04-08 2021-03-30 麦克风阵列控制方法、装置、电子设备及计算机存储介质

Country Status (2)

Country Link
CN (1) CN113497995B (zh)
WO (1) WO2021204027A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115762525A (zh) * 2022-11-18 2023-03-07 北京中科艺杺科技有限公司 一种基于全方位语音获取的语音过滤收录方法及系统
WO2023173337A1 (zh) * 2022-03-16 2023-09-21 北京小米移动软件有限公司 一种车载音频信号的采集的方法及其装置

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114827821A (zh) * 2022-04-25 2022-07-29 世邦通信股份有限公司 拾音器拾音控制方法及系统、拾音器设备与存储介质

Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103052001A (zh) * 2011-10-17 2013-04-17 联想(北京)有限公司 智能设备及其控制方法
JP2013170936A (ja) * 2012-02-21 2013-09-02 Nippon Telegr & Teleph Corp <Ntt> 音源位置判定装置、音源位置判定方法、プログラム
CN105979442A (zh) * 2016-07-22 2016-09-28 北京地平线机器人技术研发有限公司 噪声抑制方法、装置和可移动设备
US20190104359A1 (en) * 2017-09-29 2019-04-04 Apple Inc. Recording musical instruments using a microphone array in a device
CN109672966A (zh) * 2018-12-21 2019-04-23 歌尔股份有限公司 一种语音拾取方法、装置和系统
CN109817225A (zh) * 2019-01-25 2019-05-28 广州富港万嘉智能科技有限公司 一种基于位置的会议自动记录方法、电子设备及存储介质
CN109905803A (zh) * 2019-03-01 2019-06-18 深圳市沃特沃德股份有限公司 麦克风阵列的切换方法、装置、存储介质及计算机设备
CN110428828A (zh) * 2019-07-02 2019-11-08 北京搜狗科技发展有限公司 一种语音识别方法、装置和用于语音识别的装置
CN110556103A (zh) * 2018-05-31 2019-12-10 阿里巴巴集团控股有限公司 音频信号处理方法、装置、系统、设备和存储介质
US20200015005A1 (en) * 2018-07-03 2020-01-09 Fuji Xerox Co., Ltd. Systems and methods for steering speaker array and microphone array with encoded light rays
CN110788866A (zh) * 2018-08-02 2020-02-14 深圳市优必选科技有限公司 机器人唤醒方法、装置及终端设备
CN110797042A (zh) * 2018-08-03 2020-02-14 杭州海康威视数字技术股份有限公司 音频处理方法、装置及存储介质
CN110858426A (zh) * 2018-08-24 2020-03-03 深圳市神州云海智能科技有限公司 一种彩票机器人与用户交互的方法、装置及彩票机器人

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP6701573B2 (ja) * 2016-08-03 2020-05-27 株式会社リコー 音声処理装置、音声映像出力装置、及び遠隔会議システム
CN108848264A (zh) * 2018-06-19 2018-11-20 Oppo广东移动通信有限公司 麦克风的控制方法、装置、存储介质及电子设备
CN110556131A (zh) * 2019-08-14 2019-12-10 北京声加科技有限公司 一种语音活动检测设备及方法

Patent Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103052001A (zh) * 2011-10-17 2013-04-17 联想(北京)有限公司 智能设备及其控制方法
JP2013170936A (ja) * 2012-02-21 2013-09-02 Nippon Telegr & Teleph Corp <Ntt> 音源位置判定装置、音源位置判定方法、プログラム
CN105979442A (zh) * 2016-07-22 2016-09-28 北京地平线机器人技术研发有限公司 噪声抑制方法、装置和可移动设备
US20190104359A1 (en) * 2017-09-29 2019-04-04 Apple Inc. Recording musical instruments using a microphone array in a device
CN110556103A (zh) * 2018-05-31 2019-12-10 阿里巴巴集团控股有限公司 音频信号处理方法、装置、系统、设备和存储介质
US20200015005A1 (en) * 2018-07-03 2020-01-09 Fuji Xerox Co., Ltd. Systems and methods for steering speaker array and microphone array with encoded light rays
CN110788866A (zh) * 2018-08-02 2020-02-14 深圳市优必选科技有限公司 机器人唤醒方法、装置及终端设备
CN110797042A (zh) * 2018-08-03 2020-02-14 杭州海康威视数字技术股份有限公司 音频处理方法、装置及存储介质
CN110858426A (zh) * 2018-08-24 2020-03-03 深圳市神州云海智能科技有限公司 一种彩票机器人与用户交互的方法、装置及彩票机器人
CN109672966A (zh) * 2018-12-21 2019-04-23 歌尔股份有限公司 一种语音拾取方法、装置和系统
CN109817225A (zh) * 2019-01-25 2019-05-28 广州富港万嘉智能科技有限公司 一种基于位置的会议自动记录方法、电子设备及存储介质
CN109905803A (zh) * 2019-03-01 2019-06-18 深圳市沃特沃德股份有限公司 麦克风阵列的切换方法、装置、存储介质及计算机设备
CN110428828A (zh) * 2019-07-02 2019-11-08 北京搜狗科技发展有限公司 一种语音识别方法、装置和用于语音识别的装置

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023173337A1 (zh) * 2022-03-16 2023-09-21 北京小米移动软件有限公司 一种车载音频信号的采集的方法及其装置
CN115762525A (zh) * 2022-11-18 2023-03-07 北京中科艺杺科技有限公司 一种基于全方位语音获取的语音过滤收录方法及系统
CN115762525B (zh) * 2022-11-18 2024-05-07 北京中科艺杺科技有限公司 一种基于全方位语音获取的语音过滤收录方法及系统

Also Published As

Publication number Publication date
CN113497995A (zh) 2021-10-12
CN113497995B (zh) 2023-04-04

Similar Documents

Publication Publication Date Title
WO2021204027A1 (zh) 麦克风阵列控制方法、装置、电子设备及计算机存储介质
US11244672B2 (en) Speech recognition method and apparatus, and storage medium
CN109450750A (zh) 设备的语音控制方法、装置、移动终端和家电设备
US11437021B2 (en) Processing audio signals
CN105009204A (zh) 语音识别功率管理
CN113778663B (zh) 一种多核处理器的调度方法及电子设备
CN108986833A (zh) 基于麦克风阵列的拾音方法、系统、电子设备及存储介质
JP2020537213A (ja) 端末の電力消費を低減するための方法、および端末
US20200234707A1 (en) Voice interaction processing method and apparatus
WO2021218600A1 (zh) 语音唤醒方法和设备
CN109101517B (zh) 信息处理方法、信息处理设备以及介质
CN114360527B (zh) 车载语音交互方法、装置、设备及存储介质
US10950221B2 (en) Keyword confirmation method and apparatus
CN112581960A (zh) 语音唤醒方法、装置、电子设备及可读存储介质
CN110853644B (zh) 语音唤醒方法、装置、设备及存储介质
WO2019085754A1 (zh) 应用清理方法、装置、存储介质及电子设备
WO2022222045A1 (zh) 语音信息处理方法及设备
WO2024103926A1 (zh) 语音控制方法、装置、存储介质以及电子设备
CN112863545A (zh) 性能测试方法、装置、电子设备及计算机可读存储介质
CN111739515B (zh) 语音识别方法、设备、电子设备和服务器、相关系统
CN116705033A (zh) 用于无线智能音频设备的片上系统和无线处理方法
CN110164431A (zh) 一种音频数据处理方法及装置、存储介质
US20210120353A1 (en) Acoustic signal processing adaptive to user-to-microphone distances
CN116027879A (zh) 确定参数的方法、电子设备和计算机可读存储介质
CN117873693A (zh) 资源调度方法、电子设备及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21784478

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21784478

Country of ref document: EP

Kind code of ref document: A1