CN114384466A - Sound source direction determining method, sound source direction determining device, electronic equipment and storage medium - Google Patents

Sound source direction determining method, sound source direction determining device, electronic equipment and storage medium Download PDF

Info

Publication number
CN114384466A
CN114384466A CN202111659858.5A CN202111659858A CN114384466A CN 114384466 A CN114384466 A CN 114384466A CN 202111659858 A CN202111659858 A CN 202111659858A CN 114384466 A CN114384466 A CN 114384466A
Authority
CN
China
Prior art keywords
pickup
directions
sound source
sound
source direction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111659858.5A
Other languages
Chinese (zh)
Inventor
吴俊�
李智勇
陈孝良
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong Shengzhi Wulian Technology Co ltd
Original Assignee
Shandong Shengzhi Wulian Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong Shengzhi Wulian Technology Co ltd filed Critical Shandong Shengzhi Wulian Technology Co ltd
Priority to CN202111659858.5A priority Critical patent/CN114384466A/en
Publication of CN114384466A publication Critical patent/CN114384466A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S3/00Direction-finders for determining the direction from which infrasonic, sonic, ultrasonic, or electromagnetic waves, or particle emission, not having a directional significance, are being received
    • G01S3/80Direction-finders for determining the direction from which infrasonic, sonic, ultrasonic, or electromagnetic waves, or particle emission, not having a directional significance, are being received using ultrasonic, sonic or infrasonic waves
    • G01S3/802Systems for determining direction or deviation from predetermined direction
    • G01S3/808Systems for determining direction or deviation from predetermined direction using transducers spaced apart and measuring phase or time difference between signals therefrom, i.e. path-difference systems
    • G01S3/8083Systems for determining direction or deviation from predetermined direction using transducers spaced apart and measuring phase or time difference between signals therefrom, i.e. path-difference systems determining direction of source
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command

Abstract

The application relates to a sound source direction determining method, a sound source direction determining device, electronic equipment and a storage medium, and belongs to the technical field of audio processing. The method comprises the following steps: based on M sound pickup directions, M voice signals are picked up and obtained, each voice signal corresponds to one sound pickup direction, the M voice signals are used for awakening the terminal, and M is an integer larger than 1; determining M awakening parameters and an initial sound source direction based on the M voice signals, wherein the M awakening parameters are used for representing the contribution degree of the M voice signals to awaken the terminal; and adjusting the initial sound source direction based on the M awakening parameters and the M pickup directions to obtain a target sound source direction. The method can improve the accuracy of determining the direction of the target sound source.

Description

Sound source direction determining method, sound source direction determining device, electronic equipment and storage medium
Technical Field
The present disclosure relates to the field of audio processing technologies, and in particular, to a method and an apparatus for determining a sound source direction, an electronic device, and a storage medium.
Background
At present, a terminal has a voice recognition function, and can recognize a voice control instruction in a user voice signal, and then execute an operation corresponding to the voice control instruction. In order to save power consumption, the terminal is in a dormant state before recognizing the voice control instruction, wakes up and collects the voice signal after receiving the wake-up instruction, and then recognizes the voice signal. And in order to improve the clarity of acquiring the voice signal, the terminal may locate the direction of the sound source by means of the voice signal of the wake-up command and then acquire the voice signal based on the direction of the sound source.
In the related art, when a user wakes up a terminal by a voice signal, the terminal picks up voice signals in a plurality of directions, the voice signals in the plurality of directions are used for waking up the terminal, a sound source is positioned based on the voice signals in the plurality of directions, and the voice signals in the directions are subjected to beam forming based on the picked-up directions obtained by the positioning, so that the voice signals after the beam forming are recognized.
However, in the above method, when the noise of the surrounding environment is large, the sound collection direction obtained by the positioning is not accurate.
Disclosure of Invention
The embodiment of the application provides a sound source direction determining method and device, electronic equipment and a storage medium, and can improve the accuracy of determining the sound source direction. The technical scheme is as follows:
according to an aspect of an embodiment of the present application, there is provided a sound source direction determining method, including:
based on M sound pickup directions, M voice signals are picked up and obtained, each voice signal corresponds to one sound pickup direction, the M voice signals are used for awakening the terminal, and M is an integer larger than 1;
determining M awakening parameters and an initial sound source direction based on the M voice signals, wherein the M awakening parameters are used for representing the contribution degree of the M voice signals to awaken the terminal;
and adjusting the initial sound source direction based on the M awakening parameters and the M pickup directions to obtain a target sound source direction.
In a possible implementation manner, the adjusting the initial sound source direction based on the M wake-up parameters and the M sound pickup directions to obtain a target sound source direction includes:
selecting N sound pickup directions from the M sound pickup directions based on the M awakening parameters and the M sound pickup directions, wherein the N awakening parameters of the N sound pickup directions are the first N awakening parameters in the M awakening parameters which are arranged from large to small, and N is an integer which is smaller than M and larger than 1;
and adjusting the initial sound source direction based on the N awakening parameters and the N pickup directions, and determining the target sound source direction.
In another possible implementation manner, the adjusting the initial sound source direction based on the N wake-up parameters and the N sound pickup directions, and determining the target sound source direction includes:
determining a positional relationship of the N sound pickup directions based on the N sound pickup directions;
if the initial sound source direction and the position relation meet a preset condition, determining an adjusting angle based on the N awakening parameters and the N pickup directions; adjusting the initial sound source direction based on the adjustment angle to obtain the target sound source direction;
and if the initial sound source direction and the position relation do not meet the preset condition, determining the pickup direction corresponding to the awakening parameter with the maximum target sound source direction.
In another possible implementation manner, if N is greater than 2 and smaller than M, an included angle between any two adjacent pickup directions in the M pickup directions is a preset angle; if the initial sound source direction and the position relation meet a preset condition, determining an adjustment angle based on the N awakening parameters and the N pickup directions, and including:
the position relation is that only an included angle between a pair of two adjacent pickup directions exists in the N pickup directions is larger than the preset angle, the pickup direction corresponding to the maximum awakening parameter is not a boundary pickup direction of a first pickup range, the first pickup range is a pickup range formed by the N pickup directions, the boundary pickup direction is two adjacent pickup directions in the first pickup range, the included angle of the boundary pickup direction is larger than the preset angle, and if the initial sound source direction is located in the first pickup range, a first weight and a first included angle of the pickup direction with the smallest included angle with the initial sound source direction are determined;
and determining the adjusting angle based on the first weight and the first included angle.
In another possible implementation manner, if N is greater than 2 and smaller than M, an included angle between any two adjacent pickup directions in the M pickup directions is a preset angle; if the initial sound source direction and the position relation meet a preset condition, determining an adjustment angle based on the N awakening parameters and the N pickup directions, and including:
the position relation is that only an included angle between a pair of two adjacent pickup directions exists in the N pickup directions and is larger than the preset angle, the pickup direction corresponding to the maximum awakening parameter is one of boundary pickup directions of a first pickup range, the first pickup range is a pickup range formed by the N pickup directions, the boundary pickup directions are two adjacent pickup directions in the first pickup range, the included angle of which is larger than the preset angle, if the initial sound source direction is in a second pickup range, the second pickup range is a pickup range formed by other pickup directions except the pickup direction corresponding to the maximum awakening parameter, and second weights of the other pickup directions are determined;
determining the adjustment angle based on the second weight.
In another possible implementation, if N is equal to 2; if the initial sound source direction and the position relation meet a preset condition, determining an adjustment angle based on the N awakening parameters and the N pickup directions, and including:
the position relation is that the N sound pickup directions are adjacent, the sound pickup direction corresponding to the maximum awakening parameter is any one of the N sound pickup directions, if the initial sound source direction is in a first sound pickup range, the first sound pickup range is a sound pickup range formed by the N sound pickup directions, and a third weight and a second included angle of the sound pickup direction with the minimum included angle between the initial sound source direction and the first sound pickup range are determined;
and determining the adjusting angle based on the third weight and the second included angle.
In another possible implementation manner, if N is equal to M; the method further comprises the following steps:
if the initial sound source direction is within a first sound pickup range, the first sound pickup range is a sound pickup range formed by the N sound pickup directions, and a fourth weight and a third included angle of the sound pickup direction with the smallest included angle with the initial sound source direction are determined;
and determining the adjusting angle based on the fourth weight and the third included angle.
In another possible implementation manner, an included angle between any two adjacent sound pickup directions in the M sound pickup directions is a preset angle; if the initial sound source direction and the position relation meet a preset condition, determining an adjustment angle based on the N awakening parameters and the N pickup directions, and including:
the position relation is that at least a first pair of two adjacent pickup directions and a second pair of two adjacent pickup directions exist in the N pickup directions, an included angle between the first pair of two adjacent pickup directions is the preset angle, an included angle between the second pair of two adjacent pickup directions is larger than the preset angle, the pickup direction corresponding to the maximum awakening parameter is any one of the first pair of adjacent pickup directions, if the initial sound source direction is within a third pickup range, the third pickup range is a pickup range formed by the first pair of two adjacent pickup directions, and a fifth weight and a fourth included angle between the adjacent pickup directions of the pickup direction corresponding to the maximum awakening parameter are determined;
and determining the adjusting angle based on the fifth weight and the fourth included angle.
In another possible implementation manner, an included angle between any two adjacent sound pickup directions in the M sound pickup directions is a preset angle; the method further comprises the following steps:
the position relation is that at least a first pair of two adjacent pickup directions and a second pair of two adjacent pickup directions exist in the N pickup directions, an included angle between the first pair of two adjacent pickup directions is the preset angle, an included angle between the second pair of two adjacent pickup directions is larger than the preset angle, the pickup direction corresponding to the maximum awakening parameter is not the pickup direction in the first pair of adjacent pickup directions, if the initial sound source direction is located in a fourth pickup range, the fourth pickup range is a pickup range formed by the first pair of two adjacent pickup directions, and the sum of awakening parameters of the adjacent pickup directions is determined;
and if the sum of the awakening parameters is not less than the product of the maximum awakening parameter and a preset coefficient, determining the target sound source direction as the initial sound source direction.
In another possible implementation manner, an included angle between any two adjacent sound pickup directions in the M sound pickup directions is a preset angle; the method further comprises the following steps:
and if the position relation is that an included angle between at least one pair of two adjacent pickup directions exists in the N pickup directions is the preset angle, executing the step of determining an adjustment angle based on the N awakening parameters and the N pickup directions if the initial sound source direction and the position relation meet the preset condition.
In another possible implementation manner, an included angle between any two adjacent sound pickup directions in the M sound pickup directions is a preset angle; the method further comprises the following steps:
and if the position relation is that the included angle between two adjacent pickup directions does not exist in the N pickup directions, determining that the target sound source direction is the initial sound source direction.
In another possible implementation manner, the method further includes:
and if the pickup direction with the minimum included angle between the original sound source direction and the original sound source direction is not the pickup direction corresponding to the maximum awakening parameter, executing the step of adjusting the original sound source direction based on the M awakening parameters to obtain a target sound source direction.
In another possible implementation manner, the method further includes:
and if the pickup direction with the minimum included angle with the initial sound source direction is the pickup direction corresponding to the maximum awakening parameter, determining that the target sound source direction is the initial sound source direction.
According to an aspect of an embodiment of the present application, there is provided a sound source direction determination apparatus including:
the pickup module is used for picking up M voice signals based on M pickup directions, each voice signal corresponds to one pickup direction, the M voice signals are used for awakening the terminal, and M is an integer greater than 1;
a first determining module, configured to determine, based on the M voice signals, M wake-up parameters and an initial sound source direction, where the M wake-up parameters are used to represent contribution degrees of the M voice signals to wake up the terminal;
and the adjusting module is used for adjusting the initial sound source direction based on the M awakening parameters and the M pickup directions to obtain a target sound source direction.
In one possible implementation manner, the adjusting module includes:
a selecting unit, configured to select N sound pickup directions from the M sound pickup directions based on the M wake-up parameters and the M sound pickup directions, where N wake-up parameters of the N sound pickup directions are first N wake-up parameters of the M wake-up parameters arranged from large to small, and N is an integer smaller than M and larger than 1;
and the adjusting unit is used for adjusting the initial sound source direction and determining the target sound source direction based on the N awakening parameters and the N pickup directions.
In another possible implementation manner, the adjusting unit includes:
a first determining subunit configured to determine, based on the N sound pickup directions, a positional relationship of the N sound pickup directions;
a second determining subunit, configured to determine, if the initial sound source direction and the position relationship satisfy a preset condition, an adjustment angle based on the N wake-up parameters and the N pickup directions; adjusting the initial sound source direction based on the adjustment angle to obtain the target sound source direction;
and the third determining subunit is configured to determine, if the initial sound source direction and the position relationship do not satisfy a preset condition, a pickup direction corresponding to the wake-up parameter for which the target sound source direction is the largest.
In another possible implementation manner, the second determining subunit is configured to determine, as the position relationship, that only an included angle between a pair of two adjacent pickup directions exists in the N pickup directions and is greater than the preset angle, and the pickup direction corresponding to the maximum wake-up parameter is not a boundary pickup direction of a first pickup range, where the first pickup range is a pickup range formed by the N pickup directions, and the boundary pickup direction is two adjacent pickup directions in the first pickup range, where an included angle is greater than the preset angle, and determine, if the initial sound source direction is located in the first pickup range, a first weight and a first included angle of a pickup direction having a smallest included angle with the initial sound source direction; and determining the adjusting angle based on the first weight and the first included angle.
In another possible implementation manner, the second determining subunit is configured to determine, by the position relationship, that only an included angle between a pair of two adjacent sound pickup directions exists in the N sound pickup directions, where the included angle is greater than the preset angle, and the sound pickup direction corresponding to the largest wake-up parameter is one of boundary sound pickup directions of a first sound pickup range, where the first sound pickup range is a sound pickup range formed by the N sound pickup directions, the boundary sound pickup direction is two adjacent sound pickup directions in the first sound pickup range, where the included angle is greater than the preset angle, and if the initial sound source direction is in a second sound pickup range, the second sound pickup range is a sound pickup range formed by other sound pickup directions other than the sound pickup direction corresponding to the largest wake-up parameter, and second weights of the other sound pickup directions are determined; determining the adjustment angle based on the second weight.
In another possible implementation manner, where N is equal to 2, the second determining subunit is configured to determine that the position relationship is that the N sound pickup directions are adjacent to each other, and a sound pickup direction corresponding to the maximum wake-up parameter is any one of the N sound pickup directions, and if the initial sound source direction is within a first sound pickup range, the first sound pickup range is a sound pickup range formed by the N sound pickup directions, and determine a third weight and a second included angle of the sound pickup direction having a smallest included angle with the initial sound source direction; and determining the adjusting angle based on the third weight and the second included angle.
In another possible implementation manner, the apparatus further includes:
a second determining module, configured to determine, if the initial sound source direction is within a first sound pickup range, the first sound pickup range being a sound pickup range formed by the N sound pickup directions, a fourth weight and a third angle of a sound pickup direction having a smallest included angle with the initial sound source direction; and determining the adjusting angle based on the fourth weight and the third included angle.
In another possible implementation manner, the second determining subunit is configured to determine that the position relationship is that at least a first pair of two adjacent pickup directions and a second pair of two adjacent pickup directions exist in the N pickup directions, an included angle between the first pair of two adjacent pickup directions is the preset angle, an included angle between the second pair of two adjacent pickup directions is greater than the preset angle, and the pickup direction corresponding to the maximum wake-up parameter is any one of the first pair of adjacent pickup directions, and if the initial sound source direction is within a third pickup range, the third pickup range is a pickup range formed by the first pair of two adjacent pickup directions, and a fifth weight and a fourth included angle between the adjacent pickup directions of the pickup direction corresponding to the maximum wake-up parameter are determined; and determining the adjusting angle based on the fifth weight and the fourth included angle.
In another possible implementation manner, the apparatus further includes:
a third determining module, configured to determine, for the position relationship, that at least a first pair of two adjacent pickup directions and a second pair of two adjacent pickup directions exist in the N pickup directions, an included angle between the first pair of two adjacent pickup directions is the preset angle, an included angle between the second pair of two adjacent pickup directions is greater than the preset angle, and the pickup direction corresponding to the largest wake-up parameter is not the pickup direction in the first pair of adjacent pickup directions, and if the initial sound source direction is located in a fourth pickup range, the fourth pickup range is a pickup range formed by the first pair of two adjacent pickup directions, and a sum of wake-up parameters of the adjacent pickup directions is determined; and if the sum of the awakening parameters is not less than the product of the maximum awakening parameter and a preset coefficient, determining the target sound source direction as the initial sound source direction.
In another possible implementation manner, the apparatus further includes:
a fourth determining module, configured to determine, if the position relationship is that there is at least an included angle between a pair of two adjacent pickup directions in the N pickup directions as the preset angle, if the initial sound source direction and the position relationship satisfy a preset condition, an adjustment angle based on the N wake-up parameters and the N pickup directions.
In another possible implementation manner, the apparatus further includes:
and the fifth determining module is used for determining that the target sound source direction is the initial sound source direction if the position relation is that the included angle between two adjacent sound pickup directions does not exist in the N sound pickup directions.
In another possible implementation manner, the apparatus further includes:
and the sixth determining module is used for adjusting the initial sound source direction to obtain the target sound source direction based on the M awakening parameters and the M sound pickup directions if the sound pickup direction with the minimum included angle between the initial sound source directions is not the sound pickup direction corresponding to the maximum awakening parameter.
In another possible implementation manner, the apparatus further includes:
and the seventh determining module is used for determining the target sound source direction as the initial sound source direction if the sound pickup direction with the minimum included angle between the target sound source direction and the initial sound source direction is the sound pickup direction corresponding to the maximum awakening parameter.
According to an aspect of embodiments of the present application, there is provided an electronic device comprising one or more processors and one or more memories having stored therein at least one program code, the at least one program code being loaded and executed by the one or more processors to implement a sound source direction determination method as any one of the above possible implementations.
According to an aspect of the embodiments of the present application, there is provided a storage medium having at least one program code stored therein, the at least one program code being loaded and executed by a processor to implement the sound source direction determining method according to any one of the above possible implementations.
According to an aspect of embodiments of the present application, there is provided a computer program product, the computer program product including computer program code stored in a computer-readable storage medium, the computer program code being read from the computer-readable storage medium by a processor of an electronic device, the processor executing the computer program code to cause the electronic device to execute a sound source direction determination method according to any one of the above possible implementations.
In the embodiment of the application, voice signals in multiple directions are picked up, the positioning direction of a sound source is adjusted by determining the awakening parameter corresponding to the voice signal, and the awakening parameter can represent the contribution degree of the corresponding voice signal awakening terminal, so that the positioning direction of the sound source is adjusted by the awakening parameter, and the accuracy rate of obtaining the direction of a target sound source can be improved.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
FIG. 1 is a schematic illustration of an implementation environment provided by an exemplary embodiment of the present application;
FIG. 2 is a flow chart of a sound source direction determination method provided by an exemplary embodiment of the present application;
FIG. 3 is a flow chart of a sound source direction determination method provided by an exemplary embodiment of the present application;
FIG. 4 is a schematic illustration of M pickup directions for a case provided by an exemplary embodiment of the present application;
FIG. 5 is a flow chart of a sound source direction determination method provided by an exemplary embodiment of the present application;
fig. 6 is a schematic structural diagram of a sound source direction determining apparatus according to an exemplary embodiment of the present application;
fig. 7 is a schematic structural diagram of a terminal according to an exemplary embodiment of the present application;
fig. 8 is a schematic structural diagram of a server according to an exemplary embodiment of the present application.
Detailed Description
To make the objects, technical solutions and advantages of the present application more clear, embodiments of the present application will be described in further detail below with reference to the accompanying drawings. It is to be understood that the embodiments described are only a few embodiments of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It will be understood that the terms "first," "second," and the like as used herein may be used herein to describe various concepts, which are not limited by these terms unless otherwise specified. These terms are only used to distinguish one concept from another. For example, a first sound pickup range may be referred to as a second sound pickup range, and similarly, the second sound pickup range may be referred to as the first sound pickup range, without departing from the scope of the application.
As used herein, the term "at least one", "a plurality", "each", "any", at least one includes one, two or more, a plurality includes two or more, and each refers to each of the corresponding plurality, and any refers to any one of the plurality, for example, the plurality of wake-up parameters includes 3 wake-up parameters, and each refers to each of the 3 wake-up parameters, and any refers to any one of the 3 wake-up parameters, which may be the first, the second, or the third.
Fig. 1 is a schematic diagram of an implementation environment provided by an embodiment of the present application, and as shown in fig. 1, the implementation environment includes a terminal 101 and a server 102. The terminal 101 and the server 102 are connected by a wireless or wired network.
Optionally, the terminal 101 is a terminal of any type, such as a smart phone, a tablet computer, an intelligent wearable device, or an intelligent home device, and the intelligent home device is an intelligent sound box, an intelligent television, an intelligent refrigerator, an intelligent air conditioner, an intelligent robot, an intelligent lamp, an intelligent lock, or the like. The server 102 is a server, or a server cluster composed of a plurality of servers, or a cloud computing service center.
The terminal 101 has installed thereon an application served by the server 102, through which the terminal 101 can implement functions such as data transmission, message interaction, and the like. Optionally, the application is an application in an operating system of the terminal 101 or an application provided by a third party. For example, the application is a voice assistant that has a positioning function, but of course the voice assistant application can also have other functions, such as a recognition function, a function of executing voice commands, etc.
In some embodiments, in a scenario where the user performs the target operation by controlling the terminal 101 through voice, the user speaks a voice signal, the terminal 101 locates a direction of the voice signal of the user, picks up the voice signal of the user according to the direction, and transmits the picked-up voice signal to the server 102. The server 102 receives the voice signal, recognizes the voice signal, transmits the recognition result to the terminal 101, and the terminal 101 executes the target operation corresponding to the recognition result. In other embodiments, the server 102 may also locate the direction of the user's voice signal.
The sound source direction determining method provided by the embodiment of the application can be applied to any voice control terminal to execute the target operation scene.
The first scenario is applied to a scenario that the intelligent household equipment is controlled through a voice signal under the condition that the terminal is the intelligent household equipment.
For example, the smart home device is a smart television, a user wants to change a channel of the smart television or adjust the volume of the smart television, the user can control the smart television through a voice signal, the smart television locates the direction of the voice signal, and picks up the voice signal according to the direction, so that the voice signal is recognized, and corresponding operation is executed.
The second scenario is a scenario in which the terminal is a mobile phone and is controlled by a voice signal.
For example, in the process of driving a car, a user is inconvenient to operate a mobile phone, but in the case that the user wants to use the mobile phone for navigation, the user can control the mobile phone through a voice signal, the mobile phone locates the direction of the voice signal, picks up the voice signal according to the direction, thereby recognizing the voice signal, and then opens navigation software for navigation.
It should be noted that, in the embodiment of the present application, the determination of the sound source direction by using a mobile phone or an intelligent home device is merely taken as an example for description, and a scene of the sound source direction determination method of the present application is not limited.
Fig. 2 is a flowchart of a sound source direction determining method according to an embodiment of the present application. The embodiment of the application is executed by the terminal, and the method comprises the following steps:
step 201: the terminal picks up M voice signals based on M pickup directions, each voice signal corresponds to one pickup direction, the M voice signals are used for awakening the terminal, and M is an integer larger than 1.
Step 202: the terminal determines M awakening parameters and an initial sound source direction based on the M voice signals, wherein the M awakening parameters are used for representing the contribution degree of the M voice signals to awaken the terminal.
Step 203: and the terminal adjusts the initial sound source direction based on the M awakening parameters and the M sound pickup directions to obtain the target sound source direction.
In the embodiment of the application, the terminal picks up voice signals in multiple directions, the positioning direction of the sound source is adjusted by determining the awakening parameter corresponding to the voice signal, and the awakening parameter can represent the contribution degree of the corresponding voice signal to awaken the terminal, so that the positioning direction of the sound source is adjusted by the awakening parameter, and the accuracy of determining the direction of the target sound source can be improved.
Fig. 3 is a flowchart of a method for determining a direction to a sound source according to an embodiment of the present application. The embodiment of the application is executed by the terminal, and the method comprises the following steps:
step 301: the terminal picks up M voice signals based on M pickup directions.
Each voice signal corresponds to one pickup direction, M voice signals are used for awakening the terminal, and M is an integer greater than 1; for example, M is 2, 3, 4, 5, or 6, etc. An included angle exists between any two adjacent pickup directions in the M pickup directions, and the included angles can be the same or different. For example, M is 4, and 4 sound pickup directions include a 90-degree sound pickup direction, a 180-degree sound pickup direction, a 270-degree sound pickup direction, and a 360-degree sound pickup direction, and at this time, an included angle between any two adjacent sound pickup directions is the same, and is all 90 degrees. For another example, M is 6, and 6 pickup directions include 45 degrees pickup direction, 90 degrees pickup direction, 135 degrees pickup direction, 180 degrees pickup direction, 270 degrees pickup direction and 360 degrees pickup direction, and the contained angle between two arbitrary adjacent pickup directions this moment is different, and some contained angles are 45 degrees, and some contained angles are 90 degrees. In the embodiment of the present application, an included angle between any two adjacent sound pickup directions is not specifically limited herein.
In a possible implementation manner, the M sound pickup directions may be set in advance by the user at the terminal, that is, the M sound pickup directions are set in advance by the terminal. Correspondingly, the step that the terminal determines M sound pickup directions is as follows: the terminal directly determines M sound pickup directions based on preset sound pickup directions.
For example, if the user sets the directions in which the terminal picks up a voice signal to 90 degrees, 180 degrees, 270 degrees, and 360 degrees in advance in the terminal, the terminal directly takes these four sound pickup directions as M sound pickup directions. For example, if the user sets, in advance, the sound pickup direction of the terminal to 45 degrees, 90 degrees, 135 degrees, 180 degrees, 270 degrees, and 360 degrees (0 degrees) in the terminal, the terminal directly takes these six sound pickup directions as M sound pickup directions, which is not specifically limited herein.
In the embodiment of the application, the terminal directly determines M sound pickup directions based on the preset sound pickup directions, the operation is simple, and the operation consumption is low.
In another possible implementation, the terminal can determine M sound pickup directions based on the historical sound pickup times for each sound pickup direction. Correspondingly, the step that the terminal determines M sound pickup directions is as follows: the terminal acquires the historical sound pickup times and determines M sound pickup directions based on the historical sound pickup times. The historical sound pickup times are positively correlated with the number of sound pickup directions, namely the more the historical sound pickup times, the more the number of sound pickup directions, namely the larger M, the less the historical sound pickup times, the less the number of sound pickup directions, namely the smaller M.
For example, the number of times of sound pickup is large in the sound pickup range of 0 degree to 180 degrees, and the number of times of sound pickup is small in the sound pickup range of 180 degrees to 360 degrees, then the terminal determines a plurality of sound pickup directions in the sound pickup range of 0 degree to 180 degrees, and determines a small number of sound pickup directions in the sound pickup range of 180 degrees to 360 degrees, and the M sound pickup directions that the terminal can determine are the 45 degree sound pickup direction, the 90 degree sound pickup direction, the 135 degree sound pickup direction, the 180 degree sound pickup direction, the 270 degree sound pickup direction, the 360 degree sound pickup direction, and the like, which is not particularly limited herein.
In this application embodiment, the terminal sets up M pickup directions based on historical pickup number of times, accords with the work law at terminal, can accord with user's use habit promptly, picks up the more pickup range of user's speech signal number of times at the terminal, sets up more direction and carries out the pickup, avoids the signal to omit, picks up the less pickup range of user's speech signal number of times at the terminal, sets up less direction and carries out the pickup, reduces the operation consumption.
In another possible implementation manner, the terminal can determine M sound pickup directions based on the type of the terminal. Correspondingly, the step that the terminal determines M sound pickup directions is as follows: the terminal acquires the type of the terminal and determines M sound pickup directions based on the type. The type of the terminal may be a mobile phone, a smart television, or a smart refrigerator, and is not limited in this respect. For example, the terminal is a mobile phone, the user can be located in any direction of the mobile phone, and the terminal needs to pick up a voice signal in a direction of 360 degrees around the terminal, so the M sound pickup directions determined by the terminal can be a sound pickup direction of 90 degrees, a sound pickup direction of 180 degrees, a sound pickup direction of 270 degrees, and a sound pickup direction of 360 degrees, and these four directions can cover the direction of 360 degrees of the terminal. For example, the type of the terminal is a smart television, and at this time, the user is usually located in front of, to the left of, or to the right of the terminal, and the terminal needs to pick up a voice signal in a range formed by these three directions, so the M sound pickup directions determined by the terminal may be a sound pickup direction of 0 degree, a sound pickup direction of 45 degrees, a sound pickup direction of 90 degrees, a sound pickup direction of 135 degrees, and a sound pickup direction of 180 degrees, and these five sound pickup directions can cover the front, to the left of, and to the right of the terminal, and do not need to cover the 360-degree direction of the terminal.
In the embodiment of the application, the terminal determines the sound pickup direction based on the type of the terminal, the sound pickup direction can be more consistent with the terminal, and the sound pickup direction is not set in the range without a voice signal, so that the operation consumption can be reduced.
In the following examples, the angle between any two of M sound pickup directions is taken as a preset angle, for example, referring to fig. 4, where M is 8, and 8 sound pickup directions are respectively a sound pickup direction x0, a sound pickup direction x1, a sound pickup direction x2, a sound pickup direction x3, a sound pickup direction x4, a sound pickup direction x5, a sound pickup direction x6, and a sound pickup direction x7, and the angle between any two of the 8 sound pickup directions is 45 degrees.
Step 302: the terminal determines M awakening parameters and an initial sound source direction based on the M voice signals.
The terminal inputs the M voice signals into a wake-up model, the wake-up model outputs wake-up parameters corresponding to the voice signals based on the contribution degree of each voice signal to the wake-up terminal, and one voice signal corresponds to one wake-up parameter. Under the condition that a plurality of voice signals can wake up the terminal, the size of the wake-up parameter is positively correlated with the quality of the corresponding voice signal, namely the quality of the voice signal is better, the larger the wake-up parameter corresponding to the voice signal is, the worse the quality of the voice signal is, and the smaller the wake-up parameter corresponding to the voice signal is.
The terminal obtains an initial sound source Direction through a Direction Of Arrival (DOA) algorithm based on the M voice signals. However, the initial sound source direction obtained by the conventional DOA algorithm is not necessarily accurate, and the terminal can verify the initial sound source direction through the sound pickup direction corresponding to the largest wake-up parameter among the M wake-up parameters. Correspondingly, the step of the terminal verifying the initial sound source direction is as follows: and the terminal acquires the sound pickup direction corresponding to the maximum awakening parameter, and verifies the initial sound source direction based on the position relation between the sound pickup direction corresponding to the maximum awakening parameter and the initial sound source direction. Two cases can be distinguished, including:
in the first case, if the sound pickup direction having the smallest included angle with the initial sound source direction is not the sound pickup direction corresponding to the largest wake-up parameter, the step of adjusting the initial sound source direction based on the M wake-up parameters and the M sound pickup directions to obtain the target sound source direction is performed. The maximum awakening parameter represents that the contribution degree of the voice signal awakening terminal corresponding to the awakening parameter is maximum, but the difference between the obtained initial sound source direction and the sound pickup direction corresponding to the maximum awakening parameter is large, which indicates that the initial sound source direction obtained through the traditional algorithm is inaccurate, and the initial sound source direction needs to be adjusted to obtain a more accurate sound source direction.
In the second case, if the sound pickup direction having the smallest included angle with the initial sound source direction is the sound pickup direction corresponding to the largest wake-up parameter, the target sound source direction is determined as the initial sound source direction, and then the step of recognizing the target voice signal based on the target sound source direction is performed. The maximum awakening parameter represents that the contribution degree of the voice signal awakening terminal corresponding to the awakening parameter is maximum, and the difference between the obtained initial sound source direction and the pickup direction corresponding to the maximum awakening parameter is not large, so that the initial sound source direction obtained through the traditional algorithm is accurate and can be used as the sound source direction, and in the subsequent process, the voice signal is collected and identified on the basis of the initial sound source direction.
In this application embodiment, the terminal judges whether the initial sound source direction needs to be adjusted by determining the position relation of the initial sound source direction and the pickup direction corresponding to the maximum awakening parameter, and under the first condition, the initial sound source direction is relatively inaccurate, and needs to be adjusted, so that the subsequent accurate target sound source direction is obtained.
It should be noted that, in the case where the terminal is not awake, the terminal executes step 301 to continue to pick up the voice signal.
Step 303: the terminal selects N sound pickup directions from the M sound pickup directions based on the M awakening parameters and the M sound pickup directions.
The N awakening parameters of the N pickup directions are the first N awakening parameters in the M awakening parameters which are arranged from large to small, and N is an integer which is smaller than M and larger than 1. For example, if M is 4 and N is 3, that is, the terminal selects a sound pickup direction corresponding to the first 3 wake-up parameters among the 4 wake-up parameters arranged from large to small from among the four sound pickup directions, and if the wake-up parameter for the sound pickup direction of 90 degrees is 4, the wake-up parameter for the sound pickup direction of 180 degrees is 6, the wake-up parameter for the sound pickup direction of 270 degrees is 5, and the wake-up parameter for the sound pickup direction of 360 degrees is 2, the terminal selects three sound pickup directions, that is, the sound pickup direction of 90 degrees, the sound pickup direction of 180 degrees, and the sound pickup direction of 270 degrees, based on the four wake-up parameters.
For example, if M is 6 and N is 3, that is, the terminal selects a sound pickup direction corresponding to the first 3 wake-up parameters among 6 wake-up parameters arranged from large to small from six sound pickup directions, if the wake-up parameter for the sound pickup direction of 45 degrees is 4, the wake-up parameter for the sound pickup direction of 90 degrees is 4, the wake-up parameter for the sound pickup direction of 135 degrees is 5, the wake-up parameter for the sound pickup direction of 180 degrees is 6, the wake-up parameter for the sound pickup direction of 270 degrees is 5, and the wake-up parameter for the sound pickup direction of 360 degrees is 2, the terminal selects three directions, that is, the sound pickup direction of 135 degrees, the sound pickup direction of 180 degrees, and the sound pickup direction of 270 degrees, based on the six wake-up parameters.
For example, M is 8, N is 3, M is 8, N is 4, M is 7, N is 4, and the like, and the numerical values of M and N are not particularly limited in the embodiments of the present application.
Step 304: the terminal determines the position relation of the N sound pickup directions based on the N sound pickup directions.
For example, continuing the above example, the terminal selects three sound pickup directions, namely, a 90-degree sound pickup direction, a 180-degree sound pickup direction, and a 270-degree sound pickup direction, as N sound pickup directions, and it can be seen that the three sound pickup directions are adjacent, where the wake-up parameter corresponding to the 180-degree sound pickup direction is 6, and compared with the wake-up parameters corresponding to the other two directions, the wake-up parameter corresponding to the 180-degree sound pickup direction is the largest, that is, the sound pickup direction corresponding to the largest wake-up parameter is located at the center of the N sound pickup directions.
Based on the adjacent condition between N pickup directions, and the position of the pickup direction corresponding to the largest wake-up parameter between these N pickup directions, the positional relationship can be divided into the following multiple conditions, including:
in a first case, if N is greater than 2 and less than M, the position relationship is that only an included angle between a pair of two adjacent pickup directions exists in N pickup directions and is greater than a preset angle, and the pickup direction corresponding to the largest wake-up parameter is not a boundary pickup direction of a first pickup range, the first pickup range is a pickup range composed of N pickup directions, and the boundary pickup direction is two adjacent pickup directions in which the included angle is greater than the preset angle in the first pickup range.
For example, continuing to refer to fig. 4, assuming that N is 3, then N sound pickup directions are the sound pickup direction x1, the sound pickup direction x2, and the sound pickup direction x3, the wake-up parameters corresponding to the 3 sound pickup directions are s1, s2, and s3, and s2 is the largest, it can be seen that the included angle between the sound pickup direction x1 and the sound pickup direction x2 is a preset angle, the included angle between the sound pickup direction x2 and the sound pickup direction x3 is a preset angle, the included angle between the sound pickup direction x1 and the sound pickup direction x3 is greater than a preset angle, that is, only the included angle between a pair of adjacent two sound pickup directions in the 3 sound pickup directions is greater than the preset angle, at this time, the sound pickup direction x1 and the sound pickup direction x3 are boundary sound pickup directions, the sound pickup direction corresponding to the largest wake-up parameter is the sound pickup direction x2, but not the boundary sound pickup direction, it can be seen that the 3 sound pickup directions satisfy the positional relationship in this case.
In a second case, if N is greater than 2 and less than M, the position relationship is that only an included angle between a pair of two adjacent pickup directions exists in N pickup directions and is greater than a preset angle, and the pickup direction corresponding to the largest wake-up parameter is one of boundary pickup directions of a first pickup range, the first pickup range is a pickup range composed of N pickup directions, and the boundary pickup direction is two adjacent pickup directions in which the included angle is greater than the preset angle in the first pickup range.
For example, continuing to refer to fig. 4, assuming that N is 3, N sound pickup directions are a sound pickup direction x1, a sound pickup direction x2, and a sound pickup direction x3, wake-up parameters corresponding to the 3 sound pickup directions are s1, s2, and s3, and s1 is the largest, it can be seen that an included angle between the sound pickup direction x1 and the sound pickup direction x2 is a preset angle, an included angle between the sound pickup direction x2 and the sound pickup direction x3 is a preset angle, an included angle between the sound pickup direction x1 and the sound pickup direction x3 is greater than a preset angle, that is, only an included angle between a pair of adjacent two sound pickup directions exists in the 3 sound pickup directions is greater than the preset angle, at this time, the sound pickup direction x1 and the sound pickup direction x3 are boundary directions, and the sound pickup direction corresponding to the largest wake-up parameter is a boundary sound pickup direction x1, which is a boundary direction, and it can be seen that the 3 sound pickup directions satisfy the positional relationship in this case.
In a third case, if N is equal to 2, the position relationship is that the N sound pickup directions are adjacent, and the sound pickup direction corresponding to the largest wake-up parameter is any one of the N sound pickup directions. For example, with continued reference to fig. 4, the N sound pickup directions are the sound pickup direction x1 and the sound pickup direction x2, the wake-up parameters corresponding to the 2 sound pickup directions are s1 and s2, respectively, and s1> s2, and it can be seen that the two sound pickup directions satisfy the position relationship in this case.
In the fourth case, if N is equal to M, it indicates that all the sound pickup directions are selected, and at this time, it is only necessary to adjust the initial sound source direction by the sound pickup direction closest to the initial sound source direction without determining the positional relationship of the N sound pickup directions.
In a fifth case, the position relationship is that at least a first pair of two adjacent pickup directions and a second pair of two adjacent pickup directions exist in the N pickup directions, an included angle between the first pair of two adjacent pickup directions is a preset angle, an included angle between the second pair of two adjacent pickup directions is greater than the preset angle, and the pickup direction corresponding to the largest wake-up parameter is any one of the first pair of adjacent pickup directions.
For example, with continuing reference to fig. 4, the N sound pickup directions may be a sound pickup direction x1, a sound pickup direction x2, and a sound pickup direction x5, and it can be seen that there is a preset angle between two adjacent sound pickup directions in the N sound pickup directions, that is, the included angle between the sound pickup direction x1 and the sound pickup direction x2 is a preset angle, the sound pickup direction x1 and the sound pickup direction x2 form a first pair of two adjacent sound pickup directions, the included angle between the two adjacent sound pickup directions is greater than the preset angle, that is, the included angle between the adjacent sound pickup direction x1 and the sound pickup direction x5 is greater than the preset angle, the sound pickup direction x1 and the sound pickup direction x5 form a second pair of two adjacent sound pickup directions, the included angle between the adjacent sound pickup direction x2 and the sound pickup direction x5 is greater than the preset angle, the sound pickup direction x2 and the sound pickup direction x5 form a second pair of two adjacent sound pickup directions, and the 3 sound pickup directions satisfy the positional relationship in this case, the sound pickup direction corresponding to the maximum wake-up parameter at this time is the sound pickup direction x1 or the sound pickup direction x2, and is either one of the first pair of adjacent sound pickup directions.
In a sixth case, the position relationship is that at least a first pair of two adjacent pickup directions and a second pair of two adjacent pickup directions exist in the N pickup directions, an included angle between the first pair of two adjacent pickup directions is a preset angle, an included angle between the second pair of two adjacent pickup directions is greater than the preset angle, and the pickup direction corresponding to the largest wake-up parameter is not the pickup direction in the first pair of adjacent pickup directions. For example, with continuing reference to fig. 4, the N sound pickup directions may be a sound pickup direction x1, a sound pickup direction x2, and a sound pickup direction x5, and these 3 sound pickup directions satisfy the positional relationship in this case, which is the same as the fifth case described above and is not described here again. The sound pickup direction corresponding to the largest wake-up parameter at this time is the sound pickup direction x5, and is not the sound pickup direction of the first pair of adjacent sound pickup directions.
In a seventh case, the position relationship is that an included angle between two adjacent sound pickup directions does not exist in the N sound pickup directions is a preset angle. For example, with continued reference to fig. 4, the N sound pickup directions may be the sound pickup direction x1, the sound pickup direction x5, and the sound pickup direction x7, and it can be seen that there is no included angle between two adjacent sound pickup directions in the 3 sound pickup directions, which is a preset angle, and the positional relationship in this case is satisfied.
The positional relationship may also include other cases, and is not particularly limited herein.
It should be noted that, the position relationships between the N sound pickup directions are different, the method for determining the target sound source direction is different, if the position relationship is that there is no included angle between two adjacent sound pickup directions in the N sound pickup directions, the terminal determines that the target sound source direction is the initial sound source direction, then the step of identifying the target sound signal based on the target sound source direction is performed, in this case, it indicates that the N sound pickup directions corresponding to the largest wake-up parameter are not adjacent, the terminal cannot determine the approximate range of the sound source based on the wake-up parameter, and therefore cannot adjust the initial sound source direction based on the wake-up parameter, at this time, the terminal directly determines that the target sound source direction is the initial sound source direction, and then the step of identifying the target sound signal based on the target sound source direction is performed. If the position relationship is that an included angle between at least one pair of adjacent two sound pickup directions exists in the N sound pickup directions, which is a preset angle, the terminal performs a step of determining an adjustment angle based on the N wake-up parameters and the N sound pickup directions if the initial sound source direction and the position relationship satisfy a preset condition, that is, the terminal performs step 305.
Step 305: and if the relation between the initial sound source direction and the position meets the preset condition, the terminal determines an adjustment angle based on the N awakening parameters and the N pickup directions.
The preset conditions comprise preset conditions of an initial sound source direction and preset conditions of a position relation, the preset conditions are conditions that the terminal determines an adjustment angle, and then the initial sound source direction is adjusted, the preset conditions are different, the terminal determines that the adjustment angle is different, the obtained adjustment angle is different, and the preset conditions can be divided into the following multiple conditions:
for convenience of understanding, for each case, a is taken as an initial sound source direction, and for the case where no specific value of N is indicated, N is taken as 3, the 3 sound pickup directions are a1, a2, and a3, respectively, and the corresponding wake-up parameters are s1, s2, and s3, where s1 is the maximum example.
The first condition, the position relation is that only there is the contained angle between a pair of adjacent two pickup directions in N pickup directions to be greater than preset angle, and the pickup direction that the biggest awakening parameter corresponds is not the border pickup direction of first pickup scope, first pickup scope is the pickup scope that N pickup direction constitutes, border pickup direction is two adjacent pickup directions that the contained angle is greater than preset angle in the first pickup scope, if initial sound source direction is located pickup direction within range, the terminal is confirmed and the first weight and the first contained angle of the minimum pickup direction of contained angle between the initial sound source direction, based on first weight and first contained angle, confirm angle of adjustment.
For example, the position relationship is that the included angle between only one pair of two adjacent sound pickup directions in the sound pickup directions a1, a2, and a3 is greater than a preset angle, and the sound pickup direction a1 is located between the sound pickup direction a2 and the sound pickup direction a3, that is, the sound pickup direction a1 is not the boundary sound pickup direction of the first sound pickup range, where the first sound pickup range is min (a2, a3) - θ/2 to max (a2, a3) + θ/2, and the initial sound source direction is located between min (a2, a3) - θ/2 to max (a2, a3) + θ/2, where θ is the included angle between the sound pickup directions. Determining a first weight and a first included angle of a sound pickup direction with the smallest included angle with the initial sound source direction under the condition that the position relation of the initial sound source direction and the initial sound source direction meets the condition, wherein if the sound pickup direction with the smallest included angle with the initial sound source direction is a2, the first weight is s2/s1, the first included angle is a2-a, and further determining an adjustment angle to be (a2-a) × s2/s1 on the basis of the first weight and the first included angle; if the sound pickup direction with the smallest included angle with the initial sound source direction is a3, the first weight is s3/s1, the first included angle is a3-a, and the adjustment angle is determined to be (a3-a) × s3/s1 based on the first weight and the first included angle.
The second condition, the position relation is that only there is the contained angle between a pair of adjacent two pickup directions in N pickup directions to be greater than preset angle, and the pickup direction that the biggest awakening parameter corresponds is one of them of the boundary pickup direction of first pickup range, first pickup range is the pickup range that N pickup direction constitutes, boundary pickup direction is two adjacent pickup directions that contained angle is greater than preset angle in the first pickup range, if initial sound source direction is in the second pickup range, the second pickup range is the pickup range that other pickup directions constitute except the pickup direction that the biggest awakening parameter corresponds, the second weight of other pickup directions is confirmed to the terminal, based on the second weight, confirm angle of adjustment.
For example, the position relationship is that the included angle between only one pair of adjacent two sound pickup directions in the sound pickup directions a1, a2 and a3 is larger than a preset angle, the sound pickup direction a1 is one of the boundary sound pickup directions of the first sound pickup range, the second sound pickup range is a1- θ/2 and max (a2, a3) + θ/2 when a1 is the largest, the initial sound source direction is located between min (a2, a3) - θ/2 and a1+ θ/2, the second sound pickup range is a1- θ/2 and max (a2, a3) + θ/2 when a1 is the smallest, and the initial sound source direction is located between a1- θ/2 and max (a2, a3) + θ/2. When the initial sound source direction and the positional relationship satisfy the above conditions, the terminal determines that the second weight of the other sound pickup directions is (s2+ s3)/(2 × s1), and determines the adjustment angle as (s2+ s3)/(2 × s1) based on the second weight.
In a third case, if N is 2, the position relationship is that N sound pickup directions are adjacent, the sound pickup direction corresponding to the largest awakening parameter is any one of the N sound pickup directions, if the initial sound source direction is within a first sound pickup range, the first sound pickup range is a sound pickup range formed by the N sound pickup directions, and the terminal determines a third weight and a second included angle of the sound pickup direction with the smallest included angle with the initial sound source direction; and determining an adjusting angle based on the third weight and the second included angle.
For example, the two sound pickup directions in the sound pickup directions a1 and a2 are adjacent to each other in the positional relationship, the sound pickup direction corresponding to the largest awakening parameter is the sound pickup direction a1, the first sound pickup range is the sound pickup range formed by the sound pickup directions a1 and a2, the terminal determines the third weight and the second angle of the sound pickup direction with the smallest included angle with the initial sound source direction, if the sound pickup direction with the smallest included angle with the initial sound source direction is a2, the third weight is s2/s1, the second angle is a2-a, and the adjustment angle is (a2-a) s2/s1 based on the third weight and the second angle.
In the fourth case, if N is equal to M, it indicates that all the sound pickup directions are selected, and at this time, it is only necessary to adjust the initial sound source direction by the sound pickup direction closest to the initial sound source direction without determining the positional relationship of the N sound pickup directions. If the initial sound source direction is within a first sound pickup range, the first sound pickup range is a sound pickup range formed by N sound pickup directions, and a fourth weight and a third included angle of the sound pickup direction with the smallest included angle with the initial sound source direction are determined; and determining an adjusting angle based on the fourth weight and the third included angle.
For example, the first sound pickup range is a sound pickup range composed of the N sound pickup directions, the sound pickup direction corresponding to the largest wake-up parameter is a sound pickup direction a1, the initial sound source direction is within the first sound pickup range, the terminal determines the fourth weight and the third angle of the sound pickup direction having the smallest angle with the initial sound source direction, if the sound pickup direction having the smallest angle with the initial sound source direction is a3, the fourth weight is s3/s1, the third angle is a3-a, and the adjustment angle is determined to be (a3-a) s3/s1 based on the fourth weight and the third angle.
In a fifth case, the position relationship is that at least a first pair of adjacent two sound pickup directions and a second pair of adjacent two sound pickup directions exist in the N sound pickup directions, an included angle between the first pair of adjacent two sound pickup directions is a preset angle, an included angle between the second pair of adjacent two sound pickup directions is greater than the preset angle, the sound pickup direction corresponding to the largest wake-up parameter is any one of the first pair of adjacent sound pickup directions, if the initial sound source direction is within a third sound pickup range, the third sound pickup range is a sound pickup range formed by the first pair of adjacent two sound pickup directions, and the terminal determines a second included angle and a third weight between the adjacent sound pickup directions of the sound pickup direction corresponding to the largest wake-up parameter; and determining an adjusting angle based on the third weight and the second included angle.
For example, an included angle between a1 and a2 is a preset angle, an included angle between a1 and a3 and an included angle between a2 and a3 are both larger than the preset angle, the initial sound source direction is located in a sound pickup range formed by a1 and a2, a second included angle between adjacent sound pickup directions of the sound pickup direction corresponding to the largest awakening parameter is determined to be a2-a, a third weight is s2/s1, and an adjustment angle is determined to be (a2-a) s2/s1 based on the third weight and the second included angle; or a1 is adjacent to a3, none of a2 is adjacent, the initial sound source direction is located in the sound pickup range formed by a1 and a3, a second included angle between the adjacent sound pickup directions of the sound pickup direction corresponding to the maximum awakening parameter is determined to be a3-a, a third weight is determined to be s2/s1, and an adjustment angle is determined to be (a3-a) s2/s1 based on the third weight and the second included angle.
A sixth condition, the position relationship is that there are at least a first pair of two adjacent pickup directions and a second pair of two adjacent pickup directions in the N pickup directions, an included angle between the first pair of two adjacent pickup directions is a preset angle, an included angle between the second pair of two adjacent pickup directions is greater than the preset angle, and the pickup direction corresponding to the largest wake-up parameter is not the pickup direction in the first pair of adjacent pickup directions, if the initial sound source direction is located in a fourth pickup range, the fourth pickup range is the pickup range composed of the first pair of two adjacent pickup directions, the terminal determines the sum of wake-up parameters of the adjacent pickup directions, if the sum of wake-up parameters is not less than the product of the largest wake-up parameter and the preset coefficient, it is determined that the target sound source direction is the initial sound source direction, and then step 307 is executed.
For example, if the angle between a2 and a3 is a preset angle, and both the angle between a1 and a2 and the angle between a1 and a3 are greater than the preset angle, the fourth sound pickup range is min (a2, a3) - θ/2 to max (a2, a3) + θ/2, the initial sound source direction is located between min (a2, a3) - θ/2 and max (a2, a3) + θ/2, the sum of the wake-up parameters of adjacent sound pickup directions is determined to be s2+ s3, and if the sum of the wake-up parameters is not less than the product of the maximum wake-up parameter and a preset coefficient, the target sound source direction is determined to be the initial sound source direction, for example, the preset coefficient is 1.5, that is, if the sum of the wake-up parameters s2+ s3> is 1.5 × s1, the terminal determines the target sound source direction to be a.
In the embodiment of the application, the terminal determines the adjustment angle in different modes based on different preset conditions met by the initial sound source direction and the position relation, so that the accuracy of the adjustment angle is improved, the accuracy of the target sound source direction can be improved when the initial sound source direction is adjusted through the adjustment angle, the quality of the acquired target voice signal is better based on the pickup direction, and the accuracy of identifying the target voice signal can be improved.
It should be noted that, if the initial sound source direction and the position relationship do not satisfy the preset condition, the terminal determines the pickup direction corresponding to the wake-up parameter whose target sound source direction is the maximum. This situation indicates that there is an adjacent situation corresponding to the maximum wake-up parameter N sound pickup directions, and the terminal can determine the approximate range of the sound source based on the wake-up parameter, and if the initial sound source direction does not satisfy the preset condition, that is, the initial sound source direction is greatly different from the N sound pickup directions, it indicates that the initial sound source direction is inaccurate, and at this time, the terminal directly determines that the target sound source direction is the sound pickup direction corresponding to the maximum wake-up parameter.
It should be noted that the sound pickup range in the above example does not include the sound pickup range from a1- θ/2 to a1+ θ/2, and if the initial sound source direction is located from a1- θ/2 to a1+ θ/2, it indicates that the sound pickup direction with the smallest included angle with the initial sound source direction is the sound pickup direction corresponding to the largest wake-up parameter, because the largest wake-up parameter represents that the contribution degree of the voice signal corresponding to the wake-up parameter wakes up the terminal is the largest, the difference between the obtained initial sound source direction and the sound pickup direction corresponding to the largest wake-up parameter is not large, which indicates that the initial sound source direction is relatively accurate, and at this time, the terminal directly determines the target sound source direction as the initial sound source direction.
Step 306: and the terminal adjusts the initial sound source direction based on the adjustment angle to obtain the target sound source direction.
And the terminal adds an adjusting angle on the basis of the initial sound source direction to adjust the initial sound source direction, so that the target sound source direction is obtained. For example, following the example in the above step, in the first case, when the adjustment angle is (a2-a) × s2/s1, the target sound source direction is a _ final ═ a + (a2-a) × s2/s1, and when the adjustment angle is (a3-a) × s3/s1, the target sound source direction is a _ final ═ a + (a3-a) × s3/s 1; in the second case, the adjustment angle is (s2+ s3)/(2 × s1), and the target sound source direction is a + (s2+ s3)/(2 × s 1); in the third case, when the adjustment angle is (a2-a) × s2/s1, the target sound source direction is a _ final ═ a + (a2-a) × s2/s 1; in the fourth case, when the adjustment angle is (a3-a) × s3/s1, the target sound source direction is a _ final ═ a + (a3-a) × s3/s 1; in the fifth case, when the adjustment angle is (a2-a) × s2/s1, the target sound source direction is a _ final ═ a + (a2-a) × s2/s1, and when the adjustment angle is (a3-a) × s3/s1, the target sound source direction is a _ final ═ a + (a3-a) × s3/s 1; in the sixth case, when the initial sound source direction and the positional relationship satisfy the preset condition, the terminal directly determines that the target sound source direction is the initial sound source direction, that is, a _ final ═ a, and the adjustment angle at this time is 0.
The terminal can also recognize the voice signal of the target sound source direction after determining the target sound source direction, that is, the terminal recognizes the target voice signal based on the target sound source direction, wherein the target voice signal is the voice signal collected in the target sound source direction. The terminal needs earlier based on target sound source direction, gathers the speech signal of this direction, discerns the speech signal who gathers again, and is corresponding, and the terminal is based on target sound source direction, and the process of discerning target speech signal can be realized through following two steps, include:
(1) and the terminal acquires a target voice signal based on the target sound source direction.
Because the voice signals in the surrounding environment are more (for example, the voice signals include the voice signals of the user and the environmental noise), the terminal only needs to recognize the control command in the voice signals of the user and then executes the operation corresponding to the control command, so before recognizing the voice signals, the terminal only needs to collect the voice signals of the user, and in order to avoid the interference of other environmental noises, the terminal needs to suppress the noises in other directions and enhance the target voice signals in the target sound source direction based on the target sound source direction when collecting the target voice signals. For example, the terminal may employ a beamforming method to enhance the voice signal in the direction, thereby improving the voice recognition effect.
(2) And the terminal identifies the target voice signal.
And the terminal performs voice recognition on the target voice signal and recognizes the voice control command from the target voice signal.
In one possible implementation mode, the terminal identifies a voice control instruction in a target voice signal through a server; the process of recognizing the voice signal by the terminal may be: the terminal sends a target voice signal to the server, the server receives the target voice signal, a voice control instruction is identified from the target voice signal, the voice control instruction is sent to the terminal, and the terminal receives the voice control instruction.
The server stores a plurality of voice control instructions, and correspondingly, the step of recognizing the voice control instruction from the target voice signal by the server may be: the server may determine a voice control directive for the target voice signal from a plurality of voice control directives.
In the embodiment of the application, the terminal sends the target voice signal to the server, and the server identifies the target voice signal.
In another possible implementation manner, a plurality of voice control instructions are stored in the terminal, and the terminal may determine the voice control instruction of the target voice signal directly from the plurality of voice control instructions. Accordingly, the process of recognizing the voice signal by the terminal may be: and the terminal performs voice recognition on the target voice signal and determines a voice control instruction matched with the target voice signal from a plurality of locally stored voice control instructions. In the embodiment of the application, the terminal identifies the target voice signal locally, the operation is simple, the target voice signal does not need to be sent to the server, and the voice identification efficiency can be improved.
In the embodiment of the application, the terminal picks up voice signals in multiple directions, the positioning direction of the sound source is adjusted by determining the awakening parameter corresponding to the voice signal, and the awakening parameter can represent the contribution degree of the corresponding voice signal awakening terminal, so that the positioning direction of the sound source is adjusted by the awakening parameter, the accuracy of determining the direction of the target sound source can be improved, the quality of the target voice signal acquired based on the direction of the target sound source is better, and the accuracy of identifying the target voice signal can be improved.
Fig. 5 is a flowchart of a sound source direction determining method according to an embodiment of the present application. The embodiment of the application is executed by a server, and the method comprises the following steps:
step 501: the terminal picks up M voice signals based on M pickup directions.
Step 501 is the same as step 301, and is not described herein again.
Step 502: the terminal sends M voice signals to the server.
Step 503: the server receives the M voice signals.
Step 504: the server determines M wake-up parameters and an initial sound source direction based on the M voice signals.
Step 505: the server selects N sound pickup directions from the M sound pickup directions based on the M wake-up parameters.
Step 506: the server determines the positional relationship of the N sound pickup directions based on the N sound pickup directions.
Step 507: and if the relation between the initial sound source direction and the position meets the preset condition, the server determines an adjustment angle based on the N awakening parameters and the N pickup directions.
Step 508: and the server adjusts the initial sound source direction based on the adjustment angle to obtain the target sound source direction.
The steps 504 and 508 are the same as the steps 302 and 306, and are not described herein.
Step 509: the server sends the target sound source direction to the terminal.
Step 510: the terminal receives a target sound source direction.
In the embodiment of the application, the server picks up voice signals in multiple directions, and adjusts the positioning direction of the sound source by determining the awakening parameter corresponding to the voice signal, and because the awakening parameter can represent the contribution degree of the corresponding voice signal awakening terminal, the positioning direction of the sound source is adjusted by the awakening parameter, and the accuracy rate of determining the direction of the target sound source can be improved.
Fig. 6 is a schematic structural diagram of a sound source direction determining apparatus provided in an embodiment of the present application, and referring to fig. 6, the apparatus includes:
a pickup module 601, configured to pick up M voice signals based on M pickup directions, where each voice signal corresponds to one pickup direction, and the M voice signals are used to wake up a terminal, where M is an integer greater than 1;
a first determining module 602, configured to determine M wake-up parameters and an initial sound source direction based on M voice signals, where the M wake-up parameters are used to represent contribution degrees of the M voice signals for waking up a terminal;
an adjusting module 603, configured to adjust the initial sound source direction based on the M wake-up parameters and the M sound pickup directions, to obtain a target sound source direction.
In one possible implementation, the adjusting module includes:
the selection unit is used for selecting N sound pickup directions from the M sound pickup directions based on the M awakening parameters and the M sound pickup directions, wherein the N awakening parameters of the N sound pickup directions are the first N awakening parameters in the M awakening parameters which are arranged from large to small, and N is an integer which is smaller than M and larger than 1;
and the adjusting unit is used for adjusting the initial sound source direction and determining the target sound source direction based on the N awakening parameters and the N sound pickup directions.
In another possible implementation manner, the adjusting unit includes:
a first determining subunit, configured to determine, based on the N sound pickup directions, a positional relationship of the N sound pickup directions;
the second determining subunit is used for determining an adjusting angle based on the N awakening parameters and the N sound pickup directions if the relationship between the initial sound source direction and the position meets a preset condition; adjusting the initial sound source direction based on the adjustment angle to obtain a target sound source direction;
and the third determining subunit is used for determining the pickup direction corresponding to the wake-up parameter with the maximum target sound source direction if the initial sound source direction and the position relation do not meet the preset condition.
In another possible implementation manner, the second determining subunit is configured to determine, based on the position relationship, that only an included angle between a pair of two adjacent pickup directions exists in the N pickup directions and is greater than a preset angle, and the pickup direction corresponding to the largest wake-up parameter is not a boundary pickup direction of a first pickup range, where the first pickup range is a pickup range formed by the N pickup directions, the boundary pickup direction is two adjacent pickup directions in the first pickup range, where the included angle is greater than the preset angle, and if the initial sound source direction is located within the first pickup range, determine a first weight and a first included angle of the pickup direction, where the included angle with the initial sound source direction is the smallest; and determining an adjusting angle based on the first weight and the first included angle.
In another possible implementation manner, the second determining subunit is configured to determine a second weight of the other sound pickup directions, where the position relationship is that only an included angle between a pair of two adjacent sound pickup directions exists in the N sound pickup directions and is greater than a preset angle, the sound pickup direction corresponding to the largest wake-up parameter is one of boundary sound pickup directions of a first sound pickup range, the first sound pickup range is a sound pickup range formed by the N sound pickup directions, the boundary sound pickup direction is two adjacent sound pickup directions in the first sound pickup range, the included angle of which is greater than the preset angle, and if the initial sound source direction is within a second sound pickup range, the second sound pickup range is a sound pickup range formed by other sound pickup directions except the sound pickup direction corresponding to the largest wake-up parameter; based on the second weight, an adjustment angle is determined.
In another possible implementation manner, the second determining subunit is configured to determine that the position relationship is that N sound pickup directions are adjacent, and the sound pickup direction corresponding to the largest wake-up parameter is any one of the N sound pickup directions, and if the initial sound source direction is within a first sound pickup range, the first sound pickup range is a sound pickup range formed by the N sound pickup directions, and determine a third weight and a second included angle of the sound pickup direction having the smallest included angle with the initial sound source direction; and determining an adjusting angle based on the third weight and the second included angle.
In another possible implementation manner, the apparatus further includes:
the second determining module is used for determining a fourth weight and a third included angle of the sound pickup direction with the smallest included angle with the initial sound source direction if the initial sound source direction is in the first sound pickup range which is the sound pickup range formed by the N sound pickup directions; and determining an adjusting angle based on the fourth weight and the third included angle.
In another possible implementation manner, the second determining subunit is configured to determine, in the N sound pickup directions, that at least a first pair of two adjacent sound pickup directions and a second pair of two adjacent sound pickup directions exist, where an included angle between the first pair of two adjacent sound pickup directions is a preset angle, an included angle between the second pair of two adjacent sound pickup directions is greater than the preset angle, and the sound pickup direction corresponding to the largest wake-up parameter is any one of the first pair of adjacent sound pickup directions, and if the initial sound source direction is within a third sound pickup range, the third sound pickup range is a sound pickup range formed by the first pair of two adjacent sound pickup directions, and determine a fifth weight and a fourth included angle between the adjacent sound pickup directions of the sound pickup direction corresponding to the largest wake-up parameter; and determining an adjusting angle based on the fifth weight and the fourth included angle.
In another possible implementation manner, the apparatus further includes:
a third determining module, configured to determine a position relationship that at least a first pair of two adjacent pickup directions and a second pair of two adjacent pickup directions exist in the N pickup directions, an included angle between the first pair of two adjacent pickup directions is a preset angle, an included angle between the second pair of two adjacent pickup directions is greater than the preset angle, and the pickup direction corresponding to the largest wake-up parameter is not the pickup direction in the first pair of adjacent pickup directions, and if the initial sound source direction is located within a fourth pickup range, the fourth pickup range is a pickup range formed by the first pair of two adjacent pickup directions, and a sum of wake-up parameters of the adjacent pickup directions is determined; and if the sum of the awakening parameters is not less than the product of the maximum awakening parameter and the preset coefficient, determining the target sound source direction as the initial sound source direction.
In another possible implementation manner, the apparatus further includes:
and the fourth determining module is used for determining an adjusting angle based on the N awakening parameters and the N pickup directions if the position relationship is that an included angle between two adjacent pickup directions exists in the N pickup directions and is a preset angle, and if the initial sound source direction and the position relationship meet a preset condition.
In another possible implementation manner, the apparatus further includes:
and the fifth determining module is used for determining that the target sound source direction is the initial sound source direction if the position relationship is that an included angle between two adjacent sound pickup directions does not exist in the N sound pickup directions is a preset angle.
In another possible implementation manner, the apparatus further includes:
and the sixth determining module is used for adjusting the initial sound source direction to obtain the target sound source direction based on the M awakening parameters and the M sound pickup directions if the sound pickup direction with the minimum included angle between the initial sound source direction and the initial sound source direction is not the sound pickup direction corresponding to the maximum awakening parameter.
In another possible implementation manner, the apparatus further includes:
and the seventh determining module is used for determining the target sound source direction as the initial sound source direction if the sound pickup direction with the minimum included angle with the initial sound source direction is the sound pickup direction corresponding to the maximum awakening parameter.
In the embodiment of the application, voice signals in multiple directions are picked up, the positioning direction of a sound source is adjusted by determining the awakening parameter corresponding to the voice signal, and the awakening parameter can represent the contribution degree of the corresponding voice signal awakening terminal, so that the positioning direction of the sound source is adjusted by the awakening parameter, and the accuracy rate of determining the direction of a target sound source can be improved.
Optionally, the electronic device is provided as a terminal. Fig. 7 shows a block diagram of a terminal 700 according to an exemplary embodiment of the present application. The terminal 700 may be: a smart phone, a tablet computer, an MP3 player (Moving Picture Experts Group Audio Layer III, motion video Experts compression standard Audio Layer 3), an MP4 player (Moving Picture Experts Group Audio Layer IV, motion video Experts compression standard Audio Layer 4), a notebook computer, or a desktop computer. Terminal 700 may also be referred to by other names such as user equipment, portable terminal, laptop terminal, desktop terminal, and so on.
In general, terminal 700 includes: a processor 701 and a memory 702.
The processor 701 may include one or more processing cores, such as a 4-core processor, an 8-core processor, and so on. The processor 701 may be implemented in at least one hardware form of a DSP (Digital Signal Processing), an FPGA (Field-Programmable Gate Array), and a PLA (Programmable Logic Array). The processor 701 may also include a main processor and a coprocessor, where the main processor is a processor for Processing data in an awake state, and is also called a Central Processing Unit (CPU); a coprocessor is a low power processor for processing data in a standby state. In some embodiments, the processor 701 may be integrated with a GPU (Graphics Processing Unit), which is responsible for rendering and drawing the content required to be displayed on the display screen. In some embodiments, the processor 701 may further include an AI (Artificial Intelligence) processor for processing computing operations related to machine learning.
Memory 702 may include one or more computer-readable storage media, which may be non-transitory. Memory 702 may also include high-speed random access memory, as well as non-volatile memory, such as one or more magnetic disk storage devices, flash memory storage devices. In some embodiments, a non-transitory computer readable storage medium in memory 702 is used to store at least one instruction for execution by processor 701 to implement the sound source direction determination methods provided by method embodiments herein.
In some embodiments, the terminal 700 may further optionally include: a peripheral interface 703 and at least one peripheral. The processor 701, the memory 702, and the peripheral interface 703 may be connected by buses or signal lines. Various peripheral devices may be connected to peripheral interface 703 via a bus, signal line, or circuit board. Specifically, the peripheral device includes: at least one of radio frequency circuitry 704, touch screen display 705, camera 706, camera assembly 706, audio circuitry 707, positioning assembly 708, and power source 709.
The peripheral interface 703 may be used to connect at least one peripheral related to I/O (Input/Output) to the processor 701 and the memory 702. In some embodiments, processor 701, memory 702, and peripheral interface 703 are integrated on the same chip or circuit board; in some other embodiments, any one or two of the processor 701, the memory 702, and the peripheral interface 703 may be implemented on a separate chip or circuit board, which is not limited in this application.
The Radio Frequency circuit 704 is used for receiving and transmitting RF (Radio Frequency) signals, also called electromagnetic signals. The radio frequency circuitry 704 communicates with communication networks and other communication devices via electromagnetic signals. The rf circuit 704 converts an electrical signal into an electromagnetic signal to transmit, or converts a received electromagnetic signal into an electrical signal. Optionally, the radio frequency circuit 704 includes: an antenna system, an RF transceiver, one or more amplifiers, a tuner, an oscillator, a digital signal processor, a codec chipset, a subscriber identity module card, and so forth. The radio frequency circuitry 704 may communicate with other terminals via at least one wireless communication protocol. The wireless communication protocols include, but are not limited to: the world wide web, metropolitan area networks, intranets, generations of mobile communication networks (2G, 3G, 4G, and 5G), Wireless local area networks, and/or WiFi (Wireless Fidelity) networks. In some embodiments, the radio frequency circuit 704 may further include NFC (Near Field Communication) related circuits, which are not limited in this application.
The display screen 705 is used to display a UI (User Interface). The UI may include graphics, text, icons, video, and any combination thereof. When the display screen 705 is a touch display screen, the display screen 705 also has the ability to capture touch signals on or over the surface of the display screen 705. The touch signal may be input to the processor 701 as a control signal for processing. At this point, the display 705 may also be used to provide virtual buttons and/or a virtual keyboard, also referred to as soft buttons and/or a soft keyboard. In some embodiments, the display 705 may be one, providing the front panel of the terminal 700; in other embodiments, the display 705 can be at least two, respectively disposed on different surfaces of the terminal 700 or in a folded design; in still other embodiments, the display 705 may be a flexible display disposed on a curved surface or on a folded surface of the terminal 700. Even more, the display 705 may be arranged in a non-rectangular irregular pattern, i.e. a shaped screen. The Display 705 may be made of LCD (Liquid Crystal Display), OLED (Organic Light-Emitting Diode), or the like.
The camera assembly 706 is used to capture images or video. Optionally, camera assembly 706 includes a front camera and a rear camera. Generally, a front camera is disposed at a front panel of the terminal, and a rear camera is disposed at a rear surface of the terminal. In some embodiments, the number of the rear cameras is at least two, and each rear camera is any one of a main camera, a depth-of-field camera, a wide-angle camera and a telephoto camera, so that the main camera and the depth-of-field camera are fused to realize a background blurring function, and the main camera and the wide-angle camera are fused to realize panoramic shooting and VR (Virtual Reality) shooting functions or other fusion shooting functions. In some embodiments, camera assembly 706 may also include a flash. The flash lamp can be a monochrome temperature flash lamp or a bicolor temperature flash lamp. The double-color-temperature flash lamp is a combination of a warm-light flash lamp and a cold-light flash lamp, and can be used for light compensation at different color temperatures.
The audio circuitry 707 may include a microphone and a speaker. The microphone is used for collecting sound waves of a user and the environment, converting the sound waves into electric signals, and inputting the electric signals to the processor 701 for processing or inputting the electric signals to the radio frequency circuit 704 to realize voice communication. For the purpose of stereo sound collection or noise reduction, a plurality of microphones may be provided at different portions of the terminal 700. The microphone may also be an array microphone or an omni-directional pick-up microphone. The speaker is used to convert electrical signals from the processor 701 or the radio frequency circuit 704 into sound waves. The loudspeaker can be a traditional film loudspeaker or a piezoelectric ceramic loudspeaker. When the speaker is a piezoelectric ceramic speaker, the speaker can be used for purposes such as converting an electric signal into a sound wave audible to a human being, or converting an electric signal into a sound wave inaudible to a human being to measure a distance. In some embodiments, the audio circuitry 707 may also include a headphone jack.
The positioning component 708 is used to locate the current geographic Location of the terminal 700 for navigation or LBS (Location Based Service). The Positioning component 708 can be a Positioning component based on the GPS (Global Positioning System) in the united states, the beidou System in china, the graves System in russia, or the galileo System in the russian eu.
Power supply 709 is provided to supply power to various components of terminal 700. The power source 709 may be alternating current, direct current, disposable batteries, or rechargeable batteries. When power source 709 includes a rechargeable battery, the rechargeable battery may be a support wired or wireless rechargeable battery. The wired rechargeable battery is a battery charged through a wired line, and the wireless rechargeable battery is a battery charged through a wireless coil. The rechargeable battery may also be used to support fast charge technology.
In some embodiments, terminal 700 also includes one or more sensors 710. The one or more sensors 710 include, but are not limited to: acceleration sensor 711, gyro sensor 712, pressure sensor 713, fingerprint sensor 714, optical sensor 715, and proximity sensor 716.
The acceleration sensor 711 can detect the magnitude of acceleration in three coordinate axes of a coordinate system established with the terminal 700. For example, the acceleration sensor 711 may be used to detect components of the gravitational acceleration in three coordinate axes. The processor 701 may control the touch screen 705 to display the user interface in a landscape view or a portrait view according to the gravitational acceleration signal collected by the acceleration sensor 711. The acceleration sensor 711 may also be used for acquisition of motion data of a game or a user.
The gyro sensor 712 may detect a body direction and a rotation angle of the terminal 700, and the gyro sensor 712 may cooperate with the acceleration sensor 711 to acquire a 3D motion of the terminal 700 by the user. From the data collected by the gyro sensor 712, the processor 701 may implement the following functions: motion sensing (such as changing the UI according to a user's tilting operation), image stabilization at the time of photographing, game control, and inertial navigation.
Pressure sensors 713 may be disposed on a side bezel of terminal 700 and/or an underlying layer of touch display 705. When the pressure sensor 713 is disposed on a side frame of the terminal 700, a user's grip signal on the terminal 700 may be detected, and the processor 701 performs right-left hand recognition or shortcut operation according to the grip signal collected by the pressure sensor 713. When the pressure sensor 713 is disposed at a lower layer of the touch display 705, the processor 701 controls the operability control on the UI interface according to the pressure operation of the user on the touch display 705. The operability control comprises at least one of a button control, a scroll bar control, an icon control and a menu control.
The fingerprint sensor 714 is used for collecting a fingerprint of a user, and the processor 701 identifies the identity of the user according to the fingerprint collected by the fingerprint sensor 714, or the fingerprint sensor 714 identifies the identity of the user according to the collected fingerprint. When the user identity is identified as a trusted identity, the processor 701 authorizes the user to perform relevant sensitive operations, including unlocking a screen, viewing encrypted information, downloading software, paying, changing settings, and the like. The fingerprint sensor 714 may be disposed on the front, back, or side of the terminal 700. When a physical button or a vendor Logo is provided on the terminal 700, the fingerprint sensor 714 may be integrated with the physical button or the vendor Logo.
The optical sensor 715 is used to collect the ambient light intensity. In one embodiment, the processor 701 may control the display brightness of the touch display 705 based on the ambient light intensity collected by the optical sensor 715. Specifically, when the ambient light intensity is high, the display brightness of the touch display screen 705 is increased; when the ambient light intensity is low, the display brightness of the touch display 705 is turned down. In another embodiment, processor 701 may also dynamically adjust the shooting parameters of camera assembly 706 based on the ambient light intensity collected by optical sensor 715.
A proximity sensor 716, also referred to as a distance sensor, is typically disposed on a front panel of the terminal 700. The proximity sensor 716 is used to collect the distance between the user and the front surface of the terminal 700. In one embodiment, when the proximity sensor 716 detects that the distance between the user and the front surface of the terminal 700 gradually decreases, the processor 701 controls the touch display 705 to switch from the bright screen state to the dark screen state; when the proximity sensor 716 detects that the distance between the user and the front surface of the terminal 700 gradually becomes larger, the processor 701 controls the touch display 705 to switch from the breath screen state to the bright screen state.
Those skilled in the art will appreciate that the configuration shown in fig. 7 is not intended to be limiting of terminal 700 and may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components may be used.
Optionally, the electronic device is provided as a server. Fig. 8 is a schematic structural diagram of a server according to an embodiment of the present application, where the server 800 may generate a relatively large difference due to different configurations or performances, and may include one or more processors (CPUs) 801 and one or more memories 802, where the memory 802 stores at least one computer program, and the at least one computer program is loaded and executed by the processors 801 to implement the methods provided by the foregoing method embodiments. Of course, the server may also have components such as a wired or wireless network interface, a keyboard, and an input/output interface, so as to perform input/output, and the server may also include other components for implementing the functions of the device, which are not described herein again.
In an exemplary embodiment, there is also provided a computer-readable storage medium storing at least one instruction, the at least one instruction being loaded and executed by a terminal to implement the sound source direction determining method in the above-described embodiments. The computer readable storage medium may be a memory. For example, the computer-readable storage medium may be a ROM (Read-Only Memory), a RAM (Random Access Memory), a CD-ROM (Compact Disc Read-Only Memory), a magnetic tape, a floppy disk, an optical data storage device, and the like.
In an exemplary embodiment, a computer program product is also provided, which comprises computer program code, which, when executed by a processor, causes a computer to implement the sound source direction determination method in the above-described embodiments.
In an exemplary embodiment, a computer program according to an embodiment of the present application may be deployed to be executed on one computer device or on multiple computer devices located at one site, or may be executed on multiple computer devices distributed at multiple sites and interconnected by a communication network, and the multiple computer devices distributed at the multiple sites and interconnected by the communication network may constitute a block chain system.
It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, and the program may be stored in a computer-readable storage medium, and the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.
The above description is only exemplary of the present application and should not be taken as limiting, as any modification, equivalent replacement, or improvement made within the spirit and principle of the present application should be included in the protection scope of the present application.

Claims (17)

1. A method for determining a direction of a sound source, the method comprising:
based on M sound pickup directions, M voice signals are picked up and obtained, each voice signal corresponds to one sound pickup direction, the M voice signals are used for awakening the terminal, and M is an integer larger than 2;
determining M awakening parameters and an initial sound source direction based on the M voice signals, wherein the M awakening parameters are used for representing the contribution degree of the M voice signals to awaken the terminal;
and adjusting the initial sound source direction based on the M awakening parameters and the M pickup directions to obtain a target sound source direction.
2. The method of claim 1, wherein the adjusting the initial sound source direction based on the M wake-up parameters and the M sound pickup directions to obtain a target sound source direction comprises:
selecting N sound pickup directions from the M sound pickup directions based on the M awakening parameters and the M sound pickup directions, wherein the N awakening parameters of the N sound pickup directions are the first N awakening parameters in the M awakening parameters which are arranged from large to small, and N is an integer which is less than or equal to M and greater than 1;
and adjusting the initial sound source direction based on the N awakening parameters and the N pickup directions, and determining the target sound source direction.
3. The method of claim 2, wherein the adjusting the initial sound source direction based on the N wake-up parameters and the N sound pickup directions to determine the target sound source direction comprises:
determining a positional relationship of the N sound pickup directions based on the N sound pickup directions;
if the initial sound source direction and the position relation meet a preset condition, determining an adjusting angle based on the N awakening parameters and the N pickup directions; adjusting the initial sound source direction based on the adjustment angle to obtain the target sound source direction;
and if the initial sound source direction and the position relation do not meet the preset condition, determining the pickup direction corresponding to the awakening parameter with the maximum target sound source direction.
4. The method of claim 3, wherein if N is greater than 2 and less than M, an included angle between any two adjacent pickup directions of the M pickup directions is a predetermined angle; if the initial sound source direction and the position relation meet a preset condition, determining an adjustment angle based on the N awakening parameters and the N pickup directions, and including:
the position relation is that only an included angle between a pair of two adjacent pickup directions exists in the N pickup directions is larger than the preset angle, the pickup direction corresponding to the maximum awakening parameter is not a boundary pickup direction of a first pickup range, the first pickup range is a pickup range formed by the N pickup directions, the boundary pickup direction is two adjacent pickup directions in the first pickup range, the included angle of the boundary pickup direction is larger than the preset angle, and if the initial sound source direction is located in the first pickup range, a first weight and a first included angle of the pickup direction with the smallest included angle with the initial sound source direction are determined;
and determining the adjusting angle based on the first weight and the first included angle.
5. The method of claim 3, wherein if N is greater than 2 and less than M, an included angle between any two adjacent pickup directions of the M pickup directions is a predetermined angle; if the initial sound source direction and the position relation meet a preset condition, determining an adjustment angle based on the N awakening parameters and the N pickup directions, and including:
the position relation is that only an included angle between a pair of two adjacent pickup directions exists in the N pickup directions and is larger than the preset angle, the pickup direction corresponding to the maximum awakening parameter is one of boundary pickup directions of a first pickup range, the first pickup range is a pickup range formed by the N pickup directions, the boundary pickup directions are two adjacent pickup directions in the first pickup range, the included angle of which is larger than the preset angle, if the initial sound source direction is in a second pickup range, the second pickup range is a pickup range formed by other pickup directions except the pickup direction corresponding to the maximum awakening parameter, and second weights of the other pickup directions are determined;
determining the adjustment angle based on the second weight.
6. The method of claim 3, wherein if N is equal to 2; if the initial sound source direction and the position relation meet a preset condition, determining an adjustment angle based on the N awakening parameters and the N pickup directions, and including:
the position relation is that the N sound pickup directions are adjacent, the sound pickup direction corresponding to the maximum awakening parameter is any one of the N sound pickup directions, if the initial sound source direction is in a first sound pickup range, the first sound pickup range is a sound pickup range formed by the N sound pickup directions, and a third weight and a second included angle of the sound pickup direction with the minimum included angle between the initial sound source direction and the first sound pickup range are determined;
and determining the adjusting angle based on the third weight and the second included angle.
7. The method of claim 2, wherein if said N is equal to said M; the method further comprises the following steps:
if the initial sound source direction is within a first sound pickup range, the first sound pickup range is a sound pickup range formed by the N sound pickup directions, and a fourth weight and a third included angle of the sound pickup direction with the smallest included angle with the initial sound source direction are determined;
and determining the adjusting angle based on the fourth weight and the third included angle.
8. The method according to claim 3, wherein an included angle between any two adjacent sound pickup directions of the M sound pickup directions is a preset angle; if the initial sound source direction and the position relation meet a preset condition, determining an adjustment angle based on the N awakening parameters and the N pickup directions, and including:
the position relation is that at least a first pair of two adjacent pickup directions and a second pair of two adjacent pickup directions exist in the N pickup directions, an included angle between the first pair of two adjacent pickup directions is the preset angle, an included angle between the second pair of two adjacent pickup directions is larger than the preset angle, the pickup direction corresponding to the maximum awakening parameter is any one of the first pair of adjacent pickup directions, if the initial sound source direction is within a third pickup range, the third pickup range is a pickup range formed by the first pair of two adjacent pickup directions, and a fifth weight and a fourth included angle between the adjacent pickup directions of the pickup direction corresponding to the maximum awakening parameter are determined;
and determining the adjusting angle based on the fifth weight and the fourth included angle.
9. The method according to claim 3, wherein an included angle between any two adjacent sound pickup directions of the M sound pickup directions is a preset angle; the method further comprises the following steps:
the position relation is that at least a first pair of two adjacent pickup directions and a second pair of two adjacent pickup directions exist in the N pickup directions, an included angle between the first pair of two adjacent pickup directions is the preset angle, an included angle between the second pair of two adjacent pickup directions is larger than the preset angle, the pickup direction corresponding to the maximum awakening parameter is not the pickup direction in the first pair of adjacent pickup directions, if the initial sound source direction is located in a fourth pickup range, the fourth pickup range is a pickup range formed by the first pair of two adjacent pickup directions, and the sum of awakening parameters of the adjacent pickup directions is determined;
and if the sum of the awakening parameters is not less than the product of the maximum awakening parameter and a preset coefficient, determining the target sound source direction as the initial sound source direction.
10. The method according to claim 3, wherein an included angle between any two adjacent sound pickup directions of the M sound pickup directions is a preset angle; the method further comprises the following steps:
and if the position relation is that an included angle between at least one pair of two adjacent pickup directions exists in the N pickup directions is the preset angle, executing the step of determining an adjustment angle based on the N awakening parameters and the N pickup directions if the initial sound source direction and the position relation meet the preset condition.
11. The method according to claim 3, wherein an included angle between any two adjacent sound pickup directions of the M sound pickup directions is a preset angle; the method further comprises the following steps:
and if the position relation is that the included angle between two adjacent pickup directions does not exist in the N pickup directions, determining that the target sound source direction is the initial sound source direction.
12. The method of claim 1, further comprising:
and if the pickup direction with the minimum included angle between the original sound source direction and the original sound source direction is not the pickup direction corresponding to the maximum awakening parameter, executing the step of adjusting the original sound source direction based on the M awakening parameters and the M pickup directions to obtain the target sound source direction.
13. The method of claim 1, further comprising:
and if the pickup direction with the minimum included angle with the initial sound source direction is the pickup direction corresponding to the maximum awakening parameter, determining that the target sound source direction is the initial sound source direction.
14. An apparatus for determining a direction of a sound source, the apparatus comprising:
the pickup module is used for picking up M voice signals based on M pickup directions, each voice signal corresponds to one pickup direction, the M voice signals are used for awakening the terminal, and M is an integer greater than 1;
a first determining module, configured to determine, based on the M voice signals, M wake-up parameters and an initial sound source direction, where the M wake-up parameters are used to represent contribution degrees of the M voice signals to wake up the terminal;
and the adjusting module is used for adjusting the initial sound source direction based on the M awakening parameters and the M pickup directions to obtain a target sound source direction.
15. An electronic device, characterized in that the electronic device comprises one or more processors and one or more memories having at least one program code stored therein, which is loaded and executed by the one or more processors to implement the sound source direction determining method according to any one of claims 1 to 13.
16. A computer-readable storage medium, characterized in that at least one program code is stored in the storage medium, which is loaded and executed by a processor to implement the sound source direction determining method according to any one of claims 1 to 13.
17. A computer program product, characterized in that the computer program product comprises computer program code, which is stored in a computer readable storage medium, from which a processor of an electronic device reads the computer program code, the processor executing the computer program code, causing the electronic device to execute the sound source direction determination method according to any one of claims 1 to 13.
CN202111659858.5A 2021-12-30 2021-12-30 Sound source direction determining method, sound source direction determining device, electronic equipment and storage medium Pending CN114384466A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111659858.5A CN114384466A (en) 2021-12-30 2021-12-30 Sound source direction determining method, sound source direction determining device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111659858.5A CN114384466A (en) 2021-12-30 2021-12-30 Sound source direction determining method, sound source direction determining device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN114384466A true CN114384466A (en) 2022-04-22

Family

ID=81199971

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111659858.5A Pending CN114384466A (en) 2021-12-30 2021-12-30 Sound source direction determining method, sound source direction determining device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN114384466A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116913277A (en) * 2023-09-06 2023-10-20 北京惠朗时代科技有限公司 Voice interaction service system based on artificial intelligence

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116913277A (en) * 2023-09-06 2023-10-20 北京惠朗时代科技有限公司 Voice interaction service system based on artificial intelligence
CN116913277B (en) * 2023-09-06 2023-11-21 北京惠朗时代科技有限公司 Voice interaction service system based on artificial intelligence

Similar Documents

Publication Publication Date Title
CN110764730B (en) Method and device for playing audio data
CN109558837B (en) Face key point detection method, device and storage medium
CN112907725B (en) Image generation, training of image processing model and image processing method and device
CN110134744B (en) Method, device and system for updating geomagnetic information
US20220164159A1 (en) Method for playing audio, terminal and computer-readable storage medium
CN110933452B (en) Method and device for displaying lovely face gift and storage medium
CN111028144B (en) Video face changing method and device and storage medium
CN110705614A (en) Model training method and device, electronic equipment and storage medium
CN110956580A (en) Image face changing method and device, computer equipment and storage medium
CN114384466A (en) Sound source direction determining method, sound source direction determining device, electronic equipment and storage medium
CN110152309B (en) Voice communication method, device, electronic equipment and storage medium
CN109688064B (en) Data transmission method and device, electronic equipment and storage medium
CN115035187A (en) Sound source direction determining method, device, terminal, storage medium and product
CN112184802B (en) Calibration frame adjusting method, device and storage medium
CN112243083B (en) Snapshot method and device and computer storage medium
CN112329909B (en) Method, apparatus and storage medium for generating neural network model
CN111402873B (en) Voice signal processing method, device, equipment and storage medium
CN113843814A (en) Control system, method, device and storage medium for mechanical arm equipment
CN111488895B (en) Countermeasure data generation method, device, equipment and storage medium
CN112132472A (en) Resource management method and device, electronic equipment and computer readable storage medium
CN110992954A (en) Method, device, equipment and storage medium for voice recognition
CN111179628A (en) Positioning method and device for automatic driving vehicle, electronic equipment and storage medium
CN112990424A (en) Method and device for training neural network model
CN112990421A (en) Method, device and storage medium for optimizing operation process of deep learning network
CN113052408B (en) Method and device for community aggregation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination