CN110488225B

CN110488225B - Voice direction indicating method and device, readable storage medium and mobile terminal

Info

Publication number: CN110488225B
Application number: CN201910985754.XA
Authority: CN
Inventors: 郑斌; 徐晖; 沈思博
Original assignee: Nanjing Thunder Shark Information Technology Co Ltd
Current assignee: Shenzhen Grey Shark Technology Co ltd
Priority date: 2019-10-17
Filing date: 2019-10-17
Publication date: 2020-02-07
Anticipated expiration: 2039-10-17
Also published as: CN110488225A

Abstract

A method, a device, a readable storage medium and a mobile terminal for indicating sound direction are provided, the method is applied to the mobile terminal, at least one side edge of the mobile terminal is provided with a plurality of groups of lamp strips, the direction of sound received by a user is divided into a plurality of angle ranges by taking the user in an interface as a center, the lamp strips are in corresponding relation with the angle ranges, the indicating method comprises the following steps: acquiring audio data of double channels in a system, and performing feature extraction on the audio data to obtain audio feature data; judging whether the extracted audio characteristic data is the data of the target sound or not through a neural network model; if yes, calculating the sound source direction of the target sound according to the audio characteristic data; and determining the angle range to which the sound source direction belongs, and controlling the target lamp band corresponding to the angle range to be lightened. The embodiment of the invention provides the direction of the sound source for the user in a visual reminding mode to be used as preliminary judgment and reminding, thereby improving the user experience.

Description

Voice direction indicating method and device, readable storage medium and mobile terminal

Technical Field

The present invention relates to the field of electronic technologies, and in particular, to a method and an apparatus for indicating a sound direction, a readable storage medium, and a mobile terminal.

Background

The 'sound listening and position discrimination' is one of basic functions of being victory or defeat for FPS (first person shooting) game players, and the players need to judge the positions of enemy players in advance according to gunshot, footstep, car sound and the like before seeing enemy players, so that correct decisions can be made, and the pioneers can win victory. The 'listening sound and distinguishing position' is derived from the real acoustic principle, the information of a certain sound source received by the left ear and the right ear is different due to the difference of the auricle structures and the positions of the left ear and the right ear, particularly, the arrival time of sound waves has slight difference, and the human brain estimates the sound direction by calculating the difference of the binaural data. For this reason, the game engine often provides at least two-channel data to simulate real-world sound information, allowing the player to "hear sound bits".

However, when the sound output device or environment is limited or the player's hearing is limited, it is difficult to perform sound discrimination based on the sound. Especially in the mobile terminal game, because the mobile phone speaker is usually difficult to make the binaural function, many times the player can only use the speaker, even the volume of the game is reduced due to the requirement of environmental silence, at this time, the player can not directly hear the sound and distinguish the position in the game, and the game experience is poor.

Although some mobile-end games provide some degree of UI prompting to the player as assistance for such situations, for example, the location of gunshot in a map is prompted in a small map in "peace X ying", such prompting is self-defined by the game developer, and is very different or not different in different games, and the prompting effect is not obvious enough to quickly distinguish the orientation.

Disclosure of Invention

In view of the above, it is necessary to provide a method and an apparatus for indicating a sound direction, a readable storage medium, and a mobile terminal, which are directed to the problem that it is often difficult to distinguish the direction of a sound source by listening to sounds in a game in the prior art.

A voice direction indicating method is applied to a mobile terminal, at least one side edge of the mobile terminal is provided with a plurality of groups of lamp belts, the direction of receiving voice of a user is divided into a plurality of angle ranges by taking a user head position in a mobile terminal interface as a center, the plurality of groups of lamp belts and the plurality of angle ranges are in corresponding relation, and the indicating method comprises the following steps:

acquiring audio data of double channels in a system, and performing feature extraction on the audio data to obtain audio feature data;

judging whether the extracted audio characteristic data is the data of the target sound or not through a neural network model;

if yes, calculating the sound source direction of the target sound according to the audio characteristic data;

and determining the angle range to which the sound source direction belongs, and controlling the target lamp band corresponding to the angle range to be lightened.

Further, the method for indicating a sound direction, wherein the step of determining whether the extracted audio feature data is data of a target sound through a neural network model includes:

and taking an average value of the data of every three sampling points of the audio characteristic data on each sound channel in sequence to obtain a section of new audio data.

Carrying out average calculation on left and right sound channels of the new audio data at each sampling point, and mixing the audio data into single-channel audio data;

and calculating the single-channel audio data through a neural network model to determine whether the single-channel audio data is the target sound.

Further, the method for indicating a sound direction, wherein the step of determining whether the extracted audio feature data is data of a target sound through a neural network model further includes:

calculating the type of the target sound through the neural network model;

the step of controlling the target lamp strip corresponding to the angle range to be lightened further comprises the following steps:

and controlling the target lamp belt to display according to the lamp belt special effect corresponding to the type of the target sound, wherein the lamp belt special effect comprises at least one of lamp light color, lamp light flickering according to preset frequency and gradual brightness change.

Further, the method for indicating a sound bearing may further include the step of calculating a sound source direction of the target sound based on the audio feature data, including:

and calculating the audio characteristic data through a GCC-PHAT algorithm to obtain the time difference of the target sound reaching the user, and calculating the sound source direction of the target sound according to the time difference.

Further, the method for indicating a sound direction, wherein the step of acquiring the audio data of the two channels in the system includes:

acquiring a two-channel audio data stream recorded in a system, and continuously adding the audio data stream into a cache block;

and acquiring the audio fragment data in the cache block as audio data.

Further, the method for indicating a sound direction, wherein the step of extracting features of the audio data further includes:

and carrying out audio amplitude modulation on the audio data according to the current volume of the system and a preset target volume so as to enable the volume of the audio data to be the target volume.

Further, according to the method for indicating the sound direction, at least one side of the mobile terminal is uniformly provided with three groups of lamp strips, a polar coordinate system is established by taking the position of the head of the user as a pole and taking the front direction of the head of the user as a polar axis, and the direction of receiving sound by the user is divided into three angle ranges, which are respectively: minus 30 to minus 120 degrees; the angle is-30 to 0 degrees, 0 to 30 degrees, 120 to 180 degrees, and-120 to-180 degrees; 30-120 degrees.

Further, in the method for indicating a sound bearing, the step of calculating a sound source direction of the target sound according to the audio feature data further includes:

calculating the distance between the sound source of the target sound and a user according to the audio characteristic data;

and controlling the brightness of the target lamp belt to be the brightness corresponding to the distance or controlling the target lamp belt to flicker according to the frequency corresponding to the distance.

An embodiment of the present invention further provides an indicating device for a sound direction, which is applied to a mobile terminal, wherein at least one side edge of the mobile terminal is provided with a plurality of groups of lamp strips, a direction in which a user receives sound is divided into a plurality of angle ranges by taking a user head portion in an interface of the mobile terminal as a center, the plurality of groups of lamp strips and the plurality of angle ranges are in a corresponding relationship, and the indicating device includes:

the extraction module is used for acquiring the audio data of the two channels in the system and extracting the characteristics of the audio data to obtain audio characteristic data;

the judgment module is used for judging whether the extracted audio characteristic data is the data of the target sound or not through the neural network model;

the calculating module is used for calculating the sound source direction of the target sound according to the audio characteristic data;

and the control module is used for determining the angle range to which the sound source direction belongs and controlling the target lamp band corresponding to the angle range to be lightened.

An embodiment of the present invention further provides a readable storage medium, on which a program is stored, where the program, when executed by a processor, implements any of the methods described above.

An embodiment of the present invention further provides a mobile terminal, which includes a memory, a processor, and a program stored in the memory and executable on the processor, and when the processor executes the program, the method is implemented as any one of the above methods.

In the embodiment of the invention, the approximate direction of the sound source is indicated through the lamp strip on the side of the mobile terminal, when the sound output equipment or environment is limited or a player difficultly listens to sound according to sound, the approximate direction is provided for the user in a visual reminding mode to be used as preliminary judgment and reminding, so that the game experience is improved.

Drawings

FIG. 1 is a flow chart of a method of indicating a bearing of a sound according to a first embodiment of the present invention;

fig. 2 is a schematic view of lamp strips arranged on two sides of a mobile terminal and a corresponding angle range in a first embodiment of the present invention;

FIG. 3 is a flow chart of a method of indicating the bearing of a sound according to a second embodiment of the present invention;

FIG. 4 is a diagram illustrating a structure of a cache set according to a second embodiment of the present invention;

fig. 5 is a block diagram showing a configuration of a device for indicating the direction of sound in a third embodiment of the present invention.

Detailed Description

Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the accompanying drawings are illustrative only for the purpose of explaining the present invention, and are not to be construed as limiting the present invention.

These and other aspects of embodiments of the invention will be apparent with reference to the following description and attached drawings. In the description and drawings, particular embodiments of the invention have been disclosed in detail as being indicative of some of the ways in which the principles of the embodiments of the invention may be practiced, but it is understood that the scope of the embodiments of the invention is not limited correspondingly. On the contrary, the embodiments of the invention include all changes, modifications and equivalents coming within the spirit and terms of the claims appended hereto.

Referring to fig. 1, a method for indicating a voice direction according to a first embodiment of the present invention is applied to a mobile terminal, such as a mobile phone, a tablet computer, a personal digital assistant, and the like. The mobile terminal is characterized in that a plurality of groups of lamp belts are arranged on at least one side edge of the mobile terminal, the user head position in the interface of the mobile terminal is used as a center, the direction of receiving sound by a user is divided into a plurality of angle ranges, and the lamp belts correspond to the angle ranges. The user is a virtual object of a corresponding user in an application program (such as a game program) in the mobile terminal. The indicating method includes steps S11-S14.

And step S11, acquiring the audio data of the two channels in the system, and performing feature extraction on the audio data to obtain audio feature data.

The mobile terminal often provides dual-channel audio data, and audio data recorded in the system can be acquired through an audio data acquisition interface arranged on the mobile terminal. The audio data acquired by the mobile terminal are all sounds recorded in the terminal system, namely, the game sounds are acquired when the game is played; if the music playing is started at the same time of the game, the recorded sound is the sound synthesized by the two, and the recorded music sound interferes with the judgment under the condition. In this embodiment, feature extraction is performed on the acquired audio data, and a commonly used algorithm for extracting audio features is, for example, an MFCC (mel frequency cepstrum coefficient) method. MFCC extracts the components with identification in the audio signal and then removes other interference information. And calculating the audio data by using an MFCC method to obtain audio characteristic data.

Step S12, determining whether the extracted audio feature data is the data of the target sound through the neural network model.

And calculating the audio characteristic data through the neural network model, and judging whether the target sound belongs to the target sound according to the calculation result. The target sound is a sound in the game, and includes, for example, a footstep sound, a gunshot sound, a car sound, and the like in the game.

The method for distinguishing the audio data can be based on a neural network model in the prior art. For example, a Deep Neural Network (DNN) model may be used, which is divided from the DNN by positions of different layers, and the Neural network layers inside the DNN may be divided into three layers, an input layer, a hidden layer and an output layer.

In specific implementation, a DNN model is firstly constructed and trained, the DNN model can adopt a tensoflow frame to train audio data so as to determine parameters of the DNN model, and the trained DNN model is used for distinguishing whether the sound of the audio data is a target sound or a non-target sound. And finally, inputting the acquired audio data into the trained DNN model, and finally outputting the type of the audio data, namely the target sound or the non-target sound. Audio feature vector data can be calculated from the audio data using the MFCC (mel frequency cepstrum coefficient) method and input into the trained DNN model.

And step S13, when the audio characteristic data is judged to be the data of the target sound, calculating the sound source direction of the target sound according to the audio characteristic data.

The audio feature data can be calculated by a GCC-PHAT algorithm (Generalized Cross Correlation PHAseTransform, Generalized Cross Correlation-phase transformation method), so as to obtain the time difference of the target sound reaching the user, and the sound source direction of the target sound is calculated according to the time difference. The specific calculation steps are as follows:

and step S1, converting the audio characteristic data of the left and right channels into complex form data, and respectively applying fast Fourier transform to obtain two groups of frequency domain data.

Step S2, calculating the complex conjugate of one set of frequency domain data, multiplying the complex conjugate by another set of frequency domain data, and then applying inverse fourier transform to obtain the cross-correlation function.

And step S3, calculating the peak value of the cross-correlation function, wherein the coordinate position of the peak value is the phase difference of the two channels.

In step S4, the phase difference is divided by the sampling frequency to obtain TDOA (time difference tau of arrival of sound), and then the sound source angle α is calculated according to the TDOA, the calculation formula is:

sinα= tau * (c/b)；

where the time difference of sound arrival tau = Δ n/f, α is the angle of incidence of the sound source to the observer, b is the distance of the left and right ears in horizontal space, c is the sound velocity, Δ n is the phase difference of the left and right channels, and f is the audio data sampling frequency.

The value of b is 20cm, the value of c is 343m/s, and f can be 1600 hz.

Therefore, the sound source angle α can be finally calculated by obtaining the phase difference Δ n of the left and right channel data by using the GCC-PHAT algorithm.

And step S14, determining the angle range to which the sound source direction belongs, and controlling the target lamp strip corresponding to the angle range to be lightened.

At least one side of the mobile terminal is provided with a plurality of groups of lamp strips, each group of lamp strips at least comprises one lamp bead, and the lamp beads can be lightened when a current passes through the lamp strips. For example, in this embodiment, the two sides of the mobile terminal may be respectively provided with multiple groups of lamp strips, the positions of the lamp strips arranged on the two sides of the mobile terminal are opposite, and the lamp strip effects are the same. For example, as shown in fig. 2, two sides of the mobile terminal are respectively and uniformly provided with three groups of light belts, wherein one group of light belts are respectively L1, L2 and L3 from left to right, and the other group of light belts are respectively N1, N2 and N3 from left to right. The method comprises the following steps of establishing a polar coordinate system by taking a position where a user head is located in a mobile terminal interface as a pole and taking the front direction of the user head as a polar axis, and dividing the direction of sound received by a user into three angle ranges: a; B1-B4 and C, wherein A represents an angle range of: minus 30 to minus 120 degrees; B1-B4 show that the angle ranges from-30 degrees to 0 degrees, 0 degrees to 30 degrees, 120 degrees to 180 degrees, and-120 degrees to-180 degrees; c represents an angle ranging from 30 DEG to 120 deg. The correspondence between the lamp strips on the two sides of the mobile terminal and the three angle ranges is shown in table 1.

When determining the angle range to which the sound source direction belongs, determining the angleAnd lighting the lamp strip corresponding to the degree range. For example, if the direction of the sound source in the polar coordinate system is a direction at an angle of 25 °, the light strips L1 and N1 are lit, and the user can intuitively determine the approximate direction of the sound source from the lit light strips.

It should be noted that the number of the strips and the arrangement of the angle range in the present embodiment are only examples, and the present invention is not limited thereto. In other embodiments of the present invention, there may be other arrangements, for example, the number of the light strips on each side may be set to be four or five, the corresponding angle range is also set to be four or five, and the angle range corresponding to each group of the light strips may be set according to actual needs. In addition to this, it is also possible to provide the light strip on only one side.

In this embodiment, the light strip through the mobile terminal side indicates the approximate direction of sound source, and when sound output equipment or environment were limited, or player itself was difficult to accomplish according to sound and when listening to the position, through the mode that vision was reminded, provided the approximate position of user to as preliminary judgement and warning, thereby promote gaming experience.

Referring to fig. 3, a method for indicating a sound direction according to a second embodiment of the present invention includes steps S21 to S29.

And step S21, acquiring the audio data stream of the two channels recorded in the system, and continuously adding the audio data stream into the cache block.

Step S22, acquiring the audio clip data in the cache block as audio data, and extracting audio feature data of the audio data.

And step S23, averaging the audio characteristic data on each sound channel in sequence every three sampling points to obtain a new section of audio data.

The identification and classification of audio data is performed by first framing waveform data, i.e. data cut into segments using a buffer window, and simply framing will weaken the signal at both ends, and for this purpose, a part of the data needs to be reserved as the beginning of the next frame. To this end, a buffer group as shown in fig. 4 is maintained, and the data in the overlapping area is synchronously copied to the next frame, and when a buffer area is full, the data is transferred to the next stage of calculation, and the buffer area is emptied.

In this embodiment, the audio sampling frequency when the mobile terminal records is 48000Hz, and the two-channel audio stream data is recorded. Using audio buffer groups, audio data streams are continuously added to buffer blocks, and when the buffer is full of 96000 bytes, i.e. 500ms of binaural audio data, the buffered data is resampled to 16000 hz. In specific implementation, the audio fragment data is averaged once for every three sampling points on each sound channel to obtain a new segment of audio data, so that the audio is resampled from 48000Hz to 16000 Hz.

And step S24, performing average calculation of left and right channels on each sampling point of the new audio data, and mixing the new audio data into single-channel audio data.

The audio data acquired by the mobile terminal is dual-channel data, and when the audio data is identified and classified, the dual-channel data needs to be mixed into single-channel data. The audio data of the left channel and the audio data of the right channel which are sampled every time are subjected to mean value calculation to obtain a new section of audio data, and the new section of audio data and the left channel audio data and the right channel audio data can be mixed into single-channel audio data.

And step S25, calculating the single-channel audio data through a neural network model to determine whether the single-channel audio data is the data of the target sound.

And step S26, when the single-channel audio data is the data of the target sound, calculating the type of the target sound through the neural network model.

The method for distinguishing the audio data can be based on a neural network model in the prior art. For example, DNN models (Deep Neural Networks) can be used. In specific implementation, a DNN model is constructed, various types of audio data are used as training data to train the DNN model, the DNN model can adopt a tensoflow frame to train the audio data, so that parameters of the DNN model are determined, and the trained DNN model is used for distinguishing sound types of the audio data. The sound type may be determined by trained audio data, for example, a DNN model is trained by pre-collected audio data of a gunshot, a footstep, a car sound, etc. in a game. And inputting the single-channel audio data mixed by the mobile terminal into the trained DNN model, and finally outputting the type of the audio data.

In a specific implementation, the neural network model may output classification information of the target sound when the monaural audio data is processed, and output is 0 when the monaural data does not have the target sound.

Step S27, calculating the sound source direction of the target sound according to the audio feature data.

The audio characteristic data can be calculated through a GCC-PHAT algorithm, the time difference of the target sound reaching the user is determined, and the sound source direction of the target sound is calculated according to the time difference.

Step S28, determining an angle range to which the sound source direction belongs.

And step S29, controlling the target lamp strip corresponding to the angle range to be lightened, and displaying according to the lamp strip special effect corresponding to the type of the target sound.

In this embodiment, 3 groups of lamp bands may be respectively disposed at two side edges of the mobile terminal, and a direction in which a user receives sound is divided into 3 angular ranges by taking the user in the mobile terminal interface as a center, where three lamp bands measured each time respectively correspond to the 3 angular ranges one to one. Specifically, the correspondence between the lamp strips on the two sides of the mobile terminal and the three angle ranges may be as shown in table 1 in the above embodiment. And when the angle range to which the sound source direction belongs is determined, lighting the lamp strip corresponding to the angle range. For example, if the direction of the sound source in the polar coordinate system is a direction at an angle of 60 °, the light strips L2 and N2 are lit, and the user can intuitively determine the approximate direction of the sound source from the lit light strips. And if the target sounds exist in a plurality of directions at the same time, lighting the lamp belts corresponding to the plurality of directions at the same time.

Further, each type of sound corresponds to a light strip effect, and the light strip effect comprises at least one of color, flickering according to a preset frequency and gradual light intensity change. The user can judge the type of sound according to the lamp strip special effect of the target lamp strip, for example, the lamp strip special effect corresponding to the car sound in the game is red light, the lamp strip special effect corresponding to the footstep sound is yellow light and flickers at a frequency of once per second, and the lamp strip special effect corresponding to the gunshot sound is gradually strengthened in light brightness.

Namely, when the target sound is the car sound, the target lamp strip is controlled to be lightened, and the color of the lamp light is red; when the target sound is the car sound, controlling the target lamp strip to be lightened, and controlling the lamp light to flash according to the frequency corresponding to the type of the target sound, wherein the color of the lamp light is yellow; when the target sound is a gunshot sound, the target lamp strip is controlled to be lightened, and the light intensity is gradually enhanced.

It can be understood that the light strip effects listed above are only examples, and the light strip effects corresponding to each type may be designed and combined according to actual situations in specific implementations, and other effects may also be added.

Further, the duration of the light strip special effect may be set, for example, the duration is 2s, that is, the duration of lighting the target light strip is 2s, and the light indication in the time period is enough to attract the attention of the user and does not need to be in a lighting state all the time, so as to avoid unnecessary waste of electric quantity.

Further, before the step of extracting the audio feature data in the audio data in other embodiments of the present invention, audio amplitude modulation may be performed on the obtained two-channel audio data according to the current volume of the system and a preset target volume.

The audio data recorded in the system can be subjected to amplitude modulation by the volume set by the system frequently, and the audio data obtained when the volume of the system is changed are different, so that the audio identification is inconvenient. Therefore, amplitude modulation needs to be performed on the acquired audio data so that the volume of the acquired audio data is always constant, and the method comprises the following specific implementation steps:

step S31, monitoring and saving the parameter value of the current volume of the system;

step S32, respectively indexing the parameter value of the current volume and the parameter value of the preset target volume, and taking the indexed ratio as the amplitude ratio of the target volume and the current volume;

in step S33, the amplitude ratio is multiplied by the value of each audio signal of the audio data to obtain amplitude-modulated audio data.

In the specific implementation, the system presets a parameter value Vt of the target volume, and monitors and stores the parameter value of the current volume of the mobile phone system through the interface, which is expressed as Vn. Because the loudness of the volume is in an exponential relationship in human ear perception, the audio amplitude and the volume parameter are also in an exponential relationship, the target volume and the current volume are respectively indexed, and the ratio of the indexed volume is the amplitude ratio of the target volume to the current volume:

scale = exp(Vt)/exp(Vn)。

after the system audio data is obtained, the value of each audio signal is multiplied by scale, and the converted data becomes the target volume. The volume of the converted audio data is the target volume, and the data at different volume levels are finally amplitude-modulated to the target volume.

As another implementation manner of the embodiment of the present invention, the distance between the sound source of the target sound and the user may be further distinguished according to the light intensity of the target light strip. When the game is developed, the distance between a sound source and a user can be judged by the intensity of the sound received by the user through simulating the distance between the sound source and the user in a mode of decreasing the volume of game audio, and the intensity of the sound in the mobile terminal can be represented by the voltage amplitude of left and right channel audio signals. Namely, the mobile terminal obtains the voltage amplitude in the audio characteristic data, calculates the distance between the sound source and the user according to the voltage amplitude, and controls the light intensity of the target lamp strip to be the light intensity corresponding to the distance. The corresponding relation between the distance value (or distance range) and the light intensity is prestored in the mobile terminal system, and the mobile terminal system can be inquired when in use.

In other embodiments of the present invention, the distance between the sound source and the user can be further distinguished by other means, for example, the distance can be distinguished by the flashing frequency of the light, and the flashing frequency of the light corresponding to the sound at a closer distance is higher, and the flashing frequency of the light corresponding to the sound at a farther distance is lower.

Please refer to fig. 5, which is an indicating device of a sound direction according to a third embodiment of the present invention, applied to a mobile terminal, where at least one side edge of the mobile terminal is provided with a plurality of groups of light bands, and a direction of receiving sound by a user is divided into a plurality of angle ranges by taking the user in an interface as a center, where the plurality of groups of light bands correspond to the plurality of angle ranges, and the indicating device includes:

the extraction module 10 is configured to acquire audio data of two channels in a system, and perform feature extraction on the audio data to obtain audio feature data;

the judging module 20 is configured to judge whether the extracted audio feature data is data of a target sound through a neural network model;

a calculating module 30, configured to calculate a sound source direction of the target sound according to the audio feature data;

and the control module 40 is configured to determine an angle range to which the sound source direction belongs, and control the target lamp band corresponding to the angle range to be turned on.

Further, the above-mentioned indicating device of sound direction, wherein the judging module 20 includes:

and the sampling module is used for taking an average value of the data of every three sampling points on each sound channel of the audio characteristic data in sequence to obtain a section of new audio data.

The mean value calculation module is used for carrying out mean value calculation on left and right sound channels on each sampling point of the new audio data so as to mix the new audio data into single sound channel audio data;

and the judgment submodule is used for calculating the single-channel audio data through a neural network model and determining whether the single-channel audio data is the target sound.

Further, the device for indicating the sound direction, wherein the calculating module 30 is further configured to: calculating the type of the target sound through the neural network model;

the control module 40 is further configured to control the target lamp band to display a lamp band special effect corresponding to the type of the target sound, where the lamp band special effect includes at least one of a lamp color, a lamp flashing according to a preset frequency, and a gradual change in brightness.

The implementation principle and the generated technical effect of the indicating device of the sound direction provided by the embodiment of the invention are the same as those of the embodiment of the method, and for the sake of brief description, no part of the embodiment of the device is mentioned, and reference may be made to the corresponding contents in the embodiment of the method.

The invention also proposes a readable storage medium on which a computer program is stored which, when executed by a processor, implements the method of indicating a sound bearing as described above.

The embodiment of the invention also provides a mobile terminal, which comprises a memory, a processor and a program which is stored on the memory and can be run on the processor, wherein the processor executes the program to realize the method.

The logic and/or steps represented in the flowcharts or otherwise described herein, e.g., an ordered listing of executable instructions that can be considered to implement logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.

More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). Additionally, the computer-readable medium could even be paper or another suitable medium upon which the program is printed, as the program can be electronically captured, via for instance optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner if necessary, and then stored in a computer memory.

It should be understood that portions of the present invention may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.

In the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.

The above-mentioned embodiments only express several embodiments of the present invention, and the description thereof is more specific and detailed, but not construed as limiting the scope of the present invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the inventive concept, which falls within the scope of the present invention. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims

1. A voice direction indicating method is applied to a mobile terminal, and is characterized in that at least one side edge of the mobile terminal is provided with a plurality of groups of lamp belts, the direction of receiving voice by a user is divided into a plurality of angle ranges by taking a user head position in an interface of the mobile terminal as a center, the plurality of groups of lamp belts and the plurality of angle ranges are in corresponding relation, and the voice direction indicating method comprises the following steps:

acquiring audio data of double channels in a system, extracting the characteristics of the audio data to obtain audio characteristic data, and performing mean value calculation on the extracted audio characteristic data of the double channels so as to mix the audio characteristic data into single-channel audio data;

judging whether the single-channel audio data is the data of the target sound or not through a neural network model;

2. The method of claim 1, wherein the step of averaging the extracted two-channel audio feature data to mix into mono audio data comprises:

taking an average value of the data of every three sampling points of the audio characteristic data on each sound channel in sequence to obtain a section of new audio data;

and carrying out average calculation on the left and right sound channels of the new audio data at each sampling point, thereby mixing the new audio data into single-channel audio data.

3. The method of indicating a sound bearing of claim 1, wherein the step of determining whether the mono audio data is data of a target sound through a neural network model further comprises:

calculating the type of the target sound through the neural network model;

4. The method of indicating a sound bearing of claim 1, wherein the step of calculating a sound source direction of the target sound based on the audio feature data comprises:

5. The method of indicating a bearing of sound according to claim 1, wherein the step of feature extracting the audio data further comprises:

6. The method according to claim 1, wherein three groups of light strips are uniformly disposed on at least one side of the mobile terminal, a polar coordinate system is established with the position of the head of the user as a pole and the direction in front of the head of the user as a polar axis, and the direction of the user receiving the sound is divided into three angle ranges, which are: minus 30 to minus 120 degrees; 30-120 degrees; the angle ranges from-30 degrees to 0 degrees, from 0 degrees to 30 degrees, from 120 degrees to 180 degrees, and from-120 degrees to-180 degrees.

7. The method of indicating a sound bearing of claim 1, wherein the step of calculating a sound source direction of the target sound based on the audio feature data further comprises:

8. An indicating device of sound position is applied to a mobile terminal, and is characterized in that at least one side edge of the mobile terminal is provided with a plurality of groups of lamp belts, the direction of sound received by a user is divided into a plurality of angle ranges by taking a head part of the user in an interface of the mobile terminal as a center, and the plurality of groups of lamp belts and the plurality of angle ranges are in corresponding relation, the indicating device comprises:

the extraction module is used for acquiring the audio data of the two channels in the system, extracting the characteristics of the audio data to obtain audio characteristic data, and performing mean value calculation on the extracted audio characteristic data of the two channels so as to mix the audio characteristic data into single-channel audio data;

the judging module is used for judging whether the single-channel audio data is the data of the target sound or not through a neural network model;

9. A readable storage medium on which a program is stored, which program, when executed by a processor, carries out the method according to any one of claims 1 to 7.

10. A mobile terminal comprising a memory, a processor and a program stored on the memory and executable on the processor, characterized in that the processor implements the method according to any of claims 1-7 when executing the program.