CN112860067B - Magic mirror adjusting method, system and storage medium based on microphone array - Google Patents

Magic mirror adjusting method, system and storage medium based on microphone array Download PDF

Info

Publication number
CN112860067B
CN112860067B CN202110169552.5A CN202110169552A CN112860067B CN 112860067 B CN112860067 B CN 112860067B CN 202110169552 A CN202110169552 A CN 202110169552A CN 112860067 B CN112860067 B CN 112860067B
Authority
CN
China
Prior art keywords
information
microphone array
voice
microphone
intensity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110169552.5A
Other languages
Chinese (zh)
Other versions
CN112860067A (en
Inventor
赵满平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Kinstone Digital & Technology Develop Co ltd
Original Assignee
Shenzhen Kinstone Digital & Technology Develop Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Kinstone Digital & Technology Develop Co ltd filed Critical Shenzhen Kinstone Digital & Technology Develop Co ltd
Priority to CN202110169552.5A priority Critical patent/CN112860067B/en
Publication of CN112860067A publication Critical patent/CN112860067A/en
Application granted granted Critical
Publication of CN112860067B publication Critical patent/CN112860067B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/16Sound input; Sound output
    • G06F3/165Management of the audio stream, e.g. setting of volume, audio stream path

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Circuit For Audible Band Transducer (AREA)

Abstract

The application relates to a magic mirror adjusting method, a magic mirror adjusting system and a storage medium based on a microphone array, which belong to the field of signal processing, wherein the method comprises the following steps: acquiring a use request sent by a user through a microphone array, wherein the use request is in a voice form, and the microphone array comprises at least three common microphones; generating voice information corresponding to the common microphones one by one, wherein the voice information comprises voice intensity; generating a sub-microphone array according to the order of the voice intensity from large to small; according to a preset calculation model, calculating and generating preliminary sound source position information corresponding to each sub-microphone array; integrating the primary sound source position information to generate user position information; comparing the user position information with the current preset target position information to generate position comparison information; and generating movement instruction information according to the position comparison information and feeding the movement instruction information back to the magic mirror terminal. The application has the effect of being convenient for users to use.

Description

Magic mirror adjusting method, system and storage medium based on microphone array
Technical Field
The application relates to the field of signal processing, in particular to a magic mirror adjusting method, a magic mirror adjusting system and a storage medium based on a microphone array.
Background
Microphone array, literally understood, refers to an array that is composed of an array of microphones, or a system that is composed of a number of acoustic sensors for sampling and processing the spatial characteristics of a sound field. After the microphones are arranged according to the specified requirements, a corresponding algorithm is added to solve a plurality of acoustic problems, such as sound source localization, dereverberation, voice enhancement, blind source separation and the like. Microphone arrays are therefore used in many places,
With the development of social economy, the body and shadow of the magic mirror can be seen more and more in various markets. The magic mirror is a product combining entertainment and business promotion, people can shoot before the magic mirror to generate short videos and cartoon, or the magic mirror can be virtually replaced for people, so that proper information such as clothes is matched.
The related art in the above has the following drawbacks: the premise that people interact with the magic mirror is that the magic mirror can capture images of a user at a specified position, but a camera which is generally used for capturing images of the user is fixedly arranged, and the user continuously adjusts the position and the posture of the user to find a proper position, otherwise, the magic mirror cannot be triggered.
Disclosure of Invention
In order to avoid users to continuously adjust the positions and the postures of the users to find the proper positions, the application provides a magic mirror adjusting method, a magic mirror adjusting system and a storage medium based on a microphone array.
In a first aspect, the present application provides a method for adjusting a magic mirror based on a microphone array, which adopts the following technical scheme:
A magic mirror adjusting method based on a microphone array comprises the following steps:
Acquiring a use request sent by a user through the microphone array, wherein the use request is in a voice form, and the microphone array comprises at least three common microphones;
Generating voice information corresponding to the common microphones one by one, wherein the voice information comprises voice intensity;
generating a sub-microphone array according to the order of the voice intensity from big to small;
calculating and generating primary sound source position information corresponding to each sub-microphone array according to a preset calculation model;
integrating the preliminary sound source position information to generate user position information;
comparing the user position information with the current preset target position information to generate position comparison information;
And generating movement instruction information according to the position comparison information and feeding the movement instruction information back to the magic mirror terminal.
Through adopting above-mentioned technical scheme, every time after user's use request of input pronunciation form, can divide into a plurality of subarrays with the microphone array according to the speech intensity in the speech information to confirm user's position through the sound source technique of microphone by the subarray, move the magic mirror to corresponding position point again, for the camera on the magic mirror catch the portrait and carry out operation such as reloading and make the location basis, even so that the user is located the position that the camera was not shot, also can trigger the magic mirror, the problem that the user needs to constantly adjust self position and gesture to adapt to the magic mirror has been solved, user experience has been improved.
Optionally, before the obtaining, through the microphone array, the use request sent by the user further includes:
Acquiring a selection request sent by a user, wherein the selection request is in a voice form and comprises a selection keyword;
And acquiring preset target position information corresponding to the selected keywords according to the selected keywords.
Through adopting above-mentioned technical scheme, between the magic mirror carries out the location operation to the user, the user still accessible microphone array sends the request of choosing to the magic mirror to select different magic mirror functions or shoot the template, improved user's selection degree of freedom.
Optionally, the preset calculation model is a TDOA location algorithm model.
By adopting the technical scheme, the TDOA positioning algorithm model is a sound source positioning algorithm which is favorable for real-time use and has low calculation complexity, the calculation is simpler, the response speed is faster, the sound source position can be obtained faster, and the positioning accuracy can be improved by using the algorithm simultaneously for a plurality of subarrays.
Optionally, the generating the sub-microphone array according to the order of the voice intensity from big to small specifically includes:
sequencing the voice information according to the sequence from strong to weak of the voice intensity to generate a voice information list;
Sequentially accumulating the voice intensity corresponding to the voice information and the intensity sum preset to be zero from the voice information list;
judging whether the intensity sum is larger than or equal to a preset dividing intensity threshold value;
If the intensity sum is smaller than a preset dividing intensity threshold, continuing to add the voice intensity and the intensity sum; if the intensity sum is greater than or equal to a preset dividing intensity threshold, forming a sub-microphone array by using common microphones corresponding to the voice intensity contained in the intensity sum, and generating array intensity bound with the sub-microphone array according to the current intensity sum.
Through adopting above-mentioned technical scheme, divide into a plurality of sub-microphone arrays with the microphone array according to preset intensity threshold, compensate the sound source position accuracy of calculating through increasing the number to the lower voice information of voice intensity to this division reasonable of guaranteeing every sub-microphone array as far as, user position accuracy that obtains according to sub-microphone array calculation is high.
Optionally, the integrating the preliminary sound source position information to generate the user position information specifically includes:
Acquiring sub microphone arrays corresponding to the initial sound source position information;
Calculating and generating average intensity according to the number of the common microphones of the sub-microphone arrays and the corresponding array intensity;
Generating weighting coefficients corresponding to the sub-microphone arrays one by one according to the average intensity, wherein the sum of the weighting coefficients is 1;
And generating user position information according to the weighting coefficient and the preliminary sound source position information corresponding to the sub-microphone array.
By adopting the technical scheme, the voice information with lower voice intensity is far away from the sound source position, so that the possibility of deviation is higher when the voice information is acquired, and the trust degree of the voice information is reduced when the user position information is generated according to the integration of all the primary sound source position information, thereby improving the calculation accuracy.
Optionally, the obtaining the sub-microphone array corresponding to each piece of preliminary sound source position information specifically includes:
Calculating a sound source position difference value between the preliminary sound source position information and other preliminary sound source position information;
Judging whether the sound source position difference value meets a preset difference value abnormal condition or not;
if the sound source position difference value meets a preset difference value abnormality condition, defining the primary sound source position information as an abnormal sound source position, and defining a corresponding common microphone as a problem microphone.
By adopting the technical scheme, the user position is calculated without using the primary sound source position information with larger deviation, the accuracy of the user position is ensured, and the corresponding microphone is separated from the common microphone, so that the processing is convenient.
Optionally, after integrating the preliminary sound source position information to generate the user position information, the method further includes:
simulating to generate simulated voice information corresponding to the problem microphones one by one according to the user position information and the positions of the problem microphones;
generating test sound difference time according to the simulated voice information and the corresponding voice information;
judging whether the sound difference checking time is greater than a preset sound difference threshold value or not;
If the detected sound difference time is greater than a preset sound difference threshold value, defining the current problem microphone as an abnormal microphone, and feeding back the abnormal microphone to an administrator; and if the detected sound difference time is smaller than or equal to a preset sound difference threshold value, redefining the current problem microphone as a common microphone.
By adopting the technical scheme, according to the finally generated user position, the voice information generated by the back-push problem microphone is simulated and judged whether the voice information is reasonable or not, so that the actual problematic microphone range is reduced, the abnormal microphone obtained by judgment is fed back to the manager, and the manager can maintain in time.
In a second aspect, the present application provides a magic mirror adjustment system based on a microphone array, which adopts the following technical scheme:
A magic mirror adjustment system based on a microphone array, comprising:
The information acquisition module is used for acquiring a use request sent by a user through the microphone array, wherein the use request is in a voice form, and the microphone array comprises at least three common microphones; generating voice information corresponding to the common microphones one by one, wherein the voice information comprises voice intensity;
The position generation module is used for generating a sub-microphone array according to the order of the voice intensity from high to low; calculating and generating primary sound source position information corresponding to each sub-microphone array according to a preset calculation model; integrating the preliminary sound source position information to generate user position information;
The comparison mobile module is used for comparing the user position information with the current preset target position information to generate position comparison information; and generating movement instruction information according to the position comparison information and feeding the movement instruction information back to the magic mirror terminal.
Through adopting above-mentioned technical scheme, through dividing into sub microphone array with the microphone array, the multiparty is to the sound source location to can pinpoint user's position, and send the adjustment instruction to the magic mirror, make the magic mirror can move to suitable position on automatically, so that the user need not constantly adjust self position, improved user experience.
In a third aspect, the present application provides an intelligent terminal, which adopts the following technical scheme:
A smart terminal comprising a memory and a processor, the memory having stored thereon a computer program capable of being loaded by the processor and executing the method according to the first aspect.
Through adopting above-mentioned technical scheme, carry out accurate localization through sub microphone array to according to the position of the user position mobile magic mirror that obtains, need not the user constantly to carry out self-adjustment, improved user experience.
In a fourth aspect, the present application provides a computer readable storage medium, which adopts the following technical scheme:
a computer readable storage medium comprising a computer program stored with instructions executable by a processor to load and execute the method according to the first aspect.
Through adopting above-mentioned technical scheme, can carry out the accurate localization to the sound source to this adjustment magic mirror position, the user of being convenient for triggers the function of magic mirror.
In summary, the present application includes at least one of the following beneficial technical effects:
1. The sound source is positioned through the sub-microphone array, and finally, the positions of the users are obtained through integration, so that the positions of the magic mirrors are adjusted, the users can start the magic mirrors at the positions which cannot be shot by the cameras, the users do not need to adjust themselves, and the user experience is optimized;
2. the method has the advantages that the primary sound source position information with overlarge deviation is not used, the generated microphone with larger deviation is further judged according to the position of the user by back-pushing, the final abnormal microphone is fed back to an administrator, and the administrator can maintain the microphone in time.
Drawings
Fig. 1 is a flow chart of a magic mirror adjusting method based on a microphone array according to an embodiment of the application.
Fig. 2 is a flow chart illustrating a method for generating a sub-microphone array according to an embodiment of the present application.
Fig. 3 is a flowchart illustrating a process for integrating preliminary sound source location information to generate user location information according to an embodiment of the present application.
Fig. 4 is a flow chart for defining generating abnormal sound source positions and problem microphones according to an embodiment of the present application.
Fig. 5 is a flow chart illustrating a method for generating an anomalous microphone in accordance with an embodiment of the application.
Fig. 6 is a block diagram of a magic mirror adjustment system based on a microphone array according to an embodiment of the application.
Reference numerals illustrate: 1. an information acquisition module; 2. a position generation module; 3. comparing the moving modules; 4. and an abnormality confirmation module.
Detailed Description
The application is described in further detail below with reference to fig. 1-6.
The embodiment of the application discloses a magic mirror adjusting method based on a microphone array, which is applied to a magic mirror, wherein the magic mirror has the function of adjusting in the horizontal and vertical directions, and in addition, the magic mirror is also provided with the microphone array. In this embodiment, the microphone array is 4*2 array consisting of eight common microphones. The magic mirror can carry out data processing on the information acquired by the microphone array, and carries out adjustment in the horizontal direction and the vertical direction on the magic mirror according to the information content obtained by the data processing, so as to realize adjustment of the magic mirror.
Referring to fig. 1, the magic mirror adjustment method based on the microphone array includes:
S100: and acquiring a selection request sent by the user.
Wherein selecting the request includes selecting keywords, and selecting the request is in the form of speech. When the user is at a position distant from the magic mirror, he can directly speak a selection request containing a selection keyword such as "shoot", "video", etc.
S200: templates corresponding to the keywords are generated.
The template comprises preset target position information. In the preset library, each keyword corresponds to a template, each template has a different shooting theme, and different shooting themes may require a user to stand at different positions relative to the magic mirror, and thus have different target position information.
S300: and acquiring a use request sent by a user.
Wherein the use request includes a preset use keyword, and is also in the form of voice. Specifically, surrounding sounds are captured by a microphone array composed of eight normal microphones, and when a keyword is recognized by speech recognition, a use request is acquired.
S400: and generating voice information according to the use request.
The voice information is in one-to-one correspondence with the common microphones in the microphone array, and comprises voice intensity and sound arrival time. The voice strength is the peak value of the use request received by the corresponding microphone, the sound arrival time is the objective time of the use request reaching the corresponding microphone, and the sound arrival time is accurate to microseconds.
S500: the sub-microphone arrays are generated in order of the voice intensities from the large to the small.
Wherein all of the common microphones in the microphone array are allocated to one sub-microphone array, and some of the common microphones having a larger corresponding voice intensity may be allocated to more than one sub-microphone array. Specifically, in connection with fig. 2, S500 includes the following substeps.
S501: and generating a voice information list according to the order of the voice intensity from large to small.
Specifically, all the voice messages are ordered according to the order of the voice intensity from high to low, and a voice message list is generated. The voice information list is a circular list connected end to end, namely, in the voice information list, voice information with larger voice intensity is arranged more front, and voice information with largest voice intensity is arranged after voice information with smallest voice intensity.
S502: and accumulating the voice intensities according to the voice information list to generate an intensity sum.
Specifically, according to the arrangement order of the voice information in the voice information list, the voice intensity of each voice information is added to the intensity sum preset to 0 in sequence. The system has the number of accumulated information preset to 0, and each time the intensity sum is added with one voice intensity, namely the number of accumulated information is added with 1, and the current voice information is moved to a temporary voice information list, when the number of accumulated information reaches N, N is more than or equal to 3, and S503 is executed.
S503: and judging whether the intensity sum is larger than or equal to a preset dividing intensity threshold value.
Specifically, if the intensity sum is smaller than the preset dividing intensity threshold, the method returns to S502, adds the current intensity sum to the next voice intensity, and moves the voice information corresponding to the voice intensity to the temporary voice information list; if the intensity sum is greater than or equal to a preset dividing intensity threshold, the common microphones corresponding to the voice information in the temporary voice information list are formed into a sub-microphone array, and an array intensity is generated according to the current intensity sum, wherein the current intensity sum is equal to a specific numerical value of the array intensity. And clearing the intensity sum, the accumulated information number and the temporary voice information list, and generating the next sub-microphone array.
After all the normal microphone allocation is completed, the process goes to S600.
S600: and generating preliminary sound source position information of each sub-microphone array according to a preset calculation model.
The preset calculation model is a TDOA positioning algorithm model. In addition, each common microphone is preset with an array number, the array numbers are in one-to-one correspondence with the common microphones, and the array numbers can be used for reflecting the positions of the corresponding common microphones in the array. For example, array number 12 represents the microphone in the first column of the array second, and array number 41 represents the microphone in the fourth column of the array first.
The TDOA localization algorithm model is a localization technology based on the sound arrival time difference, calculates and generates the sound arrival time difference according to the sound arrival time of each common microphone in the sub-microphone array of the sound in S500, and further determines the position of the sound source, namely preliminary sound source position information, by combining the space positions of the common microphones obtained by the array numbers, wherein the preliminary sound source position information corresponds to the sub-microphone array one by one. The final primary sound source position information is mapped in a rectangular coordinate system established by taking the center of the magic mirror as an origin, the plane where the magic mirror is located as an xy plane, and the direction perpendicular to the plane where the magic mirror is located as a z axis, and is expressed in a three-dimensional coordinate form.
S700: and integrating the primary sound source position information to generate user position information.
Specifically, in connection with fig. 3, S700 includes the following substeps.
S701: the average intensity of the resulting sub-microphone array is calculated.
Specifically, the sub-microphone arrays that generate the position information of each primary sound source in S600 are obtained, the number of common microphones in each sub-microphone array is calculated, and the average intensity is generated by dividing the array intensity corresponding to the sub-microphone array by the number of the corresponding common microphones, where the average intensity corresponds to the sub-microphone array one by one.
S702: the weighting coefficients are generated based on the average intensities of the sub-microphone arrays.
Specifically, specific values of the average intensities are normalized to generate weighting coefficients of the corresponding sub-microphone arrays, and the sum of all the weighting coefficients is 1. For example, if there are three sub-microphone arrays, wherein the average intensity of the first sub-microphone array is a, the average intensity of the second sub-microphone array is B, and the average intensity of the third sub-microphone array is C, the average intensities are normalized, i.e. the average intensity of each sub-microphone array is divided by the sum of all the average intensities, and finally the weighting coefficient of the first sub-microphone array is generatedThe weighting coefficient of the second sub-microphone array isThe weighting factor of the third sub-microphone array is/>
S703: and integrating the primary sound source position information into user position information according to the weighting coefficient.
Specifically, the three-dimensional coordinate values in the primary sound source position information are multiplied by the weighting coefficients of the corresponding sub-microphone arrays, and the multiplied results are correspondingly added to obtain final user position information. For example, continuing with the example in S702, if the preliminary sound source position information of the first sub-microphone array is (x 1, y1, z 1), the preliminary sound source position information of the second sub-microphone array is (x 2, y2, z 2), and the preliminary sound source position information of the third sub-microphone array is (x 3, y3, z 3), the final generated user position information is [ ] according to the corresponding weighting coefficients,/>,)。
S800: and comparing the user position information with the target position information to generate position comparison information.
The target position information is specifically in a two-dimensional representation form, specifically, the user position information is vertically projected onto the magic mirror, that is, only the x coordinate and the y coordinate in the user position information are used, the coordinates are correspondingly subtracted from the target position information acquired in the step S200, and position comparison information is obtained, wherein the position comparison information comprises a transverse moving distance and a vertical moving distance.
S900: and generating a movement instruction according to the position comparison information and feeding the movement instruction back to the magic mirror terminal.
Specifically, according to the transverse movement distance and the vertical movement distance in the position comparison information, a movement instruction is generated and sent to the magic mirror terminal, so that the magic mirror can be adjusted according to the transverse movement distance and the vertical movement distance, and the magic mirror can be conveniently and directly used for acquiring a proper user portrait without moving a user.
To facilitate timely maintenance of the microphone array, in connection with fig. 4, before calculating the average intensity of the sub-microphone array is generated in S701, the method further includes:
S11: a sound source position difference between each preliminary sound source position and the other preliminary sound source positions is calculated.
Specifically, all the primary sound source positions are obtained, one primary sound source position is taken as a subtracted number, and the subtracted number is correspondingly subtracted from the other primary sound source positions to obtain a plurality of sound source position difference values, wherein each sound source position difference value comprises three numbers of an x difference value, a y difference value and a z difference value.
S12: and judging whether the sound source position difference value of each primary sound source position meets a preset difference value abnormal condition.
And if the preset difference abnormal condition is met, namely that the primary sound source position has more than half of sound source position differences, any one of the three differences of x, y and z is larger than a preset difference threshold. Specifically, all the sound source position differences of a certain primary sound source position are compared with a preset difference threshold value one by one, if the sound source position difference is larger than or equal to the preset difference threshold value, the accumulated comparison number preset to be 0 is added with 1, and if the sound source position difference is smaller than the preset difference threshold value, the comparison of the next sound source position difference is continued.
If the accumulated ratio of a certain primary sound source position is greater than half of the number of the ordinary microphones in the microphone array corresponding to the primary sound source position, judging that the preset difference abnormal condition is met, and jumping to S13; if the cumulative ratio is less than or equal to half of the number of the corresponding normal microphones, the difference abnormal condition is judged not to be satisfied, and the next sound source position difference value judgment is maintained in the step S12.
S13: an abnormal sound source location and problem microphone are defined.
Specifically, the current preliminary sound source position is defined as an abnormal sound source position such that the preliminary sound source position is not used when the calculation of the average intensity is performed S701, and all the normal microphones in the microphone array corresponding to the preliminary sound source position are defined as problem microphones.
To further determine problematic microphones, in connection with fig. 5, after integrating the preliminary sound source position information to generate user position information at S700, further includes:
S21: and according to the user position information and the problem microphones, simulating to generate simulated voice information corresponding to the problem microphones one by one.
Specifically, by simulation, voice information is simulated from the user position information to each of the problem microphone and one of the normal microphones, and simulated voice information corresponding to the problem microphone and one of the normal microphones one to one is generated. The analog voice information comprises analog sound arrival time, and the analog sound arrival time of the problem microphone and the analog sound arrival time of the common microphone are subtracted to obtain analog sound difference time.
S22: and generating the test sound difference time according to the analog voice information and the voice information of the problem microphone.
Specifically, the sound arrival time in the voice information of the problem microphone is subtracted from the sound arrival time in the voice information of the same common microphone, so as to generate sound difference time corresponding to the problem microphone one by one, and the sound difference time is subtracted from the analog sound difference time so as to generate test sound difference time corresponding to the problem microphone one by one. After the test sound difference time of all the problem microphones is generated, the process goes to S23.
S23: and judging whether the sound difference checking time of the microphone in the current problem is larger than a preset sound difference threshold value.
Specifically, the test sound difference time of each problem microphone is sequentially acquired, and each time a problem microphone is acquired, the problem microphone is defined as the current problem microphone. If the detection sound difference time of the current problem microphone is greater than a preset sound difference threshold value, defining the current problem microphone as an abnormal microphone, feeding back a corresponding array number to a mobile phone terminal of an administrator, and then entering judgment of the next problem microphone; if the detected sound difference time of the current problem microphone is smaller than or equal to the preset sound difference threshold value, redefining the current problem microphone as a common microphone, and then entering the judgment of the next problem microphone.
The implementation principle is as follows: and acquiring a use request in a voice form sent by a user through the microphone array, generating voice information corresponding to the common microphones one by one, and dividing all the common microphones into respective sub-microphone arrays. And calculating and generating preliminary sound source position information based on the sub-microphone array according to the TDOA positioning algorithm model. A weighting coefficient is assigned to the preliminary sound source position information based on the average intensity of the sub-microphone arrays, and user position information is finally generated based on the weighting coefficient. And comparing the user position information with preset target position information, generating a moving instruction and feeding back the moving instruction to the magic mirror terminal so that the magic mirror moves to a proper position, and avoiding the need of the user to continuously adjust the magic mirror to find the proper position.
Based on the method, the embodiment of the application also discloses a magic mirror adjusting system based on the microphone array. Referring to fig. 6, the magic mirror adjustment system based on a microphone array includes an information acquisition module 1, a position generation module 2, a contrast movement module 3, and an anomaly confirmation module 4.
The information acquisition module 1 is configured to acquire a use request and a selection request sent by a user through the microphone array, and determine target location information according to the selection request.
The position generation module 2 is used for generating sub-microphone arrays according to the order of the voice intensity from large to small, calculating the primary sound source position information of each sub-microphone array according to the TDOA positioning algorithm model, and integrating the primary sound source position information according to the average intensity of the primary sound source position information to generate the user position information.
And the comparison moving module 3 is used for comparing the calculated user position information with the obtained target position information to generate position comparison information, and generating moving instruction information based on the position comparison information and feeding the moving instruction information back to the magic mirror terminal.
The anomaly confirmation module 4 is configured to define microphones in the sub-microphone array with larger deviation generated by calculation as problem microphones, and then generate analog voice information based on user position information generated by final calculation, and test the problem microphones.
The embodiment of the application also discloses an intelligent terminal which comprises a memory and a processor, wherein the memory stores a computer program which can be loaded by the processor and execute the magic mirror adjusting method based on the microphone array.
The embodiment of the present application also discloses a computer-readable storage medium storing a computer program capable of being loaded by a processor and executing the magic mirror adjustment method based on a microphone array as described above, for example, the computer-readable storage medium comprising: a U-disk, a removable hard disk, a read-only memory (ROM), a random access memory (Random Access Memory, RAM), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
The above embodiments are only for illustrating the technical solution of the present application, and not for limiting the scope of application. It will be apparent that the described embodiments are merely some, but not all, embodiments of the application. Based on these embodiments, all other embodiments that may be obtained by one of ordinary skill in the art without inventive effort are within the scope of the application.

Claims (10)

1. A magic mirror adjusting method based on a microphone array, which is characterized in that the method is based on a magic mirror provided with the microphone array; the method comprises the following steps:
Acquiring a use request sent by a user through the microphone array, wherein the use request is in a voice form, and the microphone array comprises at least three common microphones;
Generating voice information corresponding to the common microphones one by one, wherein the voice information comprises voice intensity;
generating a sub-microphone array according to the order of the voice intensity from big to small;
calculating and generating primary sound source position information corresponding to each sub-microphone array according to a preset calculation model;
integrating the preliminary sound source position information to generate user position information;
comparing the user position information with the current preset target position information to generate position comparison information;
and generating movement instruction information according to the position comparison information and feeding back the movement instruction information to the magic mirror.
2. The method for adjusting a magic mirror based on a microphone array according to claim 1, wherein before obtaining the use request sent by the user through the microphone array, further comprises:
Acquiring a selection request sent by a user, wherein the selection request is in a voice form and comprises a selection keyword;
And acquiring preset target position information corresponding to the selected keywords according to the selected keywords.
3. The method for adjusting a magic mirror based on a microphone array according to claim 1, wherein the preset calculation model is a TDOA location algorithm model.
4. The method for adjusting a magic mirror based on a microphone array according to claim 1, wherein the generating the sub-microphone array in the order of the intensity of the voice from the top to the bottom specifically comprises:
sequencing the voice information according to the sequence from strong to weak of the voice intensity to generate a voice information list;
Sequentially accumulating the voice intensity corresponding to the voice information and the intensity sum preset to be zero from the voice information list; judging whether the intensity sum is larger than or equal to a preset dividing intensity threshold value;
If the intensity sum is smaller than a preset dividing intensity threshold, continuing to add the voice intensity and the intensity sum; if the intensity sum is greater than or equal to a preset dividing intensity threshold, forming a sub-microphone array by using common microphones corresponding to the voice intensity contained in the intensity sum, and generating array intensity bound with the sub-microphone array according to the current intensity sum.
5. The method for adjusting a magic mirror based on a microphone array according to claim 1, wherein the integrating the preliminary sound source location information to generate user location information specifically includes:
Acquiring sub microphone arrays corresponding to the initial sound source position information;
Calculating and generating average intensity according to the number of the common microphones of the sub-microphone arrays and the corresponding array intensity;
generating weighting coefficients corresponding to the sub-microphone arrays one by one according to the average intensity, wherein the sum of the weighting coefficients is 1; and generating user position information according to the weighting coefficient and the preliminary sound source position information corresponding to the sub-microphone array.
6. The method for adjusting a magic mirror based on a microphone array according to claim 5, wherein the obtaining the sub-microphone arrays corresponding to the preliminary sound source position information specifically includes:
Calculating a sound source position difference value between the preliminary sound source position information and other preliminary sound source position information;
Judging whether the sound source position difference value meets a preset difference value abnormal condition or not;
if the sound source position difference value meets a preset difference value abnormality condition, defining the primary sound source position information as an abnormal sound source position, and defining a corresponding common microphone as a problem microphone.
7. A method for adjusting a magic mirror based on a microphone array as defined in claim 6, further comprising, after the integrating the preliminary sound source location information to generate user location information:
simulating to generate simulated voice information corresponding to the problem microphones one by one according to the user position information and the positions of the problem microphones;
generating test sound difference time according to the simulated voice information and the corresponding voice information;
judging whether the sound difference checking time is greater than a preset sound difference threshold value or not;
If the detected sound difference time is greater than a preset sound difference threshold value, defining the current problem microphone as an abnormal microphone, and feeding back the abnormal microphone to an administrator; and if the detected sound difference time is smaller than or equal to a preset sound difference threshold value, redefining the current problem microphone as a common microphone.
8. A magic mirror adjusting system based on a microphone array is characterized by comprising,
The information acquisition module (1) is used for acquiring a use request sent by a user through the microphone array, wherein the use request is in a voice form, and the microphone array comprises at least three common microphones; generating voice information corresponding to the common microphones one by one, wherein the voice information comprises voice intensity;
A position generation module (2) for generating a sub-microphone array according to the order of the voice intensity from high to low; calculating and generating primary sound source position information corresponding to each sub-microphone array according to a preset calculation model; integrating the preliminary sound source position information to generate user position information;
The comparison mobile module (3) is used for comparing the user position information with the current preset target position information to generate position comparison information; and generating movement instruction information according to the position comparison information and feeding the movement instruction information back to the magic mirror.
9. An intelligent terminal comprising a memory and a processor, the memory having stored thereon a computer program capable of being loaded by the processor and performing the method according to any of claims 1 to 7.
10. A computer readable storage medium, characterized in that a computer program is stored which can be loaded by a processor and which performs the method according to any one of claims 1 to 7.
CN202110169552.5A 2021-02-07 2021-02-07 Magic mirror adjusting method, system and storage medium based on microphone array Active CN112860067B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110169552.5A CN112860067B (en) 2021-02-07 2021-02-07 Magic mirror adjusting method, system and storage medium based on microphone array

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110169552.5A CN112860067B (en) 2021-02-07 2021-02-07 Magic mirror adjusting method, system and storage medium based on microphone array

Publications (2)

Publication Number Publication Date
CN112860067A CN112860067A (en) 2021-05-28
CN112860067B true CN112860067B (en) 2024-04-19

Family

ID=75989037

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110169552.5A Active CN112860067B (en) 2021-02-07 2021-02-07 Magic mirror adjusting method, system and storage medium based on microphone array

Country Status (1)

Country Link
CN (1) CN112860067B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101201399A (en) * 2007-12-18 2008-06-18 北京中星微电子有限公司 Sound localization method and system
WO2010109708A1 (en) * 2009-03-25 2010-09-30 株式会社東芝 Pickup signal processing apparatus, method, and program
CN108717178A (en) * 2018-04-12 2018-10-30 福州瑞芯微电子股份有限公司 A kind of sound localization method and device based on neural network

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10864423B2 (en) * 2016-11-10 2020-12-15 National Taiwan University Augmented learning system for tai-chi chuan with head-mounted display

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101201399A (en) * 2007-12-18 2008-06-18 北京中星微电子有限公司 Sound localization method and system
WO2010109708A1 (en) * 2009-03-25 2010-09-30 株式会社東芝 Pickup signal processing apparatus, method, and program
CN108717178A (en) * 2018-04-12 2018-10-30 福州瑞芯微电子股份有限公司 A kind of sound localization method and device based on neural network

Also Published As

Publication number Publication date
CN112860067A (en) 2021-05-28

Similar Documents

Publication Publication Date Title
JP4450508B2 (en) Audio source positioning
CN109640224B (en) Pickup method and device
US10388268B2 (en) Apparatus and method for processing volumetric audio
US10582117B1 (en) Automatic camera control in a video conference system
CN113692750A (en) Sound transfer function personalization using sound scene analysis and beamforming
JP6977448B2 (en) Device control device, device control program, device control method, dialogue device, and communication system
CN111863020B (en) Voice signal processing method, device, equipment and storage medium
KR20210035725A (en) Methods and systems for storing mixed audio signal and reproducing directional audio
CN114333873A (en) Audio signal processing method and audio signal processing device
CN107450882B (en) Method and device for adjusting sound loudness and storage medium
CN110188179B (en) Voice directional recognition interaction method, device, equipment and medium
CN109361969B (en) Audio equipment and volume adjusting method, device, equipment and medium thereof
CN112860067B (en) Magic mirror adjusting method, system and storage medium based on microphone array
US20210294424A1 (en) Auto-framing through speech and video localizations
US20230254639A1 (en) Sound Pickup Method and Apparatus
CN111932619A (en) Microphone tracking system and method combining image recognition and voice positioning
CN109688512B (en) Pickup method and device
JP6879144B2 (en) Device control device, device control program, device control method, dialogue device, and communication system
CN113301294B (en) Call control method and device and intelligent terminal
CN113223552B (en) Speech enhancement method, device, apparatus, storage medium, and program
CN114422743A (en) Video stream display method, device, computer equipment and storage medium
CN114420144A (en) Audio signal processing method and audio signal processing device
CN110730378A (en) Information processing method and system
US20220337945A1 (en) Selective sound modification for video communication
US20240177335A1 (en) Data processing method, electronic apparatus, and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant