CN110782904A - User account switching method of intelligent voice equipment - Google Patents

User account switching method of intelligent voice equipment Download PDF

Info

Publication number
CN110782904A
CN110782904A CN201911083446.4A CN201911083446A CN110782904A CN 110782904 A CN110782904 A CN 110782904A CN 201911083446 A CN201911083446 A CN 201911083446A CN 110782904 A CN110782904 A CN 110782904A
Authority
CN
China
Prior art keywords
word
switching
neural network
intelligent voice
user account
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911083446.4A
Other languages
Chinese (zh)
Inventor
张成亮
徐庭锐
刘洋廷
郝放
简红美
高玉东
毕可骅
王飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sichuan Changhong Electric Co Ltd
Original Assignee
Sichuan Changhong Electric Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sichuan Changhong Electric Co Ltd filed Critical Sichuan Changhong Electric Co Ltd
Priority to CN201911083446.4A priority Critical patent/CN110782904A/en
Publication of CN110782904A publication Critical patent/CN110782904A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification
    • G10L17/18Artificial neural networks; Connectionist approaches
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification
    • G10L17/06Decision making techniques; Pattern matching strategies
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification
    • G10L17/22Interactive procedures; Man-machine interfaces

Abstract

The invention relates to the field of intelligent voice equipment, and discloses a user account switching method of intelligent voice equipment, which is used for solving the problem that the intelligent voice equipment is not fast enough in user account switching. Firstly, acquiring awakening word audio signals spoken by different users, converting the awakening word audio signals into digital signals, inputting the digital signals into an RNN (neural network), and clustering feature vectors of awakening words output by the RNN neural network; when a user switches accounts and speaks a wake-up word, the equipment collects an audio signal of the wake-up word, converts the audio signal into a digital signal, inputs the digital signal into the RNN neural network, calculates the distance between a feature vector of the current wake-up word output by the RNN neural network and each cluster center vector, and if the distance between the feature vector of the current wake-up word and the nearest cluster center vector does not exceed a threshold value, takes the nearest cluster center vector as the account of the current user, thereby switching the accounts. The method and the device are suitable for switching the user account of the intelligent voice equipment.

Description

User account switching method of intelligent voice equipment
Technical Field
The invention relates to the field of intelligent voice equipment, in particular to a user account switching method of intelligent voice equipment.
Background
With the vigorous development of the internet of things and the artificial intelligence technology, especially the progress of the voice recognition technology, the equipment becomes more and more intelligent, and a user can operate and control the equipment only through voice. For a scene that one device such as a household television and a sound box corresponds to a plurality of users, the device needs to store preference settings and related contents of different users, so that a quick and effective user account switching method is needed.
At present, account management is mainly carried out through a matched APP by intelligent voice equipment, the account management comprises registration and login and the like, so that user role switching is completed, and the switching mode is not direct and fast enough in the voice interaction era and influences user experience.
In addition, the mainstream intelligent voice equipment is activated by the wake-up word, that is, the user needs to speak the wake-up word first to activate the voice interaction function of the equipment, so that the problem of switching the user is most natural when the user starts with the wake-up word.
Disclosure of Invention
The technical problem to be solved by the invention is as follows: the user account switching method of the intelligent voice equipment is used for solving the problem that the intelligent voice equipment is not fast enough in user account switching.
In order to solve the problems, the invention adopts the technical scheme that: the user account switching method of the intelligent voice equipment is characterized by comprising the following steps
Step 1: collecting awakening word audio signals spoken by different users, and converting the word audio signals into digital signals with fixed length;
step 2: inputting all collected voice digital signals into an RNN (neural network), outputting a feature vector of a wake word by the RNN, and clustering all the feature vectors by a clustering algorithm, wherein a clustering center vector is used as an account identifier;
and step 3: when a user switches accounts and speaks a wake-up word, the equipment collects a wake-up word audio signal and converts the word audio signal into a digital signal with a fixed length;
and 4, step 4: and (3) inputting the digital signals in the step (3) into the RNN neural network which is the same as that in the step (2), outputting the feature vector of the current awakening word by the RNN neural network, calculating the distance between the feature vector of the current awakening word and each cluster center vector, and if the distance between the feature vector of the current awakening word and the nearest cluster center vector does not exceed a threshold value, taking the nearest cluster center vector as the account of the current user, thereby switching the accounts.
Furthermore, in order to facilitate the increase and decrease management of the user account, the invention can also comprise the following steps:
if the distance between the feature vector of the word awakened currently and the nearest cluster center vector exceeds a threshold value, taking the feature vector as a new cluster center;
and if the sample size belonging to a certain clustering center is still less than the set sample size threshold value after the number of the continuously acquired awakening word audio signals exceeds the set sample increment, removing the clustering center and directly discarding the corresponding sample data.
Further, for reasonable clustering, the sample increment may be 100, and the sample size threshold may be 28.
Further, the digital signal may be a binary number of 128 bits.
Further, the RNN neural network is an LSTM neural network.
Further, for reasonable setting, the threshold value may be 0.6.
The invention has the beneficial effects that: in the invention, the user can quickly complete the switching of the user account only by speaking the corresponding awakening word, thereby improving the user experience.
Drawings
Fig. 1 is a flowchart of switching user accounts according to an embodiment.
Detailed Description
In order to overcome the disadvantage that the user account switching of the intelligent voice device is not fast enough, a user account switching method of the intelligent voice device is provided, as shown in fig. 1, the method comprises the following steps:
step 1: collecting awakening word audio signals spoken by different users, and converting the word audio signals into digital signals with fixed length;
step 2: inputting all collected voice digital signals into an RNN (neural network), outputting a feature vector of a wake word by the RNN, and clustering all the feature vectors by a clustering algorithm, wherein a clustering center vector is used as an account identifier;
and step 3: when a user switches accounts and speaks a wake-up word, the equipment collects a wake-up word audio signal and converts the word audio signal into a digital signal with a fixed length;
and 4, step 4: and (3) inputting the digital signals in the step (3) into the RNN neural network which is the same as that in the step (2), outputting the feature vector of the current awakening word by the RNN neural network, calculating the distance between the feature vector of the current awakening word and each cluster center vector, and if the distance between the feature vector of the current awakening word and the nearest cluster center vector does not exceed a threshold value, taking the nearest cluster center vector as the account of the current user, thereby switching the accounts.
And 5: if the distance between the feature vector of the word awakened currently and the nearest cluster center vector exceeds a threshold value, taking the feature vector as a new cluster center;
and if the sample size belonging to a certain clustering center is still less than the set sample size threshold value after the number of the continuously acquired awakening word audio signals exceeds the set sample increment, removing the clustering center and directly discarding the corresponding sample data.
The present invention will be specifically described below by way of examples.
The embodiment provides a user account switching method of intelligent voice equipment, which mainly comprises model training, model use and iterative training;
firstly, model training is carried out: firstly, acquiring awakening word audio signals spoken by different users, and converting the awakening word audio signals into 128-bit binary digital signals through sampling quantization coding; and then all the collected voice digital signals are input into an RNN neural network, the RNN neural network is a standard single-layer LSTM network, the number of hidden nodes is 128, an ADAM optimization algorithm is adopted, the momentum is 0.5, the initial learning rate is 0.0002, the attenuation is half of every iteration for 50 times, 32 voice digital signals are simultaneously input into every training as a batch, the loss function is the distance between the feature vector output by the LSTM network and the clustering center vector, and the convergence condition is that the iteration times reach 600 times or the error of the loss function is lower than 0.6.
All the characteristic vectors output by the RNN neural network are clustered through a K-means clustering algorithm, and the clustering center vector is used as an account identifier.
The initial value of the category number K of the K-means clustering algorithm is 2, the convergence condition is that the mean square error value is less than 0.8 or the category distribution result of any sample point is not changed, and the distance calculation formula of the characteristic vector and the clustering center vector is Euclidean distance.
Then entering a model using stage: after a user speaks a wakeup word, the equipment collects an audio signal of the wakeup word and converts the audio signal into a 128-bit binary digital signal through sampling quantization coding; and inputting the characteristic vectors into an RNN neural network, calculating the distance between the output characteristic vectors and each clustering center vector, wherein the distance between the output characteristic vectors and the nearest clustering center vector does not exceed a threshold value of 0.6, namely, the output characteristic vectors are used as the account of the current user, and therefore, the account is switched.
And finally, carrying out iterative training of the model: if the distance between the feature vector of the awakening word of a certain user and the nearest clustering center vector exceeds a threshold value of 0.6, taking the feature vector as a new clustering center; and if the number of the continuously acquired awakening word audio signals exceeds the set sample increment of 100, and the sample size belonging to a certain clustering center is still less than the set sample size threshold value of 28, removing the clustering center and directly discarding the corresponding sample data.

Claims (6)

1. A user account switching method of intelligent voice equipment is characterized by comprising the following steps:
step 1: collecting awakening word audio signals spoken by different users, and converting the word audio signals into digital signals with fixed length;
step 2: inputting all collected voice digital signals into an RNN (neural network), outputting a feature vector of a wake word by the RNN, and clustering all the feature vectors by a clustering algorithm, wherein a clustering center vector is used as an account identifier;
and step 3: when a user switches accounts and speaks a wake-up word, the equipment collects a wake-up word audio signal and converts the word audio signal into a digital signal with a fixed length;
and 4, step 4: and (3) inputting the digital signals in the step (3) into the RNN neural network which is the same as that in the step (2), outputting the feature vector of the current awakening word by the RNN neural network, calculating the distance between the feature vector of the current awakening word and each cluster center vector, and if the distance between the feature vector of the current awakening word and the nearest cluster center vector does not exceed a threshold value, taking the nearest cluster center vector as the account of the current user, thereby switching the accounts.
2. The method for switching the user account of the intelligent voice device according to claim 1, further comprising the step 5:
if the distance between the feature vector of the word awakened currently and the nearest cluster center vector exceeds a threshold value, taking the feature vector as a new cluster center;
and if the sample size belonging to a certain clustering center is still less than the set sample size threshold value after the number of the continuously acquired awakening word audio signals exceeds the set sample increment, removing the clustering center and directly discarding the corresponding sample data.
3. The method for switching the user account of the intelligent voice device according to claim 2, wherein the sample increment is 100, and the sample size threshold is 28.
4. The method for switching the user account of the intelligent voice device according to claim 1, wherein the digital signal is a 128-bit binary number.
5. The method of claim 1, wherein the RNN neural network is an LSTM neural network.
6. The method for switching the user account of the intelligent voice device according to claim 1, wherein the threshold is 0.6.
CN201911083446.4A 2019-11-07 2019-11-07 User account switching method of intelligent voice equipment Pending CN110782904A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911083446.4A CN110782904A (en) 2019-11-07 2019-11-07 User account switching method of intelligent voice equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911083446.4A CN110782904A (en) 2019-11-07 2019-11-07 User account switching method of intelligent voice equipment

Publications (1)

Publication Number Publication Date
CN110782904A true CN110782904A (en) 2020-02-11

Family

ID=69389576

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911083446.4A Pending CN110782904A (en) 2019-11-07 2019-11-07 User account switching method of intelligent voice equipment

Country Status (1)

Country Link
CN (1) CN110782904A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113452583A (en) * 2020-03-25 2021-09-28 阿里巴巴集团控股有限公司 Account switching method and system, storage medium and processing equipment

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105139390A (en) * 2015-08-14 2015-12-09 四川大学 Image processing method for detecting pulmonary tuberculosis focus in chest X-ray DR film
CN105845140A (en) * 2016-03-23 2016-08-10 广州势必可赢网络科技有限公司 Speaker confirmation method and speaker confirmation device used in short voice condition
CN105897686A (en) * 2015-12-21 2016-08-24 乐视致新电子科技(天津)有限公司 Smart television user account speech management method and smart television
US20160350610A1 (en) * 2014-03-18 2016-12-01 Samsung Electronics Co., Ltd. User recognition method and device
CN106204451A (en) * 2016-07-08 2016-12-07 西安电子科技大学 The Image Super-resolution Reconstruction method embedded based on the fixing neighborhood of constraint
CN107492382A (en) * 2016-06-13 2017-12-19 阿里巴巴集团控股有限公司 Voiceprint extracting method and device based on neutral net
CN107809667A (en) * 2017-10-26 2018-03-16 深圳创维-Rgb电子有限公司 Television voice exchange method, interactive voice control device and storage medium
CN109063754A (en) * 2018-07-18 2018-12-21 武汉大学 A kind of remote sensing image multiple features combining classification method based on OpenStreetMap

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160350610A1 (en) * 2014-03-18 2016-12-01 Samsung Electronics Co., Ltd. User recognition method and device
CN105139390A (en) * 2015-08-14 2015-12-09 四川大学 Image processing method for detecting pulmonary tuberculosis focus in chest X-ray DR film
CN105897686A (en) * 2015-12-21 2016-08-24 乐视致新电子科技(天津)有限公司 Smart television user account speech management method and smart television
CN105845140A (en) * 2016-03-23 2016-08-10 广州势必可赢网络科技有限公司 Speaker confirmation method and speaker confirmation device used in short voice condition
CN107492382A (en) * 2016-06-13 2017-12-19 阿里巴巴集团控股有限公司 Voiceprint extracting method and device based on neutral net
CN106204451A (en) * 2016-07-08 2016-12-07 西安电子科技大学 The Image Super-resolution Reconstruction method embedded based on the fixing neighborhood of constraint
CN107809667A (en) * 2017-10-26 2018-03-16 深圳创维-Rgb电子有限公司 Television voice exchange method, interactive voice control device and storage medium
CN109063754A (en) * 2018-07-18 2018-12-21 武汉大学 A kind of remote sensing image multiple features combining classification method based on OpenStreetMap

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
张军英: "《说话人识别的现代方法与技术》", 31 October 1994 *
赵力: "《语音信号处理》", 30 June 2009 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113452583A (en) * 2020-03-25 2021-09-28 阿里巴巴集团控股有限公司 Account switching method and system, storage medium and processing equipment

Similar Documents

Publication Publication Date Title
Fu et al. Metricgan+: An improved version of metricgan for speech enhancement
WO2019120114A1 (en) Data fixed point processing method, device, electronic apparatus and computer storage medium
CN105744434B (en) A kind of intelligent sound box control method and system based on gesture identification
CN103281581B (en) By man-machine interactive system and the method for smart mobile phone Voice command IP Set Top Box
CN110992932B (en) Self-learning voice control method, system and storage medium
CN110600018A (en) Voice recognition method and device and neural network training method and device
CN109754790B (en) Speech recognition system and method based on hybrid acoustic model
CN102831892B (en) Toy control method and system based on internet voice interaction
CN110033758A (en) A kind of voice wake-up implementation method based on small training set optimization decoding network
CN103700370A (en) Broadcast television voice recognition method and system
CN102855874A (en) Method and system for controlling household appliance on basis of voice interaction of internet
CN103514883B (en) A kind of self-adaptation realizes men and women's sound changing method
CN110287303B (en) Man-machine conversation processing method, device, electronic equipment and storage medium
CN107885323B (en) VR scene immersion control method based on machine learning
CN107316635B (en) Voice recognition method and device, storage medium and electronic equipment
CN110070855A (en) A kind of speech recognition system and method based on migration neural network acoustic model
CN111048085A (en) Off-line voice control method, system and storage medium based on ZIGBEE wireless technology
CN111161726B (en) Intelligent voice interaction method, device, medium and system
CN111429915A (en) Scheduling system and scheduling method based on voice recognition
CN102945673A (en) Continuous speech recognition method with speech command range changed dynamically
CN111223489B (en) Specific keyword identification method and system based on Attention mechanism
CN110782904A (en) User account switching method of intelligent voice equipment
Lim et al. Weakly labeled semi-supervised sound event detection using CRNN with inception module.
CN114360510A (en) Voice recognition method and related device
WO2020151017A1 (en) Scalable field human-machine dialogue system state tracking method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20200211