CN110738988A

CN110738988A - shower voice recognition system and method based on Viterbi algorithm

Info

Publication number: CN110738988A
Application number: CN201911018314.3A
Authority: CN
Inventors: 吴淼; 唐刚
Original assignee: Shanghai Maritime University
Current assignee: Shanghai Maritime University
Priority date: 2019-10-24
Filing date: 2019-10-24
Publication date: 2020-01-31

Abstract

The invention discloses shower voice recognition systems and methods based on a Viterbi algorithm.

Description

shower voice recognition system and method based on Viterbi algorithm

Technical Field

The invention relates to the technical field of voice recognition, in particular to shower voice recognition systems and methods based on a Viterbi algorithm.

Background

With the increase in living standards and the improvement in housing conditions, many households have bathing facilities installed, and showers are the most common shower devices. At present, most of shower products are manually adjusted to control the water temperature and the water flow, so that the two hands of normal people can not be intelligently liberated, the disabled can be conveniently regulated and controlled by people, and the user experience is poor.

In the prior art, patent CN201710369098 provides device for automatically adjusting the water amount sprayed by an intelligent shower head according to the position of a human body, and although the water amount is automatically determined at degree, the temperature is manually adjusted, patent CN201820904890 provides shower heads capable of measuring the temperature, and although the temperature is not required to be sensed by hands, the cold and hot degrees of the water temperature among individuals are different, and temperatures suitable for the users cannot be accurately obtained on the premise of not manually adjusting.

Therefore, the invention aims to solve the technical problems of realizing the intelligent control of the water temperature and the switching function of the shower head so as to liberate both hands and help the blind.

Disclosure of Invention

The invention aims to provide shower voice recognition systems and methods based on a Viterbi algorithm to realize intelligent control of water temperature and switching functions of a shower so as to liberate hands and help blind people.

In order to achieve the purpose, the invention provides shower voice recognition methods based on the Viterbi algorithm, which comprises the following steps:

step 1: collecting audio data sent by a user;

step 2: identifying the collected audio data as characters based on a Viterbi algorithm;

and step 3: and converting the recognized characters into actions including switching on and off the shower head and controlling the water temperature.

The shower voice recognition method based on the viterbi algorithm is characterized in that the step 2 comprises the following steps:

step 2.1: carrying out noise reduction pretreatment on the acquired audio data;

2.2, extracting the characteristics of the audio data subjected to the noise reduction processing to extract groups or a plurality of groups of characteristic parameters capable of describing the audio data;

step 2.3: and decoding the extracted characteristic parameters based on a Viterbi algorithm to obtain an optimal character recognition result.

The shower voice recognition method based on the viterbi algorithm is characterized in that the step 2.1 comprises the following steps:

step 2.1.1: carrying out mute cutting of the head end and the tail end of the collected audio data;

step 2.1.2: performing framing processing on the cut audio data;

step 2.2.3: and extracting useful voice signals from the noise background for the audio data after the framing processing so as to inhibit and reduce noise interference.

In the method for recognizing the shower voice based on the viterbi algorithm, in step 2.1.2, a moving window function is used for framing, and the frames are overlapped.

In the method for recognizing the shower voice based on the viterbi algorithm, in step 2.2, the characteristic parameters include average energy, zero-crossing number, linear prediction cepstrum coefficient and mel-frequency cepstrum coefficient.

In the step 2.3, in the decoding process, based on the viterbi algorithm, a WFST search space is constructed by using an acoustic model, a pronunciation dictionary and a language model, and an optimal path with the maximum matching probability is searched in a Weighted Finite-State-converter (WFST) search space, so as to obtain an optimal character recognition result.

In the method for recognizing the shower voice based on the viterbi algorithm, the training method adopted by the acoustic model is a dynamic time warping method, a vector quantization method, a hidden markov model method, an artificial neural network method, a support vector machine method or a wavelet transform method.

The invention also provides shower voice recognition systems based on the Viterbi algorithm, which comprises:

the data acquisition module is used for acquiring audio data sent by a user;

the voice recognition module is used for recognizing the collected audio data into characters based on a Viterbi algorithm;

and the control module is used for converting the recognized characters into actions including switching on and off the shower head and controlling water temperature.

The shower voice recognition system based on the viterbi algorithm is characterized in that the voice recognition module comprises:

the information preprocessing module is used for carrying out noise reduction preprocessing on the acquired audio data;

the characteristic extraction module is used for extracting the characteristics of the audio data subjected to the noise reduction processing so as to extract groups or a plurality of groups of characteristic parameters capable of describing the audio data;

the model training module is used for obtaining an acoustic model, a language model and a preset pronunciation dictionary through training;

and the pattern matching module is used for decoding the extracted characteristic parameters based on a Viterbi algorithm, and in the decoding process, a WFST search space is constructed by utilizing an acoustic model, a pronunciation dictionary and a language model, and an optimal path with the maximum matching probability is searched in the WFST search space to obtain an optimal character recognition result.

The sprinkler voice recognition system based on the viterbi algorithm, wherein the information preprocessing module comprises:

the silence removal module is used for performing silence removal of the head end and the tail end of the acquired audio data;

the framing processing module is used for framing the cut audio data;

and the noise reduction processing module is used for extracting useful voice signals from a noise background for the audio data after the framing processing so as to inhibit and reduce noise interference.

Compared with the prior art, the invention has the following beneficial effects:

the invention is based on the Viterbi algorithm, applies voice recognition to the shower head, realizes automatic temperature control and the opening and closing of the shower head, improves the user experience degree and has strong practicability.

Drawings

FIG. 1 is a flow chart of a voice recognition method of a shower based on a Viterbi algorithm according to the present invention;

FIG. 2 is a flow chart of step 2 of the shower voice recognition method based on the Viterbi algorithm;

FIG. 3 is a flow chart of step 2.1 of the voice recognition method of the shower head based on the Viterbi algorithm of the present invention;

FIG. 4 is a schematic structural diagram of a speech recognition system of a shower based on Viterbi algorithm;

FIG. 5 is a schematic structural diagram of a voice recognition module in the voice recognition system of the shower head based on the Viterbi algorithm;

FIG. 6 is a schematic structural diagram of an information preprocessing module in the speech recognition system of the sprinkler based on the Viterbi algorithm.

Detailed Description

The invention is further described with reference to the following embodiments, which are provided for illustration only and are not intended to limit the scope of the invention.

As shown in fig. 1, the present invention provides shower voice recognition methods based on Viterbi (Viterbi) algorithm, which includes the following steps:

step 1: audio data (including user speech and background noise) emitted by the user is collected.

And 2, identifying the collected audio data as characters based on a Viterbi algorithm, wherein the Viterbi algorithm means that the maximum probability of all paths reaching the state is recorded every steps after the state is started, and then the paths are continuously pushed backwards by taking the maximum probability as a reference to find a global optimal path.

Step , as shown in fig. 2, step 2 includes:

step 2.1: and carrying out noise reduction preprocessing on the acquired audio data.

, as shown in fig. 3, the step 2.1 includes the following steps to simplify the processing of the voice signal, step 2.1.1, to perform silence removal of the head and tail end of the collected audio data to reduce the interference to the subsequent steps, step 2.1.2, to perform framing processing on the removed audio data, that is, to cut the sound into segments , each segment is called frames, and is implemented by using a moving window function, not simply cutting, and the frames are overlapped with each other , and step 2.2.3, to extract the useful voice signal from the noise background by dividing the framed audio data to suppress and reduce the noise interference and realize the voice enhancement.

And 2.2, performing feature extraction on the audio data subjected to the noise reduction processing to extract groups or several groups of parameters capable of describing the audio data feature.

, in step 2.2, the characteristic parameters include average energy, zero-crossing number, Linear Prediction Cepstrum Coefficient (LPCC) and Mel Frequency Cepstrum Coefficient (MFCC). The characteristic parameters are convenient to calculate and have efficient algorithm to ensure the real-time realization of speech recognition.

Step , in step 2.3, in the decoding process, based on the viterbi algorithm, an acoustic model (trained from speech training data and noise data), a pronunciation dictionary, and a language model (trained from text training data) are used to construct a WFST search space, and an optimal path with the highest matching probability is found in the WFST search space to obtain an optimal character recognition result.

, the main mode training method of the acoustic model includes Dynamic Time Warping (DTW), Vector Quantization (VQ), Hidden Markov Model (HMM), Artificial Neural Network (ANN), Support Vector Machine (SVM), Wavelet Transform (WT), etc.

As shown in fig. 4, the present invention further provides shower voice recognition systems based on viterbi algorithm, which includes:

the data acquisition module 1 is configured to acquire audio data sent by a user, and may specifically use a microphone to perform acquisition.

And the voice recognition module 2 is used for recognizing the collected audio data into characters based on the Viterbi algorithm.

And the control module 3 is used for converting the recognized characters into actions including switching on and off the shower head and controlling water temperature.

Step , as shown in fig. 5, the speech recognition module 2 includes:

and the information preprocessing module 21 is configured to perform noise reduction preprocessing on the acquired audio data.

And the feature extraction module 22 is configured to perform feature extraction on the audio data after the noise reduction processing to extract groups or several groups of parameters capable of describing the audio data features.

And the model training module 23 is used for obtaining an acoustic model, a language model and a preset pronunciation dictionary through training.

And the pattern matching module 24 is used for decoding the extracted characteristic parameters based on a Viterbi algorithm, and in the decoding process, a WFST search space is constructed by using an acoustic model, a pronunciation dictionary and a language model, and an optimal path with the maximum matching probability is searched in the WFST search space to obtain an optimal character recognition result.

, as shown in fig. 6, the information preprocessing module 21 includes a silence removal module 211 configured to perform silence removal on the beginning and the end of the collected audio data, a framing processing module 212 configured to perform framing processing on the removed audio data, and a noise reduction processing module 213 configured to extract a useful speech signal from a noise background for the framed audio data to suppress and reduce noise interference.

In conclusion, the voice recognition is applied to the shower head based on the Viterbi algorithm, so that the temperature and the shower head can be automatically controlled to be opened and closed, the user experience is improved, and the practicability is high.

While the present invention has been described in detail with reference to the preferred embodiments, it should be understood that the above description should not be taken as limiting the invention. Various modifications and alterations to this invention will become apparent to those skilled in the art upon reading the foregoing description. Accordingly, the scope of the invention should be determined from the following claims.

Claims

1, shower voice recognition method based on Viterbi algorithm, which is characterized by comprising the following steps:

step 1: collecting audio data sent by a user;

2. The viterbi algorithm based sprinkler speech recognition method of claim 1, wherein the step 2 comprises:

step 2.1: carrying out noise reduction pretreatment on the acquired audio data;

3. The viterbi algorithm based sprinkler speech recognition method according to claim 2, characterized in that the step 2.1 comprises the steps of:

step 2.1.2: performing framing processing on the cut audio data;

4. The method of claim 2 in which in step 2.1.2 the framing is performed using a moving window function with overlap between frames.

5. The viterbi algorithm based sprinkler speech recognition method according to claim 2, characterized in that in step 2.2, the characteristic parameters include average energy, zero-crossing number, linear prediction cepstral coefficients and mel-frequency cepstral coefficients.

6. The method for shower voice recognition based on the viterbi algorithm as claimed in claim 2, wherein in the step 2.3, in the decoding process, based on the viterbi algorithm, the acoustic model, the pronunciation dictionary and the language model are used to construct the WFST search space, and the optimal path with the maximum matching probability is found in the WFST search space, so as to obtain the optimal character recognition result.

7. The method as claimed in claim 6, wherein the acoustic model is trained by dynamic time warping, vector quantization, hidden Markov model, artificial neural network, support vector machine, or wavelet transform.

8, gondola water faucet speech recognition system based on Viterbi algorithm, which comprises:

the data acquisition module is used for acquiring audio data sent by a user;

9. The viterbi algorithm based sprinkler speech recognition system of claim 8, wherein the speech recognition module comprises:

10. The viterbi algorithm based sprinkler speech recognition system of claim 9, wherein the information pre-processing module comprises:

the framing processing module is used for framing the cut audio data;