CN113192489A - Paint spraying robot voice recognition method based on multi-scale enhancement BiLSTM model - Google Patents
Paint spraying robot voice recognition method based on multi-scale enhancement BiLSTM model Download PDFInfo
- Publication number
- CN113192489A CN113192489A CN202110531117.2A CN202110531117A CN113192489A CN 113192489 A CN113192489 A CN 113192489A CN 202110531117 A CN202110531117 A CN 202110531117A CN 113192489 A CN113192489 A CN 113192489A
- Authority
- CN
- China
- Prior art keywords
- layer
- signal
- frequency spectrum
- spraying
- voice recognition
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000005507 spraying Methods 0.000 title claims abstract description 55
- 238000000034 method Methods 0.000 title claims abstract description 18
- 239000003973 paint Substances 0.000 title claims abstract description 13
- 238000001228 spectrum Methods 0.000 claims abstract description 27
- 238000012545 processing Methods 0.000 claims abstract description 11
- 230000005236 sound signal Effects 0.000 claims abstract description 11
- 238000005065 mining Methods 0.000 claims abstract description 4
- 238000012549 training Methods 0.000 claims abstract description 4
- 238000004364 calculation method Methods 0.000 claims description 15
- 230000004913 activation Effects 0.000 claims description 6
- 150000001875 compounds Chemical class 0.000 claims description 6
- 238000010422 painting Methods 0.000 claims description 6
- 238000007781 pre-processing Methods 0.000 claims description 6
- 238000012935 Averaging Methods 0.000 claims description 5
- 230000008569 process Effects 0.000 claims description 5
- 238000013461 design Methods 0.000 claims description 4
- 238000000605 extraction Methods 0.000 claims description 4
- 230000002708 enhancing effect Effects 0.000 claims description 3
- 230000007613 environmental effect Effects 0.000 claims description 3
- 238000001914 filtration Methods 0.000 claims description 3
- 230000004927 fusion Effects 0.000 claims description 3
- 238000010276 construction Methods 0.000 description 6
- 230000006870 function Effects 0.000 description 5
- 238000010586 diagram Methods 0.000 description 3
- 238000013527 convolutional neural network Methods 0.000 description 2
- 238000005034 decoration Methods 0.000 description 2
- 238000003745 diagnosis Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000012271 agricultural production Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000002457 bidirectional effect Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000036541 health Effects 0.000 description 1
- 229910001385 heavy metal Inorganic materials 0.000 description 1
- 210000004072 lung Anatomy 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 239000000941 radioactive substance Substances 0.000 description 1
- 230000006403 short-term memory Effects 0.000 description 1
- 239000002904 solvent Substances 0.000 description 1
- 239000013589 supplement Substances 0.000 description 1
- 231100000331 toxic Toxicity 0.000 description 1
- 230000002588 toxic effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/16—Speech classification or search using artificial neural networks
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/26—Pre-filtering or post-filtering
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0216—Noise filtering characterised by the method used for estimating noise
- G10L21/0232—Processing in the frequency domain
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
- G10L2015/223—Execution procedure of a spoken command
Landscapes
- Engineering & Computer Science (AREA)
- Human Computer Interaction (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Quality & Reliability (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Manipulator (AREA)
Abstract
A paint spraying robot voice recognition method based on a multi-scale enhancement BilSTM model. 1) Collecting a common spraying sound instruction by using a signal collecting system, wherein a data collecting card selects NI-9234; 2) repeatedly adding 100 times of Gaussian white noise into the acquired audio signal, generating a noise-containing signal, solving a corresponding Mel frequency spectrum sequence, and then solving an average sequence of 100 Mel frequency spectrum sequences; 3) extracting features of the average Mel frequency spectrum sequence by using a multi-scale convolution filter, and further mining the extracted features by using a BilSTM model to obtain corresponding output; 4) the output of the BilSTM model is spliced together, then input into a full connection layer and a Softmax layer for processing, and finally speech recognition is realized by combining a CTC algorithm; 5) and embedding the model obtained by training in the step 1-4 into a spraying robot, and intelligently realizing a corresponding spraying task. The model of the invention can realize the intelligent voice recognition function of the spraying robot and has high practical application value.
Description
Technical Field
The invention relates to the field of intelligent spraying robots, in particular to a voice recognition method of a spraying robot based on a multi-scale enhanced BilSTM model.
Background
With the rapid development of the domestic construction industry at present, the decoration industry closely related to the construction industry also has great market prospect; however, most of work in the current decoration industry needs to be finished manually, for example, wall surface spraying is carried out manually by holding a spraying machine in hands, the spraying effect is different, and the construction quality and efficiency are difficult to guarantee.
Manual spraying has high labor intensity of workers, the spraying distance and the spraying speed are not easy to control, and the spraying thickness error is easy to cause rework and even does not meet the quality requirement; the paint contains heavy metal, radioactive substance, toxic agricultural production system solvent and the like, and the paint needs to be atomized in the spraying process, the atomized paint is easy to be inhaled into the lungs of field constructors, and the severe construction environment has great harm to the health of spraying workers. Aiming at the various problems, the patent provides a voice recognition method of a painting robot based on a multiscale reinforced BilSTM model, which can help the robot to realize automation of spraying on the wall surface of a house, so that the self-adaptive spraying operation replaces the manual disordered spraying operation, the working environment can be improved, the labor intensity of workers is reduced, the spraying efficiency is greatly improved, and the construction quality can be ensured.
The domestic patent related to the intelligent spraying robot is 'an intelligent spraying robot system and a spraying method thereof' (201910960106.9). by designing a scanning modeling unit, an off-line programming unit, a driving control unit, a robot body and a thickness detection unit, the intelligent spraying of the spraying robot is realized, and the spraying quality problem caused by errors such as spraying track errors, spraying process parameter errors and the like is effectively reduced. The invention discloses a building outer wall spraying method based on an intelligent spraying robot (202011419313.2). The building outer wall spraying method based on the intelligent spraying robot can enable a robot body to automatically perform spraying operation on the outer wall of a building along a wavy track by controlling a retraction assembly, wherein a controller can automatically supplement paint according to the surplus value of the paint in a paint box, the required manual interference is less, the manual intervention is not needed, the construction cost is low, and no personnel risk exists. The above patents are all executed by the spraying robot after the tasks are preset, and have no self-adaptability, in reality, the spraying robot needs to make corresponding changes according to different conditions, but not mechanical execution tasks, and the voice recognition function of the spraying robot is endowed with the corresponding spraying tasks which can be completed in a self-adaptive manner, so that the method has important practical significance.
Disclosure of Invention
In order to solve the problems, the invention provides a paint spraying robot voice recognition method based on a multi-scale enhancement BilSTM model on the basis of a Convolutional Neural Network (CNN) and a bidirectional Long Short-Term Memory (BilSTM). Firstly, considering the influence of noise components contained in collected signals on model identification precision, the patent provides an aggregate denoising algorithm, and the noise influence can be well eliminated through multiple aggregate averaging so as to enhance the characteristics of voice signals; secondly, aiming at the characteristic that the characteristics of the voice signal are not easy to mine, the multi-scale convolution filter bank is designed, and the characteristics existing in the signal are mined from the multi-scale direction by designing four convolution kernels with effective lengths, so that the model can be greatly helped to mine the characteristics in the voice signal, and the diagnosis precision of the model is improved; finally, a BilSTM model is adopted to further extract the characteristics of the voice signals, and a full connection layer, a Softmax layer and a CTC algorithm are added into the model to finally realize voice recognition. To achieve the purpose, the invention provides a paint spraying robot voice recognition method based on a multi-scale enhanced BilSTM model, which comprises the following specific steps:
step 2, aggregate denoising pretreatment: repeatedly adding 100 times of Gaussian white noise into the acquired audio signal, generating a noise-containing signal, solving a corresponding Mel frequency spectrum sequence, and then solving an average sequence of 100 Mel frequency spectrum sequences;
step 3, multi-scale feature extraction: extracting features of the average Mel frequency spectrum sequence by using a multi-scale convolution filter, and further mining the extracted features by using a BilSTM model to obtain corresponding output;
and 4, feature fusion recognition: splicing the outputs of the BilSTM model together, inputting the spliced outputs into a full connection layer and a Softmax layer for processing, and finally realizing voice recognition by combining a CTC algorithm;
step 5, the spraying robot is applied: embedding the model obtained by training in the steps 1-4 into a spraying robot, and intelligently realizing a corresponding spraying task.
Further, the process of preprocessing the audio signal by using the set denoising preprocessing in step 2 can be expressed as:
assuming that the collected audio signal is x (t), which includes a valid signal c (t) and an ambient noise signal n (t), that is: x (t) ═ c (t) + n (t), adding 100 white gaussian noise g (t) to x (t) to generate noise signal s (t), solving the Mel frequency spectrum sequence of s (t), finally solving the average sequence Ms of 100 Mel frequency spectrum sequencesave:
In the formula, Ms (·) represents the calculation solution of the mel-frequency spectrum sequence, and since the mean value of white gaussian noise is 0, when the number of times of adding white gaussian noise is large enoughIs close to 0, which enables the environmental noise signal n (t) in the collected signal to be filtered out, thereby greatly enhancing the characteristics of the effective signal c (t).
Further, in step 3, the Mel frequency spectrum averaging sequence Ms obtained in step 2aveThe specific steps for extracting the multi-scale features are as follows:
step 3.1, design four one-dimensional convolution checks Ms of different scalesaveFiltering is carried out, the lengths of the four convolution kernels are respectivelyWherein L is MsaveLength of (d);
step 3.2, carrying out further processing by using a BilSTM model, wherein the specific steps can be expressed as follows:
step 3.2.1, building a BilSTM network with a forward propagation layer and a backward propagation layer;
step 3.2.2, forward propagation layer pair Ms is utilizedavePerforming calculation to obtain forward hidden state at t momentThe calculation expression is as follows:
in the formula, H represents an activation function of a hidden layer, and a sigmoid activation function, x, is selected in the patenttFor inputting data (Ms)ave),Representing the connection weight coefficient between the forward input layer and the hidden layer,represents the connection weight coefficient between the forward hidden layers,representing the forward hidden layer state at time t-1,representing the bias coefficients of the forward hidden layer.
Step 3.2.3, Ms is paired using the backward propagation layeraveCalculating to obtain a backward hidden state at time tThe calculation expression is as follows:
in the formula (I), the compound is shown in the specification,represents the connection weight coefficient between the backward input layer and the hidden layer,represents the connection weight coefficient between the backward hidden layers,indicating the backward hidden layer state at time t-1,representing the bias coefficients of the backward hidden layer.
Step 3.2.4, calculating output vector y of output layertThe calculation expression is:
in the formula (I), the compound is shown in the specification,for the connection weight coefficient between the forward hidden layer and the output layer,is a connection weight coefficient between the backward hidden layer and the output layer, b0Is the bias coefficient of the output layer.
Step 3.2.5, splicing the outputs of a plurality of BilSTM models together, inputting the spliced outputs into a full connection layer, and then processing the spliced outputs by a Softmax layer;
and 3.2.6, decoding the output of the Softmax layer by utilizing a CTC algorithm to realize voice recognition.
The invention discloses a paint spraying robot voice recognition method based on a multi-scale enhanced BilSTM model, which has the beneficial effects that: the invention has the technical effects that:
1. the invention considers the influence of noise components contained in the collected signals on the model identification precision, provides an aggregate denoising algorithm, and can well eliminate the noise influence through multiple aggregate averaging to enhance the characteristics of voice signals, thereby improving the robustness of a network model;
2. aiming at the characteristic that the characteristics of the voice signal are not easy to mine, the invention designs a multi-scale convolution filter group, and mines the characteristics existing in the signal from the multi-scale direction by designing convolution kernels with four effective lengths, so that the model can be greatly helped to mine the characteristics in the voice signal, and the diagnosis precision of the model is improved;
3. the invention adopts a BilSTM model to further extract the characteristics of the voice signals, adds a full connection layer, a Softmax layer and a CTC algorithm into the model, and finally realizes the voice recognition by designing a new model.
Drawings
FIG. 1 is a flow chart of the present invention;
fig. 2 is a network structure diagram of the multiscale enhanced BiLSTM model according to the present invention.
Detailed Description
The invention is described in further detail below with reference to the following detailed description and accompanying drawings:
the invention provides a voice recognition method of a painting robot based on a multi-scale enhanced BilSTM model, and aims to help the painting robot to intelligently recognize voice so as to complete a corresponding painting task. FIG. 1 is a flow chart of the present invention, and the steps of the present invention will be described in detail in conjunction with the flow chart.
step 2, aggregate denoising pretreatment: repeatedly adding 100 times of Gaussian white noise into the acquired audio signal, generating a noise-containing signal, solving a corresponding Mel frequency spectrum sequence, and then solving an average sequence of 100 Mel frequency spectrum sequences;
the process of preprocessing the audio signal by using the set denoising preprocessing in the step 2 can be expressed as follows:
assuming that the collected audio signal is x (t), which includes a valid signal c (t) and an ambient noise signal n (t), that is: x (t) ═ c (t) + n (t), adding 100 white gaussian noise g (t) to x (t) to generate noise signal s (t), solving the Mel frequency spectrum sequence of s (t), finally solving the average sequence Ms of 100 Mel frequency spectrum sequencesave:
In the formula, Ms (·) represents the calculation solution of the mel-frequency spectrum sequence, and since the mean value of white gaussian noise is 0, when the number of times of adding white gaussian noise is large enoughIs close to 0, which enables the environmental noise signal n (t) in the collected signal to be filtered out, thereby greatly enhancing the characteristics of the effective signal c (t).
Step 3, multi-scale feature extraction: extracting features of the average Mel frequency spectrum sequence by using a multi-scale convolution filter, and further mining the extracted features by using a BilSTM model to obtain corresponding output;
step 3 Merr spectrum averaging sequence Ms obtained in step 2aveThe specific steps for extracting the multi-scale features are as follows:
step 3.1, design four one-dimensional convolution checks Ms of different scalesaveFiltering is carried out, the lengths of the four convolution kernels are respectivelyWherein L is MsaveLength of (d);
step 3.2, carrying out further processing by using a BilSTM model, wherein the specific steps can be expressed as follows:
step 3.2.1, building a BilSTM network with a forward propagation layer and a backward propagation layer;
step 3.2.2, forward propagation layer pair Ms is utilizedavePerforming calculation to obtain forward hidden state at t momentThe calculation expression is as follows:
in the formula, H represents an activation function of a hidden layer, and a sigmoid activation function, x, is selected in the patenttFor inputting data (Ms)ave),Representing the connection weight coefficient between the forward input layer and the hidden layer,represents the connection weight coefficient between the forward hidden layers,representing the forward hidden layer state at time t-1,representing the bias coefficients of the forward hidden layer.
Step 3.2.3, Ms is paired using the backward propagation layeraveCalculating to obtain a backward hidden state at time tThe calculation expression is as follows:
in the formula (I), the compound is shown in the specification,represents the connection weight coefficient between the backward input layer and the hidden layer,represents the connection weight coefficient between the backward hidden layers,indicating the backward hidden layer state at time t-1,representing the bias coefficients of the backward hidden layer.
Step 3.2.4, calculating output vector y of output layertThe calculation expression is:
in the formula (I), the compound is shown in the specification,for the connection weight coefficient between the forward hidden layer and the output layer,is a connection weight coefficient between the backward hidden layer and the output layer, b0Is the bias coefficient of the output layer.
Step 3.2.5, splicing the outputs of a plurality of BilSTM models together, inputting the spliced outputs into a full connection layer, and then processing the spliced outputs by a Softmax layer;
and 3.2.6, decoding the output of the Softmax layer by utilizing a CTC algorithm to realize voice recognition.
And 4, feature fusion recognition: splicing the outputs of the BilSTM model together, inputting the spliced outputs into a full connection layer and a Softmax layer for processing, and finally realizing voice recognition by combining a CTC algorithm;
step 5, the spraying robot is applied: embedding the model obtained by training in the steps 1-4 into a spraying robot, and intelligently realizing a corresponding spraying task.
Fig. 2 is a network structure diagram of the multiscale enhanced BiLSTM model proposed by the present invention. It can be clearly seen from the structure diagram that, for the collected voice signal, 100 groups of white gaussian noise are added, then the mel frequency spectrum sequence of the noise-added signal is solved, then the solved mel frequency spectrum sequence is arithmetically averaged, and finally the averaged mel frequency spectrum sequence is obtained, namely, the noise interference in the original voice signal is filtered out in a collective noise-adding mode, and the characteristics of the effective signal are enhanced; designing 4 convolutional filter groups with different scales, and inputting the convolutional filter groups into a BilSTM model respectively to realize the learning of the characteristics of the original signal from a multi-scale layer; and then splicing the outputs of the BilSTM model together, and finally realizing the intelligent recognition of the voice through the processing of a full connection layer, a Softmax layer and a CTC decoding algorithm. In addition, as can be seen from the structure of the BilSTM model, the BilSTM is composed of forward and backward LSTM models, and features contained in the sound signal can be mined more accurately than the LSTM through the connection between the forward and backward hidden layers.
The above description is only a preferred embodiment of the present invention, and is not intended to limit the present invention in any way, but any modifications or equivalent variations made according to the technical spirit of the present invention are within the scope of the present invention as claimed.
Claims (3)
1. A paint spraying robot voice recognition method based on a multi-scale enhancement BilSTM model comprises the following specific steps:
step 1, acquiring instruction signals: collecting a common spraying sound instruction by using a signal collecting system, wherein a data collecting card selects NI-9234;
step 2, aggregate denoising pretreatment: repeatedly adding 100 times of Gaussian white noise into the acquired audio signal, generating a noise-containing signal, solving a corresponding Mel frequency spectrum sequence, and then solving an average sequence of 100 Mel frequency spectrum sequences;
step 3, multi-scale feature extraction: extracting features of the average Mel frequency spectrum sequence by using a multi-scale convolution filter, and further mining the extracted features by using a BilSTM model to obtain corresponding output;
and 4, feature fusion recognition: splicing the outputs of the BilSTM model together, inputting the spliced outputs into a full connection layer and a Softmax layer for processing, and finally realizing voice recognition by combining a CTC algorithm;
step 5, the spraying robot is applied: embedding the model obtained by training in the steps 1-4 into a spraying robot, and intelligently realizing a corresponding spraying task.
2. The painting robot voice recognition method based on the multi-scale enhancement BilSTM model as claimed in claim 1, characterized in that: the process of preprocessing the audio signal by using the set denoising preprocessing in the step 2 can be expressed as follows:
assuming that the collected audio signal is x (t), which includes a valid signal c (t) and an ambient noise signal n (t), that is: x (t) ═ c (t) + n (t), adding 100 white gaussian noise g (t) to x (t) to generate noise signal s (t), solving the Mel frequency spectrum sequence of s (t), finally solving the average sequence Ms of 100 Mel frequency spectrum sequencesave:
In the formula, Ms (·) represents the calculation solution of the mel-frequency spectrum sequence, and since the mean value of white gaussian noise is 0, when the number of times of adding white gaussian noise is large enoughIs close to 0, which enables the environmental noise signal n (t) in the collected signal to be filtered out, thereby greatly enhancing the characteristics of the effective signal c (t).
3. The painting robot voice recognition method based on the multi-scale enhancement BilSTM model as claimed in claim 1, characterized in that: step 3 Merr spectrum averaging sequence Ms obtained in step 2aveTo carry outThe specific steps of multi-scale feature extraction are as follows:
step 3.1, design four one-dimensional convolution checks Ms of different scalesaveFiltering is carried out, the lengths of the four convolution kernels are respectivelyWherein L is MsaveLength of (d);
step 3.2, carrying out further processing by using a BilSTM model, wherein the specific steps can be expressed as follows:
step 3.2.1, building a BilSTM network with a forward propagation layer and a backward propagation layer;
step 3.2.2, forward propagation layer pair Ms is utilizedavePerforming calculation to obtain forward hidden state at t momentThe calculation expression is as follows:
in the formula, H represents an activation function of a hidden layer, and a sigmoid activation function, x, is selected in the patenttFor inputting data (Ms)ave),Representing the connection weight coefficient between the forward input layer and the hidden layer,represents the connection weight coefficient between the forward hidden layers,representing the forward hidden layer state at time t-1,bias coefficient representing a forward hidden layer;
Step 3.2.3, Ms is paired using the backward propagation layeraveCalculating to obtain a backward hidden state at time tThe calculation expression is as follows:
in the formula (I), the compound is shown in the specification,represents the connection weight coefficient between the backward input layer and the hidden layer,represents the connection weight coefficient between the backward hidden layers,indicating the backward hidden layer state at time t-1,a bias coefficient representing a backward hidden layer;
step 3.2.4, calculating output vector y of output layertThe calculation expression is:
in the formula (I), the compound is shown in the specification,for the connection weight coefficient between the forward hidden layer and the output layer,for backward hidingConnection weight coefficient between layers and output layer, b0Is the bias coefficient of the output layer.
Step 3.2.5, splicing the outputs of a plurality of BilSTM models together, inputting the spliced outputs into a full connection layer, and then processing the spliced outputs by a Softmax layer;
and 3.2.6, decoding the output of the Softmax layer by utilizing a CTC algorithm to realize voice recognition.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110531117.2A CN113192489A (en) | 2021-05-16 | 2021-05-16 | Paint spraying robot voice recognition method based on multi-scale enhancement BiLSTM model |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110531117.2A CN113192489A (en) | 2021-05-16 | 2021-05-16 | Paint spraying robot voice recognition method based on multi-scale enhancement BiLSTM model |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113192489A true CN113192489A (en) | 2021-07-30 |
Family
ID=76981842
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110531117.2A Pending CN113192489A (en) | 2021-05-16 | 2021-05-16 | Paint spraying robot voice recognition method based on multi-scale enhancement BiLSTM model |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113192489A (en) |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102542151A (en) * | 2011-11-30 | 2012-07-04 | 重庆大学 | Rotary machine axis track purification method based on ensemble empirical mode decomposition |
CN103226649A (en) * | 2013-03-25 | 2013-07-31 | 西安交通大学 | Ensemble noise-reconstructed EMD (empirical mode decomposition) method for early and compound faults of machinery |
CN103839197A (en) * | 2014-03-19 | 2014-06-04 | 国家电网公司 | Method for judging abnormal electricity consumption behaviors of users based on EEMD method |
CN110211574A (en) * | 2019-06-03 | 2019-09-06 | 哈尔滨工业大学 | Speech recognition modeling method for building up based on bottleneck characteristic and multiple dimensioned bull attention mechanism |
CN111653275A (en) * | 2020-04-02 | 2020-09-11 | 武汉大学 | Method and device for constructing voice recognition model based on LSTM-CTC tail convolution and voice recognition method |
CN112593680A (en) * | 2020-12-07 | 2021-04-02 | 李朝阳 | Building outer wall spraying method based on intelligent spraying robot |
CN112642619A (en) * | 2019-10-10 | 2021-04-13 | 中国科学院重庆绿色智能技术研究院 | Intelligent spraying robot system and spraying method thereof |
-
2021
- 2021-05-16 CN CN202110531117.2A patent/CN113192489A/en active Pending
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102542151A (en) * | 2011-11-30 | 2012-07-04 | 重庆大学 | Rotary machine axis track purification method based on ensemble empirical mode decomposition |
CN103226649A (en) * | 2013-03-25 | 2013-07-31 | 西安交通大学 | Ensemble noise-reconstructed EMD (empirical mode decomposition) method for early and compound faults of machinery |
CN103839197A (en) * | 2014-03-19 | 2014-06-04 | 国家电网公司 | Method for judging abnormal electricity consumption behaviors of users based on EEMD method |
CN110211574A (en) * | 2019-06-03 | 2019-09-06 | 哈尔滨工业大学 | Speech recognition modeling method for building up based on bottleneck characteristic and multiple dimensioned bull attention mechanism |
CN112642619A (en) * | 2019-10-10 | 2021-04-13 | 中国科学院重庆绿色智能技术研究院 | Intelligent spraying robot system and spraying method thereof |
CN111653275A (en) * | 2020-04-02 | 2020-09-11 | 武汉大学 | Method and device for constructing voice recognition model based on LSTM-CTC tail convolution and voice recognition method |
CN112593680A (en) * | 2020-12-07 | 2021-04-02 | 李朝阳 | Building outer wall spraying method based on intelligent spraying robot |
Non-Patent Citations (1)
Title |
---|
范勇: ""基于改进EMD与SOM神经自动机驱动机构故障诊断研究"" * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109664300B (en) | Robot multi-style calligraphy copying method based on force sense learning | |
CN110543878B (en) | Pointer instrument reading identification method based on neural network | |
CN111192237B (en) | Deep learning-based glue spreading detection system and method | |
CN109858406B (en) | Key frame extraction method based on joint point information | |
CN105931218A (en) | Intelligent sorting method of modular mechanical arm | |
Liu et al. | Recognition methods for coal and coal gangue based on deep learning | |
CN101441776B (en) | Three-dimensional human body motion editing method driven by demonstration show based on speedup sensor | |
CN113538486B (en) | Method for improving identification and positioning accuracy of automobile sheet metal workpiece | |
CN106514667A (en) | Human-computer cooperation system based on Kinect skeletal tracking and uncalibrated visual servo | |
CN111261183A (en) | Method and device for denoising voice | |
CN106406518A (en) | Gesture control device and gesture recognition method | |
CN105389570A (en) | Face angle determination method and system | |
CN112507859B (en) | Visual tracking method for mobile robot | |
CN114851209B (en) | Industrial robot working path planning optimization method and system based on vision | |
CN113192489A (en) | Paint spraying robot voice recognition method based on multi-scale enhancement BiLSTM model | |
CN109087646B (en) | Method for leading-in artificial intelligence ultra-deep learning for voice image recognition | |
CN111681649B (en) | Speech recognition method, interaction system and achievement management system comprising system | |
CN103530857B (en) | Based on multiple dimensioned Kalman filtering image denoising method | |
TW202001871A (en) | Voice actuated industrial machine control system | |
CN112965487A (en) | Mobile robot trajectory tracking control method based on strategy iteration | |
Luo et al. | Robot artist performs cartoon style facial portrait painting | |
CN110472691A (en) | Target locating module training method, device, robot and storage medium | |
CN114715363A (en) | Navigation method and system for submarine stratum space drilling robot and electronic equipment | |
Binder et al. | Utilizing an enterprise architecture framework for model-based industrial systems engineering | |
CN112507940A (en) | Skeleton action recognition method based on difference guidance representation learning network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20210730 |
|
RJ01 | Rejection of invention patent application after publication |