CN114792517A - Voice recognition method and device for intelligent water cup - Google Patents
Voice recognition method and device for intelligent water cup Download PDFInfo
- Publication number
- CN114792517A CN114792517A CN202210322946.4A CN202210322946A CN114792517A CN 114792517 A CN114792517 A CN 114792517A CN 202210322946 A CN202210322946 A CN 202210322946A CN 114792517 A CN114792517 A CN 114792517A
- Authority
- CN
- China
- Prior art keywords
- voice
- water cup
- information
- intelligent water
- frequency
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
- 238000000034 method Methods 0.000 title claims abstract description 32
- 241000190070 Sarracenia purpurea Species 0.000 title claims abstract description 30
- 239000013598 vector Substances 0.000 claims abstract description 28
- 238000001914 filtration Methods 0.000 claims abstract description 9
- 239000011159 matrix material Substances 0.000 claims abstract description 8
- 238000001228 spectrum Methods 0.000 claims abstract description 8
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 7
- 230000008569 process Effects 0.000 claims description 6
- 238000004364 calculation method Methods 0.000 claims description 4
- 238000007781 pre-processing Methods 0.000 claims description 4
- 230000003044 adaptive effect Effects 0.000 claims description 3
- 238000009432 framing Methods 0.000 claims description 3
- 238000005316 response function Methods 0.000 claims description 3
- 230000011218 segmentation Effects 0.000 claims description 3
- 230000003993 interaction Effects 0.000 abstract description 6
- 238000010438 heat treatment Methods 0.000 description 16
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 6
- 239000000126 substance Substances 0.000 description 5
- WHXSMMKQMYFTQS-UHFFFAOYSA-N Lithium Chemical compound [Li] WHXSMMKQMYFTQS-UHFFFAOYSA-N 0.000 description 4
- 230000035622 drinking Effects 0.000 description 4
- 229910052744 lithium Inorganic materials 0.000 description 4
- 238000006243 chemical reaction Methods 0.000 description 2
- 210000003477 cochlea Anatomy 0.000 description 2
- 239000003651 drinking water Substances 0.000 description 2
- 235000020188 drinking water Nutrition 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 239000007788 liquid Substances 0.000 description 2
- 235000015097 nutrients Nutrition 0.000 description 2
- 101100327917 Caenorhabditis elegans chup-1 gene Proteins 0.000 description 1
- QVGXLLKOCUKJST-UHFFFAOYSA-N atomic oxygen Chemical compound [O] QVGXLLKOCUKJST-UHFFFAOYSA-N 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 239000008280 blood Substances 0.000 description 1
- 210000004369 blood Anatomy 0.000 description 1
- 210000004556 brain Anatomy 0.000 description 1
- 230000000739 chaotic effect Effects 0.000 description 1
- 238000007635 classification algorithm Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 230000037433 frameshift Effects 0.000 description 1
- 230000036541 health Effects 0.000 description 1
- 210000004185 liver Anatomy 0.000 description 1
- 210000004072 lung Anatomy 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 230000004060 metabolic process Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 210000003205 muscle Anatomy 0.000 description 1
- 229910001120 nichrome Inorganic materials 0.000 description 1
- 210000000056 organ Anatomy 0.000 description 1
- 229910052760 oxygen Inorganic materials 0.000 description 1
- 239000001301 oxygen Substances 0.000 description 1
- 230000008447 perception Effects 0.000 description 1
- 230000035945 sensitivity Effects 0.000 description 1
- 239000013589 supplement Substances 0.000 description 1
- 238000010792 warming Methods 0.000 description 1
- 239000002699 waste material Substances 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2411—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/04—Segmentation; Word boundary detection
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
- G10L2015/223—Execution procedure of a spoken command
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Multimedia (AREA)
- Acoustics & Sound (AREA)
- Human Computer Interaction (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Computational Biology (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Machine Translation (AREA)
Abstract
The invention discloses a voice recognition method for an intelligent water cup, and particularly relates to the technical field of voice recognition. The method comprises the following steps: firstly, reading voice information; processing the voice information by a self-adaptive high-pass filtering method; frame division: dividing the voice signal into N sections of small voice signals, wherein frames are overlapped with each other; fourthly, adding a window: windowing the data by using a Hamming window; fourier transform: carrying out Fourier transform on the data, performing modulus taking and then square taking on the matrix to obtain energy spectrum density, and adding the energy spectrum density of each frame to obtain an energy sum matrix of each frame; sixthly, triangular band-pass filtering; and seventhly, discrete cosine transform: substituting the logarithmic energy into discrete cosine transform to obtain an L-order Mel-scale Cepstrum parameter; and training the voice feature vectors through an SVM algorithm, and then recognizing. By adopting the technical scheme of the invention, the problem of low recognition accuracy rate of the existing intelligent water cup during voice interaction is solved, and the intelligent water cup can be used for voice signal recognition of intelligent voice equipment.
Description
Technical Field
The invention discloses a voice recognition method and device for an intelligent water cup, and particularly relates to the technical field of voice recognition.
Background
Water is the substance with the largest content in human body, and accounts for about 60% -70% of the weight of adults, most substances in blood are water, and organs such as muscles, livers, lungs, brains and the like also contain a large amount of water. Water is not only an important nutrient substance for maintaining human health, but also participates in chemical reactions, substance conversion and energy exchange of various substances in the body. The absence of water will make the nutrients unabsorbed, oxygen not transported, waste not discharged, and metabolism impossible. Supplement water in time, and is very important for human bodies.
Various cups exist in the market at present, and not only are traditional drinking cups, but also a plurality of intelligent cups which are integrated with scientific and technological elements. For example, Chinese patent (patent publication No. CN110742469A) discloses an intelligent voice water cup, and provides an intelligent voice interaction water cup which can monitor the liquid storage, the liquid temperature, the position of the water cup and the like in the water cup in real time and remind a user of drinking water in time. However, the common accuracy of the existing voice interaction is not high enough, and false triggering can be caused, so that the water cup is empty, and certain potential safety hazards exist.
The existing intelligent water cup has the defects of insufficient intelligent degree and insufficient convenience in use. The voice interaction control method is generally not high enough in recognition accuracy, and in an intelligent water cup, once a recognition error occurs, empty burning possibly exists, so that the safety problem is caused.
Disclosure of Invention
The invention aims to provide a voice recognition method for an intelligent water cup, and solves the problem that the existing intelligent water cup has low recognition accuracy during voice interaction.
In order to achieve the above purpose, one technical solution of the present invention is as follows: a voice recognition method for an intelligent water cup comprises the following steps:
s1, calculating MFCC feature vectors of a section of voice information, wherein the calculation method of the MFCC feature vectors comprises the following steps:
reading voice information: inputting voice information in wav format;
preprocessing data: an adaptive high-pass filtering method is provided for processing voice information:
y(n)=x(n-1)+10·log 10 ·(x(n)-x(n-1)+1)
wherein x (n) is an input signal, y (n) is an output signal, n is time, x (n) is the amplitude of a sound waveform at the current time, and x (n-1) is the amplitude of the waveform at the last time;
thirdly, framing: dividing a voice signal into N small voice signals, and enabling frames to have a part of mutual overlap when segmentation is carried out;
fourthly, adding a window: windowing the framed data using a Hamming window, the Hamming window formula being:
W(n)=(1-a)-a·cos(2·π·n/N)
wherein, the value of a is 0.46, and n is a small section of voice data;
fourier transform: performing fast Fourier transform on the windowed data, performing modulus taking and then square taking on the obtained matrix to obtain energy spectrum density, and adding the energy spectrum density of each frame to obtain an energy sum matrix of each frame;
sixthly, triangular band-pass filtering: using a Mel-frequency filter bank, the Mel frequency is expressed as:
f mel =2595·log 10 (1+f Hz /700)
wherein f is Hz F (m) is the frequency value corresponding to the center of the triangle in a triangle band-pass filter, f mel Is the cepstral frequency;
the frequency response function of the triangular band-pass filter is:
seventhly, discrete cosine transform: substituting the logarithmic energy into discrete cosine transform, solving the parameter of the Mel-scale Cepstrum of the L order, and finally generating a voice feature vector of the voice information;
and S2, training the voice feature vectors through an SVM algorithm, and then recognizing.
Further, the training process of step S2 includes the following steps:
s2.1, preparing two sections of different voice information, wherein the two sections of voice information comprise instruction information and interference information, respectively obtaining an instruction information characteristic vector and an interference information characteristic vector by calculating the instruction information and the interference information, and sending the instruction information characteristic vector and the interference information characteristic vector into an SVM classifier for training;
s2.2, training meets the following classification function:
wherein, y i (w·x i +b)≥1-ζ i ,ζ i More than or equal to 0, i-1, 2. w is the hyperplane parameter, b is the hyperplane bias, ζ is the relaxation variable, and C is the penalty coefficient.
According to another technical scheme, the intelligent water cup voice recognition method is applied to an intelligent water cup.
Compared with the prior art, the beneficial effect of this scheme:
1. the scheme adopts a self-adaptive data preprocessing method to accurately extract the voice high-frequency information. Generally, the speech is different according to the speaking habits of speakers, and the definition of words is often not very same, so that the high-frequency signal in the speech signal is accurately extracted, and a very key two-point effect is played for the correct recognition of the subsequent speech. The scheme provides a self-adaptive high-frequency filtering method, which can accurately extract high-frequency information segments in the voice and is convenient for subsequent voice recognition;
2. the scheme is more accurate in detection, and can improve the identification accuracy of a specific instruction for a specific scene, for example: the heating command can be accurately recognized, the voice interaction content of the intelligent water cup is not much, and too many words do not need to be recognized generally, so that aiming at the characteristic, a voice recognition method is designed, the heating command is accurately recognized, other voice signals cannot be recognized and judged, the idea greatly simplifies the recognition difficulty, and the recognition accuracy is improved;
3. according to the scheme, the SVM is used for classifying the voice characteristics, and the calculation is more accurate. The SVM is originally a two-classification method, the classification of feature vectors is realized through a hyperplane, and the scene is a two-classification scene;
4. this scheme user convenient to use provides the route of this intelligence pronunciation drinking cup of extra convenient use for the user.
Drawings
FIG. 1 is a schematic view showing the structure of a smart cup according to embodiment 2;
fig. 2 is a flowchart of the speech recognition method for the intelligent cup in embodiment 2.
Detailed Description
The present invention will be described in further detail below by way of specific embodiments:
reference numerals in the drawings of the specification include: cup body 1, control chip 2, lithium cell 3, wireless charging coil 4.
Example 1
A voice recognition method for an intelligent water cup comprises the following steps:
s1, calculating MFCC feature vectors of a section of voice information, wherein the calculation method of the MFCC feature vectors comprises the following steps:
reading voice information: inputting the voice information in the wav format, and converting the voice information into the wav format if the voice information is not in the wav format.
Preprocessing data: in order to highlight the high frequency part of the data and weaken the low frequency part of the data, an adaptive high-pass filtering method is provided for processing the voice information:
y(n)=x(n-1)+10·log 10 ·(x(n)-x(n-1)+1)
where x (n) is an input signal, y (n) is an output signal, n is time, x (n) is the amplitude of the sound waveform at the current time, and x (n-1) is the amplitude of the waveform at the previous time.
Frame division: for an original speech signal, if fourier transform is performed directly, frequency information of the whole speech signal will be obtained, and time domain information is lost. In order to avoid the phenomenon, a frame division method is adopted to divide a section of voice signals into N sections of small voice signals, and meanwhile, the continuity between frames is considered, and the frames are overlapped partially when the segmentation is carried out; the frame length is typically set to 25ms and the frame shift to 10 ms.
Fourthly, windowing: windowing is required after the framing of the signal so that continuity between frames increases. Windowing the framed data by using a Hamming window formula:
W(n)=(1-a)-a·cos(2·π·n/N)
where a is 0.46 and n is a small segment of speech data.
Fourier transform: and performing fast Fourier transform on the windowed data, performing modulus and square on the obtained matrix to obtain energy spectrum density, and adding the energy spectrum densities of the frames to obtain the energy sum matrix of each frame.
Sixthly, triangular band-pass filtering: the human ear can still distinguish speech normally in a noisy and chaotic environment, in the process, the cochlea plays a very important role. The cochlea filters the received voice signals on a logarithmic frequency scale, the linear scale is generally below 1000Hz, and both scales above 1000Hz are logarithmic scales, so that the human ear has higher sensitivity to low-frequency signals. Based on this, a Mel frequency filter bank is employed, which has human cochlear perception characteristics. The Mel frequency is expressed as:
f mel =2595·log 10 (1+f Hz /700)
wherein f is Hz F (m) is the frequency value corresponding to the center of the triangle in a triangular band-pass filter, f mel Is the cepstral frequency;
the frequency response function of the triangular band-pass filter is:
seventhly, discrete cosine transform: the logarithmic energy is brought into discrete cosine transform to obtain the parameter of Mel-scale Cepstrum of L order,
through the steps, the voice feature vector of the voice information is finally generated.
S2, training and optimizing the voice feature vectors through an SVM algorithm, wherein the training process comprises the following steps:
s2.1, preparing two sections of different voice information, wherein the two sections of voice information comprise instruction information and interference information, respectively obtaining an instruction information characteristic vector and an interference information characteristic vector by calculating the instruction information and the interference information, and sending the instruction information characteristic vector and the interference information characteristic vector into an SVM classifier for training;
s2.2, training meets the following classification function:
wherein, y i (w·x i +b)≥1-ζ i ,ζ i More than or equal to 0, i-1, 2. w is hyperplane parameter, b is hyperplaneSurface bias, ζ is the relaxation variable, and C is the penalty factor.
Example 2
As shown in fig. 1 and 2, the voice recognition method of embodiment 1 is applied to an intelligent water cup, and the intelligent water cup of this embodiment includes a wireless charging coil 4, a lithium battery 3, a control chip 2, and a water cup body 1. Wireless charging coil 4 is connected with lithium cell 3 through control chip 2, controls charging and the discharge heating of lithium cell 3, and the last miniature microphone that has integrateed of control chip 2 for gather the speech control instruction, integrated PTC heating plate in the drinking cup body of cup 1, be used for heating the drinking cup inner wall, and then heating drinking water.
The scene applied to the intelligent water cup mainly judges whether the voice information contains heating or not, and is a typical two-classification scene. The SVM algorithm adopted by the scheme is a classification algorithm which is very widely applied, particularly the SVM algorithm is very widely applied to two classifications, and the SVM algorithm is very suitable for two classification judgment of voice signals under the scene. The SVM training process is to find an optimal hyperplane for distinguishing two different voice data. During SVM training, the instruction information is "heating", and the interference information is any other content, such as "warming".
The working process of the embodiment:
1. saying "heat" against the cup;
2. at the moment, the miniature microphone on the intelligent water cup acquires voice information;
3. extracting MFCC characteristic vectors from the collected voice information by using the method of the embodiment;
4. classifying the MFCC feature vectors by using a pre-trained SVM model, and outputting classification information;
5. and when the heating voice information is detected, the PTC heating sheet of the water cup is controlled to be electrified to heat.
The heating is performed using a PTC heating plate instead of a material such as nichrome wire. Because the PTC heating sheet can not exceed the highest heating temperature when heated, the PTC heating sheet has good temperature protection capability, and reduces the potential safety hazard while meeting the heating condition.
The above are merely examples of the present invention, and common general knowledge of known specific structures and/or characteristics in the schemes has not been described herein. It should be noted that, for those skilled in the art, without departing from the structure of the present invention, many variations and modifications can be made, which should also be considered as the scope of protection of the present invention, and these will not affect the effect and the exclusive utility of the implementation of the present invention. The scope of the claims of the present application shall be defined by the content of the claims, and the description of the embodiments and the like in the specification shall be used to explain the content of the claims.
Claims (3)
1. A voice recognition method for an intelligent water cup is characterized by comprising the following steps: the method comprises the following steps:
s1, calculating MFCC feature vectors of a section of voice information, wherein the calculation method of the MFCC feature vectors comprises the following steps:
reading voice information: inputting voice information in wav format;
preprocessing data: an adaptive high-pass filtering method is provided for processing voice information:
y(n)=x(n-1)+10.log 10 ·(x(n)-x(n-1)+1)
wherein x (n) is an input signal, y (n) is an output signal, n is time, x (n) is the amplitude of a sound waveform at the current time, and x (n-1) is the amplitude of the waveform at the last time;
thirdly, framing: dividing a voice signal into N small voice signals, and enabling frames to have a part of mutual overlap when segmentation is carried out;
fourthly, windowing: windowing the framed data by using a Hamming window formula:
W(n)=(1-a)-a·cos(2·π·n/N)
wherein, the value of a is 0.46, and n is a segment of small voice data;
fourier transform: performing fast Fourier transform on the windowed data, performing modulus taking and then square taking on the obtained matrix to obtain energy spectrum density, and adding the energy spectrum density of each frame to obtain an energy sum matrix of each frame;
sixthly, triangular band-pass filtering: using a Mel-frequency filter bank, the Mel frequency is expressed as:
f mel =2595.log 10 (1+f Hz /700)
wherein f is Hz F (m) is the frequency value corresponding to the center of the triangle in a triangular band-pass filter, f mel Is the cepstral frequency;
the frequency response function of the triangular band-pass filter is:
seventhly, discrete cosine transform: introducing logarithmic energy into discrete cosine transform, solving an L-order Mel-scale Cepstrum parameter, and finally generating a voice feature vector of voice information;
and S2, training the voice feature vectors through an SVM algorithm, and then recognizing.
2. The intelligent water cup voice recognition method according to claim 1, wherein: the training process of step S2 includes the following steps:
s2.1, preparing two sections of different voice information, wherein the two sections of voice information comprise instruction information and interference information, respectively obtaining an instruction information characteristic vector and an interference information characteristic vector by calculating the instruction information and the interference information, and sending the instruction information characteristic vector and the interference information characteristic vector into an SVM classifier for training;
s2.2, training to meet the following classification function:
wherein, the first and the second end of the pipe are connected with each other,y i (w·x i +b)≥1-ζ i ,ζ i more than or equal to 0, i-1, 2. w is the hyperplane parameter, b is the hyperplane bias, ζ is the relaxation variable, and C is the penalty coefficient.
3. The intelligent water cup voice recognition method of claim 1 or 2 is applied to an intelligent water cup.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210322946.4A CN114792517A (en) | 2022-03-30 | 2022-03-30 | Voice recognition method and device for intelligent water cup |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210322946.4A CN114792517A (en) | 2022-03-30 | 2022-03-30 | Voice recognition method and device for intelligent water cup |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114792517A true CN114792517A (en) | 2022-07-26 |
Family
ID=82462312
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210322946.4A Withdrawn CN114792517A (en) | 2022-03-30 | 2022-03-30 | Voice recognition method and device for intelligent water cup |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114792517A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118016106A (en) * | 2024-04-08 | 2024-05-10 | 山东第一医科大学附属省立医院(山东省立医院) | Elderly emotion health analysis and support system |
-
2022
- 2022-03-30 CN CN202210322946.4A patent/CN114792517A/en not_active Withdrawn
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118016106A (en) * | 2024-04-08 | 2024-05-10 | 山东第一医科大学附属省立医院(山东省立医院) | Elderly emotion health analysis and support system |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11322155B2 (en) | Method and apparatus for establishing voiceprint model, computer device, and storage medium | |
WO2019023877A1 (en) | Specific sound recognition method and device, and storage medium | |
CN106847292A (en) | Method for recognizing sound-groove and device | |
CN108922541A (en) | Multidimensional characteristic parameter method for recognizing sound-groove based on DTW and GMM model | |
CN106571135A (en) | Ear voice feature extraction method and system | |
CN103280220A (en) | Real-time recognition method for baby cry | |
Fook et al. | Comparison of speech parameterization techniques for the classification of speech disfluencies | |
Osmani et al. | Machine learning approach for infant cry interpretation | |
CN101620853A (en) | Speech-emotion recognition method based on improved fuzzy vector quantization | |
CN106264839A (en) | Intelligent snore stopping pillow | |
CN112397074A (en) | Voiceprint recognition method based on MFCC (Mel frequency cepstrum coefficient) and vector element learning | |
CN110942784A (en) | Snore classification system based on support vector machine | |
Bhagatpatil et al. | An automatic infant’s cry detection using linear frequency cepstrum coefficients (LFCC) | |
CN114792517A (en) | Voice recognition method and device for intelligent water cup | |
Eray et al. | An application of speech recognition with support vector machines | |
CN112017658A (en) | Operation control system based on intelligent human-computer interaction | |
CN115346561A (en) | Method and system for estimating and predicting depression mood based on voice characteristics | |
Kamaruddin et al. | Features extraction for speech emotion | |
Vesperini et al. | Snore sounds excitation localization by using scattering transform and deep neural networks | |
CN108564967A (en) | Mel energy vocal print feature extracting methods towards crying detecting system | |
CN111862991A (en) | Method and system for identifying baby crying | |
Pan et al. | The Methods of Realizing Baby Crying Recognition and Intelligent Monitoring Based on DNN-GMM-HMM | |
CN115641839A (en) | Intelligent voice recognition method and system | |
Mini et al. | Feature vector selection of fusion of MFCC and SMRT coefficients for SVM classifier based speech recognition system | |
Hanifa et al. | Comparative analysis on different cepstral features for speaker identification recognition |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WW01 | Invention patent application withdrawn after publication |
Application publication date: 20220726 |
|
WW01 | Invention patent application withdrawn after publication |