CN107527617A - Monitoring method, apparatus and system based on voice recognition - Google Patents
Monitoring method, apparatus and system based on voice recognition Download PDFInfo
- Publication number
- CN107527617A CN107527617A CN201710944193.XA CN201710944193A CN107527617A CN 107527617 A CN107527617 A CN 107527617A CN 201710944193 A CN201710944193 A CN 201710944193A CN 107527617 A CN107527617 A CN 107527617A
- Authority
- CN
- China
- Prior art keywords
- sound
- feature
- voice
- signal
- monitoring
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/063—Training
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
- G10L25/63—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination for estimating an emotional state
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N7/00—Television systems
- H04N7/18—Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast
Abstract
The invention provides a kind of monitoring method based on voice recognition, apparatus and system, method comprises the following steps:S1:Several specific sound is gathered in advance carries out sound model training, the sound model after being trained;S2:Collection site sound, feature extraction corresponding with the several specific sound is carried out to the sound of collection;S3:The feature of extraction and the sound model are subjected to matching classification, obtain the classification results of live sound;S4:Judge whether to need to alarm according to the classification results.The present invention can make up the deficiency of traditional video surveillance, and sound coordinates video preferably can be monitored in real time to complex environment.And prevention and the efficiency crackd down on crime can be improved, it is ensured that the initiative and promptness that monitoring system monitors to unsafe incidents.
Description
Technical field
It is more particularly to a kind of to be known based on sound the present invention relates to signal transacting, speech recognition and mode identification technology
Other monitoring method, apparatus and system.
Background technology
It is in public to use traditional video monitoring means more, video monitoring relative efficiency to have taken precautions against some illegal
Criminal activity.But there are following two shortcomings in video monitoring:Due to the carelessness of monitoring personnel, monitored picture can be missed and captured
Unsafe incidents;Due to the bidimensionality of video pictures, picture, which is easy to disturbed thing, to be stopped.Although case occur with
Afterwards, the monitor video of spot can be gathered, helps to investigate and collect evidence.But if missing optimal rescue period can then cause
The deterioration of case.So traditional video monitoring system is difficult timely and effectively to find that some incidents of violence either terror is attacked
Hit.
Secondly, the classification in sound monitoring to sound simply amplitude or other features can not be classified, it is necessary to
Classify with reference to the monitoring scene actual conditions feature different to sound, so that sound monitoring is really applied to life and worked as
In.
So the novel intelligent monitoring system of a breakthrough tradition monitoring obstacle of design is extremely urgent.In video monitoring
On the basis of add three types sound monitoring and aid in, can greatly improve monitoring efficiency, reduce the generation of tragedy, it is raw to reality
Work is significant.
The content of the invention
It is existing to solve it is an object of the invention to provide a kind of monitoring method based on voice recognition, apparatus and system
Video monitoring function it is single, the problem of monitoring efficiency is relatively low.
To achieve the above object, the invention provides a kind of monitoring method based on voice recognition, comprise the following steps:
S1:Several specific sound is gathered in advance carries out sound model training, the sound model after being trained;
S2:Collection site sound, feature extraction corresponding with the several specific sound is carried out to the sound of collection;
S3:The feature of extraction and the sound model are subjected to matching classification, obtain the classification results of live sound;
S4:Judge whether to need to alarm according to the classification results.
It is preferred that the specific sound includes abnormal sound, the voice with emotion and the sensitive word voice of non-voice, phase
When feature is extracted in Ying Di, the step S2, the feature of extraction is respectively:It is special for the non-speech sounds of abnormal sound monitoring
Sign;For crowd's speech emotional feature of crowd's mood monitoring;And monitor and extract with sensitive vocabulary for crowd's language institute
Speech-to-text needed for feature.
It is preferred that when extracting non-speech sounds feature, using the abnormal sound feature extracting method based on D-ESMD, tool
Body comprises the following steps:
1. determine the number K of T distributed random noises;
2. the voice signal s of collection site, and T distributed random noises are added in the voice signal s, obtain plus make an uproar
Signal Si, wherein, i is the number of noisy signal;
3. to the noisy signal SiDecomposed using the ESMD of symmetrical Point Interpolation method, obtain modal components
4. calculate the modal componentsArrangement entropy H, and pass through field test threshold value;
If 5. the arrangement entropy H is more than the threshold value, the modal componentsFor useful signal modal components, enter
Enter step 6., otherwise the modal componentsFor noise;
6. willAs input signal, repeat 3.~5., the modal components until decomposing obtained n ranksFor noise
Untill, wherein, n is positive integer;
If 7. i<K, then make i=i+1, repeat 2.~6., untill i=K, obtain all modal components, and ask
Its population meanBy population meanFinal modal components as decomposed signal;
8. calculating energy ratio of each rank modal components relative to original voice signal s, and it is combined into characteristic vector progress
Normalized, the characteristic vector as primary signal.
It is preferred that when extracting crowd's speech emotional feature, using the feature extracting method based on speech emotion recognition, tool
Body is:The expression of characteristic vector is carried out using the feature set used in international speech emotional challenge match.
It is preferred that when extracting feature needed for speech-to-text, using the speech feature extraction side based on Gammatone
Method, specifically include following steps:
1. the live voice signal integrated carries out preemphasis as x (n), to it, if pre emphasis factor is α, after preemphasis
Voice signal be y (n)=x (n)-α * x (n-1), wherein, n be collection in worksite voice signal number;
2. to after preemphasis voice signal y (n) carry out framing, frame length be N number of sampled point, wherein, N for 2 it is just whole
Power for several times;;
3. to the voice signal y (n) after preemphasis plus Hamming window, the voice signal S (n) after adding window be expressed as S (n)=
Y (n) * w (n), wherein, w (n) is Hamming window;
4. carrying out Fast Fourier Transform (FFT) to the voice signal S (n) after adding window, frequency domain signal X (k)=fft (S are obtained
(n),N);
5. square obtaining energy spectrum to frequency domain signal X (k) modulus, then it is filtered with Gammatone wave filter groups
Processing, obtains signal H (k)=fft (h (n), N);
6. the output to each Gammatone wave filters carries out log-compressed;
7. the signal of log-compressed is carried out into discrete cosine transform, GFLCC (Gammatone Frequency Log are obtained
Cepstrum Coeffient);
8. the feature obtained by discrete cosine transform is carried out into liter semisinusoidal cepstrum to be lifted, feature to the end is obtained.
It is preferred that the abnormal sound of the non-voice includes shot, explosive sound, strike note, shriek in monitoring scene
In one or more;The voice with emotion is included with one kind in happy, normal, tranquil, lively, angry, angry
The voice of emotion;The sensitive word voice is saved somebody's life, murdered including appearance, the middle dangerous vocabulary of one or more of hitting the person.
It is preferred that when the classification results are the abnormal sound of the non-voice, then judge in the step S4 pair
The live event answered is the one or more in shooting incident, crash, explosive incident, certain danger event, and is reported
Alert prompting;
When the classification results are the voice with emotion, then judge that corresponding crowd's emotion occurs in the step S4
Alarm is carried out when indignation, angry feature;
When the classification results are sensitive word voice, then reported in the step S4 according to the sensitive word recognized
Alert prompting.
It is preferred that the step S1 is specifically included:Using the algorithm of fuzzy least squares SVMs to from several
The characteristic value extracted in specific sound is learnt, establishes the sound model and classification;Then the step S3 is further
Including, by collection in worksite to feature and the sound model of voice signal correspond to carry out matching classification;
Wherein, judge that output result is to need the result alarmed with being not required to according to the classification results in the step S4
The result to be alarmed.
Present invention also offers a kind of supervising device based on voice recognition, including:
Sound pick-up, for collected sound signal;
Model training module, sound model training is carried out for gathering several specific sound in advance, after being trained
Sound model;
Characteristic extracting module, carried for the voice signal of collection in worksite to be carried out into feature corresponding with several specific sound
Take;
Sort module is matched, the feature that the characteristic extracting module is extracted carries out matching classification with the sound model,
Obtain the classification results of live sound;
Alarm module, judge whether to need to alarm according to the classification results.
Present invention also offers a kind of monitoring system based on voice recognition, including one or more to be based on as described above
The supervising device of voice recognition.
The invention has the advantages that:
The deficiency of traditional video surveillance can be effectively made up, it is real-time that sound coordinates video preferably can be carried out to complex environment
Monitoring.Technical scheme coordinates video monitoring to improve prevention and the effect crackd down on crime to a certain extent
Rate, it is ensured that the initiative and promptness that monitoring system monitors to unsafe incidents.
Brief description of the drawings
Fig. 1 is monitoring method schematic flow sheet of the preferred embodiment of the present invention based on voice recognition;
Fig. 2 is supervising device structural representation of the preferred embodiment of the present invention based on voice recognition;
Fig. 3 is preferred embodiment of the present invention sound characteristic extraction module Organization Chart;
Fig. 4 is preferred embodiment of the present invention model building module apparatus structure schematic diagram;
Fig. 5 is preferred embodiment of the present invention model building module Organization Chart;
Fig. 6 is that the preferred embodiment of the present invention matches sort module Organization Chart.
Embodiment
Below with reference to the accompanying drawing of the present invention, clear, complete description is carried out to the technical scheme in the embodiment of the present invention
And discussion, it is clear that as described herein is only a part of example of the present invention, is not whole examples, based on the present invention
In embodiment, the every other implementation that those of ordinary skill in the art are obtained on the premise of creative work is not made
Example, belongs to protection scope of the present invention.
For the ease of the understanding to the embodiment of the present invention, make further by taking specific embodiment as an example below in conjunction with accompanying drawing
Illustrate, and each embodiment does not form the restriction to the embodiment of the present invention.
As shown in figure 1, the monitoring method based on voice recognition that the present embodiment provides, comprises the following steps:
S1:Several specific sound is gathered in advance carries out sound model training, the sound model after being trained;
S2:Collection site sound, feature extraction corresponding with the several specific sound is carried out to the sound of collection;
S3:The feature of extraction and the sound model are subjected to matching classification, obtain the classification results of live sound;
S4:Judge whether to need to alarm according to the classification results.
Wherein, specific sound includes abnormal sound, the voice with emotion and the sensitive word voice of non-voice, correspondingly,
When feature is extracted in the step S2, the feature of extraction is respectively:For the non-speech sounds feature of abnormal sound monitoring;For
Crowd's speech emotional feature of crowd's mood monitoring;And for crowd's language with sensitive vocabulary monitor and extract voice turn
Feature needed for word.
Below in conjunction with the accompanying drawings shown in 2-6, the inventive method is described further as follows:
With reference to figure 2-5, mainly include below scheme in the sound monitoring method of the present embodiment:Sound collection, sound characteristic
Extraction, model is established, model and live sound feature match classification, the type of alarm.Wherein, monitoring is included to three
The monitoring of the sound of type, it is respectively:Abnormal sound monitoring, the speech emotional monitoring of crowd, by the speech-to-text of crowd
It is monitored.
Abnormal sound monitoring is to shot, explosive sound, strike note, shriek etc. should not occur in monitoring scene sound
Monitoring, then correspond to extraction feature be non-speech sounds feature.
The speech emotional monitoring of crowd is that the speech emotional in the voice to crowd in monitoring scene is monitored, emotion
Including emotion possessed by the mankind such as happy, normal, tranquil, lively, angry, angry, wherein being reported to dangerous emotion
Alert prompting, such as indignation, anger, then the feature for corresponding to extraction is crowd's speech emotional feature.
The monitoring of crowd's speech-to-text is that crowd's voice in monitoring scene is converted into word, and then word is supervised
Control.Such as occur saving somebody's life, murder, dangerous vocabulary of hitting the person, then correspond to sensitive word of the feature of extraction needed for speech-to-text
Feature, now, monitoring system make alarm.
Therefore, the abnormal sound of non-voice includes one in the shot in monitoring scene, explosive sound, strike note, shriek
Kind is a variety of;Voice with emotion includes the voice with a kind of emotion in happy, normal, tranquil, lively, angry, angry;
Sensitive word voice is saved somebody's life, murdered including appearance, the middle dangerous vocabulary of one or more of hitting the person.In so step S1, collection sound is made
, it is necessary to the sound that artificial addition is monitored during to train sound:, it is necessary to artificial when being used as training sound such as acquisition abnormity sound
Manufacture gunshot, explosive sound etc., it is also desirable to the artificial sound manufactured under other safe conditions, recorded, feature extraction,
Model training;, it is necessary to manufacture the voice with all kinds of emotions in the place in speech emotional monitoring, enrolled, feature carries
Take, model training;, it is necessary to carry alarm vocabulary (such as save somebody's life, hit the person) in the Place Making in speech-to-text monitoring
Voice, it is also desirable to reference to the typing of corresponding places feature not by the sound of alarm vocabulary, enrolled, feature extraction, mould
Type training.
It is sound characteristic extraction module Organization Chart of the present invention with reference to figure 3:
Based on three kinds of monitoring type, when the present invention models, the extraction to sound characteristic needs to be divided into three classes:
Monitored for abnormal sound:Abnormal sound feature extraction based on D-ESMD;
Monitored for crowd's mood:Feature extraction based on speech emotion recognition;
Monitored for speech-to-text:Speech feature extraction based on Gammatone.
It is model building module apparatus structure schematic diagram of the present invention with reference to figure 4:
First, according to monitoring scene situation, artificial selection sound is as training sound;
Sound pick-up collects training sound, and by transmission of sound signals to characteristic extracting module;
Characteristic extracting module carries out feature extraction to the feature for training sound, and characteristic value is transferred into training module;
Training module is trained using fuzzy least squares vector machine algorithm to characteristic value, and exports three standby species
The training pattern of type, in case matching sort module is called.
Specifically, in step S1, when extracting non-speech sounds feature, carried using the abnormal sound feature based on D-ESMD
Method is taken, specifically includes following steps:
2. determine the number K of T distributed random noises;
2. the voice signal s of collection site, and T distributed random noises are added in the voice signal s, obtain plus make an uproar
Signal Si, wherein, i is the number of noisy signal, the arbitrary value in waiting for 1,2,3 ...;
3. to the noisy signal SiDecomposed using the ESMD of symmetrical Point Interpolation method, obtain modal components
4. calculate the modal componentsArrangement entropy H, and pass through field test threshold value;
If 5. the arrangement entropy H is more than the threshold value, the modal componentsFor useful signal modal components, enter
Step 6., otherwise modal componentsFor noise;
6. willAs input signal, repeat 3.~5., the modal components until decomposing obtained n-th orderTo make an uproar
Untill sound, wherein, n is positive integer;
If 7. i<K, then make i=i+1, repeat 2.~6., untill i=K, obtain all modal components, and ask
Its population meanBy population meanFinal modal components as decomposed signal;
8. calculating energy ratio of each rank modal components relative to original voice signal s, and it is combined into characteristic vector progress
Normalized, the characteristic vector as primary signal.
Wherein, T partition noises are added to original voice signal and carries out noise reduction, its principle is:
If the voice signal collected is X (t), real sound signals are x (t), and noise is N (t);
X (t) is decomposed, modal components M (t) is obtained and decomposes remainder r (t);
M (t) includes actual signal component m (t) and noise c (t);
Primary signal mathematic(al) representation:
Primary signal adds T noises:
Above-mentioned k formula is added up:
As k~∞,
Found out by above-mentioned mathematical formulae, add T partition noises and influence of noise is reduced using ESMD decomposed signals.
Wherein, the ESMD of symmetrical centre interpolation includes:
Seek noisy signal SiAll maximum point xmaxWith minimum point xmin;
All adjacent maximum points and minimum point are connected, seek its midpoint xmean=(xmax+xmin)/2;
Seek the symmetrical midpoint x at adjacent midpointm, to xmEnter row interpolation.
Wherein, arranging the calculating of entropy includes:
Delay reconstruction is carried out to modal components M, obtains following sequence:
Wherein, i is time delay, and m is reconstruct dimension;
Ascending order arrangement is carried out to m element in all reconstruct component Y (i), the arrangement mode of all reconstruct components is converged
Always, the Probability p that every kind of arrangement mode occurs is calculated1,p2,…,pi, then arranging entropy is:
Wherein, mode energy is calculated as follows:
In step S1, when extracting crowd's speech emotional feature, using the feature extracting method based on speech emotion recognition,
Specially:The expression of characteristic vector is carried out using the feature set used in international speech emotional challenge match, is specially:
With reference to following table one, the feature set used in the present embodiment includes 16 low layer descriptor (low-level
Descriptors, LLDs), and 16 low layer descriptors are acted on by the statistic of 12 class functions, you can carry out crowd
The expression of the characteristic vector of speech emotional.
Table one:The feature set used in international speech emotional challenge match
In step S1, when extracting feature needed for speech-to-text, using the speech feature extraction side based on Gammatone
Method, specifically include following steps:
1. the live voice signal integrated carries out preemphasis as x (n), to it, if pre emphasis factor is α, after preemphasis
Voice signal be y (n)=x (n)-α * x (n-1), wherein, n be collection in worksite voice signal number, be 1,2,3 ...
In any one value;
2. carrying out framing to the voice signal y (n) after preemphasis, frame length is N number of sampled point, wherein, N here is
256, N can be set to the value of 2 arbitrary integer power in other preferred embodiments;
3. to the voice signal y (n) after preemphasis plus Hamming window, the voice signal S (n) after adding window be expressed as S (n)=
Y (n) * w (n), wherein, w (n) is Hamming window, and Hamming window w (n) is specially:
4. carrying out Fast Fourier Transform (FFT) to the voice signal S (n) after adding window, frequency domain signal X (k)=fft (S are obtained
(n),N);
5. square obtaining energy spectrum to frequency domain signal X (k) modulus, then it is filtered with Gammatone wave filter groups
Processing, obtains signal H (k)=fft (h (n), N), wherein, the expression formula of Gammatone wave filters is:
G (t)=tn-1exp(3πBt)cos(2πfiT), t >=0,
∫iIt is centre frequency, B=1.109* (24.7+0.108 ∫i);
6. the output to each Gammatone wave filters carries out log-compressed, compression expression formula is:
P is number of filter;
7. the signal of log-compressed is carried out into discrete cosine transform, GFLCC (Gammatone Frequency Log are obtained
Cepstrum Coeffient), expression formula is as follows:
M is the dimension of GFLCC features;
8. the feature obtained by discrete cosine transform is carried out into liter semisinusoidal cepstrum to be lifted, feature to the end is obtained, such as
Shown in formula:
C=C* ω (i).
With reference to figure 5, when carrying out model training, the sound and the normal sound in actual monitored region alarmed needs
Extracted by characteristic extracting module, be trained subsequently into training module, obtain training pattern.
Then in actual monitored, specifically the sound for monitoring place is monitored in real time, in order to improve point of sound type
Class result, employs fuzzy least squares vector machine algorithm, and this algorithm can uniquely be attributed to some class to each sample
Not.So the sound that the sound or needs that are likely to occur in needing to scene in model training monitor carries out artificial manufacture
And gather;For in scene abnormal sound monitoring, it is necessary to collection sound, such as paces sound, lap, running car sound, rifle
The sound that is likely to occur in the monitoring such as sound, explosive sound, strike note, shriek place, wherein need alarm for shot, quick-fried
The sound that fried sound, strike note, shriek etc. should not occur;Monitoring for crowd's emotion in monitoring place is, it is necessary to gather band
The sound of emotional color, the sound of happy, sad, tranquil, lively, angry, angry emotion is such as carried, wherein to dangerous
The sound of emotion carries out alarm;Middle the monitoring with vocabulary is spoken, it is necessary to gather in place for crowd in monitoring place
The sound of vocabulary is likely to occur, such as has a meal, go window-shopping, learning, playing, saving somebody's life, murdering, the sound for vocabulary of hitting the person, wherein needing
Alarm to save somebody's life, murder, the sound for the dangerous vocabulary such as hit the person;Artificial manufacture is needed by alarm and the sound do not alarmed
Sound;Feature extraction is carried out to sound using characteristic extracting module, and characteristic value is transferred to training module;Training module uses mould
Paste least square support vector machines algorithm is trained to characteristic value, exports training pattern.
With reference to figure 6, the present embodiment is monitored to the sound of three types, then the feature extraction to sound is three species
Type.The sound model of three types is established on the basis of the sound characteristic of three types, is respectively:Abnormal sound model, voice
Emotion model, language and characters model;Three kinds of sound characteristics and three kinds of sound models are subjected to matching classification;Sorting algorithm is fuzzy
Least square method supporting vector machine algorithm;Three kinds of classification results are exported after matching classification.
When step S3 classification results are the abnormal sound of non-voice, then corresponding live event is judged in step S4
For the one or more in shooting incident, crash, explosive incident, certain danger event, and carry out alarm;Work as classification
When being as a result the voice with emotion, then when judging that indignation, angry feature occurs in corresponding crowd's emotion in the step S4
Carry out alarm;When classification results are sensitive word voice, then carried out in the step S4 according to the sensitive word recognized
Alarm.
During concrete application, when alarm module obtains classification results, three kinds of classification results are independent of each other, such as:Abnormal sound is examined
Survey as danger sound, then result one is alarmed, as a result two and result three may not detect danger, then do not alarm;Monitoring personnel
The alarm types that can be sent according to warning device make corresponding action;It is relative that event occurs if abnormal sound alarm occurs
Seriously, such as shooting incident, explosive incident, then ambulance can be alarmed and called to monitoring personnel;If speech emotional alarm occurs
Mostly crowd's dispute, monitoring personnel can call goes together conciliation or the alarm of selectivity and calling ambulance in advance;As occurred
Speech-to-text alarm is then the event such as hit the person, save somebody's life, then ambulance can be alarmed and called to monitoring personnel.
After then step S1 extracts feature, still further comprise:Using fuzzy least squares SVMs algorithm to from
The characteristic value extracted in several specific sound is learnt, establishes the sound model and classification;The then step S3
Further comprise, by collection in worksite to feature and the sound model of voice signal correspond to carry out matching classification,
The characteristic matching classification of model and field collection sound is also used into fuzzy least squares vector machine;Wherein, in the step S4
Output result is judged according to the classification results to need the result alarmed and the result that need not be alarmed.
In preferred embodiment, because the foundation of the sound model of the present embodiment needs to be divided into three classes, then in the present embodiment
Alarm setting it is as follows:
The first kind be non-speech sounds feature model establish, i.e., abnormal sound in scene is monitored, as paces sound,
The sound being likely to occur in the monitoring such as lap, running car sound, shot, explosive sound, strike note, shriek place is established
Model, wherein needing the sound that should not occur for shot, explosive sound, strike note, shriek etc. of alarm;
Second class is that the model of the speech emotional feature of crowd is established, i.e., to the emotion in crowd's voice in monitoring scene
It is monitored, emotion includes the emotion possessed by the mankind such as happy, normal, tranquil, lively, angry, angry, wherein to dangerous
Emotion carry out alarm, such as angry, anger;
The model of required feature is established when 3rd class is crowd's speech-to-text, i.e., middle institute of being spoken to crowd in monitoring scene
Monitoring with vocabulary, such as the vocabulary for having a meal, go window-shopping, learning, play, saving somebody's life, murder, be likely to occur in monitoring place of hitting the person
Sound characteristic establishes model, wherein need alarm to save somebody's life, murder, the vocabulary that should not occur such as hit the person.
Here the fuzzy least squares vector machine used has done further improvement on the basis of traditional SVMs, makes
Obtain each sample and be attributed to some classification;
Introduce fuzzy membership si, then optimization problem be:
Wherein, xiFor m dimensional input vectors, yiFor sample category, i is sample number, and w is hyperplane wxi+ b=0 law vector,
B is hyperplane bias, and C is punishment parameter, ξiX is represented for relaxation factoriTo hyperplane wxi+ b=0 distance;
It is to the optimizing decision surface function of jth class sample for the i-th class sample:
Dij(x)=(wT)ij+ b,
Fuzzy membership function is defined as:
The fuzzy membership function of i-th class sample is:
Sample data x is divided into classification:
The present embodiment additionally provides a kind of supervising device based on voice recognition, and with reference to shown in figure 2, the device includes:
Sound pick-up, for collected sound signal;
Model training module, sound model training is carried out for gathering several specific sound in advance, after being trained
Sound model, its hardware composition include microprocessor, integrated circuit, programmable gate circuit etc., can be to three category features of sound
Carry out the foundation of model;
Characteristic extracting module, carried for the voice signal of collection in worksite to be carried out into feature corresponding with several specific sound
Take, its hardware composition includes microprocessor, integrated circuit, programmable gate circuit etc., can extract three classes of sound according to demand
Feature;
Sort module is matched, the feature that the characteristic extracting module is extracted carries out matching classification with the sound model,
The classification results of live sound are obtained, its hardware composition includes microprocessor, integrated circuit, programmable gate circuit etc., Ke Yigen
Corresponded according to the type of model and feature and carry out matching classification;
Alarm module, judge whether to need to alarm according to the classification results, and further carry out alarm.Alarm
Prompting is respectively abnormal sound alarm, speech emotional alarm, the alarm of dangerous vocabulary.Alarmed for abnormal sound, if detected
The sound that paces sound, lap, running car sound etc. meet in scene is not alarmed then, if detecting shot, explosive sound, shock
The dangerous sound such as sound, shriek is then alarmed;Alarmed for speech emotional, if detecting the normal safe feelings such as happy, lively
Thread is not alarmed then, is alarmed if the dangerous mood such as anger, indignation is detected;Alarmed for speech-to-text, if detection
To have a meal, go window-shopping, learn, the normal vocabulary such as play is not alarmed then, if the dangerous vocabulary such as detect help, murder, hit the person
Then alarm.
In addition, the present embodiment additionally provides a kind of monitoring system based on voice recognition, including one or more as above institutes
The supervising device based on voice recognition stated, or there are multiple sound pick-ups, multiple signals are obtained, it is other for supervising device
Module carries out sound signal processing respectively.
The foregoing is only a specific embodiment of the invention, but protection scope of the present invention is not limited thereto, any
Those skilled in the art the invention discloses technical scope in, to the present invention deformation or replacement done, should all cover
Within protection scope of the present invention.Therefore, protection scope of the present invention should be defined by described scope of the claims.
Claims (10)
1. a kind of monitoring method based on voice recognition, it is characterised in that comprise the following steps:
S1:Several specific sound is gathered in advance carries out sound model training, the sound model after being trained;
S2:Collection site sound, feature extraction corresponding with the several specific sound is carried out to the sound of collection;
S3:The feature of extraction and the sound model are subjected to matching classification, obtain the classification results of live sound;
S4:Judge whether to need to alarm according to the classification results.
2. the monitoring method according to claim 1 based on voice recognition, it is characterised in that the specific sound includes non-
The abnormal sound of voice, the voice with emotion and sensitive word voice, when correspondingly, in the step S2 extracting feature, extraction
Feature be respectively:For the non-speech sounds feature of abnormal sound monitoring;For crowd's speech emotional of crowd's mood monitoring
Feature;And for crowd's language with sensitive vocabulary monitor and extract speech-to-text needed for feature.
3. the monitoring method according to claim 2 based on voice recognition, it is characterised in that when extraction non-speech sounds are special
During sign, using the abnormal sound feature extracting method based on D-ESMD, following steps are specifically included:
1. set the number K of T distributed random noises;
2. the voice signal s of collection site, and T distributed random noises are added in the voice signal s, obtain noisy signal
Si, wherein, i is the number of noisy signal;
3. to the noisy signal SiDecomposed using the ESMD of symmetrical Point Interpolation method, obtain modal components
4. calculate the modal componentsArrangement entropy H, and pass through field test threshold value;
If 5. the arrangement entropy H is more than the threshold value, the modal componentsFor useful signal modal components, into step
6. the otherwise modal componentsFor noise;
6. willAs input signal, repeat 3.~5., until decomposing obtained n rank modal componentsUntill noise, its
In, n is positive integer;
If 7. i<K, then make i=i+1, repeat 2.~6., untill i=K, obtain all modal components, and ask its total
Body average valueBy population meanFinal modal components as decomposed signal;
8. calculating energy ratio of each rank modal components relative to original voice signal s, and it is combined into characteristic vector and carries out normalizing
Change is handled, the characteristic vector as primary signal.
4. the monitoring method according to claim 2 based on voice recognition, it is characterised in that when extraction crowd's speech emotional
During feature, using the feature extracting method based on speech emotion recognition, it is specially:Using being used in international speech emotional challenge match
Feature set carry out characteristic vector expression.
5. the monitoring method according to claim 2 based on voice recognition, it is characterised in that when extraction speech-to-text institute
When needing feature, using the Speech Feature Extraction based on Gammatone, following steps are specifically included:
1. the live voice signal gathered is x (n), preemphasis is carried out to it, if pre emphasis factor is α, after preemphasis
Voice signal is y (n)=x (n)-α * x (n-1), wherein, n is the number of the voice signal of collection in worksite;
2. carrying out framing to the voice signal y (n) after preemphasis, frame length is N number of sampled point, wherein, the positive integer time that N is 2
Power;
3. to the voice signal y (n) after preemphasis plus Hamming window, the voice signal S (n) after adding window is expressed as S (n)=y
(n) * w (n), wherein, w (n) is Hamming window;
4. carrying out Fast Fourier Transform (FFT) to the voice signal S (n) after adding window, frequency domain signal X (k)=fft (S (n), N) is obtained;
5. square obtaining energy spectrum to frequency domain signal X (k) modulus, processing then is filtered with Gammatone wave filter groups,
Obtain signal H (k)=fft (h (n), N);
6. the output to each Gammatone wave filters carries out log-compressed;
7. the signal of log-compressed is carried out into discrete cosine transform, GFLCC is obtained;
8. the feature obtained by discrete cosine transform is carried out into liter semisinusoidal cepstrum to be lifted, feature to the end is obtained.
6. the monitoring method according to claim 2 based on voice recognition, it is characterised in that the abnormal sound of the non-voice
Sound includes the one or more in the shot in monitoring scene, explosive sound, strike note, shriek;The voice packet with emotion
Include the voice with a kind of emotion in happy, normal, tranquil, lively, angry, angry;The sensitive word voice includes occurring
Save somebody's life, murder, the middle dangerous vocabulary of one or more of hitting the person.
7. the monitoring method according to claim 6 based on voice recognition, it is characterised in that when the classification results are institute
When stating the abnormal sound of non-voice, then judge in the step S4 corresponding to live event be shooting incident, it is crash, quick-fried
One or more in fried event, certain danger event, and carry out alarm;
When the classification results are the voice with emotion, then judge that anger occurs in corresponding crowd's emotion in the step S4
Alarm is carried out when anger, angry feature;
When the classification results are sensitive word voice, then alarm are carried out according to the sensitive word recognized in the step S4 and carried
Show.
8. the monitoring method according to claim 1 or 2 based on voice recognition, it is characterised in that the step S1 is specific
Including:The characteristic value extracted from several specific sound is carried out using the algorithm of fuzzy least squares SVMs
Learn, establish the sound model and classification;
Then the step S3 further comprises, the feature for the voice signal that collection in worksite is arrived corresponds with the sound model
To carry out matching classification;
Wherein, judge that output result is to need the result alarmed with that need not report according to the classification results in the step S4
Alert result.
A kind of 9. supervising device based on voice recognition, it is characterised in that including:
Sound pick-up, for collected sound signal;
Model training module, sound model training, the sound after being trained are carried out for gathering several specific sound in advance
Model;
Characteristic extracting module, for the voice signal of collection in worksite to be carried out into feature extraction corresponding with several specific sound;
Sort module is matched, the feature that the characteristic extracting module is extracted carries out matching classification with the sound model, obtained
The classification results of live sound;
Alarm module, judge whether to need to alarm according to the classification results.
10. a kind of monitoring system based on voice recognition, it is characterised in that including one or more as claimed in claim 9
Supervising device based on voice recognition.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710944193.XA CN107527617A (en) | 2017-09-30 | 2017-09-30 | Monitoring method, apparatus and system based on voice recognition |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710944193.XA CN107527617A (en) | 2017-09-30 | 2017-09-30 | Monitoring method, apparatus and system based on voice recognition |
Publications (1)
Publication Number | Publication Date |
---|---|
CN107527617A true CN107527617A (en) | 2017-12-29 |
Family
ID=60684025
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710944193.XA Pending CN107527617A (en) | 2017-09-30 | 2017-09-30 | Monitoring method, apparatus and system based on voice recognition |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107527617A (en) |
Cited By (28)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108769369A (en) * | 2018-04-23 | 2018-11-06 | 维沃移动通信有限公司 | A kind of method for early warning and mobile terminal |
CN108922548A (en) * | 2018-08-20 | 2018-11-30 | 深圳园林股份有限公司 | A kind of bird based on deep learning, frog intelligent monitoring method |
CN108986430A (en) * | 2018-09-13 | 2018-12-11 | 苏州工业职业技术学院 | Net based on speech recognition about vehicle safe early warning method and system |
CN109065069A (en) * | 2018-10-10 | 2018-12-21 | 广州市百果园信息技术有限公司 | A kind of audio-frequency detection, device, equipment and storage medium |
CN109298642A (en) * | 2018-09-20 | 2019-02-01 | 三星电子(中国)研发中心 | The method and device being monitored using intelligent sound box |
CN109410535A (en) * | 2018-11-22 | 2019-03-01 | 维沃移动通信有限公司 | A kind for the treatment of method and apparatus of scene information |
CN109447789A (en) * | 2018-11-01 | 2019-03-08 | 北京得意音通技术有限责任公司 | Method for processing business, device, electronic equipment and storage medium |
CN109493579A (en) * | 2018-12-28 | 2019-03-19 | 赵俊瑞 | A kind of public emergency automatic alarm and monitoring system and method |
CN109616140A (en) * | 2018-12-12 | 2019-04-12 | 浩云科技股份有限公司 | A kind of abnormal sound analysis system |
CN109640112A (en) * | 2019-01-15 | 2019-04-16 | 广州虎牙信息科技有限公司 | Method for processing video frequency, device, equipment and storage medium |
CN109754819A (en) * | 2018-12-29 | 2019-05-14 | 努比亚技术有限公司 | A kind of data processing method, device and storage medium |
CN109948739A (en) * | 2019-04-22 | 2019-06-28 | 桂林电子科技大学 | Ambient sound event acquisition and Transmission system based on support vector machines |
CN110033785A (en) * | 2019-03-27 | 2019-07-19 | 深圳市中电数通智慧安全科技股份有限公司 | A kind of calling for help recognition methods, device, readable storage medium storing program for executing and terminal device |
CN110310646A (en) * | 2019-05-22 | 2019-10-08 | 深圳壹账通智能科技有限公司 | Intelligent alarm method, apparatus, equipment and storage medium |
CN110415152A (en) * | 2019-07-29 | 2019-11-05 | 哈尔滨工业大学 | A kind of safety monitoring system |
CN110531345A (en) * | 2019-08-08 | 2019-12-03 | 中国人民武装警察部队士官学校 | To cheating interference method, system and the terminal device of shot positioning device |
CN110867959A (en) * | 2019-11-13 | 2020-03-06 | 上海迈内能源科技有限公司 | Intelligent monitoring system and monitoring method for electric power equipment based on voice recognition |
CN111223486A (en) * | 2019-12-30 | 2020-06-02 | 上海联影医疗科技有限公司 | Alarm device and method |
CN111599379A (en) * | 2020-05-09 | 2020-08-28 | 北京南师信息技术有限公司 | Conflict early warning method, device, equipment, readable storage medium and triage system |
CN111627430A (en) * | 2020-06-19 | 2020-09-04 | 北京世纪之星应用技术研究中心 | Multi-frequency domain fuzzy recognition alarm method and device for solid sound detection |
CN111951560A (en) * | 2020-08-30 | 2020-11-17 | 北京嘀嘀无限科技发展有限公司 | Service anomaly detection method, method for training service anomaly detection model and method for training acoustic model |
CN112101089A (en) * | 2020-07-27 | 2020-12-18 | 北京建筑大学 | Signal noise reduction method and device, electronic equipment and storage medium |
CN112349296A (en) * | 2020-11-10 | 2021-02-09 | 胡添杰 | Subway platform safety monitoring method based on voice recognition |
CN112493605A (en) * | 2020-11-18 | 2021-03-16 | 西安理工大学 | Intelligent fire fighting helmet for planning path |
CN112534500A (en) * | 2018-07-26 | 2021-03-19 | Med-El电气医疗器械有限公司 | Neural network audio scene classifier for hearing implants |
CN113660142A (en) * | 2021-08-19 | 2021-11-16 | 京东科技信息技术有限公司 | Fault monitoring method and device and related equipment |
CN113761267A (en) * | 2021-08-23 | 2021-12-07 | 珠海格力电器股份有限公司 | Prompt message generation method and device |
CN115132188A (en) * | 2022-09-02 | 2022-09-30 | 珠海翔翼航空技术有限公司 | Early warning method and device based on voice recognition, terminal equipment and storage medium |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2007017853A1 (en) * | 2005-08-08 | 2007-02-15 | Nice Systems Ltd. | Apparatus and methods for the detection of emotions in audio interactions |
CN103730109A (en) * | 2014-01-14 | 2014-04-16 | 重庆大学 | Method for extracting characteristics of abnormal noise in public places |
CN103971702A (en) * | 2013-08-01 | 2014-08-06 | 哈尔滨理工大学 | Sound monitoring method, device and system |
CN106228979A (en) * | 2016-08-16 | 2016-12-14 | 重庆大学 | A kind of abnormal sound in public places feature extraction and recognition methods |
CN106328120A (en) * | 2016-08-17 | 2017-01-11 | 重庆大学 | Public place abnormal sound characteristic extraction method |
CN107086036A (en) * | 2017-04-19 | 2017-08-22 | 杭州派尼澳电子科技有限公司 | A kind of freeway tunnel method for safety monitoring |
-
2017
- 2017-09-30 CN CN201710944193.XA patent/CN107527617A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2007017853A1 (en) * | 2005-08-08 | 2007-02-15 | Nice Systems Ltd. | Apparatus and methods for the detection of emotions in audio interactions |
CN103971702A (en) * | 2013-08-01 | 2014-08-06 | 哈尔滨理工大学 | Sound monitoring method, device and system |
CN103730109A (en) * | 2014-01-14 | 2014-04-16 | 重庆大学 | Method for extracting characteristics of abnormal noise in public places |
CN106228979A (en) * | 2016-08-16 | 2016-12-14 | 重庆大学 | A kind of abnormal sound in public places feature extraction and recognition methods |
CN106328120A (en) * | 2016-08-17 | 2017-01-11 | 重庆大学 | Public place abnormal sound characteristic extraction method |
CN107086036A (en) * | 2017-04-19 | 2017-08-22 | 杭州派尼澳电子科技有限公司 | A kind of freeway tunnel method for safety monitoring |
Cited By (37)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108769369A (en) * | 2018-04-23 | 2018-11-06 | 维沃移动通信有限公司 | A kind of method for early warning and mobile terminal |
CN112534500A (en) * | 2018-07-26 | 2021-03-19 | Med-El电气医疗器械有限公司 | Neural network audio scene classifier for hearing implants |
CN108922548A (en) * | 2018-08-20 | 2018-11-30 | 深圳园林股份有限公司 | A kind of bird based on deep learning, frog intelligent monitoring method |
CN108986430A (en) * | 2018-09-13 | 2018-12-11 | 苏州工业职业技术学院 | Net based on speech recognition about vehicle safe early warning method and system |
CN109298642A (en) * | 2018-09-20 | 2019-02-01 | 三星电子(中国)研发中心 | The method and device being monitored using intelligent sound box |
CN109298642B (en) * | 2018-09-20 | 2021-08-27 | 三星电子(中国)研发中心 | Method and device for monitoring by adopting intelligent sound box |
CN109065069A (en) * | 2018-10-10 | 2018-12-21 | 广州市百果园信息技术有限公司 | A kind of audio-frequency detection, device, equipment and storage medium |
US11948595B2 (en) | 2018-10-10 | 2024-04-02 | Bigo Technology Pte. Ltd. | Method for detecting audio, device, and storage medium |
CN109065069B (en) * | 2018-10-10 | 2020-09-04 | 广州市百果园信息技术有限公司 | Audio detection method, device, equipment and storage medium |
CN109447789A (en) * | 2018-11-01 | 2019-03-08 | 北京得意音通技术有限责任公司 | Method for processing business, device, electronic equipment and storage medium |
CN109410535A (en) * | 2018-11-22 | 2019-03-01 | 维沃移动通信有限公司 | A kind for the treatment of method and apparatus of scene information |
CN109616140A (en) * | 2018-12-12 | 2019-04-12 | 浩云科技股份有限公司 | A kind of abnormal sound analysis system |
CN109616140B (en) * | 2018-12-12 | 2022-08-30 | 浩云科技股份有限公司 | Abnormal sound analysis system |
CN109493579A (en) * | 2018-12-28 | 2019-03-19 | 赵俊瑞 | A kind of public emergency automatic alarm and monitoring system and method |
CN109754819A (en) * | 2018-12-29 | 2019-05-14 | 努比亚技术有限公司 | A kind of data processing method, device and storage medium |
CN109640112A (en) * | 2019-01-15 | 2019-04-16 | 广州虎牙信息科技有限公司 | Method for processing video frequency, device, equipment and storage medium |
CN110033785A (en) * | 2019-03-27 | 2019-07-19 | 深圳市中电数通智慧安全科技股份有限公司 | A kind of calling for help recognition methods, device, readable storage medium storing program for executing and terminal device |
CN109948739A (en) * | 2019-04-22 | 2019-06-28 | 桂林电子科技大学 | Ambient sound event acquisition and Transmission system based on support vector machines |
CN110310646A (en) * | 2019-05-22 | 2019-10-08 | 深圳壹账通智能科技有限公司 | Intelligent alarm method, apparatus, equipment and storage medium |
CN110415152A (en) * | 2019-07-29 | 2019-11-05 | 哈尔滨工业大学 | A kind of safety monitoring system |
CN110531345B (en) * | 2019-08-08 | 2021-08-13 | 中国人民武装警察部队士官学校 | Deception jamming method and system for gunshot positioning device and terminal equipment |
CN110531345A (en) * | 2019-08-08 | 2019-12-03 | 中国人民武装警察部队士官学校 | To cheating interference method, system and the terminal device of shot positioning device |
CN110867959A (en) * | 2019-11-13 | 2020-03-06 | 上海迈内能源科技有限公司 | Intelligent monitoring system and monitoring method for electric power equipment based on voice recognition |
CN111223486A (en) * | 2019-12-30 | 2020-06-02 | 上海联影医疗科技有限公司 | Alarm device and method |
CN111223486B (en) * | 2019-12-30 | 2023-02-24 | 上海联影医疗科技股份有限公司 | Alarm device and method |
CN111599379B (en) * | 2020-05-09 | 2023-09-29 | 北京南师信息技术有限公司 | Conflict early warning method, device, equipment, readable storage medium and triage system |
CN111599379A (en) * | 2020-05-09 | 2020-08-28 | 北京南师信息技术有限公司 | Conflict early warning method, device, equipment, readable storage medium and triage system |
CN111627430A (en) * | 2020-06-19 | 2020-09-04 | 北京世纪之星应用技术研究中心 | Multi-frequency domain fuzzy recognition alarm method and device for solid sound detection |
CN112101089B (en) * | 2020-07-27 | 2023-10-10 | 北京建筑大学 | Signal noise reduction method and device, electronic equipment and storage medium |
CN112101089A (en) * | 2020-07-27 | 2020-12-18 | 北京建筑大学 | Signal noise reduction method and device, electronic equipment and storage medium |
CN111951560B (en) * | 2020-08-30 | 2022-02-08 | 北京嘀嘀无限科技发展有限公司 | Service anomaly detection method, method for training service anomaly detection model and method for training acoustic model |
CN111951560A (en) * | 2020-08-30 | 2020-11-17 | 北京嘀嘀无限科技发展有限公司 | Service anomaly detection method, method for training service anomaly detection model and method for training acoustic model |
CN112349296A (en) * | 2020-11-10 | 2021-02-09 | 胡添杰 | Subway platform safety monitoring method based on voice recognition |
CN112493605A (en) * | 2020-11-18 | 2021-03-16 | 西安理工大学 | Intelligent fire fighting helmet for planning path |
CN113660142A (en) * | 2021-08-19 | 2021-11-16 | 京东科技信息技术有限公司 | Fault monitoring method and device and related equipment |
CN113761267A (en) * | 2021-08-23 | 2021-12-07 | 珠海格力电器股份有限公司 | Prompt message generation method and device |
CN115132188A (en) * | 2022-09-02 | 2022-09-30 | 珠海翔翼航空技术有限公司 | Early warning method and device based on voice recognition, terminal equipment and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107527617A (en) | Monitoring method, apparatus and system based on voice recognition | |
CN109616140B (en) | Abnormal sound analysis system | |
CN109473120A (en) | A kind of abnormal sound signal recognition method based on convolutional neural networks | |
CN106874833A (en) | A kind of mode identification method of vibration event | |
CN102148032A (en) | Abnormal sound detection method and system for ATM (Automatic Teller Machine) | |
WO2009046359A2 (en) | Detection and classification of running vehicles based on acoustic signatures | |
CN112735473B (en) | Method and system for identifying unmanned aerial vehicle based on voice | |
CN102163427A (en) | Method for detecting audio exceptional event based on environmental model | |
CN105095624A (en) | Method for identifying optical fibre sensing vibration signal | |
CN113566948A (en) | Fault audio recognition and diagnosis method for robot coal pulverizer | |
CN111613240B (en) | Camouflage voice detection method based on attention mechanism and Bi-LSTM | |
CN116778964A (en) | Power transformation equipment fault monitoring system and method based on voiceprint recognition | |
CN112349296A (en) | Subway platform safety monitoring method based on voice recognition | |
CN114352486A (en) | Wind turbine generator blade audio fault detection method based on classification | |
Li et al. | Research on environmental sound classification algorithm based on multi-feature fusion | |
Sigmund et al. | Efficient feature set developed for acoustic gunshot detection in open space | |
CN109389994A (en) | Identification of sound source method and device for intelligent transportation system | |
CN109087666A (en) | The identification device and method that prison is fought | |
Spadini et al. | Sound event recognition in a smart city surveillance context | |
Estrebou et al. | Voice recognition based on probabilistic SOM | |
CN114155878B (en) | Artificial intelligence detection system, method and computer program | |
CN111048089A (en) | Method for improving voice awakening success rate of intelligent wearable device, electronic device and computer readable storage medium | |
CN109243486A (en) | A kind of winged acoustic detection method of cracking down upon evil forces based on machine learning | |
CN113247730B (en) | Elevator passenger screaming detection method and system based on multi-dimensional features | |
CN112960506B (en) | Elevator warning sound detection system based on audio frequency characteristics |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20171229 |