CN107437418A - Vehicle-mounted voice identifies electronic entertainment control system - Google Patents

Vehicle-mounted voice identifies electronic entertainment control system Download PDF

Info

Publication number
CN107437418A
CN107437418A CN201710632907.3A CN201710632907A CN107437418A CN 107437418 A CN107437418 A CN 107437418A CN 201710632907 A CN201710632907 A CN 201710632907A CN 107437418 A CN107437418 A CN 107437418A
Authority
CN
China
Prior art keywords
mrow
natural
msub
sounding
voice signal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710632907.3A
Other languages
Chinese (zh)
Inventor
韦玥
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Yixin Intelligent Technology Co Ltd
Original Assignee
Shenzhen Yixin Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Yixin Intelligent Technology Co Ltd filed Critical Shenzhen Yixin Intelligent Technology Co Ltd
Priority to CN201710632907.3A priority Critical patent/CN107437418A/en
Publication of CN107437418A publication Critical patent/CN107437418A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60RVEHICLES, VEHICLE FITTINGS, OR VEHICLE PARTS, NOT OTHERWISE PROVIDED FOR
    • B60R16/00Electric or fluid circuits specially adapted for vehicles and not otherwise provided for; Arrangement of elements of electric or fluid circuits specially adapted for vehicles and not otherwise provided for
    • B60R16/02Electric or fluid circuits specially adapted for vehicles and not otherwise provided for; Arrangement of elements of electric or fluid circuits specially adapted for vehicles and not otherwise provided for electric constitutive elements
    • B60R16/023Electric or fluid circuits specially adapted for vehicles and not otherwise provided for; Arrangement of elements of electric or fluid circuits specially adapted for vehicles and not otherwise provided for electric constitutive elements for transmission of signals between vehicle parts or subsystems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command

Abstract

The invention provides a kind of vehicle-mounted voice to identify electronic entertainment control system, including natural-sounding input module, speech processing module, bluetooth module, media player, air conditioner, utility control program and car body control module, the natural-sounding input module are used for the voice signal of reception staff;The speech processing module is used for the voice signal for receiving the natural-sounding input module, and converts voice signals into executable control command;The bluetooth module is used to receive executable command, and bluetooth equipment is controlled;The media player is used to receive executable control command, controls the media renderer plays media;Air conditioner, the air conditioner are used to receive executable control command, adjust temperature and air quantity, air-flow in-car outer circulation pattern and blowing pattern.The present invention provides the speech recognition schemes of the natural language of vehicle electronics entertainment systems, facilitates user to be interacted with vehicle electronic device.

Description

Vehicle-mounted voice identifies electronic entertainment control system
Technical field
The present invention relates to vehicle intellectualized control field, particularly vehicle-mounted voice identification electronic entertainment system.
Background technology
In recent years, China Automobile Industry makes rapid progress, and many control buttons has been concentrated on automobile multifunctional steering wheel, more Convenient while numerous key distribution is scattered, and troublesome in poeration, function is also limited, does not make big breakthrough but in terms of voice system, Traditional voice system mainly concentrates voice message, Voice Navigation etc., and the functional module for completing above-mentioned function is phase Mutually independent, the corresponding independent speech chip of each functional module, this largely causes the wave of speech chip Take, lack effective allotment of the control module to each functional module.
Meanwhile in the prior art, vehicle electronics entertainment control system does not largely support speech recognition, even if there is support , also it is only capable of being identified by authoritative voice mode.For example " turn on radio " such a order is solidificated in control In system processed, user have to it is of verbatim account say this order and could activate this behavior, greatly reduce Consumer's Experience Property.
The content of the invention
In view of the above-mentioned problems, the present invention is intended to provide a kind of vehicle-mounted voice identifies electronic entertainment system.
The purpose of the present invention is realized using following technical scheme:
A kind of vehicle-mounted voice identifies electronic entertainment control system, including natural-sounding input module, speech processing module, indigo plant Tooth module, media player, air conditioner, utility control program and car body control module, the natural-sounding input module are used for The voice signal of reception staff;The speech processing module is used for the voice signal for receiving the natural-sounding input module, and Convert voice signals into executable control command;The bluetooth module is used to receive executable command, and bluetooth equipment is entered Row control;The media player is used to receive executable control command, controls the media renderer plays media;Air-conditioning Device, the air conditioner are used to receive executable control command, adjust temperature and air quantity, air-flow in-car outer circulation pattern and blowing Pattern.
The vehicle-mounted voice identification electronic entertainment control system also includes navigator, and the navigator is executable for receiving Control command, setting destination, planning guidance path, selection path and change destination.
The utility control program is used to receive executable control command, runs corresponding application program.
The car body control module is used to receive executable control command, controls the facility of in-car.
Beneficial effects of the present invention are:Vehicle-mounted voice identification electronic entertainment control system of the present invention makes up in-car electronic equipment The unfriendly property of injunctive speech recognition, ease for use, does not know so as to provide the voice of the natural language of vehicle electronics entertainment systems Other scheme, facilitates user to be interacted with vehicle electronic device, is exchanged with usual tongue.Overcome its existing other party simultaneously The defects of case has to connect internet, there is provided could be used that natural language speech identifies in the case of no networking.
Brief description of the drawings
Using accompanying drawing, the invention will be further described, but the embodiment in accompanying drawing does not form any limit to the present invention System, for one of ordinary skill in the art, on the premise of not paying creative work, can also be obtained according to the following drawings Other accompanying drawings.
The frame construction drawing of Fig. 1 vehicle-mounted voice identification electronic entertainment systems of the present invention;
Fig. 2 is the frame construction drawing of speech processing module of the present invention.
Reference:
Natural-sounding input module 1, speech processing module 2, bluetooth module 3, media player 4, air conditioner 5, using control Processing procedure sequence 6, car body control module 7, navigator 8, natural-sounding detection unit 20, natural-sounding enhancement unit 21, feature extraction Unit 22 and natural-sounding recognition unit 23
Embodiment
With reference to following application scenarios, the invention will be further described.
Referring to Fig. 1, a kind of vehicle-mounted voice identifies electronic entertainment control system, including natural-sounding input module 1, at voice Manage module 2, bluetooth module 3, media player 4, air conditioner 5, utility control program 6 and car body control module 7, the natural language Sound input module is used for the voice signal of reception staff;The speech processing module is used to receive the natural-sounding input module Voice signal, and convert voice signals into executable control command;The bluetooth module is used to receive executable command, Bluetooth equipment is controlled;The media player is used to receive executable control command, controls the media player Play media;Air conditioner, the air conditioner are used to receive executable control command, adjust temperature and air quantity, air-flow are in-car outer Circulation pattern and blowing pattern.
Further, the vehicle-mounted voice identification electronic entertainment control system also includes navigator 8, and the navigator is used for Receive executable control command, setting destination, planning guidance path, selection path and change destination.
Further, the utility control program is used to receive executable control command, runs corresponding application program.
Further, the car body control module is used to receive executable control command, controls the facility of in-car.
Preferably, the car also includes topic pattern block in speech recognition electronic entertainment control system, and the prompting module is used In receiving executable control command, the prompting of voice is sent to the personnel of in-car.
Preferably, referring to Fig. 2, the speech processing module includes natural-sounding detection unit 20, and natural-sounding enhancing is single Member 21, feature extraction unit 22 and natural-sounding recognition unit 23, the natural-sounding detection unit are used to detect and extract to connect Effective natural-sounding message part in the voice signal of receipts;The natural-sounding enhancement unit is used for natural-sounding information portion Divide and carry out enhancing processing, obtain natural-sounding message part to be identified;The feature extraction unit is used for natural language to be identified Sound message part carries out the extraction of instruction features parameter;The sound instruction recognition unit is used for according to the instruction features parameter pair It is identified, obtains corresponding control command.
The above embodiment of the present invention, there is provided the speech recognition schemes of the natural language of vehicle electronics entertainment systems, make up car The unfriendly property of the injunctive speech recognition of inner electronic equipment, ease for use, does not facilitate user to be interacted with vehicle electronic device.
Preferably, it is effective natural in voice signal of the natural-sounding detection unit 20 for detecting and extracting reception Voice messaging, including:
(1) the overlapping carry out sub-frame processing of interframe 50% is pressed to the voice signal of reception, and adds Hamming window, obtain each frame Voice signal;
Preferably, frame length U=30ms is selected during framing in this unit;
(2) obtain the logarithmic energy feature of each frame voice signal, the function used for:
In formula, D (m) represents the logarithmic energy feature of the m frames of voice signal,Represent voice signal the The short-time energy of m frames, | rm(n)|2Represent that the m frames of voice signal represent the Hamming window in energy value at different moments, U Length, c represent the logarithmic energy factor of setting;
Preferably, c=105
(3) Short Time Fourier Transform is carried out to each frame voice signal, obtains the general K (f of energyn), wherein fnRepresent frequency point Amount;
(4) obtain the spectrum entropy feature of each frame voice signal, the function used for:
Wherein,
In formula, T (m) represents the spectrum entropy feature of voice signal m frames, pg(n, m) represents voice signal m frame rate components For fnProbability density, Km(fn) represent m frame voice signals the general frequency components of energy be fnEnergy intensity, N represent it is short When Fourier transformation window length, with Hamming window equal length, i.e. N=U;
(5) obtain the behavioral characteristics of each frame voice signal, the SQL used for:
In formula, DT (m) represents the behavioral characteristics of voice signal m frames, and D (m) represents the logarithm energy of the m frames of voice signal Measure feature, T (m) represent the spectrum entropy feature of voice signal m frames, ΛDAnd ΛTThe logarithm energy of 10 frame voice signals before representing respectively The average value of amount and spectrum entropy feature, ω represent the proposed factors of setting, ω ∈ [1,2];
(6) according to the behavioral characteristics of voice signal, the threshold value of each frame voice signal behavioral characteristics and setting is compared Compared with, reservation behavioral characteristics, which are more than the corresponding speech signal frame of threshold value and are designated as natural-sounding message part, to be for further processing, Remainder is designated as unvoiced section.
This preferred embodiment, in vehicle-mounted voice identifies electronic entertainment control system using the above method to receive from Right voice signal carries out speech detection, and natural-sounding signal is described with reference to logarithmic energy feature and spectrum entropy feature, can Natural-sounding message part and unvoiced section are more accurately distinguished, especially there is good effect in the case where road is noisy Fruit, accurately identify control command for vehicle electronics entertainment control system and provide guarantee.
Preferably, the natural-sounding enhancement unit 21 is used to carry out natural-sounding message part enhancing processing, obtains Natural-sounding message part to be identified, including:
(1) Fast Fourier Transform (FFT) is carried out to natural-sounding message part, obtains the amplitude spectrum C of nature voice messaging part (f);
(2) to natural-sounding message part carry out speech enhan-cement processing, the SQL used for:
Wherein,
In formula, C ' (f) represents the amplitude spectrum of natural-sounding message part after speech enhan-cement processing, and C (f) represents natural-sounding The amplitude spectrum of message part, | C (f) |2The power spectrum of natural-sounding message part is represented, δ and μ represent adjustable gain effect Dynamic gene, whereinThe estimation to present frame noise power spectrum is represented, divides it by obtaining the natural-sounding information portion The noise power spectrum of the preceding unvoiced section obtains, and A ' (f) represents the estimation of the noise power spectrum of previous frame, and A (f) represents to work as The noise power spectrum that previous frame obtains, ωpRepresent the weight of present frame noise power spectrum, it should be noted that noise power spectrum is only in institute State unvoiced section to be updated, in the natural-sounding message part without renewal;
(3) inverse fast fourier transform is carried out to the result of self-defined wave filter, obtains natural-sounding information portion to be identified Point.
This preferred embodiment, in vehicle-mounted voice identifies electronic entertainment control system, adopt with the aforedescribed process according to being obtained The unvoiced section of the natural-sounding signal taken in itself obtains required noise power Power estimation, then to natural-sounding information Part is strengthened, and is improved the adaptability of speech enhan-cement, can be effectively increased the signal to noise ratio of natural-sounding message part, Provided the foundation for the identification control command after electronic entertainment control system.
Preferably, the feature extraction unit 22 is used to carry out instruction features parameter to natural-sounding message part to be identified Extraction, including:
(1) framing plus Hamming window processing are carried out to natural-sounding message part to be identified;
Preferably, frame length N=30ms is selected during framing in this unit, overlapping interframe is 10ms;
(2) frame chosen successively in nature voice messaging part carries out Fast Fourier Transform (FFT), obtains frequency spectrum R (f);
(3) frequency spectrum R (f) is converted into mel-frequency R (f '), and natural-sounding is obtained using following self-defined wave filter group The characteristic energy spectrum E of signalb(x), it is specially:
Wherein,
In formula, Eb (x) represents characteristic energy spectrum Eb (x), x=1 corresponding to x-th of wave filter output in wave filter group, 2 ..., X, X represent the number of wave filter group median filter, and R (f ') is represented to be transformed into the frequency spectrum obtained after mel-frequency, and f ' is represented Mel-frequency,Represent the barycenter parameter of x-th of wave filter in wave filter group, Vx(f) wave filter group is represented In x-th of wave filter, jx, hx、kxThe upper limit of x-th of wave filter, center, wherein lower limit, h in wave filter group are represented respectivelyx= jx-1=kx+1,
Preferably, it is X=13 to take wave filter group median filter quantity;
Wherein, the mel-frequency be it is a kind of the sense organ of equidistant change in pitch is judged based on human ear depending on it is non-linear The relation of frequency scale, mel-frequency f ' and frequency f hertz is:
(4) the characteristic energy spectrum E (x) of acquisition is taken the logarithm, then carries out discrete cosine transform, obtain discrete cosine transform Preceding X coefficient afterwards ties up speech characteristic parameter as the X of this frame natural-sounding message part;
(5) characteristic parameter of the repeat step (2) to (4) until obtaining each frame of natural-sounding message part to be identified.
This preferred embodiment, the feature extraction unit 22 is adopted carries out speech feature extraction with the aforedescribed process, in feature The barycenter parameter of corresponding different frequency wave filter is introduced in parameter extraction function, can be according to natural-sounding message part itself Frequency characteristic, accurately reflect its characteristic parameter, improve the robustness of characteristic parameter extraction, at the same improve the present invention Vehicle-mounted voice identification electronic entertainment control system is particularly the stability in the noisy environment of road.
Finally it should be noted that the above embodiments are merely illustrative of the technical solutions of the present invention, rather than the present invention is protected The limitation of scope is protected, although being explained with reference to preferred embodiment to the present invention, one of ordinary skill in the art should Work as understanding, technical scheme can be modified or equivalent substitution, without departing from the reality of technical solution of the present invention Matter and scope.

Claims (8)

1. a kind of vehicle-mounted voice identifies electronic entertainment control system, it is characterised in that including natural-sounding input module, at voice Manage module, bluetooth module, media player, air conditioner, utility control program and car body control module, the natural-sounding input Module is used for the voice signal of reception staff;The speech processing module is used for the voice for receiving the natural-sounding input module Signal, and convert voice signals into executable control command;The bluetooth module is used to receive executable command, to bluetooth Equipment is controlled;The media player is used to receive executable control command, controls the media renderer plays matchmaker Body;Air conditioner, the air conditioner are used to receive executable control command, adjust temperature and air quantity, air-flow in-car outer circulation mould Formula and blowing pattern.
2. a kind of vehicle-mounted voice identification electronic entertainment control system according to claim 1, it is characterised in that described vehicle-mounted Speech recognition electronic entertainment control system also includes navigator, and the navigator is used to receive executable control command, set Destination, planning guidance path, selection path and change destination.
A kind of 3. vehicle-mounted voice identification electronic entertainment control system according to claim 1, it is characterised in that the application Control program is used to receive executable control command, runs corresponding application program.
A kind of 4. vehicle-mounted voice identification electronic entertainment control system according to claim 1, it is characterised in that the vehicle body Control module is used to receive executable control command, controls the facility of in-car.
5. a kind of vehicle-mounted voice identification electronic entertainment control system according to claim 1, it is characterised in that the car exists Speech recognition electronic entertainment control system also includes topic pattern block, and the prompting module is used to receive executable control command, The prompting of voice is sent to the personnel of in-car.
A kind of 6. vehicle-mounted voice identification electronic entertainment control system according to claim 1, it is characterised in that the voice Processing module includes natural-sounding detection unit, natural-sounding enhancement unit, feature extraction unit and natural-sounding recognition unit, Effective natural-sounding message part in voice signal of the natural-sounding detection unit for detecting and extracting reception;It is described Natural-sounding enhancement unit is used to carry out enhancing processing to natural-sounding message part, obtains natural-sounding information portion to be identified Point;The feature extraction unit is used for the extraction that instruction features parameter is carried out to natural-sounding message part to be identified;The sound Recognition unit is instructed to be used to obtain corresponding control command to being identified according to the instruction features parameter.
A kind of 7. vehicle-mounted voice identification electronic entertainment control system according to claim 6, it is characterised in that the nature Effective natural-sounding information in voice signal of the speech detection unit for detecting and extracting reception, including:
(1) the overlapping carry out sub-frame processing of interframe 50% is pressed to the voice signal of reception, and adds Hamming window, obtain each frame voice Signal;
(2) obtain the logarithmic energy feature of each frame voice signal, the function used for:
<mrow> <mi>D</mi> <mrow> <mo>(</mo> <mi>m</mi> <mo>)</mo> </mrow> <mo>=</mo> <msub> <mi>log</mi> <mn>10</mn> </msub> <mrow> <mo>(</mo> <munderover> <mo>&amp;Sigma;</mo> <mrow> <mi>n</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>U</mi> </munderover> <msup> <mrow> <mo>|</mo> <mrow> <msub> <mi>r</mi> <mi>m</mi> </msub> <mrow> <mo>(</mo> <mi>n</mi> <mo>)</mo> </mrow> </mrow> <mo>|</mo> </mrow> <mn>2</mn> </msup> <mo>+</mo> <mi>c</mi> <mo>)</mo> </mrow> <mo>-</mo> <msub> <mi>log</mi> <mn>10</mn> </msub> <mi>c</mi> </mrow>
In formula, D (m) represents the logarithmic energy feature of the m frames of voice signal,Represent voice signal m frames Short-time energy, | rm(n)|2Represent that the m frames of voice signal represent the length of the Hamming window, c in energy value at different moments, U Represent the logarithmic energy factor of setting;
(3) Short Time Fourier Transform is carried out to each frame voice signal, obtains the general K (f of energyn), wherein fnRepresent frequency component;
(4) obtain the spectrum entropy feature of each frame voice signal, the function used for:
<mrow> <mi>T</mi> <mrow> <mo>(</mo> <mi>m</mi> <mo>)</mo> </mrow> <mo>=</mo> <mo>-</mo> <munderover> <mo>&amp;Sigma;</mo> <mrow> <mi>n</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>N</mi> </munderover> <msub> <mi>p</mi> <mi>g</mi> </msub> <mrow> <mo>(</mo> <mi>n</mi> <mo>,</mo> <mi>m</mi> <mo>)</mo> </mrow> <msub> <mi>logp</mi> <mi>g</mi> </msub> <mrow> <mo>(</mo> <mi>n</mi> <mo>,</mo> <mi>m</mi> <mo>)</mo> </mrow> </mrow> 1
Wherein,
<mrow> <msub> <mi>p</mi> <mi>g</mi> </msub> <mrow> <mo>(</mo> <mi>n</mi> <mo>,</mo> <mi>m</mi> <mo>)</mo> </mrow> <mo>=</mo> <mfrac> <mrow> <msub> <mi>K</mi> <mi>m</mi> </msub> <mrow> <mo>(</mo> <msub> <mi>f</mi> <mi>n</mi> </msub> <mo>)</mo> </mrow> </mrow> <mrow> <msubsup> <mi>&amp;Sigma;</mi> <mrow> <mi>&amp;gamma;</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>N</mi> </msubsup> <msub> <mi>K</mi> <mi>m</mi> </msub> <mrow> <mo>(</mo> <msub> <mi>f</mi> <mi>&amp;gamma;</mi> </msub> <mo>)</mo> </mrow> </mrow> </mfrac> <mo>,</mo> <mi>n</mi> <mo>=</mo> <mn>1</mn> <mo>,</mo> <mn>2</mn> <mo>,</mo> <mo>...</mo> <mi>N</mi> </mrow>
In formula, T (m) represents the spectrum entropy feature of voice signal m frames, pg(n, m) represents that voice signal m frame rates component is fn Probability density, Km(fn) represent m frame voice signals the general frequency components of energy be fnEnergy intensity, N represents Fu in short-term In leaf transformation window length, with Hamming window equal length, i.e. N=U;
(5) obtain the behavioral characteristics of each frame voice signal, the SQL used for:
<mrow> <mi>D</mi> <mi>T</mi> <mrow> <mo>(</mo> <mi>m</mi> <mo>)</mo> </mrow> <mo>=</mo> <msup> <mrow> <mo>(</mo> <mfrac> <mn>1</mn> <mrow> <mo>&amp;lsqb;</mo> <mrow> <mo>(</mo> <mi>D</mi> <mo>(</mo> <mi>m</mi> <mo>)</mo> <mo>-</mo> <msub> <mi>&amp;Lambda;</mi> <mi>D</mi> </msub> <mo>)</mo> </mrow> <mo>&amp;CenterDot;</mo> <mrow> <mo>(</mo> <mi>T</mi> <mo>(</mo> <mi>m</mi> <mo>)</mo> <mo>-</mo> <msub> <mi>&amp;Lambda;</mi> <mi>T</mi> </msub> <mo>)</mo> </mrow> <mo>&amp;rsqb;</mo> </mrow> </mfrac> <mo>)</mo> </mrow> <mi>&amp;omega;</mi> </msup> </mrow>
In formula, DT (m) represents the behavioral characteristics of voice signal m frames, and D (m) represents that the logarithmic energy of the m frames of voice signal is special Sign, T (m) represent the spectrum entropy feature of voice signal m frames, ΛDAnd ΛTRespectively represent before 10 frame voice signals logarithmic energy and The average value of entropy feature is composed, β represents the proposed factors of setting, ω ∈ [1,2];
(6) according to the behavioral characteristics of voice signal, the threshold value of each frame voice signal behavioral characteristics and setting is compared, protected Stay behavioral characteristics to be more than the corresponding speech signal frame of threshold value and be designated as natural-sounding message part to be for further processing, its remaining part Minute mark is unvoiced section.
8. identify electronic entertainment control system according to a kind of vehicle-mounted voice described in claim 6, it is characterised in that the feature carries Unit is taken to be used to carry out instruction features ginseng to the natural-sounding message part to be identified obtained by the natural-sounding enhancement unit Several extractions, including:
(1) framing plus Hamming window processing are carried out to natural-sounding message part to be identified;
(2) frame chosen successively in nature voice messaging part carries out Fast Fourier Transform (FFT), obtains frequency spectrum R (f);
(3) frequency spectrum R (f) is converted into mel-frequency R (f '), and nature voice signal is obtained using following self-defined wave filter group Characteristic energy spectrum Eb(x), it is specially:
Wherein,
In formula, Eb(x) characteristic energy spectrum E corresponding to x-th of wave filter output in wave filter group is representedb(x), x=1,2 ..., X, X The number of wave filter group median filter is represented, R (f ') represents to be transformed into the frequency spectrum obtained after mel-frequency, and f ' represents Mel frequency Rate,Represent the barycenter parameter of x-th of wave filter in wave filter group, Vx(f) represent in wave filter group x-th Wave filter, jx、hx、kxThe upper limit of x-th of wave filter, center, wherein lower limit, h in wave filter group are represented respectivelyx=jx-1=kx+1,
(4) the characteristic energy spectrum E (x) of acquisition is taken the logarithm, then carries out discrete cosine transform, after obtaining discrete cosine transform Preceding X coefficient ties up speech characteristic parameter as the X of this frame natural-sounding message part;
(5) characteristic parameter of the repeat step (2) to (4) until obtaining each frame of natural-sounding message part to be identified.
CN201710632907.3A 2017-07-28 2017-07-28 Vehicle-mounted voice identifies electronic entertainment control system Pending CN107437418A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710632907.3A CN107437418A (en) 2017-07-28 2017-07-28 Vehicle-mounted voice identifies electronic entertainment control system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710632907.3A CN107437418A (en) 2017-07-28 2017-07-28 Vehicle-mounted voice identifies electronic entertainment control system

Publications (1)

Publication Number Publication Date
CN107437418A true CN107437418A (en) 2017-12-05

Family

ID=60459896

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710632907.3A Pending CN107437418A (en) 2017-07-28 2017-07-28 Vehicle-mounted voice identifies electronic entertainment control system

Country Status (1)

Country Link
CN (1) CN107437418A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108545040A (en) * 2018-03-15 2018-09-18 王强源 A kind of automobile of automatic running
CN109319351A (en) * 2018-11-28 2019-02-12 广州市煌子辉贸易有限公司 A kind of intelligent garbage bin with sound identifying function

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050288923A1 (en) * 2004-06-25 2005-12-29 The Hong Kong University Of Science And Technology Speech enhancement by noise masking
CN101142800A (en) * 2004-04-23 2008-03-12 声学技术公司 Noise suppression based on bark band weiner filtering and modified doblinger noise estimate
US7428490B2 (en) * 2003-09-30 2008-09-23 Intel Corporation Method for spectral subtraction in speech enhancement
CN101794126A (en) * 2009-12-15 2010-08-04 广东工业大学 Wireless intelligent home appliance voice control system
CN101901602A (en) * 2010-07-09 2010-12-01 中国科学院声学研究所 Method for reducing noise by using hearing threshold of impaired hearing
CN102411930A (en) * 2010-09-21 2012-04-11 索尼公司 Method and equipment for generating audio model as well as method and equipment for detecting scene classification
WO2013142661A1 (en) * 2012-03-23 2013-09-26 Dolby Laboratories Licensing Corporation Post-processing gains for signal enhancement
CN103854646A (en) * 2014-03-27 2014-06-11 成都康赛信息技术有限公司 Method for classifying digital audio automatically
CN105416208A (en) * 2015-12-08 2016-03-23 延锋伟世通电子科技(上海)有限公司 Vehicle-mounted voice recognition electronic entertainment control system
CN105472092A (en) * 2014-07-29 2016-04-06 小米科技有限责任公司 Conversation control method, conversation control device and mobile terminal

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7428490B2 (en) * 2003-09-30 2008-09-23 Intel Corporation Method for spectral subtraction in speech enhancement
CN101142800A (en) * 2004-04-23 2008-03-12 声学技术公司 Noise suppression based on bark band weiner filtering and modified doblinger noise estimate
US20050288923A1 (en) * 2004-06-25 2005-12-29 The Hong Kong University Of Science And Technology Speech enhancement by noise masking
CN101794126A (en) * 2009-12-15 2010-08-04 广东工业大学 Wireless intelligent home appliance voice control system
CN101901602A (en) * 2010-07-09 2010-12-01 中国科学院声学研究所 Method for reducing noise by using hearing threshold of impaired hearing
CN102411930A (en) * 2010-09-21 2012-04-11 索尼公司 Method and equipment for generating audio model as well as method and equipment for detecting scene classification
WO2013142661A1 (en) * 2012-03-23 2013-09-26 Dolby Laboratories Licensing Corporation Post-processing gains for signal enhancement
CN103854646A (en) * 2014-03-27 2014-06-11 成都康赛信息技术有限公司 Method for classifying digital audio automatically
CN105472092A (en) * 2014-07-29 2016-04-06 小米科技有限责任公司 Conversation control method, conversation control device and mobile terminal
CN105416208A (en) * 2015-12-08 2016-03-23 延锋伟世通电子科技(上海)有限公司 Vehicle-mounted voice recognition electronic entertainment control system

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
张林: ""噪声环境下基于MFCC的鲁棒语音识别研究"", 《中国优秀硕士学位论文全文数据库信息科技辑》 *
杨勇: "基于统计模型和贝叶斯估计的语音增强算法研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *
翟振辉: "基于噪声幅度谱估计的单通道语音增强算法研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *
赵欢 等: ""一种新的对数能量谱熵语音端点检测方法"", 《湖南大学学报(自然科学版)》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108545040A (en) * 2018-03-15 2018-09-18 王强源 A kind of automobile of automatic running
CN109319351A (en) * 2018-11-28 2019-02-12 广州市煌子辉贸易有限公司 A kind of intelligent garbage bin with sound identifying function

Similar Documents

Publication Publication Date Title
CN109087669B (en) Audio similarity detection method and device, storage medium and computer equipment
CN106782504A (en) Audio recognition method and device
US9542938B2 (en) Scene recognition method, device and mobile terminal based on ambient sound
CN102999161B (en) A kind of implementation method of voice wake-up module and application
CN101154384B (en) Sound signal correcting method, sound signal correcting apparatus and computer program
WO2021082572A1 (en) Wake-up model generation method, smart terminal wake-up method, and devices
CN105405448A (en) Sound effect processing method and apparatus
CN108597505A (en) Audio recognition method, device and terminal device
CN104134444B (en) A kind of song based on MMSE removes method and apparatus of accompanying
CN107437418A (en) Vehicle-mounted voice identifies electronic entertainment control system
CN110070884B (en) Audio starting point detection method and device
KR20120037954A (en) System and method for noise reduction in processing speech signals by targeting speech and disregarding noise
CN111653289A (en) Playback voice detection method
CN108206027A (en) A kind of audio quality evaluation method and system
CN112562742B (en) Voice processing method and device
CN109360585A (en) A kind of voice-activation detecting method
CN112712816B (en) Training method and device for voice processing model and voice processing method and device
CN105654941A (en) Voice change method and device based on specific target person voice change ratio parameter
CN106356076A (en) Method and device for detecting voice activity on basis of artificial intelligence
WO2019169685A1 (en) Speech processing method and device and electronic device
CN102664018B (en) Singing scoring method with radial basis function-based statistical model
CN107393533A (en) A kind of device by Voice command treadmill
CN107564546A (en) A kind of sound end detecting method based on positional information
CN110660399A (en) Training method and device for voiceprint recognition, terminal and computer storage medium
CN114882879A (en) Audio noise reduction method, method and device for determining mapping information and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20171205