CN103267568B

CN103267568B - Voice online detection method for automobile electronic control unit

Info

Publication number: CN103267568B
Application number: CN201310206492.5A
Authority: CN
Inventors: 许雲淞; 高会军; 魏树银; 于金泳; 孙光辉; 李正超; 王海鹏
Original assignee: Harbin Institute of Technology
Current assignee: Ningbo Intelligent Equipment Research Institute Co., Ltd.
Priority date: 2013-05-29
Filing date: 2013-05-29
Publication date: 2015-06-10
Anticipated expiration: 2033-05-29
Also published as: CN103267568A

Abstract

The invention discloses a voice online detection method for an automobile electronic control unit, relates to voice online detection methods, and solves the problems that when a traditional voice detection method for the automobile electronic control unit is used, robustness is difficult to keep in a detection workshop, and embedded detection cannot easily meet the requirement for both accuracy and speed. The voice online detection method comprises the steps that voice data subjected to hardware lowpass filtering and pre-emphasis are collected; the voice data are subjected to early framing processing and endpoint detection to determine the duration of a voice segment and the duration of a voiceless segment; feature values of the voice data are extracted to obtain a four-dimensional feature vector describing the voice data; the four-dimensional feature vector obtained from the previous step is matched with a standard signal feature vector in a feature voice signal sample base, and a matching evaluation index is calculated to obtain qualified information and unqualified information; whether the matching evaluation index is the qualified information or the unqualified information is judged to obtain a detection result, and therefore one time of detection is finished. The voice online detection method for the automobile electronic control unit can be widely used for voice online detection for automobile electronic control.

Description

A kind of sound online test method of vehicle electronic control unit

Technical field

The present invention relates to a kind of sound online test method.

Background technology

Whether traditional vehicle electronic control unit sound detection is generally in workshop by manually completing, met the requirements by hearing sound recognition by experienced workman, increases human cost so on the one hand, limits accuracy of detection and reliability on the other hand.

Occurred computer aided detection method and apparatus afterwards, but these methods need to gather mass data, and completed by host computer, be unfavorable for the raising of portable inspectiont and detection speed.Meanwhile, under detection workshop condition, ground unrest is comparatively large and random, and this has higher requirement to the performance of detection method, and method is difficult to the performance that maintenance one is stable in fixed environment.

For the embedded detection that some occurs, although portable devices, small and exquisite property are met, method corresponding is thus had higher requirement, and its accuracy and speed are difficult to meet simultaneously.

Summary of the invention

The present invention in order to the sound detection method the solving traditional vehicle electronic control unit problem got both that is difficult to keep robustness and embedded detection to be difficult to meet accuracy and speed between inspection vehicle, thus provides a kind of sound online test method of vehicle electronic control unit.

A sound online test method for vehicle electronic control unit, it comprises the steps:

Step one: by system initialization, makes system works in detected state;

Step 2: the voice data through hardware low pass ripple and pre-emphasis is gathered; Described acquisition rate is 8KHz, and acquisition time is 4s;

Step 3: sub-frame processing in early stage is carried out to the voice data collected;

Step 4: carry out end-point detection to voice data, determines sound section of duration and unvoiced segments duration; Described end-point detection comprises just testing process and smart testing process;

Step 5: the eigenwert extracting voice data, obtains four dimensional feature vectors describing voice data; Described feature comprise sound period, sound section with next time interval of sound period, the characteristic frequency of sound section and the instantaneous power corresponding with characteristic frequency dot frequency, described four dimensional feature vectors include sound segment endpoint frame and rise the difference of point frame, sound section play point frame and the previous difference of sound segment endpoint frame, the characteristic frequency of sound section and the instantaneous power corresponding with unique point frequency;

Step 6: four dimensional feature vectors obtained in step 5 are mated with the standard signal proper vector in characteristic acoustic signature Sample Storehouse, and calculate coupling evaluation index, obtain qualified information and defective information;

Step 7: judge that coupling evaluation index is qualified information or defective information, if judged result is qualified information, is sent to workshop inspection center by qualified information and ID corresponding to characteristic acoustic signature; If judged result is defective information, defective information is sent to workshop inspection center, and this product is rejected by prompting;

Step 8: obtain testing result, complete one-time detection.

Present invention achieves and between inspection vehicle, to keep robustness and embedded detection to be difficult to meet accuracy and speed gets both.The method of the invention both ensure that calculating accuracy also improved arithmetic speed.When voice data end-point detection, invent short detection and detected with essence the method combined.

Accompanying drawing explanation

Fig. 1 is the process flow diagram of the sound online test method of a kind of vehicle electronic control unit of the present invention;

The RST schematic diagram that Fig. 2 collects for 4s described in embodiment one; A headed by (), section is the situation of sound section, headed by (b), section is the situation of unvoiced segments.

Embodiment

Embodiment one, in conjunction with Fig. 1 and 2, this embodiment is described.A sound online test method for vehicle electronic control unit, it comprises the steps:

Step one: by system initialization, makes system works in detected state;

Step 8: obtain testing result, complete one-time detection.

Detailed step of the present invention is:

Step one: by system initialization, makes system works in detected state;

Described step 3: the process of the voice data collected being carried out to sub-frame processing in early stage is:

Step 31: the voice data collected in step 2 is write in SRAM memory;

Step 32: to the voice data in step 31 according to the sub-frame processing of predetermined frame progress row; Described predetermined frame length is 128, and corresponding duration is 16ms.

The described process just detected is:

Step 4 A1: corresponding short-time energy E (f) is calculated to each frame voice data that step 3 obtains:

E (f) = Σ_{i = 0}^{N - 1} {X_{f}}^{2} (i)

In formula, f is frame number, X _fi () is the energy amplitude that in f frame voice data, i-th data point is corresponding, N is the data point sum that f frame voice data comprises;

Step 4 A2: design the optimum edge detection filter device that short-time energy E (f) is converted;

The process of described design to the optimum edge detection filter device that short-time energy E (f) converts is: carry out calculated off-line to the exponential term of wave filter and sine and cosine, obtain the value of various discrete stored in ROM storer; Adopt the mode of tabling look-up to search successively when filtering;

Described wave filter h (x) is:

f(x)＝e ^Ax[K ₁sin(Ax)+K ₂cos(Ax)]+e ^-Ax[K ₃sin(Ax)+K ₄cos(Ax)]+K ₅+K ₆e ^Sx

h (x) = \{\begin{matrix} - f (- x) & 0 \leq x \leq W \\ f (x) & - W \leq x < 0 \end{matrix}

Wherein A, S, K=[K ₁, K ₂, K ₃, K ₄, K ₅, K ₆] be filter coefficient, W is filter order; Filter coefficient elects W=5, S=1.4, A=0.573, K=[1.6,1.5 ,-0.08 ,-0.04 ,-0.9 ,-0.56] as;

Step 4 A3: optimum edge detection filter device h (x) that short-time energy E (f) utilizing step 4 A1 to obtain and step 4 A2 obtain obtains wave filter and exports F (f);

Wave filter exports F (f):

F (f) = Σ_{i = - W}^{W} E (f + i) h (i)

Wherein f is frame number;

Step 4 A4: calculate energy threshold according to wave filter output F (f) that step 4 A3 obtains, and stored in ROM storer; Described energy threshold comprises the improvement short-time energy lower limit TH of sound section of sound and the improvement short-time energy upper limit TL of unvoiced segments sound;

The process that described wave filter output F (f) obtained according to step 4 A3 calculates the energy threshold preset is:

Step 4 A4-1: the wave filter traveling through each frame exports F (f), obtains short-time energy maximal value F wherein _maxwith minimum value F _min;

Step 4 A4-2: the initial value F (0) of calculated threshold iteration:

F (0) = \frac{1}{2} (F_{\max} + F_{\min})

Step 4 A4-2: calculate the threshold value F (k) during kth time iteration;

F (k) = \frac{1}{2} [Σ_{i = 1}^{m (k - 1)} \frac{F_{i}}{m (k - 1)} + Σ_{j = 1}^{n (k - 1)} \frac{F_{j}}{n (k - 1)}]

Wherein m (k-1), n (k-1) are greater than, are less than the number of the frame of F (k-1) for filter output value when kth-1 iteration completes; As F (k)-F (k-1) <1, iteration terminates, and obtains F (k);

Step 4 A4-3: make TH=F (k)+10, TL=F (k)-3 according to the F (k) that step 4 A4-2 obtains, and stored in ROM storer;

Step 4 A5: determine to detect starting point f at the beginning of two of voice data ₁and f ₆, f ₁<f ₆; Endpoint detection f at the beginning of two ₃and f ₈, f ₃<f ₈;

The described start frame f determining voice data ₁and f ₆process be: the wave filter that traversal step four A3 obtains exports F (f), is namely totalframes for 3≤f≤M-3, M, and f is 1 to increase progressively and the computing carrying out below to F (f) with step-length:

Occur continuous 3 frame f when first, the wave filter of f+1, f+2 exports and is all greater than TH and the wave filter of f-3, f-2, f-1 exports when being all less than TH, using the first detection starting point f of frame f minimum for frame number in 3 frames as voice data ₁;

When continuous 3 frame f appear in second time, the wave filter of f+1, f+2 exports and is all greater than TH and the wave filter of f-3, f-2, f-1 exports when being all less than TH, and frame f minimum for frame number in 3 frames is detected starting point f at the beginning of second of voice data ₆;

Occur continuous 3 frame f when first, the wave filter of f+1, f+2 exports and is all less than TL and the wave filter of f-3, f-2, f-1 exports the frame number f write down when being all greater than TL now ₂=f, as continuous 3 frame f, the wave filter of f+1, f+2 exports and is all greater than TL and the wave filter of f-3, f-2, f-1 exports when being all less than TL, judges frame number f now ₃=f and f ₂difference whether be greater than 10, if so, by f ₃as first terminal that voice data just detects, otherwise continue to calculate until obtain the f satisfied condition ₃;

When continuous 3 frame f appear in second time, the wave filter of f+1, f+2 exports and is all less than TL and the wave filter of f-3, f-2, f-1 exports the frame number f write down when being all greater than TL now ₇=f, as continuous 3 frame f, the wave filter of f+1, f+2 exports and is all greater than TL and the wave filter of f-3, f-2, f-1 exports when being all less than TL, judges frame number f now ₈=f and f ₇difference whether be greater than 10, if so, by f ₈as second terminal that voice data just detects, otherwise continue to calculate until obtain the f satisfied condition ₈.

Rough detection does not directly use short-time energy, but obtain optimum edge detection filter device by the Edge Detection in image procossing, and filtering is carried out to the short-time energy of input, export according to wave filter and determine the improvement short-time energy lower limit of sound section of sound and the improvement short-time energy upper limit of unvoiced segments sound, and tentatively determine the rough detection end points of voice signal as standard.

In the improvement short-time energy of the improvement short-time energy lower limit and unvoiced segments sound of determining sound section of sound in limited time, directly the improvement short-time energy lower limit of sound section of sound is not set to fixed value TH=10, also directly the improvement short-time energy upper limit of unvoiced segments sound is not set to fixed value TL=-3, but utilize optimum edge detection filter device to convert short-time energy, then according to the output of optimum edge detection filter device, design formula carry out iteration, obtain the amount of bias F (k) of now voice signal, and make TH=F (k)+10, TL=F (k)-10.Make the improvement short-time energy upper limit of the improvement short-time energy lower limit of sound section of sound and unvoiced segments sound not fix like this, but follow the change of actual voice signal input adaptive, enhance robustness and the accuracy of end-point detection.

The process that described essence detects is:

Step 4 B1: set zero passage discrimination threshold thr and improved zero rate detection mode:

Z_{f} = \frac{1}{2} Σ_{i = 0}^{N - 1} | sgn [X_{f} (i) - thr] - sgn [X_{f} (i - 1) + thr] |

In formula, f is frame number, X _fi () is the amplitude that in f frame voice data, i-th data point is corresponding, N is the data point sum that f frame voice data comprises, Z _fbe the zero-crossing rate of f frame voice data, thr=0.8;

Step 4 B2: to the calculated zero rate of the voice data after step 3 framing;

Step 4 B3: be totalframes for 3≤f≤M-3, M, f is frame number, and f is 1 increase progressively and carry out computing below with step-length:

Continuous 3 frame f, the zero-crossing rate Z of f+1, f+2 is there is when first _fbe greater than zero-crossing rate threshold value Z _thr=10 and the zero-crossing rate Z of f-3, f-2, f-1 _fbe less than zero-crossing rate threshold value Z _thrwhen=10, write down the frame number f of this frame ₄=f, by f ₄first as voice data is detected starting point;

When the zero-crossing rate Z of f+1, f+2 appears continuous 3 frame f, in second time _fbe greater than zero-crossing rate threshold value Z _thr=10 and the zero-crossing rate Z of f-3, f-2, f-1 _fbe less than zero-crossing rate threshold value Z _thrwhen=10, write down the frame number f of this frame ₉=f, by f ₉second as voice data is detected starting point; f ₄<f ₉.

Continuous 3 frame f, the zero-crossing rate Z of f+1, f+2 is there is when first _fbe less than zero-crossing rate threshold value Z _thr=10 and the zero-crossing rate Z of f-3, f-2, f-1 _fbe greater than zero-crossing rate threshold value Z _thrwhen=10, write down the frame number f of this frame ₅=f, by f ₅as first endpoint detection of voice data;

When continuous 3 frame f appear in second time, the zero-crossing rate of f+1, f+2 is less than zero-crossing rate threshold value Z _thr=10 and the zero-crossing rate Z of f-3, f-2, f-1 _fbe greater than zero-crossing rate threshold value Z _thrwhen=10, write down the frame number f of this frame ₁₀=f, by f ₁₀as second endpoint detection of voice data; f ₅<f ₁₀;

Step 4 B4: according to frame number mode from small to large, four end points are obtained to the data collected in 4s described in step 2 and is respectively f _a, f _b, f _c, f _d; Obtain f _aposition;

If f _afor the terminal of sound section, then have

f_{A} = \frac{1}{2} (f_{5} - f_{3}), f_{B} = \frac{1}{2} (f_{4} - f_{1}), f_{C} = \frac{1}{2} (f_{10} - f_{8}),

and now the computing formula of sound section of duration is L _y=8 × (f _c-f _b), unit is millisecond; The duration of unvoiced segments is L _w=8 × (f _b-f _a), unit is millisecond;

If f _afor the starting point of sound section, then have

f_{A} = \frac{1}{2} (f_{4} - f_{1}), f_{B} = \frac{1}{2} (f_{5} - f_{3}), f_{C} = \frac{1}{2} (f_{9} - f_{6}),

and now the computing formula of sound section of duration is L _y=8 × (f _b-f _a), unit is millisecond; The duration of unvoiced segments is L _w=8 × (f _c-f _b), unit is millisecond.

The essence detection improvement detection mode of zero-crossing rate, make the detection of end points can the interference of effective filtering noise, the end points finally obtained not only includes the contribution of energy but also comprises the contribution of frequency, releases from energy and frequency two aspects, more objective and accurate.

When detecting zero-crossing rate, by setting zero passage discrimination threshold thr, zero-crossing rate detection mode being improved, obtaining form

Z_{f} = \frac{1}{2} Σ_{i = 0}^{N - 1} | sgn [X_{f} (i) - thr] - sgn [X_{f} (i - 1) + thr] | .

Making the oscillation amplitude of noise signal be no more than thr by introducing this threshold value, just not affecting the counting of zero-crossing rate, thus eliminate false zero-crossing rate, filtered the interference of noise by a small margin, improve the precision of end-point detection.

Extract the eigenwert of voice data described in step 5, the process obtaining four dimensional feature vectors describing voice data is:

Step 51: with the sound section of duration calculation formula obtained in step 4 B4, sound period is converted to sound segment signal terminal frame number and deducts same section of sound segment signal starting point frame number;

Step 52: be converted into sound segment signal starting point frame number by sound section with next time interval of sound period with the unvoiced segments duration calculation formula obtained in step 4 B4 and deduct sound segment signal terminal frame number the last period;

Step 53: carry out stored in the voice data in SRAM the characteristic frequency that Fast Fourier Transform (FFT) FFT obtains sound section in step 31;

Step 54: calculate corresponding instantaneous power stored in the voice data in SRAM with the characteristic frequency described in step 53 in step 31.

When choosing the feature needed for voice signal coupling, having more than employing " characteristic frequency of sound section " and " instantaneous power corresponding with characteristic frequency ", but adding the coupling to sound section of duration and unvoiced segments duration.This avoid sound frequency in sound detection to meet the demands but sounding duration situation about not meeting the demands, improve Detection accuracy.The calculating of these two amounts simultaneously increased is in both cases respectively by formula L _y=8 × (f _c-f _b), L _w=8 × (f _b-f _a) and L _y=8 × (f _b-f _a), L _w=8 × (f _c-f _b) change into calculating to " sound segment endpoint frame with play the difference of point frame ", the difference of point frame and previous sound segment endpoint frame " sound section rise ", multiplying is changed into subtraction, improves the computing velocity of this method.

The process of establishing in the data sample of characteristic sounds described in step 6 storehouse is:

Step 6 A1: system initialization, makes system works in physical training condition;

Step 6 A2 a: ID is set to often kind of characteristic sounds;

Step 6 A3: under normal workshop condition, three times are gathered to each ID characteristic of correspondence voice data;

Step 6 A4: carry out three four dimensional feature vectors that step 3 ~ five obtain characteristic sounds data;

Step 6 A5: to three four dimensional feature vector vec1 of single characteristic sounds data, vec2, vec3 ask for arithmetic mean, and the characteristics of mean vector that will finally obtain as the proper vector of these characteristic sounds data;

Step 6 A6: the proper vector obtained in step 6 A5 is write outside FLASH memory morphogenesis characters sample of signal storehouse.

Mated with the standard signal proper vector in characteristic sounds data sample storehouse by four dimensional feature vectors obtained in step 5 described in step 6, and calculate coupling evaluation index, the process obtaining qualified information and defective information is:

Step 6 B1: four dimensional feature vectors step 5 acquired compare one by one with the standard signal proper vector in characteristic sounds data sample storehouse, and calculate the relative error of four dimensional feature vectors respectively: sound segment endpoint frame and the relative error of difference, sound section of relative error of difference playing point frame and previous sound segment endpoint frame, the relative error of the characteristic frequency of sound section and the relative error of the instantaneous power corresponding with unique point frequency that play point frame in relative error;

Sound segment endpoint frame and the relative error of the difference of a point frame, sound section point frame and the relative error of difference of previous sound segment endpoint frame and the relative error sum of the characteristic frequency of sound section in the relative error of step 6 B2: calculation procedure six B1 tetra-described four dimensional feature vectors;

Step 6 B3: standard feature vector whole in described Sample Storehouse is mated with sum described in step 6 B2, finds Sample Storehouse standard signal proper vector corresponding when sum is minimum described in step 6 B2;

Step 6 B4: calculate according to coupling index; Described coupling index is: whether the proper vector that step 6 B3 obtains meets that sound segment endpoint frame and the relative error of difference playing point frame are less than 0.1, the relative error of the difference of sound section point frame and previous sound segment endpoint frame is less than 0.1, the relative error of the characteristic frequency of sound section be less than 0.05 and the relative error of the instantaneous power corresponding with unique point frequency be less than 0.15;

If meet, obtain qualified information, if do not meet, obtain defective information.

By Criterion voice signal Sample Storehouse, adopt the mode of coupling, make to carry out one-time detection to all types of voice signals of certain vehicle electronic control unit when detecting in practical application, and the input sequence of all types of voice signal can be any, the process that dissimilar sound detection switches does not need artificial participation, which kind of sound the voice signal of system energy autonomous classification input is and whether qualifiedly judges, the ID of qualified sound type is sent to inspection center; Instead of one-time detection can only detect the sound of a type of certain vehicle electronic control unit and whether qualifiedly judge, also needs artificial switching in time needing the sound detecting next type.

Step 8: obtain testing result, complete one-time detection.

Claims

1. a sound online test method for vehicle electronic control unit, is characterized in that it comprises the steps:

Step one: by system initialization, makes system works in detected state;

Step 6: four dimensional feature vectors obtained in step 5 are mated with the standard signal proper vector in characteristic sounds data sample storehouse, and calculate coupling evaluation index, obtain qualified information and defective information;

Step 7: judge that coupling evaluation index is qualified information or defective information, if judged result is qualified information, is sent to workshop inspection center by qualified information and ID corresponding to characteristic sounds data; If judged result is defective information, defective information is sent to workshop inspection center, and this product is rejected by prompting;

Step 8: obtain testing result, complete one-time detection.

2. the sound online test method of a kind of vehicle electronic control unit according to claim 1, is characterized in that described step 3: the process of the voice data collected being carried out to sub-frame processing in early stage is:

Step 31: the voice data collected in step 2 is write in SRAM memory;

3. the sound online test method of a kind of vehicle electronic control unit according to claim 2, is characterized in that the process just detected described in step 4 is:

Described wave filter h (x) is:

Wave filter exports F (f):

Wherein f is frame number;

Step 4 A4-2: the initial value F (0) of calculated threshold iteration:

Step 4 A4-2: calculate the threshold value F (k) during kth time iteration;

4. the sound online test method of a kind of vehicle electronic control unit according to claim 3, is characterized in that the process that described in step 4, essence detects is:

Step 4 B2: to the calculated zero rate of the voice data after step 3 framing;

When the zero-crossing rate Z of f+1, f+2 appears continuous 3 frame f, in second time _fbe greater than zero-crossing rate threshold value Z _thr=10 and the zero-crossing rate Z of f-3, f-2, f-1 _fbe less than zero-crossing rate threshold value Z _thrwhen=10, write down the frame number f of this frame ₉=f, by f ₉second as voice data is detected starting point; f ₄<f ₉;

If f _afor the terminal of sound section, then have and now the computing formula of sound section of duration is L _y=8 × (f _c-f _b), unit is millisecond; The duration of unvoiced segments is L _w=8 × (f _b-f _a), unit is millisecond;

If f _afor the starting point of sound section, then have and now the computing formula of sound section of duration is L _y=8 × (f _b-f _a), unit is millisecond; The duration of unvoiced segments is L _w=8 × (f _c-f _b), unit is millisecond.

5. the sound online test method of a kind of vehicle electronic control unit according to claim 4, is characterized in that the eigenwert extracting voice data described in step 5, and the process obtaining four dimensional feature vectors describing voice data is:

6. the sound online test method of a kind of vehicle electronic control unit according to claim 1 or 4, is characterized in that the process of establishing in the data sample of characteristic sounds described in step 6 storehouse is:

Step 6 A2 a: ID is set to often kind of characteristic sounds;

7. the sound online test method of a kind of vehicle electronic control unit according to claim 6, it is characterized in that described in step 6, four dimensional feature vectors obtained in step 5 being mated with the standard signal proper vector in characteristic sounds data sample storehouse, and calculate coupling evaluation index, the process obtaining qualified information and defective information is:

Step 6 B1: four dimensional feature vectors step 5 acquired compare one by one with the standard signal proper vector in characteristic sounds data sample storehouse, and calculate the relative error of four dimensional feature vectors respectively: sound segment endpoint frame and the relative error of difference, sound section of relative error of difference playing point frame and previous sound segment endpoint frame, the relative error of the characteristic frequency of sound section and the relative error of the instantaneous power corresponding with unique point frequency that play point frame;

8. the sound online test method of a kind of vehicle electronic control unit according to claim 1, is characterized in that the process just detected described in step 4 is:

Described wave filter h (x) is:

Wave filter exports F (f):

Wherein f is frame number;

Step 4 A4-2: the initial value F (0) of calculated threshold iteration:

Step 4 A4-2: calculate the threshold value F (k) during kth time iteration;