CN109147759A

CN109147759A - A kind of shortwave voice signal diversity merging method of reseptance based on marking algorithm

Info

Publication number: CN109147759A
Application number: CN201811172837.9A
Authority: CN
Inventors: 崔亚笛; 董彬虹; 张存林; 曹蕾; 赵宇轩; 李千饶
Original assignee: University of Electronic Science and Technology of China
Current assignee: University of Electronic Science and Technology of China
Priority date: 2018-10-09
Filing date: 2018-10-09
Publication date: 2019-01-04

Abstract

A kind of shortwave voice signal diversity based on marking algorithm of the disclosure of the invention merges method of reseptance and is handled mainly for the enhanced single-side belt analogue voice signal of multipath reception, is related to a kind of shortwave voice signal diversity and merges method of reseptance.The present invention proposes a kind of diversity merging method of reseptance based on marking algorithm for defect existing for existing folding.The present invention is merged in segment of speech using the weighting after marking, uses equal gain combining in silent section.Compared with existing folding, the intelligibility of voice signal is dramatically improved.The present invention is merged in segment of speech using the weighting based on scoring method, silent section compensated for by the way of equal gain combining the possible speech of equal gain combining really become estranged noise amplification, simultaneously by way of the weighting merging based on marking result, the quality of speech segment signal is improved.

Description

A kind of shortwave voice signal diversity merging method of reseptance based on marking algorithm

Technical field

Present invention is generally directed to the enhanced single-side belt analogue voice signals of multipath reception to be handled, and be related to a kind of short Wave voice signal diversity merges method of reseptance.

Background technique

Speech terminals detection refers to isolates voice segments and non-speech segment from the signal comprising voice, also includes in signal The confirmation of starting point and terminating point, effective end-point detection technology can not only exclude the noise jamming of unvoiced segments, improve system Processing real-time, and can be reduced the processing time of system, to enable the larger raising of subsequent recognition performance.The present invention adopts It is more accurate with the end-point detecting method based on energy entropy ratio, testing result.

Marking is the objective voice evaluation of programme of reference source-free, which is suitable for the voice matter without independent reference signal Amount prediction.For this reason that the method is proposed as the voice quality assessment in the unknown voice source of phone distal end, scene Network monitoring and assessment.Marking can predict the voice quality based on physiological sensation, which is not only restricted to end-to-end measurement, energy It is used for any position of chain.It should be noted that the algorithm is not the comprehensive assessment to transmission quality, voice matter is only measured The one direction voice distortion of amount and the influence of noise, it can be studied by hearing test, and hearing test assessment is The quality received in absolute scope range of value.Because the algorithm is to combine receiving end simulation mankind's mass perception, receive The degeneration that end and other true monitoring devices generate should not be considered.Simultaneously because the algorithm predicts sense of hearing score, so all Reducing the influence spoken with conversational quality cannot be considered., it is intended that loudness reduction, sidetone, delay, echo and it is other and The damage of the two-way interaction of speech quality is not reacted in the algorithm.Accordingly, it is possible to there is very high score, but do not represent best Quality.

Equal gain combining is also referred to as phase equalization, does not carry out amplitude weighting to each branch, and only corrects the phase of each branch Position guarantees with addition.Although reducing estimation parameter, merge performance still influenced by evaluated error, in addition, also by The influence of each branch's disequilibrium, when each branch's performance differs greatly, due to signal, weak branch is also amplified same multiple It participates in merging afterwards, causes to introduce more noises, so that is obtained after merging output may not be to merge gain but merge damage It loses.

Summary of the invention

The present invention proposes a kind of diversity merging reception based on marking algorithm for defect existing for existing folding Method.The present invention is merged in segment of speech using the weighting after marking, uses equal gain combining in silent section.With existing folding It compares, dramatically improves the intelligibility of voice signal.

Technical scheme is as follows:

When merging multi-path voice signal, according to voice signal end-point detection as a result, using scoring method, segment of speech is believed It number gives a mark, and merging is weighted according to marking result；Equal gain combining is carried out to silent section.

Technical solution of the present invention are as follows: a kind of shortwave voice signal diversity merging method of reseptance based on marking algorithm, the party Method includes:

Step 1, according to voice signal end-point detection as a result, determining the position of segment of speech and silent section；

Step 2 gives a mark to the current voice section of each road signal, w_i(i=1,2 ... L)=P (x, length)

Wherein, length is to need to give a mark the length of signal x, w_iMarking is indicated as a result, i indicates that the i-th road signal, L indicate The shared road L voice signal, p () indicate scoring functions；The realization of the marking algorithm: first by input signal, according to algorithm Calculate eight required key parameters, comprising: pitch period, the kurtosis value of linear predictor coefficient, signal-to-noise ratio, mechanic sound ginseng Number, the suddenly lasting length of weak number, voice interruption, mute length and estimating part signal-to-noise ratio；Secondly, being closed according to this eight Bond parameter matches corresponding six kinds of type of distortion；There are 12 characteristic parameters, different type of distortion pair in each type of distortion The characteristic parameter answered is not exactly the same, is weighted respectively to 12 characteristic parameters, obtains a median；Finally, according to this A median, then be weighted with other 11 characteristic parameters, marking result can be obtained；

Step 3 carries out equal gain combining to the silent section before the current voice section of each road speech；

W at this time_iIt is that 1, y (t) indicates the voice signal exported after weighting merging, x_i(t) enhanced single channel voice is indicated Signal；

Step 4 is weighted merging to the current voice section of each road signal, and give a mark result w_iAs weighting coefficient；

Step 5, index, if index is not above the segment of speech number of first via signal, goes to step 2 and continue to hold from increasing Row；

If step 6, index exceed the segment of speech number of first via signal, just to the gains conjunction such as last silent section carries out And；

Step 7 carries out end-point detection to merging what happened latter sound again, and place is normalized to the voice signal after merging Reason.

Beneficial effect of the present invention are as follows: the present invention is merged in segment of speech using the weighting based on scoring method, in silent section It compensates for the possible speech of equal gain combining by the way of equal gain combining really to become estranged noise amplification, while by being based on The weighting for result of giving a mark merges mode, improves the quality of speech segment signal.

Detailed description of the invention

Fig. 1 is scoring method flow chart of the present invention；

Fig. 2 is that the present invention is based on the diversity of scoring method to merge block diagram；

Fig. 3 is the performance comparison figure that artificial voice signal diversifying of the present invention merges front and back；

Fig. 4 is the performance comparison figure that actual voice signal diversifying of the present invention merges front and back.

Specific embodiment

With reference to the accompanying drawings and examples, technical solution of the present invention is described in detail.But it is above-mentioned that this should not be interpreted as to the present invention The range of main body is only limitted to following embodiment, all to be all belonged to the scope of the present invention based on the technology that the content of present invention is realized.

It is present invention marking algorithm flow chart shown in Fig. 1.

Voice distortion is divided into 6 classifications by scoring method, it is according to the sizes of 8 key parameters, according to the preferential of setting Grade realizes the judgement of type of distortion.

The type of distortion of highest priority is ambient noise, it is determined according to the signal-to-noise ratio of signal.Ambient noise can be serious Influence voice quality, most of voice quality Mean Opinion Score (Mean Opinion Scores, MOS) containing ambient noise Value is generally in the range of 1~3.The interruption distortion of voice signal refers to that signal has mute or interrupts, i.e. the level value of signal It is mutated.Multiplicative noise distortion refers to noise related with signal envelope in voice signal, such distortion only occurs in activity Phonological component.The mechanic sound of voice and the tone of voice are closely related.The minimum type of distortion of priority be voice entirety not Naturalness, since the output quality of audio coder & decoder (codec) is gender-related, scoring method is based on fundamental frequency for the type of distortion It is divided into two kinds of male voice, female voice situations.

Since subjective feeling of the human ear to different type voice distortion is different, scoring method is according to specific type of distortion pair As a result the different value of mapping model parameter setting.Each type of distortion includes 12 different phonetic features, scoring method According to the type of distortion of voice to be measured, linear combination is done to 12 features by perceiving weight accordingly and is obtained in evaluation result Between be worth, then by this intermediate result combine 11 characteristic parameters obtain final result.

It is that the present invention is based on the diversity of scoring method to merge block diagram shown in Fig. 2.

The principle of diversity and combining techniques is exactly the same signal copy difference that will be carried on two or more pieces independent pathway Strategy be combined, with increase receive signal instantaneous signal-to-noise ratio and average signal-to-noise ratio, improve system performance.It is based in this way For one physical phenomenon when deep fade occurs in the signal of a paths, other independent pathways are also in the probability of deep fade very simultaneously It is low, therefore one or more signals can be selected to merge in multiple signals, the output of receiving end thus can be improved Signal-to-noise ratio.

Diversity and combining techniques include two aspect meanings: first is that distributed transmission, enables receiving end to obtain multiple mutual statisticals Fading signal that is independent and carrying same information；Second is that concentrating merging treatment, multiple mutual statisticals that receiver is received Independent fading signal is merged according to different strategies, to reduce the influence of decline.Therefore, to obtain diversity most heavy It is " uncorrelated " between each signal that the condition wanted, which is ensuring that,.

A plurality of mutually independent tributary signal can be obtained in receiving end by diversity technique, but receiving end is with which kind of side Formula combines multiple signals to achieve the purpose that improve output signal-to-noise ratio, and here it is folding problems to be solved.Fig. 2 L branch combinatorial construction block diagram is given, wherein w_i(i=1,2 ... L) is the weighting coefficient of i-th receiving branch.If i-th It is x that branch, which receives signal,_i(t), then output end signal y (t) is represented by after merging

By choosing different weighting coefficient w_i, different consolidation strategies can be formed.

Simulation comparison is carried out using the performance that Matlab simulation software merges front and back to emulation signal and actual signal diversity Analysis, simulation result difference are as shown in Figure 3 and Figure 4.Fig. 3 illustrates 12 sections of emulation signal diversifyings of the present invention and merges front and back marking knot The difference of fruit.As seen from the figure, the marking result of each section of voice signal is apparently higher than each section of speech before merging after the present invention merges Signal, the speech marking result after merging improve 1 point or so；Before Fig. 4 illustrates 12 sections of actual signal diversity merging of the present invention The difference for result of giving a mark afterwards.As seen from the figure, the marking result of each section of voice signal is slightly above each before merging after the present invention merges Section voice signal, the speech marking result after merging improve 0.5~1 point or so.Therefore, the present invention is on the basis that speech enhances On improve speech quality, achieved the effect that ideal.

Claims

1. a kind of shortwave voice signal diversity based on marking algorithm merges method of reseptance, this method comprises:

Wherein, length is to need to give a mark the length of signal x, w_iMarking is indicated as a result, i indicates that the i-th road signal, L indicate shared L Road voice signal, p () indicate scoring functions；The realization of the marking algorithm: it first by input signal, is calculated according to algorithm Eight required key parameters, comprising: pitch period, the kurtosis value of linear predictor coefficient, signal-to-noise ratio, mechanic sound parameter, suddenly The lasting length of the weak number in ground, voice interruption, mute length and estimating part signal-to-noise ratio；Secondly, according to this eight key parameters Match corresponding six kinds of type of distortion；There are 12 characteristic parameters, the corresponding spy of different type of distortion in each type of distortion It is not exactly the same to levy parameter, 12 characteristic parameters are weighted respectively, obtain a median；Finally, according among this Value, then be weighted with other 11 characteristic parameters, marking result can be obtained；

W at this time_iIt is that 1, y (t) indicates the voice signal exported after weighting merging, x_i(t) enhanced single channel voice letter is indicated Number；

Step 5, index, if index is not above the segment of speech number of first via signal, goes to step 2 and continue to execute from increasing；

If step 6, index exceed the segment of speech number of first via signal, equal gain combining just is carried out to last silent section；

Step 7 carries out end-point detection to merging what happened latter sound again, and the voice signal after merging is normalized.