CN112634871B

CN112634871B - Lie detection method and system based on voice and radar dual sensors

Info

Publication number: CN112634871B
Application number: CN202011492568.1A
Authority: CN
Inventors: 洪弘; 李新; 李彧晟; 孙理; 顾陈; 朱晓华
Original assignee: Nanjing University of Science and Technology
Current assignee: Nanjing University of Science and Technology
Priority date: 2020-12-17
Filing date: 2020-12-17
Publication date: 2024-02-20
Anticipated expiration: 2040-12-17
Also published as: CN112634871A

Abstract

The invention discloses a lie detection method and system based on a voice and radar dual sensor. Firstly, synchronously acquiring a voice signal and a radar signal by utilizing a microphone and the radar; secondly, respectively preprocessing a voice signal and a radar signal; then respectively extracting a voice feature set, a respiratory feature set and a heartbeat feature set according to psychological and physiological characteristics of lie-lying time; then, the three feature sets are fused by utilizing a feature fusion technology; finally, a machine learning classifier is used for detecting and identifying lie. The invention can effectively combine and complement the advantages of the two non-contact sensors, has reliable performance and can accurately detect lie. The invention not only can not bring uncomfortable feeling to the testee, but also can eliminate the limitation of voice, effectively improves the lie detection accuracy, and has high reliability and wide applicability.

Description

Lie detection method and system based on voice and radar dual sensors

Technical Field

The invention belongs to the lie detection field, and particularly relates to a lie detection method and system based on a voice and radar dual sensor.

Background

Lie is a special behavior that people can hide the true looks of things and make the other party trust under certain scenes through false descriptions of the people. Lie detection technology is an important technology for cross fusion of various subjects such as psychology, physiology, linguistics, cognitive subjects, statistics, sensor technology, pattern recognition, criminal investigation and the like, and has important significance for understanding behavior characteristics of human beings, and particularly assisting criminal investigation judicial cases. Human judgment of lie is basically by subjective random guessing, and accuracy is low, so that a scientific and reliable system and method are needed to guide human detection of lie.

In the past, the detection of lie mainly uses a multi-channel physiological instrument to detect physiological parameters and abnormal conditions in the lie-lie process, and the method has certain reliability, because human beings do not autonomously generate some physiological reactions in the lie-lie process, such as acceleration of heartbeat, inhibition of breathing and the like. However, this method requires that the subject wear a plurality of sensors, which may cause discomfort to the subject and even cause a great sense of compression, which may affect the measurement result. Currently, home lie detection is mainly performed by means of voice analysis, and voice or semantic features of lie are analyzed from the perspective of voice or semantics. Although the method can achieve non-contact measurement, the method has great limitation because of great difference of speaking habits and expression contents of each person.

Disclosure of Invention

The invention aims to provide a voice and radar dual-sensor-based lie detection method and system with high reliability and wide applicability, which can realize non-contact acquisition of voice signals, respiratory signals and heartbeat signals, and fusion of the three signals to detect lie.

The technical solution for realizing the purpose of the invention is as follows: a lie detection method based on dual voice and radar sensors, the method comprising the steps of:

Step 1, synchronously acquiring a voice signal and a radar signal by using a microphone and a continuous wave radar;

step 2, noise reduction, sound event detection and pre-emphasis preprocessing are carried out on the voice signals acquired in the step 1;

step 3, extracting the characteristics of fundamental frequency, sounding probability, short-time zero-crossing rate, frame root mean square energy and mel cepstrum coefficient according to the preprocessed voice signals obtained in the step 2, and applying 6 statistical parameters of maximum value, minimum value, mean value, standard deviation, skewness and kurtosis to the 5 characteristics to obtain a voice characteristic set X;

step 4, demodulating and filtering the radar signals acquired in the step 1 to obtain respiratory signals and heartbeat signals;

step 5, carrying out time domain, frequency domain and nonlinear feature extraction on the respiratory signal and the heartbeat signal obtained in the step 4 respectively to obtain a respiratory feature set R and a heartbeat feature set H;

step 6, fusing the respiratory feature set R obtained in the step 5 with the heartbeat feature set H to obtain a physiological feature set Y, and then performing feature fusion on the physiological feature set Y and the voice feature set X obtained in the step 3 to obtain a fusion feature set Z;

and 7, training a classifier by using the fusion feature set Z obtained in the step 6, and performing lie detection classification on the voice sample.

Further, in the step 2, the noise reduction is performed on the voice signal acquired in the step 1, and the specific process includes:

step 2-1, recording a noise sample aiming at the noise of which the decibel exceeds a preset threshold value, namely main noise;

step 2-2, generating a noise sample configuration file by using the SOX audio processing program for the noise sample obtained in the step 2-1, and performing primary noise reduction on the voice signal according to the noise sample configuration file to remove main noise;

and 2-3, performing secondary noise reduction on the voice signal obtained in the step 2-2 after primary noise reduction by using improved spectral subtraction, and removing other types of noise except noise samples to obtain a pure voice signal.

Further, in the step 5, time domain, frequency domain and nonlinear feature extraction are performed on the respiratory signal and the heartbeat signal obtained in the step 4 respectively to obtain a respiratory feature set R and a heartbeat feature set H, which specifically include:

step 5-1, carrying out time domain, frequency domain and nonlinear feature extraction on the respiratory signal to obtain a respiratory feature set R;

A. time domain features: extracting a breath amplitude mean value, a breath amplitude standard deviation, a breath average amplitude difference and a breath normalized average amplitude difference as time domain features of a breath signal; wherein,

(1) Mean value mu of respiratory amplitude _x The expression is used for reflecting the average amplitude condition of respiration in lie detection process and is as follows:

wherein X (N) is the nth respiration sequence, N is the total number of the respiration sequences, and N is more than or equal to 1 and less than or equal to N;

(2) Standard deviation sigma of respiratory amplitude _x The expression is used for reflecting the overall change condition of respiration in lie detection process, and is as follows:

(3) Average respiratory amplitude delta _x The expression is used for reflecting the short-time change of the respiratory amplitude in lie detection process:

(4) Breath normalized average amplitude difference delta _rx The expression is used for reflecting the influence of short-time variation of the respiratory amplitude on the overall variation in lie detection process:

B. frequency domain characteristics: extracting respiratory low frequency band F _L Mid-respiratory frequency band F _M And respiratory high frequency band F _H The power spectrum amplitude average value of the three frequency bands is used as the frequency domain characteristic of the breathing signal; wherein F is _L ＜p ₁ Hz，p ₁ Hz≤F _M ＜p ₂ Hz，F _H ＞p ₂ Hz；

C. Nonlinear characteristics: extracting a respiration detrack fluctuation scale index and a respiration sample entropy as nonlinear characteristics of respiration signals;

(1) Respiratory detrence fluctuation scale index

The respiratory detrack fluctuation scale index is used for reflecting the nonstationary characteristic of respiratory signals in lie detection, and the computing steps are as follows:

1) Assuming that the respiratory sequence is X (n), the mean mu is calculated _x ：

2) Calculating the cumulative difference y (n) of the breathing sequence:

3) Dividing y (n) into a windows with a window length of b in a non-overlapping manner;

4) Fitting a local trend y to each section of window length interval by using a least square method _b (n) then removing the local trend of each interval to obtain a new respiratory sequence and calculating the root mean square F (n) of the new respiratory sequence:

5) Changing the size of the window length b and then repeating the steps until the required data volume is obtained;

6) According to the parameters calculated in the above steps, drawing a curve with log (n) as an abscissa and log [ F (n) ] as an ordinate, wherein the slope of the curve is the respiratory detrack fluctuation scale index of the respiratory sequence;

(2) Breath sample entropy

The breath sample entropy is used for evaluating the complexity of the breath signal in lie estimation, and the calculation steps are as follows:

1) The respiration time series is denoted as X (N), with m as window length, and is divided into s=n-m+1 respiration subsequences:

X _m (t)＝(X(t),X(t+1),…,X(t+m-1)),1≤t≤N-m+1

wherein X is _m (t) is the t-th breath subsequence;

2) Definition of sequence X _m (i) And sequence X _m (j) The distance of the corresponding element is the absolute value of the maximum difference value of the corresponding element and the distance d between each breathing subsequence and all other breathing subsequences is calculated _ij ：

d _ij ＝max _{k＝0,…,m-1} (|X _m (i+k)-X _m (j+k)|)

Wherein i is more than or equal to 1 and less than or equal to N-m, j is more than or equal to 1 and less than or equal to N-m, and i is not equal to j;

3) Calculating the standard deviation sigma of the respiratory amplitude _x And defines a threshold f=r×σ _x R is a constant, taking 0.1-0.25; the distance d calculated in the above 2) _ij The ratio of the number less than or equal to F to s is recordedCalculate +.>Mean value phi of (1) ^m (t)：

4) Changing window length to m+1, repeating steps 1) to 3) to obtain phi ^m+1 (t)；

5) Calculating breath sample entropy samplen (t):

SampEn(t)＝ln[φ ^m (t)]-ln[φ ^m+1 (t)]

step 5-2, carrying out time domain, frequency domain and nonlinear feature extraction on the heartbeat signal to obtain a respiratory feature set R;

A. time domain features: extracting a heartbeat amplitude mean value, a heartbeat amplitude standard deviation, a heartbeat average amplitude difference and a heartbeat normalized average amplitude difference as time domain features of a heartbeat signal, wherein the specific calculation mode is the same as that of the breathing feature extraction part;

B. frequency domain characteristics: extracting the heartbeat low frequency band F _L ' Heartbeat intermediate frequency bandF _M ' and heartbeat high frequency band F _H ' the average value of the power spectrum amplitudes of the three frequency bands is used as the frequency domain characteristic of the heartbeat signal; wherein F is _L '＜p ₃ Hz，p ₃ Hz≤F _M '＜p ₄ Hz，F _H '＞p ₄ Hz，p ₃ ＞p ₁ ,p ₄ ＞p ₂ ；

C. Nonlinear characteristics: and extracting a heartbeat trending fluctuation scale index and a heartbeat sample entropy as nonlinear characteristics of a heartbeat signal, wherein the specific calculation mode is the same as that of the breathing characteristic extraction part.

Further, the specific process of step 6 includes:

step 6-1, adopting a characteristic fusion mode of serial fusion for the respiratory characteristic set R and the heartbeat characteristic set H to obtain a physiological characteristic set Y:

Y＝[RH]

Step 6-2, calculating vector average values of each class and all feature data of the voice feature set X and the physiological feature set Y, and describing the process of the voice feature set X, wherein Y is the same as the following, and the specific steps are as follows:

in the method, in the process of the invention,vector mean value of ith class of samples of voice feature set X _ij For the j-th sample of the i-th class of the voice feature set X, n _i The number of samples of the ith class of the voice feature set X;

in the method, in the process of the invention,the vector average value of all sample data of the voice feature set X is obtained, k is the class number, and n is the total sample number;

step 6-3, respectively calculating projection matrices of the voice signal feature set X and the physiological feature set Y to minimize the inter-class correlation of the feature sets X and Y, and describing the projection matrix calculation process of X, where Y is the same as the following, specifically:

first, an inter-class dispersion matrix S of a speech signal feature set X is calculated _bx ：

Wherein,

phi when the correlation between classes is minimum _bx ^T φ _bx Is a diagonal matrix, also because of phi _bx ^T φ _bx Symmetrical semi-normal, there is the following transformation:

where P is the orthogonal eigenvector matrix,diagonal matrices ordered in descending order for non-negative real eigenvalues;

let Q be composed of r eigenvectors corresponding to r maximum non-zero eigenvalues in matrix P, corresponding to:

Q ^T (φ _bx ^T φ _bx )Q＝A

then a projection matrix W of X can be obtained _bx ：

Projection matrix W of Y is obtained in the same way _by ；

Step 6-4, the calculated casting according to step 6-3Projecting X and Y by using the shadow matrix to obtain a projected voice feature set X _p And a physiological feature set Y _p ：

X _p ＝W _bx ^T X

Y _p ＝W _by ^T Y

Step 6-5, utilizing singular value decomposition SVD to diagonalize the inter-set covariance matrix of the projected feature set to obtain a voice feature set conversion matrix W _x And physiological feature set transformation matrix W _y The method is characterized by comprising the following steps:

wherein S is _xy ＝X _p Y _p ^T B is a diagonal matrix with non-zero diagonal elements, wherein U, V can be obtained through SVD; from the following componentsThe method can obtain:

wherein I is a unit array;

thus, a speech feature set conversion matrix W is obtained _x And physiological feature set transformation matrix W _y ：

W _x ＝W _cx ^T W _bx ^T

W _y ＝W _cy ^T W _by ^T

Step 6-6, calculating the converted voice feature set X according to the conversion matrix calculated in step 6-5 _dca And a physiological feature set Y _dca ：

X _dca ＝W _x X

Y _dca ＝W _y Y

Step 6-7, the speech feature set X calculated for step 6-6 _dca And a physiological feature set Y _dca Serial fusion is carried out to obtain a fusion feature set Z:

Z＝[X _dca Y _dca ]。

a lie detection system based on a voice and radar dual sensor comprises a voice acquisition module, a radar acquisition module, a voice preprocessing module, a radar preprocessing module, a voice feature extraction module, a physiological feature extraction module, a feature fusion module and a classification module;

the voice acquisition module is used for acquiring voice signals by utilizing a microphone;

The radar acquisition module is used for acquiring radar signals by using a continuous wave radar;

the voice preprocessing module is used for carrying out noise reduction, sound event detection and pre-emphasis preprocessing on the collected voice signals;

the radar preprocessing module is used for demodulating and filtering the acquired radar signals to obtain respiratory signals and heartbeat signals;

the voice feature extraction module is used for extracting characteristics of fundamental frequency, sounding probability, short-time zero-crossing rate, frame root mean square energy and mel cepstrum coefficient according to the preprocessed voice signals, and applying 6 statistical parameters of maximum value, minimum value, mean value, standard deviation, skewness and kurtosis to the 5 characteristics to obtain a voice feature set X;

the physiological characteristic extraction module is used for extracting time domain, frequency domain and nonlinear characteristics of the respiratory signal and the heartbeat signal respectively to obtain a respiratory characteristic set R and a heartbeat characteristic set H;

the feature fusion module is used for fusing the respiratory feature set R and the heartbeat feature set H to obtain a physiological feature set Y, and then carrying out feature fusion on the physiological feature set Y and the voice feature set X to obtain a fusion feature set Z;

the classifying module is used for training the classifier by utilizing the fusion feature set Z and performing lie detection classification on the voice sample.

Compared with the prior art, the invention has the remarkable advantages that: 1) The continuous wave radar can realize non-contact measurement of respiration and heartbeat, effectively reduce uncomfortable feeling of a subject, reduce physiological and psychological pressing feeling of the subject and reduce influence on a measurement result; 2) The voice signal, the respiration signal and the heartbeat signal are combined, so that the problems of strong individual variability, easy camouflage, strong limitation and the like of the voice signal can be solved; 3) By combining the SOX noise reduction program and the characteristics of improving the spectral subtraction, a better noise reduction effect can be achieved; 4) The DCA algorithm is utilized to fuse the voice features and the physiological features, minimize the correlation among classes, maximize the correlation among feature sets and improve the accuracy of lie detection.

The invention is described in further detail below with reference to the accompanying drawings.

Drawings

Figure 1 is a block diagram of a lie detection system based on dual voice and radar sensors of the present invention.

Fig. 2 is a schematic diagram of an original noisy speech waveform in one embodiment.

FIG. 3 is a schematic diagram of a speech waveform of an original noisy speech after SOX processing in one embodiment.

FIG. 4 is a schematic diagram of speech waveforms of an embodiment after an improved spectral subtraction of original noisy speech.

FIG. 5 is a schematic diagram of speech waveforms of an original noisy speech subjected to SOX+ modified spectral subtraction processing in an embodiment.

Fig. 6 is a comparison of lie detection results of different feature sets in one embodiment.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application will be further described in detail with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the present application.

In addition, if there is a description of "first", "second", etc., in this disclosure, the description of "first", "second", etc. is for descriptive purposes only and is not to be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include at least one such feature. In addition, the technical solutions of the embodiments may be combined with each other, but it is necessary to base that the technical solutions can be realized by those skilled in the art, and when the technical solutions are contradictory or cannot be realized, the combination of the technical solutions should be considered to be absent and not within the scope of protection claimed in the present invention.

The invention provides a lie detection method based on a voice and radar dual sensor, which comprises the following steps:

step 2, noise reduction, sound event detection and pre-emphasis preprocessing are carried out on the voice signals acquired in the step 1; the specific process comprises the following steps:

step 2-1, recording a noise sample of the noise which is the main noise and has the decibel exceeding a preset threshold value, wherein the duration is 2s;

Step 3, extracting characteristics (0-12) of fundamental frequency, sounding probability, short-time zero-crossing rate, frame root mean square energy and mel cepstrum coefficient according to characteristics such as voice modulation, pause, voice speed change and the like possibly occurring under lie-state of a person aiming at the preprocessed voice signal obtained in the step 2, and applying 6 statistical parameters of maximum value, minimum value, average value, standard deviation, skewness and kurtosis to the 5 characteristics to obtain a voice characteristic set X;

step 5, according to the characteristics of possible respiratory 'inhibition' and rapid heartbeat of people in lie-state, the respiratory signal and the heartbeat signal obtained in the step 4 are respectively subjected to time domain, frequency domain and nonlinear characteristic extraction to obtain a respiratory characteristic set R and a heartbeat characteristic set H; the method specifically comprises the following steps:

B. frequency domain characteristics: extracting respiratory low frequency band F _L (F _L Less than 0.2Hz, respiratory medium frequency band F _M (0.2≤F _M Less than or equal to 0.3 Hz) and respiratory high frequency band F _H (F _H More than 0.3 Hz) as the frequency domain characteristic of the respiratory signal;

(1) Respiratory detrence fluctuation scale index

2) Calculating the cumulative difference y (n) of the breathing sequence:

(2) Breath sample entropy

X _m (t)＝(X(t),X(t+1),…,X(t+m-1)),1≤t≤N-m+1

wherein X is _m (t) is the t-th breath subsequence;

d _ij ＝max _{k＝0,…,m-1} (X _m (i+k)-X _m (j+k))

5) Calculating breath sample entropy samplen (t):

SampEn(t)＝ln[φ ^m (t)]-ln[φ ^m+1 (t)]

B. Frequency domain characteristics: extracting the heartbeat low frequency band F _L '(F _L ' < 0.8 Hz), frequency band F in heart beat _M '(0.8≤F _M '.ltoreq.1.2 Hz) and a heartbeat high frequency band F _H '(F _H ' 1.2 Hz) as the frequency domain characteristic of the heartbeat signal;

Step 6, carrying out serial fusion on the respiratory feature set R obtained in the step 5 and the heartbeat feature set H to obtain a physiological feature set Y, and then carrying out feature fusion on the physiological feature set Y and the voice feature set X obtained in the step 3 by using a DCA algorithm to obtain a fusion feature set Z; the specific process comprises the following steps:

Y＝[RH]

Wherein,

Q ^T (φ _bx ^T φ _bx )Q＝A

then a projection matrix W of X can be obtained _bx ：

Projection matrix W of Y is obtained in the same way _by ；

Step 6-4, projecting X and Y according to the projection matrix calculated in the step 6-3 to obtain a projected voice feature set X _p And a physiological feature set Y _p ：

X _p ＝W _bx ^T X

Y _p ＝W _by ^T Y

Wherein S is _xy ＝X _p Y _p ^T B is a diagonal matrix with non-zero diagonal elements, wherein U, V can be obtained through SVD;

from the following componentsThe method can obtain:

wherein I is a unit array;

thus, a speech feature set conversion matrix W is obtained _x Hesheng (Chinese character)Conversion matrix W of the physical feature set _y ：

W _x ＝W _cx ^T W _bx ^T

W _y ＝W _cy ^T W _by ^T

X _dca ＝W _x X

Y _dca ＝W _y Y

Z＝[X _dca Y _dca ]。

Referring to fig. 1, the invention provides a lie detection system based on a voice and radar dual sensor, which comprises a voice acquisition module, a radar acquisition module, a voice preprocessing module, a radar preprocessing module, a voice feature extraction module, a physiological feature extraction module, a feature fusion module and a classification module;

the voice preprocessing module is used for carrying out noise reduction, sound event detection and pre-emphasis preprocessing on the collected voice signals; the module includes the following components:

The main noise acquisition unit is used for recording a noise sample aiming at the noise of which the decibel exceeds a preset threshold value, namely main noise;

the primary noise unit is used for generating a noise sample configuration file for the noise samples by using a SOX audio processing program, and carrying out primary noise reduction on the voice signals according to the noise sample configuration file to remove main noise;

and the secondary noise unit is used for carrying out secondary noise reduction on the voice signal subjected to primary noise reduction by utilizing improved spectral subtraction, removing other types of noise except noise samples and obtaining a pure voice signal.

the physiological characteristic extraction module is used for extracting time domain, frequency domain and nonlinear characteristics of the respiratory signal and the heartbeat signal respectively to obtain a respiratory characteristic set R and a heartbeat characteristic set H; the module includes the following components:

The respiratory feature set acquisition unit is used for carrying out time domain, frequency domain and nonlinear feature extraction on the respiratory signals to obtain a respiratory feature set R;

(1) Respiratory detrence fluctuation scale index

2) Calculating the cumulative difference y (n) of the breathing sequence:

4) Fitting each section of window length interval by using least square methodTrend y of the outgoing part _b (n) then removing the local trend of each interval to obtain a new respiratory sequence and calculating the root mean square F (n) of the new respiratory sequence:

(2) Breath sample entropy

X _m (t)＝(X(t),X(t+1),…,X(t+m-1)),1≤t≤N-m+1

wherein X is _m (t) is the t-th breath subsequence;

d _ij ＝max _{k＝0,…,m-1} (|X _m (i+k)-X _m (j+k)|)

5) Calculating breath sample entropy samplen (t):

SampEn(t)＝ln[φ ^m (t)]-ln[φ ^m+1 (t)]

the respiratory feature set acquisition unit is used for extracting time domain, frequency domain and nonlinear features of the heartbeat signal to obtain a respiratory feature set R;

B. frequency domain characteristics: extracting the heartbeat low frequency band F _L ' Heartbeat middle frequency band F _M ' and heartbeat high frequency band F _H ' the average value of the power spectrum amplitudes of the three frequency bands is used as the frequency domain characteristic of the heartbeat signal; wherein F is _L '＜p ₃ Hz，p ₃ Hz≤F _M '＜p ₄ Hz，F _H '＞p ₄ Hz，p ₃ ＞p ₁ ,p ₄ ＞p ₂ ；

The feature fusion module is used for fusing the respiratory feature set R and the heartbeat feature set H to obtain a physiological feature set Y, and then carrying out feature fusion on the physiological feature set Y and the voice feature set X to obtain a fusion feature set Z; the module includes the following components:

the first feature fusion unit is used for adopting a feature fusion mode of serial fusion for the respiratory feature set R and the heartbeat feature set H to obtain a physiological feature set Y:

Y＝[RH]

the first calculating unit is configured to calculate a vector average value of each class and all feature data of the speech feature set X and the physiological feature set Y, and the following description is given by a process of the speech feature set X, where Y is the same as the following, specifically:

the second calculation unit is configured to calculate projection matrices of the speech signal feature set X and the physiological feature set Y respectively so as to minimize an inter-class correlation between the feature sets X and Y, and the following description is made with a projection matrix calculation procedure of X, and Y is available in the same way, specifically as follows:

Wherein,

Q ^T (φ _bx ^T φ _bx )Q＝A

then a projection matrix W of X can be obtained _bx ：

Projection matrix W of Y is obtained in the same way _by ；

The projection unit is used for projecting the X and the Y according to the projection matrix to obtain a projected voice feature set X _p And a physiological feature set Y _p ：

X _p ＝W _bx ^T X

Y _p ＝W _by ^T Y

Singular value decomposition SVD unit for diagonalizing the inter-set covariance matrix of the projected feature set using the SVD to obtain a speech feature set conversion matrix W _x And physiological feature set transformation matrix W _y The method is characterized by comprising the following steps:

from the following componentsThe method can obtain:

wherein I is a unit array;

W _x ＝W _cx ^T W _bx ^T

W _y ＝W _cy ^T W _by ^T

A third calculation unit for calculating the converted voice feature set X according to the conversion matrix _dca And a physiological feature set Y _dca ：

X _dca ＝W _x X

Y _dca ＝W _y Y

A second feature fusion unit for the voice feature set X _dca And a physiological feature set Y _dca Serial fusion is carried out to obtain a fusion feature set Z:

Z＝[X _dca Y _dca ]。

Specific limitations regarding voice and radar dual sensor based lie detection systems may be found in the above limitations of voice and radar dual sensor based lie detection methods, and are not described in detail herein. The various modules in the voice and radar dual sensor based lie detection system described above may be implemented in whole or in part by software, hardware, and combinations thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.

The present invention will be described in further detail with reference to examples.

Examples

The invention discloses a lie detection method based on a voice and radar dual sensor, which comprises the following steps:

Step 3, extracting characteristics of fundamental frequency, sounding probability, short-time zero-crossing rate, frame root mean square energy and mel cepstrum coefficient according to characteristics of voice modulation, pause, voice speed change and the like which possibly occur in a lie state of a person aiming at the preprocessed voice signal obtained in the step 2, and applying 6 statistical parameters of maximum value, minimum value, mean value, standard deviation, skewness and kurtosis to the 5 characteristics to obtain a voice characteristic set X;

step 5, according to the characteristics of possible respiratory 'inhibition' and rapid heartbeat of people in lie-state, the respiratory signal and the heartbeat signal obtained in the step 4 are respectively subjected to time domain, frequency domain and nonlinear characteristic extraction to obtain a respiratory characteristic set R and a heartbeat characteristic set H;

step 6, carrying out serial fusion on the respiratory feature set R obtained in the step 5 and the heartbeat feature set H to obtain a physiological feature set Y, and then carrying out feature fusion on the physiological feature set Y and the voice feature set X obtained in the step 3 by using a DCA algorithm to obtain a fusion feature set Z;

In connection with fig. 2, 3, 4, 5, the sox audio processing program has a good noise reduction effect on the main noise component, but cannot completely remove the noise; the improved spectral subtraction has strong noise reduction capability, but can misjudge effective speech segments. The SOX audio processing program is adopted to perform primary noise reduction, and then improved spectral subtraction is used to perform secondary noise reduction, so that good noise reduction effect can be achieved, and meanwhile, effective voice fragments cannot be lost.

With reference to fig. 6, due to the limitations of speech, lie detection classification accuracy on the speech feature set is 59.1% at maximum; the highest lie detection classification accuracy of the physiological feature set is 67.6%, which illustrates the validity of the physiological signal and the extracted physiological feature set for lie detection; the fusion feature set obtained by fusing the voice feature set and the physiological feature set by using the serial fusion and DCA algorithm obtains higher lie detection classification accuracy than that of the two feature sets, and the highest lie detection accuracy is 70.2%.

In conclusion, the voice and radar-based dual-sensor lie detection method and system provided by the invention not only can not bring uncomfortable feeling to a detected person, but also can eliminate the limitation of voice, effectively improve the lie detection accuracy, and are high in reliability and wide in applicability.

The foregoing has outlined and described the basic principles, features, and advantages of the present invention. It will be understood by those skilled in the art that the present invention is not limited to the embodiments described above, and that the above embodiments and descriptions are merely illustrative of the principles of the present invention, and various changes and modifications may be made without departing from the spirit and scope of the invention, which is defined in the appended claims. The scope of the invention is defined by the appended claims and equivalents thereof.

Claims

1. A lie detection method based on a voice and radar dual sensor, the method comprising the steps of:

and 5, respectively extracting time domain, frequency domain and nonlinear characteristics of the respiratory signal and the heartbeat signal obtained in the step 4 to obtain a respiratory characteristic set R and a heartbeat characteristic set H, wherein the method specifically comprises the following steps of:

B. frequency domain characteristics: extracting respiratory low frequency band F _L Mid-respiratory frequency band F _M And respiratory high frequency band F _H The power spectrum amplitude average value of the three frequency bands is used as the frequency domain characteristic of the breathing signal; wherein F is _L ,p ₁ Hz，p ₁ Hz≤F _M ＜p ₂ Hz，F _H ＞p ₂ Hz；

(1) Respiratory detrence fluctuation scale index

2) Calculating the cumulative difference y (n) of the breathing sequence:

(2) Breath sample entropy

X _m (t)＝(X(t),X(t+1),…,X(t+m-1)),1≤t≤N-m+1

wherein X is _m (t) is the t-th breath subsequence;

d _ij ＝max _{k＝0,…,m-1} (|X _m (i+k)-X _m (j+k)|)

5) Calculating breath sample entropy samplen (t):

SampEn(t)＝ln[φ ^m (t)]-ln[φ ^m+1 (t)]

C. Nonlinear characteristics: extracting a heartbeat trending fluctuation scale index and a heartbeat sample entropy as nonlinear characteristics of a heartbeat signal, wherein the specific calculation mode is the same as that of the breathing characteristic extraction part;

2. A voice and radar dual-sensor lie detection method according to claim 1, wherein the noise reduction of the voice signal acquired in step 1 in step 2 comprises the following steps:

3. The lie detection method based on the dual voice and radar sensor according to claim 1, wherein in the step 6, feature fusion is performed on the physiological feature set Y and the voice feature set X obtained in the step 3 to obtain a fused feature set Z, and the fusion feature set Z is realized specifically by using a DCA algorithm.

4. A voice and radar dual sensor based lie detection method according to claim 3, characterized in that step 6 comprises the following steps:

Y＝[RH]

Wherein,

Q ^T (φ _bx ^T φ _bx )Q＝A

then a projection matrix W of X can be obtained _bx ：

Projection matrix W of Y is obtained in the same way _by ；

X _p ＝W _bx ^T X

Y _p ＝W _by ^T Y

From the following componentsThe method can obtain:

wherein I is a unit array;

W _x ＝W _cx ^T W _bx ^T

W _y ＝W _cy ^T W _by ^T

X _dca ＝W _x X

Y _dca ＝W _y Y

Z＝[X _dca Y _dca ]。

5. the lie detection system based on the voice and radar double sensors is characterized by comprising a voice acquisition module, a radar acquisition module, a voice preprocessing module, a radar preprocessing module, a voice feature extraction module, a physiological feature extraction module, a feature fusion module and a classification module;

the physiological characteristic extraction module comprises the following steps of:

(1) Respiratory detrence fluctuation scale index

2) Calculating the cumulative difference y (n) of the breathing sequence:

4) Fitting a local trend y to each section of window length interval by using a least square method _b (n) then removing the local trend of each interval to obtain a new oneRespiration sequence and calculate root mean square F (n) for the new respiration sequence:

(2) Breath sample entropy

X _m (t)＝(X(t),X(t+1),…,X(t+m-1)),1≤t≤N-m+1

wherein X is _m (t) is the t-th breath subsequence;

d _ij ＝max _{k＝0,…,m-1} (|X _m (i+k)-X _m (j+k)|)

5) Calculating breath sample entropy samplen (t):

SampEn(t)＝ln[φ ^m (t)]-ln[φ ^m+1 (t)]

6. The lie detection system based on dual voice and radar sensors of claim 5, wherein the voice preprocessing module comprises, in order:

7. The voice and radar dual sensor based lie detection system according to claim 6, characterized in that the feature fusion module comprises the following steps:

Y＝[RH]

Wherein,

Q ^T (φ _bx ^T φ _bx )Q＝A

then a projection matrix W of X can be obtained _bx ：

Projection matrix W of Y is obtained in the same way _by ；

X _p ＝W _bx ^T X

Y _p ＝W _by ^T Y

from the following componentsThe method can obtain:

wherein I is a unit array;

W _x ＝W _cx ^T W _bx ^T

W _y ＝W _cy ^T W _by ^T

X _dca ＝W _x X

Y _dca ＝W _y Y

Z＝[X _dca Y _dca ]。