CN110033778B - Real-time identification and correction system for lie state - Google Patents

Real-time identification and correction system for lie state Download PDF

Info

Publication number
CN110033778B
CN110033778B CN201910376615.7A CN201910376615A CN110033778B CN 110033778 B CN110033778 B CN 110033778B CN 201910376615 A CN201910376615 A CN 201910376615A CN 110033778 B CN110033778 B CN 110033778B
Authority
CN
China
Prior art keywords
module
voice
recognition module
communication connection
correction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910376615.7A
Other languages
Chinese (zh)
Other versions
CN110033778A (en
Inventor
周燕
陈杰
史斌斌
张旭
胡志峰
曾子涵
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Vocational University
Original Assignee
Suzhou Vocational University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Vocational University filed Critical Suzhou Vocational University
Priority to CN201910376615.7A priority Critical patent/CN110033778B/en
Publication of CN110033778A publication Critical patent/CN110033778A/en
Application granted granted Critical
Publication of CN110033778B publication Critical patent/CN110033778B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification
    • G10L17/26Recognition of special voice characteristics, e.g. for use in lie detectors; Recognition of animal voices

Abstract

The invention relates to a lie state real-time recognition and correction system, which comprises a voice input module, a sample classification module, a semantic recognition module, a lie state recognition module, a voice dictionary, an identification module, a correction module and an output module, wherein the voice input module is used for inputting a voice; the voice input module is used for acquiring voice signals; the sample classification module is in communication connection with the voice input module; the semantic recognition module is in communication connection with the sample classification module; the lie state recognition module is in communication connection with the sample classification module; the voice dictionary is in communication connection between the semantic recognition module and the lie state recognition module; the identification module is respectively connected with the semantic recognition module and the lie state recognition module in a communication way; the correction module is respectively in communication connection with the identification module and the voice dictionary; the output module is in communication connection with the correction module, can recognize and identify the reliability level of the words, corrects and generates new problems, and is strong in effectiveness and accurate in acquired information.

Description

Real-time identification and correction system for lie state
Technical Field
The invention relates to voice and lie state detection, in particular to a lie state real-time recognition and correction system.
Background
Some lie often appears in life, a good lie can avoid getting into a greater trouble or danger, but a malicious lie can bring great harm to others or society. Lie is a typical psychological state that affects a person's behavior, facial expressions, physiological changes, and language expressions, among others. For people with higher psychological qualities, this effect is extremely subtle and difficult for normal people to perceive, and thus their lie cannot be quickly and effectively uncovered. With the development of science and technology, part of information can be accurately collected through equipment, and the machine identification lies state is laid a foundation.
Since the nineties of the last century, lie detection devices have developed sufficiently in China to successfully assist in the detection of more than 2000 difficult cases. Therefore, the ministry of public security puts this item as the "eight five" key item, and during the "nine five" period, it is put as the national science and technology project. However, the lie detection device still stays at the stage of collecting information on the site of the device and judging the information in a manual background, so that the timeliness is poor, the judgment period is long, and the opportunity is easily delayed; when the information with low reliability appears, the question is inconvenient to be asked again aiming at the information; and in the process of judgment, the requirement on professional quality of personnel is high, and the wide popularization cannot be realized.
Disclosure of Invention
The invention aims to solve the technical problem of providing a lie state real-time recognition and correction system which can recognize and identify the credibility level of a speech, correct the speech with lower credibility level and generate new problems with pertinence, and has strong effectiveness and accurate acquired information.
In order to solve the technical problem, the invention provides a lie state real-time recognition and correction system which comprises a voice input module, a sample classification module, a voice recognition module, a lie state recognition module, a voice dictionary, an identification module, a correction module and an output module; the voice input module is used for acquiring voice signals; the sample classification module is in communication connection with the voice input module and is used for distinguishing voice signals of a questioner and a questioned person and transmitting the voice signals of the questioned person to the semantic recognition module and the lie state recognition module after noise reduction; the semantic recognition module is in communication connection with the sample classification module and is used for recognizing the semantic meaning of the semantic signal and splitting the whole sentence into characters and words; the lie state recognition module is in communication connection with the sample classification module and is used for recognizing the credibility of the semantic signals and grading the credibility of each character, word, sentence and situation; the voice dictionary is in communication connection between the semantic recognition module and the lie state recognition module and is used for storing an acoustic model and a language model; the identification module is respectively connected with the semantic recognition module and the lie state recognition module in a communication mode, and can be coupled with the word output by the semantic recognition module and the credibility level output by the lie state recognition module in a time sequence to obtain the credibility level of each word; the correction module is respectively in communication connection with the identification module and the voice dictionary and is used for carrying out antisense replacement on verbs and adjectives with lower credibility level output by the identification module and modifying the whole sentences and situations with lower credibility level; the output module is in communication connection with the correction module and is used for outputting the voice signal of the correction module in real time.
Preferably, the semantic recognition module comprises a voice processing unit, an acoustic matching unit and a language processing unit which are sequentially in communication connection; the voice processing unit is used for extracting acoustic characteristics of characters and words of the voice signal and endowing each character and word with a unique label; the acoustic matching unit is used for matching the acoustic features extracted by the voice processing unit with the acoustic features in the voice dictionary to obtain the components and the meanings of each word and each word; the language processing unit is used for correcting the components and the meanings of the words and the phrases according to the grammar structure, the semantic structure and the language model in the voice dictionary.
Preferably, the components are sentence components including a subject, a predicate, an object, an animal, a predicate, a subject, a complement, and a core; the meaning of the words and phrases are intended to have a special meaning.
Preferably, the lie state identification module comprises a preprocessing unit, a sparse representation unit, a deep learning unit and a dynamic sparse bayesian time sequence modeling unit which are sequentially in communication connection; the preprocessing unit is used for sequentially performing framing, windowing, pre-emphasis, endpoint detection and PCA (principal component analysis) dimensionality reduction on the voice signals; the sparse representation unit is used for decomposing the preprocessed voice signal into an over-complete dictionary and a corresponding sparse coefficient matrix; the deep learning unit is a machine learning model constructed by multiple hidden layers and used for extracting deep structure information of the voice signal; the dynamic sparse Bayesian time sequence modeling unit is used for extracting overall characteristic information and detail characteristic information of the voice signal.
Preferably, the overcomplete sparse dictionary is a feature subspace of a feature set of the speech signal; the sparse coefficient matrix is a characteristic parameter of a projection intrinsic structure of the voice signal on the characteristic subspace.
Preferably, the sparse representation method of the sparse representation unit may be based on a fast K-SVD sparse representation algorithm.
Preferably, the depth structure information includes a language feature, a physiological feature, an acoustic feature, and a psycho-perceptual feature.
A real-time lie state recognition and correction system comprises a voice input module, a sample classification module, a voice recognition module, a lie state recognition module, a voice dictionary, an identification module, a correction module, an output module, a manual correction module, a big database and a problem generation module; the voice input module is used for acquiring voice signals; the sample classification module is in communication connection with the voice input module and is used for distinguishing voice signals of a questioner and a questioned person, transmitting the voice signals of the questioned person to the semantic recognition module and the lie state recognition module after noise reduction, and transmitting all input voice signals to the large database; the semantic recognition module is in communication connection with the sample classification module and is used for recognizing the semantic meaning of the semantic signal and splitting the whole sentence into characters and words; the lie state recognition module is in communication connection with the sample classification module and is used for recognizing the credibility of the semantic signals and grading the credibility of each character, word, sentence and situation; the voice dictionary is in communication connection between the semantic recognition module and the lie state recognition module and is used for storing an acoustic model and a language model; the identification module is respectively connected with the semantic recognition module and the lie state recognition module in a communication mode, and can be coupled with the word output by the semantic recognition module and the credibility level output by the lie state recognition module in a time sequence to obtain the credibility level of each word; the correction module is respectively in communication connection with the identification module and the voice dictionary and is used for carrying out antisense replacement on verbs and adjectives with lower credibility level output by the identification module and modifying the whole sentences and situations with lower credibility level; the output module is in communication connection with the correction module and is used for outputting voice signals of the correction module and the problem generation module in real time; the big database is respectively connected to the sample classification module and the lie state recognition module in a communication mode and used for storing voice signals generated in the questioning process and information of questioned persons acquired from the outside; the manual correction module is in communication connection between the correction module and the big database, and is used for artificially correcting the credibility grade of the voice signal of the questioned person; the question generation module is in communication connection between the big database and the output module, and is used for extracting characters, words, sentences and situations with low credibility levels by combining the big data and outputting the extracted characters, words, sentences and situations to the questioner by the output module after generating corresponding questions.
Preferably, the questioned person information includes, but is not limited to, personal basic information, activity time and scope information, credit information and social relationship information.
Preferably, the large database comprises a primary database and a correction database, the primary database is used for storing the voice information generated in the conversation process, and the correction database is used for storing the corrected voice information of the questioned person and keeping the credibility level of all the voice information.
Compared with the prior art, the invention has the beneficial effects that:
1. the invention can detect the lie state of each word spoken by a questioner, mark the credibility level of each character, word, sentence and situation, and modify the characters, words, sentences and situations with lower credibility to ensure that the characters, words, sentences and situations have more authenticity; after the modification is finished, the information with higher credibility grade can be output in real time, so that a questioner can know which information is correct and which information is wrong more easily, and the questioning efficiency and the authenticity of the acquired information are improved;
2. the invention can combine all the relevant information of the questioned person, extracts the characters, words, sentences and situations with lower credibility grade, generates corresponding new problems, and outputs the new problems to the questioning person by the output module, which can generate circulation in the system and out of the system to gradually increase the credibility of all the information, and after multiple circulation, all the required real information can be obtained, the information can be obtained quickly and accurately, a large amount of time is saved, and the interference of wrong information is avoided.
Drawings
FIG. 1 is a schematic structural view of example 1;
fig. 2 is a schematic structural view of embodiment 2.
Detailed Description
The technical solutions in the embodiments of the present invention will be described clearly and completely with reference to the accompanying drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Example 1:
referring to fig. 1, the embodiment discloses a lie state real-time recognition and correction system, which includes a voice input module, a sample classification module, a voice recognition module, a lie state recognition module, a voice dictionary, an identification module, a correction module, and an output module.
A voice input module:
the voice input module is a microphone. The voice input module is used for collecting voice signals.
A sample classification module:
the sample classification module is in communication connection with the voice input module. The sample classification module is used for distinguishing the voice signals of the questioner and the questioned person and simultaneously reducing noise. The voice signals of the questioned person extracted by the sample classification module can be simultaneously transmitted to the semanteme recognition module and the lie state recognition module.
Specifically, the sample classification module may distinguish the voice signal samples by way of manual keying, timing and/or timbre.
A semantic recognition module:
the semantic recognition module is in communication connection with the sample classification module. The semantic meaning recognition module is used for recognizing the semantic meaning of the voice signal and splitting the whole sentence into characters and words.
Specifically, the semantic recognition module includes a voice processing unit, an acoustic matching unit and a language processing unit, which are sequentially connected in a communication manner.
The voice processing unit is used for extracting acoustic characteristics of characters and words of a whole sentence of the voice signal and endowing each character and word with a unique label.
The acoustic matching unit is used for matching the acoustic features extracted by the voice processing unit with the acoustic features in the voice dictionary to obtain the components and the meanings of each word and each word.
The language processing unit is used for correcting the components and the meanings of the words and the phrases according to the grammar structure, the semantic structure and the language model in the voice dictionary.
The above-mentioned components are sentence components including subject, predicate, object, animal, fixed, object, complement and central; to the extent that a term or phrase is intended to have a special meaning, the term or phrase is intended to be interpreted in a generic and descriptive sense.
Lie state identification module:
the lie state recognition module is in communication connection with the sample classification module. The lie state recognition unit is used for recognizing the credibility of the voice signals and grading the voice signals. When the lie state identification module identifies and classifies the credibility, the external sensor can be used for collecting facial expression information, emotion change information and physiological change information of the questioned person, so that the accuracy of classification is further improved.
Specifically, the lie state identification module comprises a preprocessing unit, a sparse representation unit, a deep learning unit and a dynamic sparse Bayesian time sequence modeling unit which are sequentially in communication connection.
The preprocessing unit is used for performing framing, windowing, pre-emphasis, endpoint detection and PCA dimension reduction processing on the voice signals in sequence.
The sparse representation unit is used for decomposing the preprocessed voice signal into an over-complete dictionary and a corresponding sparse coefficient matrix. The overcomplete sparse dictionary is a feature subspace of a feature set of the speech signal; the sparse coefficient matrix is a characteristic parameter of the projection intrinsic structure of the voice signal on the characteristic subspace. The sparse representation unit can carry out sparse representation on the original data of the voice signal, the structure is clear, the efficiency is high, the processing cost is low, and convenience is provided for subsequent processing. The method of sparse representation of the sparse representation unit may be based on a fast K-SVD sparse representation algorithm.
The deep learning unit is a machine learning model constructed by multiple hidden layers and used for extracting deep structure information of the voice signal. The depth structure information includes linguistic features, physiological features, acoustic features, and psycho-perceptual features. The extraction method of the deep learning unit comprises the following steps: the approximation of a complex function is completed by utilizing the good multi-dimensional data linear decomposition capability of sparse representation and the deep structure advantage of multilayer nonlinear mapping; gradually mapping input data of the deep learning network through an unsupervised sparse automatic coding algorithm; and extracting the depth structure information of the voice signal. The deep learning unit improves the accuracy and timeliness of voice data analysis. The machine learning model can be a sparse stack self-encoder network (SSAE) model by utilizing signal data, and the machine learning model is simple in construction idea and network structure and easy to operate.
The dynamic sparse Bayesian time sequence modeling unit is used for extracting overall characteristic information and detail characteristic information of the voice signal. The extraction method of the dynamic sparse Bayesian time sequence modeling unit comprises the following steps: dynamically reflecting the change situation of the lying psychological state along with time through a network topological structure of sparse Bayes; thereby calculating the incidence relation and the time sequence relation among the characteristics corresponding to each variable in the network. The dynamic sparse Bayes time sequence modeling unit can deeply express the lie state by extracting the characteristics of different time scales, and the detection accuracy is high.
A speech dictionary:
the voice dictionary is in communication connection between the semantic recognition module and the lie state recognition module. The speech dictionary is used to store acoustic models and language models to provide a data base.
An identification module:
the identification module is respectively connected with the semantic recognition module and the lie state recognition module in a communication mode. The identification module can couple the words output by the meaning recognition module and the confidence level output by the lie state recognition module in time sequence, so that the confidence level of each word and each word is obtained.
A correction module:
the correction module is respectively connected with the identification module and the voice dictionary in a communication way. The correction module can carry out antisense replacement on verbs and adjectives with lower credibility level output by the identification module, modify whole sentences and situations with lower credibility level and obtain corresponding voice signals from the voice dictionary.
An output module:
the output module is in communication connection with the correction module. The output module is used for outputting the voice signal of the correction module in real time. The output signal of the output module may be a sound signal or a text signal.
The optimized structure can accurately identify and mark the lie state of the voice signal of the questioned person, and output the corrected result to the questioner, so that the questioner can acquire the authenticity of the words of the question asking person in real time, and then ask questions continuously in a more targeted manner, thereby acquiring real information, and having high accuracy and timeliness.
Example 2:
referring to fig. 2, the embodiment discloses a lie state real-time recognition and correction system, which includes a voice input module, a sample classification module, a voice recognition module, a lie state recognition module, a voice dictionary, an identification module, a correction module, an output module, a manual correction module, a big database and a problem generation module.
A voice input module:
the voice input module is a microphone. The voice input module is used for collecting voice signals.
A sample classification module:
the sample classification module is in communication connection with the voice input module. The sample classification module is used for distinguishing the voice signals of the questioner and the questioned person and simultaneously reducing noise. The sample classification module can convey the extracted voice signals of the questioned person to the semantic recognition module and the lie state recognition module, and convey all the input voice signals to the large database.
Specifically, the sample classification module may distinguish the voice signal samples by way of manual keying, timing and/or timbre.
A semantic recognition module:
the semantic recognition module is in communication connection with the sample classification module. The semantic meaning recognition module is used for recognizing the semantic meaning of the voice signal and splitting the whole sentence into characters and words.
Specifically, the semantic recognition module includes a voice processing unit, an acoustic matching unit and a language processing unit, which are sequentially connected in a communication manner.
The voice processing unit is used for extracting acoustic characteristics of characters and words of a whole sentence of the voice signal and endowing each character and word with a unique label.
The acoustic matching unit is used for matching the acoustic features extracted by the voice processing unit with the acoustic features in the voice dictionary to obtain the components and the meanings of each word and each word.
The language processing unit is used for correcting the components and the meanings of the words and the phrases according to the grammar structure, the semantic structure and the language model in the voice dictionary.
The above-mentioned components are sentence components including subject, predicate, object, animal, fixed, object, complement and central; to the extent that a term or phrase is intended to have a special meaning, the term or phrase is intended to be interpreted in a generic and descriptive sense.
Lie state identification module:
the lie state recognition module is in communication connection with the sample classification module. The lie state recognition unit is used for recognizing the credibility of the voice signals and grading the voice signals. When the lie state identification module identifies and classifies the credibility, the external sensor can be used for collecting facial expression information, emotion change information and physiological change information of the questioned person, so that the accuracy of classification is further improved.
Specifically, the lie state identification module comprises a preprocessing unit, a sparse representation unit, a deep learning unit and a dynamic sparse Bayesian time sequence modeling unit which are sequentially in communication connection.
The preprocessing unit is used for performing framing, windowing, pre-emphasis, endpoint detection and PCA dimension reduction processing on the voice signals in sequence.
The sparse representation unit is used for decomposing the preprocessed voice signal into an over-complete dictionary and a corresponding sparse coefficient matrix. The overcomplete sparse dictionary is a feature subspace of a feature set of the speech signal; the sparse coefficient matrix is a characteristic parameter of the projection intrinsic structure of the voice signal on the characteristic subspace. The sparse representation unit can carry out sparse representation on the original data of the voice signal, the structure is clear, the efficiency is high, the processing cost is low, and convenience is provided for subsequent processing. The method of sparse representation of the sparse representation unit may be based on a fast K-SVD sparse representation algorithm.
The deep learning unit is a machine learning model constructed by multiple hidden layers and used for extracting deep structure information of the voice signal. The depth structure information includes linguistic features, physiological features, acoustic features, and psycho-perceptual features. The extraction method of the deep learning unit comprises the following steps: the approximation of a complex function is completed by utilizing the good multi-dimensional data linear decomposition capability of sparse representation and the deep structure advantage of multilayer nonlinear mapping; gradually mapping input data of the deep learning network through an unsupervised sparse automatic coding algorithm; and extracting the depth structure information of the voice signal. The deep learning unit improves the accuracy and timeliness of voice data analysis. The machine learning model can be a sparse stack self-encoder network (SSAE) model by utilizing signal data, and the machine learning model is simple in construction idea and network structure and easy to operate.
The dynamic sparse Bayesian time sequence modeling unit is used for extracting overall characteristic information and detail characteristic information of the voice signal. The extraction method of the dynamic sparse Bayesian time sequence modeling unit comprises the following steps: dynamically reflecting the change situation of the lying psychological state along with time through a network topological structure of sparse Bayes; thereby calculating the incidence relation and the time sequence relation among the characteristics corresponding to each variable in the network. The dynamic sparse Bayes time sequence modeling unit can deeply express the lie state by extracting the characteristics of different time scales, and the detection accuracy is high.
A speech dictionary:
the voice dictionary is in communication connection between the semantic recognition module and the lie state recognition module. The speech dictionary is used to store acoustic models and language models to provide a data base.
An identification module:
the identification module is respectively connected with the semantic recognition module and the lie state recognition module in a communication mode. The identification module can couple the words output by the meaning recognition module and the confidence level output by the lie state recognition module in time sequence, so that the confidence level of each word and each word is obtained.
A correction module:
the correction module is respectively connected with the identification module and the voice dictionary in a communication way. The correction module can carry out antisense replacement on verbs and adjectives with lower credibility level output by the identification module, modify whole sentences and situations with lower credibility level and obtain corresponding voice signals from the voice dictionary.
An output module:
the output module is in communication connection with the correction module. The output module is used for outputting the voice signals of the correction module and the problem generation module in real time. The output signal of the output module may be a sound signal or a text signal.
A large database:
the big databases are respectively connected to the sample classification module and the lie state identification module in a communication mode. The large database is used for storing voice signals generated in the questioning process and questioned person information acquired from the outside. Questioned person information includes, but is not limited to, personal basic information, activity time and scope information, credit information, and social relationship information. The big database can provide a data base for semantic recognition and lie states.
The large database comprises a primary database and a correction database. The raw database is used for storing voice information generated in the conversation process. The correction database is used for storing the corrected voice information of the questioned person and keeping the credibility levels of all the voice information.
A manual correction module:
the manual correction module is in communication connection between the correction module and the big database. The manual correction module is used for artificially modifying the credibility grade of the voice signal of the questioned person. When the questioner can be sure that the questioner answers true, the confidence level can be manually increased to speed up the questioning and answering process.
A problem generation module:
the problem generation module is in communication connection between the large database and the output module. The question generation module is used for extracting characters, words, sentences and situations with lower credibility levels by combining big data, and outputting the characters, words, sentences and situations to the questioner by the output module after generating corresponding new questions. The problem generation module facilitates obtaining information with a higher level of confidence.
The optimized structure can generate circulation in and out of the system, gradually increase the credibility of all information, and obtain all required real information after multiple circulation. The information acquisition is quick and accurate, a large amount of time is saved, and the interference of error information is avoided.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein
It should be noted that the above-mentioned embodiments are not intended to limit the scope of the present invention, and all equivalent modifications and substitutions based on the above-mentioned technical solutions are within the scope of the present invention as defined in the claims.

Claims (8)

1. A real-time lie state recognition and correction system is characterized by comprising a voice input module, a sample classification module, a semantic recognition module, a lie state recognition module, a voice dictionary, an identification module, a correction module and an output module;
the voice input module is used for acquiring voice signals;
the sample classification module is in communication connection with the voice input module and is used for distinguishing voice signals of a questioner and a questioned person and transmitting the voice signals of the questioned person to the semantic recognition module and the lie state recognition module after noise reduction;
the semantic recognition module is in communication connection with the sample classification module and is used for recognizing the semantic meaning of the voice signal and splitting the whole sentence into characters and words;
the lie state recognition module is in communication connection with the sample classification module and is used for recognizing the credibility of the voice signal and grading the credibility of each character, word, sentence and situation;
the voice dictionary is in communication connection between the semantic recognition module and the lie state recognition module and is used for storing an acoustic model and a language model;
the identification module is respectively connected with the semantic recognition module and the lie state recognition module in a communication mode, and can be coupled with the word output by the semantic recognition module and the credibility level output by the lie state recognition module in a time sequence to obtain the credibility level of each word;
the correction module is respectively in communication connection with the identification module and the voice dictionary and is used for carrying out antisense replacement on verbs and adjectives with lower credibility level output by the identification module and modifying the whole sentences and situations with lower credibility level;
the output module is in communication connection with the correction module and is used for outputting the voice signal of the correction module in real time; the semantic recognition module comprises a voice processing unit, an acoustic matching unit and a language processing unit which are sequentially in communication connection;
the voice processing unit is used for extracting acoustic characteristics of characters and words of the voice signal and endowing each character and word with a unique label;
the acoustic matching unit is used for matching the acoustic features extracted by the voice processing unit with the acoustic features in the voice dictionary to obtain the components and the meanings of each word and each word;
the language processing unit is used for correcting the components and the meanings of the words and the phrases according to the grammar structure, the semantic structure and the language model in the voice dictionary;
the components are sentence components comprising subjects, predicates, objects, actors, determinants, subjects, complements and hearts; the meaning of the words and phrases are intended to have a special meaning.
2. The lie state real-time identification and correction system according to claim 1, wherein the lie state identification module comprises a preprocessing unit, a sparse representation unit, a deep learning unit and a dynamic sparse bayesian timing modeling unit which are sequentially connected in communication;
the preprocessing unit is used for sequentially performing framing, windowing, pre-emphasis, endpoint detection and PCA (principal component analysis) dimensionality reduction on the voice signals;
the sparse representation unit is used for decomposing the preprocessed voice signal into an over-complete dictionary and a corresponding sparse coefficient matrix;
the deep learning unit is a machine learning model constructed by multiple hidden layers and used for extracting deep structure information of the voice signal;
the dynamic sparse Bayesian time sequence modeling unit is used for extracting overall characteristic information and detail characteristic information of the voice signal.
3. The lie state real-time recognition correction system according to claim 2, wherein the overcomplete sparse dictionary is a feature subspace of a feature set of a speech signal; the sparse coefficient matrix is a characteristic parameter of a projection intrinsic structure of the voice signal on the characteristic subspace.
4. The lie state real-time identification correction system according to claim 3, wherein the sparse representation method of the sparse representation unit is based on a fast K-SVD sparse representation algorithm.
5. The lie state real-time recognition correction system according to claim 4, wherein the depth structure information includes a language feature, a physiological feature, an acoustic feature and a psycho-perceptual feature.
6. A real-time lie-speaking state recognition and correction system is characterized by comprising a voice input module, a sample classification module, a voice recognition module, a lie-speaking state recognition module, a voice dictionary, an identification module, a correction module, an output module, a manual correction module, a large database and a problem generation module;
the voice input module is used for acquiring voice signals;
the sample classification module is in communication connection with the voice input module and is used for distinguishing voice signals of a questioner and a questioned person, transmitting the voice signals of the questioned person to the semantic recognition module and the lie state recognition module after noise reduction, and transmitting all input voice signals to the large database;
the semantic recognition module is in communication connection with the sample classification module and is used for recognizing the semantic meaning of the voice signal and splitting the whole sentence into characters and words;
the lie state recognition module is in communication connection with the sample classification module and is used for recognizing the credibility of the voice signal and grading the credibility of each character, word, sentence and situation;
the voice dictionary is in communication connection between the semantic recognition module and the lie state recognition module and is used for storing an acoustic model and a language model;
the identification module is respectively connected with the semantic recognition module and the lie state recognition module in a communication mode, and can be coupled with the word output by the semantic recognition module and the credibility level output by the lie state recognition module in a time sequence to obtain the credibility level of each word;
the correction module is respectively in communication connection with the identification module and the voice dictionary and is used for carrying out antisense replacement on verbs and adjectives with lower credibility level output by the identification module and modifying the whole sentences and situations with lower credibility level;
the output module is in communication connection with the correction module and is used for outputting voice signals of the correction module and the problem generation module in real time;
the big database is respectively connected to the sample classification module and the lie state recognition module in a communication mode and used for storing voice signals generated in the questioning process and information of questioned persons acquired from the outside;
the manual correction module is in communication connection between the correction module and the big database, and is used for artificially correcting the credibility grade of the voice signal of the questioned person;
the question generation module is in communication connection between the big database and the output module, and is used for extracting characters, words, sentences and situations with low credibility levels by combining the big data and outputting the extracted characters, words, sentences and situations to the questioner by the output module after generating corresponding questions.
7. The lie state real-time identification correction system according to claim 6, wherein the questioned person information includes personal basic information, activity time and scope information, credit information and social relationship information.
8. The lie state real-time recognition correction system according to claim 7, wherein the large database comprises a raw database and a correction database, the raw database is used for storing the voice information generated during the conversation, the correction database is used for storing the corrected voice information of the questioned person, and the credibility level of all the voice information is preserved.
CN201910376615.7A 2019-05-07 2019-05-07 Real-time identification and correction system for lie state Active CN110033778B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910376615.7A CN110033778B (en) 2019-05-07 2019-05-07 Real-time identification and correction system for lie state

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910376615.7A CN110033778B (en) 2019-05-07 2019-05-07 Real-time identification and correction system for lie state

Publications (2)

Publication Number Publication Date
CN110033778A CN110033778A (en) 2019-07-19
CN110033778B true CN110033778B (en) 2021-07-23

Family

ID=67241449

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910376615.7A Active CN110033778B (en) 2019-05-07 2019-05-07 Real-time identification and correction system for lie state

Country Status (1)

Country Link
CN (1) CN110033778B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2596351B (en) * 2020-06-26 2022-10-19 Lexiqal Ltd Forensic interaction understanding: A system and method for determining the nature of spoken interactions using acoustic and linguistic markers

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101999902A (en) * 2009-09-03 2011-04-06 上海天岸电子科技有限公司 Voiceprint lie detector and voiceprint lie detecting method
CN205493847U (en) * 2016-03-04 2016-08-24 北京同方神火联合科技发展有限公司 Pronunciation analytic system
CN105976820A (en) * 2016-06-14 2016-09-28 上海质良智能化设备有限公司 Voice emotion analysis system
CN107293302A (en) * 2017-06-27 2017-10-24 苏州大学 A kind of sparse spectrum signature extracting method being used in voice lie detection system
CN107680602A (en) * 2017-08-24 2018-02-09 平安科技(深圳)有限公司 Voice fraud recognition methods, device, terminal device and storage medium
WO2018135303A1 (en) * 2017-01-18 2018-07-26 ソニー株式会社 Information processing device, information processing method, and program
CN108520753A (en) * 2018-02-26 2018-09-11 南京工程学院 Voice lie detection method based on the two-way length of convolution memory network in short-term

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8972266B2 (en) * 2002-11-12 2015-03-03 David Bezar User intent analysis extent of speaker intent analysis system
US7571101B2 (en) * 2006-05-25 2009-08-04 Charles Humble Quantifying psychological stress levels using voice patterns
CN102104676A (en) * 2009-12-21 2011-06-22 深圳富泰宏精密工业有限公司 Wireless communication device with lie detection function and lie detection method thereof
KR101233628B1 (en) * 2010-12-14 2013-02-14 유비벨록스(주) Voice conversion method and terminal device having the same
US20180160959A1 (en) * 2016-12-12 2018-06-14 Timothy James Wilde Modular electronic lie and emotion detection systems, methods, and devices
CN107452405B (en) * 2017-08-16 2021-04-09 北京易真学思教育科技有限公司 Method and device for evaluating data according to voice content

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101999902A (en) * 2009-09-03 2011-04-06 上海天岸电子科技有限公司 Voiceprint lie detector and voiceprint lie detecting method
CN205493847U (en) * 2016-03-04 2016-08-24 北京同方神火联合科技发展有限公司 Pronunciation analytic system
CN105976820A (en) * 2016-06-14 2016-09-28 上海质良智能化设备有限公司 Voice emotion analysis system
WO2018135303A1 (en) * 2017-01-18 2018-07-26 ソニー株式会社 Information processing device, information processing method, and program
CN107293302A (en) * 2017-06-27 2017-10-24 苏州大学 A kind of sparse spectrum signature extracting method being used in voice lie detection system
CN107680602A (en) * 2017-08-24 2018-02-09 平安科技(深圳)有限公司 Voice fraud recognition methods, device, terminal device and storage medium
CN108520753A (en) * 2018-02-26 2018-09-11 南京工程学院 Voice lie detection method based on the two-way length of convolution memory network in short-term

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
基于语音稀疏表示的谎言检测方法;周燕;《中国优秀博士学位论文全文数据库信息科技辑》;20180430;I136-1 *
语音测谎技术研究现状与展望;赵力;《数据采集与处理》;20170228;246-257 *
谎言语音检测分析研究;潘欣裕;《中国优秀博士学位论文全文数据库信息科技辑》;20170331;I136-18 *

Also Published As

Publication number Publication date
CN110033778A (en) 2019-07-19

Similar Documents

Publication Publication Date Title
CN110164447B (en) Spoken language scoring method and device
CN112750465A (en) Cloud language ability evaluation system and wearable recording terminal
CN103971675A (en) Automatic voice recognizing method and system
CN110175229B (en) Method and system for on-line training based on natural language
CN111445898B (en) Language identification method and device, electronic equipment and storage medium
CN112466316A (en) Zero-sample voice conversion system based on generation countermeasure network
CN111554293A (en) Method, device and medium for filtering noise in voice recognition and conversation robot
CN112015874A (en) Student mental health accompany conversation system
CN114328817A (en) Text processing method and device
Chakraborty et al. Knowledge-based framework for intelligent emotion recognition in spontaneous speech
CN116127015A (en) NLP large model analysis system based on artificial intelligence self-adaption
JP6605105B1 (en) Sentence symbol insertion apparatus and method
CN110033778B (en) Real-time identification and correction system for lie state
Abhishek et al. Aiding the visually impaired using artificial intelligence and speech recognition technology
Ramadani et al. A new technology on translating Indonesian spoken language into Indonesian sign language system.
WO2022134025A1 (en) Offline speech recognition method and apparatus, electronic device and readable storage medium
Boakye et al. Any questions? Automatic question detection in meetings
CN110853669B (en) Audio identification method, device and equipment
CN113724693B (en) Voice judging method and device, electronic equipment and storage medium
CN117198338B (en) Interphone voiceprint recognition method and system based on artificial intelligence
CN112562856B (en) Method and system for searching health knowledge through voice
CN112735475B (en) Method and system for searching disease knowledge through voice
CN116915894A (en) Incoming call identity recognition method and device, electronic equipment and readable storage medium
CN111916106B (en) Method for improving pronunciation quality in English teaching
CN116168824A (en) Multi-modal mental disorder assessment method, computer device, and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant