TWM630105U

TWM630105U - System for detecting a target lies

Info

Publication number: TWM630105U
Application number: TW110215746U
Authority: TW
Inventors: 黃子源; 蔡祈岩; 蔡宜倫; 郭峻成
Original assignee: 星展（台灣）商業銀行股份有限公司
Priority date: 2021-12-30
Filing date: 2021-12-30
Publication date: 2022-08-01

Abstract

According to one embodiment of the invention, a system for detecting cyber criminal emotion of a target is disclosed. The system comprises a storing device, a connecting device and a processing device. The storing device provides a data storage space. The processing device electrically connects to the storing device through the connecting device to access the data storage space. The processing device comprises a feature extraction module, a micro-expression recognition module, a voice emotion recognition module and a deep learning model. The feature extraction module low-level analyzes an image to extract multi-dimensional feature. The micro-expression recognition module high-level analyzes the multi-dimensional feature to identify at least one micro-expression action. The voice emotion recognition module generates a spectrum from a voice signal and analyzes the spectrum to obtain an emotion feature. The deep learning model generates a probability that the target lies according to the multi-dimensional feature, the micro-expression action and the emotion feature.

Description

Lie detection system

本創作係與辨識一目標人物是否說謊之謊言辨識系統相關，尤其是與協助查核金融交易或防範電腦犯罪之謊言辨識系統相關。 This work is related to a lie detection system that identifies whether a target person is lying, especially a lie detection system that assists in checking financial transactions or preventing computer crime.

自測謊機問世一百多年以來，發展至今仍需要仰賴各種感測裝置感測受測者如心跳、血壓、皮膚導電率等的變化，藉此來判斷受測者是否說謊。雖然測得如此多的訊息，測謊機的準確率一直備受質疑。況且，隨著金融交易的態樣快速轉變，需要受試者親身進行測試的測謊機也不堪應用於查核金融交易或防範電腦犯罪。因此，亟需研發嶄新、便於使用且判斷準確的謊言辨識技術。 Since the advent of lie detectors more than 100 years ago, the development still relies on various sensing devices to sense changes in subjects such as heartbeat, blood pressure, skin conductivity, etc., so as to determine whether the subjects lie. Despite so much information, the accuracy of lie detectors has been questioned. Moreover, with the rapidly changing landscape of financial transactions, polygraph machines, which require in-person testing by subjects, are unsuitable for checking financial transactions or preventing computer crime. Therefore, there is an urgent need to develop a new, easy-to-use and accurate lie detection technology.

本創作之一目的在於提供謊言辨識系統其可結合分別針對影像資料與聲音訊號的分析結果，產生一目標人物說謊的一機率，藉此查核金融交易是否異常或甚至是防止不法的電腦犯罪事件，較佳地，可以引入準確度較高的人工智慧技術，如：以深度學習模型進行分析，而無須設置感測如心跳、血壓、皮膚導電率等人體生理特徵的各種感測裝置，以便於使用。 One of the purposes of this creation is to provide a lie detection system that can combine the analysis results of image data and audio signals to generate a probability of a target person lying, thereby checking whether financial transactions are abnormal or even preventing illegal computer crime incidents, Preferably, artificial intelligence technology with higher accuracy can be introduced, such as: using deep learning models for analysis, without setting up various sensing devices for sensing human physiological characteristics such as heartbeat, blood pressure, skin conductivity, etc., so as to facilitate use. .

依據本創作之另一面向，提供一謊言辨識系統，用以辨識一目標人物是否說謊，包括一儲存裝置、一連接裝置及一處理裝置。儲存裝置提供一資料儲存空間。處理裝置透過連接裝置電性連接儲存裝置以使用資料儲存空間，並包括一特徵擷取模組、一微表情辨識模組、一語音情緒辨識模組及一深度學習模型。特徵擷取模組對一影像資料進行低位分析，以擷取多維度特徵；微表情辨識模組對多維度特徵進行高位分析，以辨識至少一微表情動作；語音情緒辨識模組依據一聲音訊號產生一頻譜圖資料，分析頻譜圖資料以得到一情緒特徵；及深度學習模型依據前述多維度特徵、微表情動作及情緒特徵，產生目標人物說謊的一機率。 According to another aspect of the present invention, a lie identification system is provided for identifying whether a target person is lying, including a storage device, a connection device and a processing device. The storage device provides a Data storage space. The processing device is electrically connected to the storage device through the connection device to use the data storage space, and includes a feature extraction module, a micro-expression recognition module, a speech emotion recognition module and a deep learning model. The feature extraction module performs low-level analysis on an image data to extract multi-dimensional features; the micro-expression recognition module performs high-level analysis on the multi-dimensional features to recognize at least one micro-expression action; the speech emotion recognition module is based on a sound signal generating a spectrogram data, analyzing the spectrogram data to obtain an emotional feature; and the deep learning model generates a probability of the target person lying according to the aforementioned multi-dimensional features, micro-expression movements and emotional features.

1:謊言辨識系統 1: Lie detection system

10:處理裝置 10: Processing device

15:連接裝置 15: Connection device

20:儲存裝置 20: Storage device

30:特徵擷取模組 30: Feature extraction module

31:CNN單元 31: CNN unit

40:微表情辨識模組 40: Micro-expression recognition module

41:啟發式過濾單元 41: Heuristic filtering unit

42:GCN單元 42: GCN unit

50:語音情緒辨識模組 50: Speech emotion recognition module

51:頻譜轉換單元 51: Spectrum conversion unit

52:CNN單元 52: CNN unit

60:深度學習模型 60: Deep Learning Models

61:特徵隱藏空間 61: Feature Hidden Space

62:多模式判斷模組 62: Multi-mode judgment module

63:辨識迴歸分類集 63: Identifying Regression Classification Sets

S100、S200、S300、S400:步驟 S100, S200, S300, S400: Steps

圖1顯示依據本創作之一實施例之謊言辨識系統之系統架構圖。 FIG. 1 shows a system architecture diagram of a lie detection system according to an embodiment of the present invention.

圖2繪示依據本創作之一實施例之謊言辨識方法之流程圖。 FIG. 2 is a flowchart illustrating a method for detecting lies according to an embodiment of the present invention.

圖3顯示依據本創作之另一實施例之謊言辨識系統之功能方塊圖。 FIG. 3 shows a functional block diagram of a lie detection system according to another embodiment of the present invention.

圖4顯示依據本創作之另一實施例之深度學習模型之功能方塊圖。 FIG. 4 shows a functional block diagram of a deep learning model according to another embodiment of the present invention.

為進一步說明各實施例及其優點，本創作乃配合圖式提供下列說明。此些圖式乃為本創作揭露內容之一部分，其主要係用以說明實施例，並可配合說明書之相關描述來解釋實施例的運作原理。配合參考這些內容，本領域具有通常知識者應能理解其他可能的實施方式以及本創作之優點。圖中的元件並未按比例繪製，而類似的元件符號通常用來表示類似的元件。如在此揭露，「實施例」、「示例」及「本實施例」並非專指單一實施例，而可及於依據本創作不同結合方式實施之例子，不悖于本創作之精神與範圍。此處使用之詞彙僅用以闡明本創作原則之具體實施例，應不拘限本創作。故而，如「之中」可包括「之內」及「之上」，「一」及「該」可包括單數或複數；「藉」可指「從」，「若」可指「當」或「一旦」，端示於前後文字內容。此外，「及/或」可包括有關元件的任何可能的組合。 To further illustrate the various embodiments and their advantages, the present invention provides the following description in conjunction with the drawings. These drawings are a part of the disclosure content of the present invention, which are mainly used to illustrate the embodiments, and can be combined with the relevant descriptions in the specification to explain the operation principles of the embodiments. With reference to these contents, those of ordinary skill in the art should be able to understand other possible implementations and the advantages of the present invention. Elements in the figures are not drawn to scale and similar reference numerals are generally used to designate similar elements. As disclosed herein, "embodiments", "examples" and "this embodiment" do not specifically refer to a single embodiment, but may be implemented according to different combinations of the present invention without departing from the spirit and scope of the present invention. vocabulary used here The specific embodiments are only used to illustrate the principles of this creation, and should not be limited to this creation. Therefore, if "in" can include "within" and "on", "one" and "the" can include singular or plural; "borrow" can mean "from", "if" can mean "when" or "if". "Once" is displayed in the text before and after. Also, "and/or" can include any possible combination of the associated elements.

本說明書揭露可辨識一目標人物是否說謊之謊言辨識方法及系統之多個示例，其可結合分別針對影像資料與聲音訊號的分析結果，產生一目標人物說謊的一機率，藉此查核金融交易是否異常或甚至是防止不法的電腦犯罪事件，較佳地，可以引入準確度較高的人工智慧技術，如：以深度學習(Deep Learning)模型進行分析，而無須設置感測如心跳、血壓、皮膚導電率等人體生理特徵的各種感測裝置，以便於使用。請參考圖1顯示依據本創作之一實施例之一謊言辨識系統。謊言辨識系統1包括一處理裝置10、一連接裝置15及一儲存裝置20。處理裝置10透過連接裝置15電性連接並控制儲存裝置20的操作以使用儲存裝置20的資料儲存空間，存取資料儲存空間中的資料。儲存裝置20可經由一通訊連結裝置(圖中未示)與一外部資料庫(圖中未示)或一影像錄製裝置，如：攝影機等形成通訊連結，藉此接收一目標人物的一影片，如：拍攝到目標人物的臉部及錄音的影片，其拍攝時間可為此目標人物進行金融交易時，影片可在輸入儲存裝置20之前或之後經處理分割為影像資料與聲音訊號。在此示例處理裝置10可為一處理器，如：一中央處理器(CPU)、一圖形處理器(GPU)等，其較佳可進行矩陣平行運算；連接裝置15可為傳遞訊號、資料的通道，如：匯流排或主機板等；儲存裝置20可為一記憶體、一硬碟或一資料庫等；通訊連結裝置可為一無線通訊連結裝置、一有線通訊連結裝置等；然而請注意本創作不限於此。本實施例是以一中央處理器實施謊言辨識系統1，以一記憶體實施儲存裝置20，以一無線通訊連結裝置實施通訊連結裝置，然而在其他實施例中，謊言辨識系統可以其他元件實施。 This specification discloses several examples of lie detection methods and systems that can identify whether a target person is lying, which can combine the analysis results of image data and audio signals respectively to generate a probability that a target person is lying, thereby checking whether a financial transaction is not. Abnormal or even to prevent illegal computer crime events, preferably, artificial intelligence technology with higher accuracy can be introduced, such as: deep learning (Deep Learning) model for analysis, without setting sensors such as heartbeat, blood pressure, skin Various sensing devices for human physiological characteristics such as electrical conductivity for ease of use. Please refer to FIG. 1 to show a lie detection system according to an embodiment of the present invention. The lie detection system 1 includes a processing device 10 , a connecting device 15 and a storage device 20 . The processing device 10 is electrically connected to and controls the operation of the storage device 20 through the connection device 15 to use the data storage space of the storage device 20 and access the data in the data storage space. The storage device 20 can form a communication link with an external database (not shown in the figure) or an image recording device, such as a camera, through a communication link device (not shown in the figure), thereby receiving a video of a target person, For example, a video of a target person's face and audio recordings is captured, and when the target person performs financial transactions, the video can be processed and divided into image data and audio signals before or after being input to the storage device 20 . In this example, the processing device 10 can be a processor, such as: a central processing unit (CPU), a graphics processing unit (GPU), etc., which preferably can perform matrix parallel operations; the connecting device 15 can be a signal and data transmission device. Channel, such as: bus bar or motherboard, etc.; the storage device 20 can be a memory, a hard disk or a database, etc.; the communication connection device can be a wireless communication connection device, a wired communication connection device, etc.; however, please note This creation is not limited to this. In this embodiment, a central processing unit is used to implement the lie detection system 1, and a The memory implements the storage device 20, and the communication connection device is implemented as a wireless communication connection device, however in other embodiments, the lie detection system may be implemented with other components.

在本實施例中，處理裝置10係被配置以執行如圖2顯示之謊言辨識方法，主要包括四步驟S100~S400來分別針對影像資料與聲音訊號進行分析後再綜合判斷目標人物說謊的機率。首先，在步驟S100中，處理裝置10對影像資料進行低位分析，以擷取多維度特徵。接著，在步驟S200中，處理裝置10對多維度特徵進行高位分析，以辨識至少一微表情動作。接著，在步驟S300中，處理裝置10依據一聲音訊號產生一頻譜圖資料，分析頻譜圖資料以得到一情緒特徵。最後，在步驟S200中，處理裝置10依據多維度特徵、微表情動作及情緒特徵產生目標人物說謊的機率。請注意在其他實施例中，可以精細化上述任一步驟在其中包括子步驟或在任意兩步驟之前、之間及/或之後加入其他步驟，並不限於此。 In this embodiment, the processing device 10 is configured to execute the lie identification method shown in FIG. 2 , which mainly includes four steps S100 to S400 to analyze the image data and the audio signal respectively, and then comprehensively determine the probability of the target person lying. First, in step S100, the processing device 10 performs low-level analysis on the image data to extract multi-dimensional features. Next, in step S200, the processing device 10 performs high-level analysis on the multi-dimensional feature to identify at least one micro-expression action. Next, in step S300, the processing device 10 generates a spectrogram data according to a sound signal, and analyzes the spectrogram data to obtain an emotion feature. Finally, in step S200, the processing device 10 generates the probability of the target person lying according to the multi-dimensional features, micro-expression actions and emotional features. Please note that in other embodiments, any one of the above steps may be refined to include sub-steps therein, or other steps may be added before, between and/or after any two steps, but not limited thereto.

請一併參考圖1與圖3，圖3顯示一實施例中之謊言辨識系統之功能方塊圖，其中各功能方塊圖可以硬體或軟體形式設置在謊言辨識系統1中。本實施例的謊言辨識系統1可執行一謊言辨識方法。謊言辨識系統1包括一處理裝置10、一連接裝置15及一儲存裝置20，處理裝置10包括一特徵擷取模組30、一微表情辨識模組40、一語音情緒辨識模組50及一深度學習模型60。特徵擷取模組30包括一卷積神經網路(Convolution Neural Networks，CNN)單元31或一編碼器，在此以CNN單元31為例。可透過一初始步驟自一主機，如：一遠端伺服器下載及/或設定，或無須此初始步驟，端視於各種應用的需求，比如說：當處理裝置10應用於一端點主機或手持裝置時，可即時向遠端伺服器下載及/或設定CNN單元31；當處理裝置10應用於一伺服器且接受來自端點主機或一銀行主機的交易請求時，可即時且直接地以其中的CNN單元31協助判斷目標人物是否說謊。 Please refer to FIG. 1 and FIG. 3 together. FIG. 3 shows a functional block diagram of a lie detection system in an embodiment, wherein each functional block diagram can be configured in the lie detection system 1 in the form of hardware or software. The lie detection system 1 of this embodiment can implement a lie detection method. The lie recognition system 1 includes a processing device 10 , a connection device 15 and a storage device 20 . The processing device 10 includes a feature extraction module 30 , a micro-expression recognition module 40 , a speech emotion recognition module 50 and a depth Learning Model 60. The feature extraction module 30 includes a Convolution Neural Networks (CNN) unit 31 or an encoder, and the CNN unit 31 is taken as an example here. It can be downloaded and/or set from a host through an initial step, such as a remote server, or not required, depending on the needs of various applications, for example, when the processing device 10 is applied to an endpoint host or handheld The CNN unit 31 can be downloaded and/or set to the remote server in real time; when the processing device 10 is applied to a server and accepts data from an endpoint host or a bank When the host makes a transaction request, the CNN unit 31 can be used to assist in determining whether the target person is lying in real time and directly.

首先分析影像資料的第一步是，處理裝置10可將拍攝到目標人物的臉部及錄音的影片中的影像資料D_I輸入至特徵擷取模組30，由特徵擷取模組30中的CNN單元31進行低位分析以擷取出多維度特徵。較佳地，此處的影像資料D_I是一幀(frame)圖片，且可一幀接著一幀地輸入CNN單元31；多維度特徵可為多個數量無須限制的特徵點或由編碼器產生的特徵向量，在此舉例為68個臉部主要特徵點，如：對於每幀圖片可擷取出68×2維浮點數的資料量。較佳地，編碼器31對影像資料D_I進行低位分析可以從複雜的臉部像素中擷取出有意義的資料類別，有利於處理裝置10獲得易於分析出相關性的分類函數。多維度特徵接著被輸出至微表情辨識模組40及深度學習模型60中。 The first step of analyzing the image data is that the processing device 10 can input the image data D _I in the video of the face of the target person and the recorded video into the feature extraction module 30, and the feature extraction module 30 will The CNN unit 31 performs low-level analysis to extract multi-dimensional features. Preferably, the image data D _I here is a frame of pictures, and can be input to the CNN unit 31 frame by frame; the multi-dimensional features can be a plurality of feature points without limitation or generated by an encoder. The feature vector of , here is an example of 68 main feature points of the face, for example, the amount of data of 68×2-dimensional floating point numbers can be extracted for each frame of picture. Preferably, the encoder 31 performs low-level analysis on the image data D _I to extract meaningful data types from complex face pixels, which is beneficial for the processing device 10 to obtain a classification function that is easy to analyze the correlation. The multi-dimensional features are then output to the micro-expression recognition module 40 and the deep learning model 60 .

微表情辨識模組40可發現是否有預定的微表情動作存在，這些微表情動作可包括皺眉、抿嘴、眨眼、嘴角上揚、視角上飄等，並不限於此。微表情辨識模組40可包括一啟發式過濾(Heuristic Filtering)單元41及多個圖形卷積網路(Graph Convolution Networks，GCN)單元42或多個CNN單元兩者之任一，在此以多個GCN單元42為例，且其數量並不限於圖中示例。詳細地說，啟發式過濾單元41接收多維度特徵時可依據預先設定的多個微表情識別規則來分析多維度特徵，以過濾出多維度特徵中較符合此些微表情識別規則的微表情特徵或微表情圖片，例如眼睛開閉的判斷可以透過上、下眼的特徵點歐幾里德距離來判斷，此種方式能夠免去蒐集模型訓練資料集的時間，也可以用這種方式快速為資料集標籤以訓練模型。或者，過濾出來的微表情特徵或微表情圖片可被輸入至GCN單元42或CNN單元由GCN單元42或CNN單元來進一步辨識出微表情動作。較佳地，微表情動作可為n×1維整數的資料量，其中n=微表情動作的數量。接著，可輸出微表情動作至深度學習模型60中。 The micro-expression recognition module 40 can find out whether there are predetermined micro-expression actions, and these micro-expression actions may include frowning, pursing the mouth, blinking, raising the corner of the mouth, and floating the viewing angle, etc., but not limited thereto. The micro-expression recognition module 40 may include a heuristic filtering unit 41 and a plurality of Graph Convolution Networks (GCN) units 42 or any one of a plurality of CNN units. The number of GCN units 42 is taken as an example, and the number thereof is not limited to the example in the figure. In detail, the heuristic filtering unit 41 can analyze the multi-dimensional features according to a plurality of preset micro-expression recognition rules when receiving the multi-dimensional features, so as to filter out the micro-expression features or the micro-expression features that are more in line with these micro-expression recognition rules among the multi-dimensional features. Micro-expression pictures, such as the judgment of eye opening and closing, can be judged by the Euclidean distance of the feature points of the upper and lower eyes. This method can save the time of collecting model training data sets, and can also be used to quickly create data sets in this way. labels to train the model. Alternatively, the filtered micro-expression features or micro-expression pictures can be input to the GCN unit 42 or the CNN unit, and the GCN unit 42 or the CNN unit can further identify the micro-expressions love action. Preferably, the micro-expression actions can be an n×1-dimensional integer data amount, where n=the number of micro-expression actions. Then, the micro-expression actions can be output to the deep learning model 60 .

對於聲音訊號D_V的分析，首先由語音情緒辨識模組50進行。音情緒辨識模組50包括一頻譜轉換單元51及一CNN單元52。頻譜轉換單元51可將接收到的聲音訊號D_V轉換為頻譜圖資料，頻譜圖資料在此示例為梅爾刻度頻譜(Mel Scale Spectrogram)。此頻譜圖資料並接續由CNN單元52使用遷移學習(transfer learning)分析以得到情緒特徵。特定地說，CNN單元52可包括多層卷積層(Convolutional layer)，情緒特徵為此些卷積層的倒數第二層卷積層的輸出，情緒特徵在此示例為32×32的二維浮點數的資料量。情緒特徵接著可自語音情緒辨識模組50輸出至深度學習模型60。 The analysis of the voice signal _DV is first performed by the voice emotion recognition module 50 . The voice emotion recognition module 50 includes a spectrum conversion unit 51 and a CNN unit 52 . The spectrum conversion unit 51 can convert the received sound signal _DV into spectrogram data, where the spectrogram data is a Mel Scale Spectrogram in this example. The spectrogram data is then analyzed by the CNN unit 52 using transfer learning to obtain emotional features. Specifically, the CNN unit 52 may include multiple layers of convolutional layers (Convolutional layers), and the emotion feature is the output of the penultimate convolutional layer of these convolutional layers, and the emotion feature is 32×32 two-dimensional floating point numbers in this example. amount of data. The emotion features can then be output from the speech emotion recognition module 50 to the deep learning model 60 .

從前述中可以得知，深度學習模型60會接收到上述多維度特徵、微表情動作及情緒特徵，因此可結合分別針對影像資料與聲音訊號的分析結果來綜合判斷目標人物說謊的機率。多維度特徵、微表情動作及情緒特徵可先被一特徵隱藏空間(Feature Latent Space)61整合為一整合特徵，在此示例性地將多維度特徵、微表情動作及情緒特徵壓縮為一維且連接的資料，以前述的示例可推導出其為136+n+1024=1160+n維/每幀維度浮點數的資料量，若當初的影像資料有t幀圖片，則會有t×(1160+n)維的資料量輸入深度學習模型60。深度學習模型中包括前述特徵隱藏空間61、一多模式判斷模組(Multi-Modal Recognition Module)62及一辨識迴歸分類集(Regression Model)63，多模式判斷模組62連接在特徵隱藏空間61及辨識迴歸分類集63之間。多模式判斷模組62可包括一長短期記憶(Long Short-Term Memory，LSTM)卷積神經網路和一循環神經網路(Recurrent Neural Network，RNN)之任一者，其功能方塊圖請參考圖4。特徵隱藏空間61中包括多個隱藏特徵以供比對。多模式判斷模組62可判斷整合特徵符合隱藏空間61中的隱藏特徵的機率，此機率在此示例性地以多維資料呈現，如：512維的資料。接著，在辨識迴歸分類集63接收到多模式判斷模組62的輸出後，將此些整合特徵符合隱藏特徵的機率進行迴歸分類以最終得到目標人物說謊的機率，並以此作為判斷結果。此目標人物說謊的機率在此示例性地以一維且介在0~1之間的資料呈現。請注意，當深度學習模型預先透過一定訓練資料的輸入進行訓練時，其所判斷是否說謊的準確率將得以提高。 It can be known from the foregoing that the deep learning model 60 receives the above-mentioned multi-dimensional features, micro-expression movements and emotional features, and therefore can comprehensively judge the probability of the target person lying by combining the analysis results of the image data and the sound signal respectively. The multi-dimensional features, micro-expression actions and emotional features can be first integrated into an integrated feature by a feature latent space 61, where the multi-dimensional features, micro-expression actions and emotional features are exemplarily compressed into one-dimensional and The connected data can be deduced from the above example as 136+n+1024=1160+n-dimensional/each-frame-dimensional floating-point data volume. If the original image data has t frames of pictures, there will be t×( The data volume of 1160+n) dimensions is input into the deep learning model 60 . The deep learning model includes the aforementioned feature hidden space 61, a multi-modal judgment module (Multi-Modal Recognition Module) 62 and a recognition regression classification set (Regression Model) 63, and the multi-modal judgment module 62 is connected to the feature hidden space 61 and Identify between regression classification sets 63 . The multi-modal judgment module 62 may include any one of a Long Short-Term Memory (LSTM) convolutional neural network and a Recurrent Neural Network (RNN). Please refer to its functional block diagram. Figure 4. The feature hidden space 61 includes a plurality of hidden features for comparison. The multi-modal determination module 62 can determine the probability that the integrated feature conforms to the hidden feature in the hidden space 61 , and this probability is exemplarily represented by multi-dimensional data, such as 512-dimensional data. Next, after the identification and regression classification set 63 receives the output of the multi-modal judgment module 62, the probability that these integrated features conform to the hidden features is subjected to regression classification to finally obtain the probability of the target person lying, which is used as the judgment result. The probability of the target person lying is exemplarily presented here as one-dimensional data between 0 and 1. Please note that when the deep learning model is pre-trained with a certain amount of training data input, its accuracy in judging whether to lie or not will be improved.

之後，處理裝置10可被裝配以依據前述判斷結果，在判斷目標人物說謊時發出一警報警示。較佳地，處理裝置10可依據應用的不同，建立不同的異常匯款請求的警報產生路徑，如：當處理裝置10應用於端點主機或手持裝置時，將警報訊息送回發出交易請求的端點主機及/或執行匯款的一對應銀行主機；當處理裝置10應用於主機時，將警報訊息送回發出交易請求的端點主機或銀行主機。 Afterwards, the processing device 10 can be configured to issue an alarm when it is judged that the target person is lying according to the aforementioned judgment result. Preferably, the processing device 10 can establish different alarm generating paths for abnormal remittance requests according to different applications. For example, when the processing device 10 is applied to an endpoint host or a handheld device, an alarm message is sent back to the end that issued the transaction request. A peer host and/or a corresponding bank host that performs remittance; when the processing device 10 is applied to the host, an alert message is sent back to the endpoint host or bank host that issued the transaction request.

以上敍述依據本創作多個不同實施例，其中各項特徵可以單一或不同結合方式實施。因此，本創作實施方式之揭露為闡明本創作原則之具體實施例，應不拘限本創作於所揭示的實施例。進一步言之，先前敍述及其附圖僅為本創作示範之用，並不受其限囿。其他元件之變化或組合皆可能，且不悖于本創作之精神與範圍。 The above description is based on multiple different embodiments of the present invention, wherein each feature can be implemented in a single or different combination. Therefore, the disclosure of the embodiments of the present creation is a specific example to illustrate the principle of the present creation, and the present creation should not be limited to the disclosed embodiments. Further, the foregoing description and the accompanying drawings are only used for the demonstration of this creation, and are not limited thereto. Changes or combinations of other elements are possible without departing from the spirit and scope of this creation.

10:處理裝置 10: Processing device

30:特徵擷取模組 30: Feature extraction module

31:CNN單元 31: CNN unit

40:微表情辨識模組 40: Micro-expression recognition module

41:啟發式過濾單元 41: Heuristic filtering unit

42:GCN單元 42: GCN unit

50:語音情緒辨識模組 50: Speech emotion recognition module

51:頻譜轉換單元 51: Spectrum conversion unit

52:CNN單元 52: CNN unit

60:深度學習模型 60: Deep Learning Models

Claims

A lie detection system for identifying whether a target person is lying, including: a storage device, providing a data storage space; a connecting device; A processing device electrically connected to the storage device through the connection device to use the data storage space, including: a feature extraction module, which performs low-level analysis on an image data to extract multi-dimensional features; A micro-expression recognition module for performing high-level analysis on the multi-dimensional feature to recognize at least one micro-expression action; a speech emotion recognition module that generates a spectrogram data according to a sound signal, and analyzes the spectrogram data to obtain an emotion feature; and A deep learning model generates a probability of the target person lying according to the multi-dimensional feature, the at least one micro-expression action and the emotional feature.

The lie recognition system of claim 1, wherein the feature extraction module includes a CNN unit or an encoder, and the CNN unit or the encoder receives and analyzes the image data in a low-level manner to extract the multi-dimensional feature.

The lie recognition system of claim 1, wherein the multi-dimensional feature is any one of a plurality of main feature points and feature vectors of the face image.

The lie recognition system of claim 1, wherein the micro-expression recognition module comprises a heuristic filtering unit and either a GCN unit or a CNN unit, the heuristic filtering unit receives and analyzes the multi-dimensional feature , to filter out the micro-expression features or micro-expression pictures that are more in line with multiple micro-expression recognition rules in this multi-dimensional feature, and the micro-expression features or micro-expression pictures are input to this GCN unit or this CNN unit by this GCN unit or this The CNN unit recognizes the micro-expression action.

The lie recognition system according to claim 1, wherein the micro-expression actions include frowning, pursing the mouth, blinking, raising the corner of the mouth, and moving the angle of view upward.

The lie recognition system of claim 1, wherein the speech emotion recognition module includes a spectrum conversion unit and a CNN unit, the spectrum conversion unit converts the received sound signal into the spectrogram data, the spectrogram data And then the CNN unit uses transfer learning analysis to obtain the emotion feature.

The lie recognition system of claim 6, wherein the CNN unit comprises multiple convolutional layers, and the emotion feature is the output of the penultimate convolutional layer of the convolutional layers.

The lie recognition system of claim 1, wherein: The deep learning model receives the multi-dimensional feature, the at least one micro-expression action, and the emotional feature, and includes a feature hidden space, a multi-modal judgment module, and an identification regression classification set, and the multi-modal judgment module is connected to the feature Between the hidden space and the identification regression classification set, the feature hidden space includes a plurality of hidden features and integrates the multi-dimensional feature, the at least one micro-expression action and the emotional feature into an integrated feature; The multi-modal determination module determines the probability that the integrated feature conforms to the hidden features; and The identification regression classification set regresses and classifies the probability that the integrated feature matches the hidden features to obtain the probability that the target person lies.

The lie detection system of claim 8, wherein the probability that the integrated feature matches the hidden features includes multi-dimensional data.

The lie recognition system of claim 8, wherein the probability of the target person lying is one-dimensional data between 0 and 1.