TWI822646B

TWI822646B - Identity verification device and method based on dynamic lip image

Info

Publication number: TWI822646B
Application number: TW112124001A
Authority: TW
Inventors: 吳秉虔
Original assignee: 華南商業銀行股份有限公司
Priority date: 2019-08-07
Filing date: 2019-08-07
Publication date: 2023-11-11
Also published as: TW202345068A

Abstract

An identity verification device based on dynamic images includes a video recorder, a database and a processor. The video recorder is configured to capture a first lips dynamic image including a first voice message according to a control command, wherein the first voice message is related to a set of personal information to be verified. The database is configured to store a second lips dynamic image including a second voice message and a set of preset information. The process is connected to the video recorder and the database. The processor is configured to generate the control command according to a request command, and the processor further determines whether a similarity between the first lips dynamic image and the second lips dynamic image is greater than a threshold. The processor generates a successful verification message when the similarity is greater than the threshold.

Description

Identity verification device and method based on lip dynamic image

本發明係關於一種身分驗證裝置及方法，特別是一種應用唇形動態影像及語音分析的身分驗證裝置及方法。The present invention relates to an identity verification device and method, particularly an identity verification device and method using lip dynamic images and voice analysis.

隨著生物辨識技術的蓬勃發展，現今眾多產業(例如銀行交易)時常應用指紋辨識或臉部辨識作為客戶身份辨識的依據。針對前者來說，僅有新款的裝置支援指紋辨識功能，對於舊型的裝置來說並無法提供指紋辨識功能。針對後者來說，不同個體之間區別不大，且同樣的個體可能因外在樣貌改變而導致無法準確地辨識。With the vigorous development of biometric recognition technology, many industries (such as banking transactions) now often use fingerprint recognition or facial recognition as the basis for customer identification. For the former, only newer devices support the fingerprint recognition function, and older devices cannot provide the fingerprint recognition function. For the latter, there is not much difference between different individuals, and the same individual may not be accurately identified due to changes in external appearance.

對於注重客戶身份確認的特定行業來說，若是因客戶身份辨識有瑕疵而導致金融交易出現錯誤，例如客戶身分遭盜用，不但可能會使客戶蒙受金錢損失，同時也會影響企業的商譽。因此，在這類型的業界需要一種用戶的身分辨識裝置，其辨識方式可兼顧準確性與便利性，以利用戶可以安全且有效率地執行交易或業務。For certain industries that focus on customer identity verification, if there are errors in financial transactions due to flaws in customer identity recognition, such as customer identity theft, it may not only cause customers to suffer monetary losses, but also affect the goodwill of the company. Therefore, in this type of industry, there is a need for a user's identity recognition device whose recognition method can take into account both accuracy and convenience, so that users can safely and efficiently perform transactions or business.

有鑑於此，本發明提出一種基於動態影像的身分驗證裝置及方法，透過語音辨識搭配唇形動態影像分析，以雙重認證的方式確認用戶身份，如此可大大提升用戶身份辨識的準確性及用戶使用上的便利性。In view of this, the present invention proposes an identity verification device and method based on dynamic images. Through voice recognition and lip dynamic image analysis, the user's identity is confirmed in a double authentication method. This can greatly improve the accuracy of user identity recognition and user usage. convenience.

依據本發明之一實施例揭露一種基於動態影像的身分驗證裝置，包含錄像器、資料庫及運算器。錄像器用以依據控制指令擷取包含第一語音訊息的第一唇部動態影像，其中第一語音訊息關聯於待驗證個資。資料庫用於儲存包含第二語音訊息的第二唇部動態影像及預設個資。運算器連接錄像器及資料庫，運算器用以依據一請求指令而產生控制指令，且運算器判斷待驗證個資是否符合預設個資。當待驗證個資符合預設個資時，運算器進一步判斷第一唇部動態影像與第二唇部動態影像的相似度是否大於驗證閾值，並在相似度大於驗證閾值時產生驗證成功訊息。According to an embodiment of the present invention, an identity verification device based on dynamic images is disclosed, which includes a video recorder, a database and a computing unit. The video recorder is used to capture the first lip dynamic image including the first voice message according to the control instruction, wherein the first voice message is associated with the personal information to be verified. The database is used to store the second lip dynamic image including the second voice message and the default personal information. The operator is connected to the video recorder and the database. The operator is used to generate control instructions according to a request instruction, and the operator determines whether the information to be verified matches the preset information. When the personal information to be verified matches the preset personal information, the operator further determines whether the similarity between the first lip dynamic image and the second lip dynamic image is greater than the verification threshold, and generates a verification success message when the similarity is greater than the verification threshold.

依據本發明之一實施例揭露一種基於動態影像的身分驗證方法，包含以一運算器依據關聯於一預設個資的一請求指令而產生一控制指令；以錄像器依據該控制指令擷取包含一第一語音訊息的一第一唇部動態影像，其中該第一語音訊息關聯於一待驗證個資；以運算器判斷該待驗證個資是否符合該預設個資；當該待驗證個資符合該預設個資時，以該運算器判斷該第一唇部動態影像與一資料庫內的一第二唇部動態影像的一相似度是否大於一驗證閾值，其中該第二唇部動態影像包含一第二語音訊息；以及當該相似度大於該驗證閾值時，以該運算器產生一驗證成功訊息。According to an embodiment of the present invention, an identity verification method based on dynamic images is disclosed, which includes using a computer to generate a control command based on a request command associated with a preset data; using a video recorder to capture the content containing A first lip dynamic image of a first voice message, wherein the first voice message is associated with a profile to be verified; using a computer to determine whether the profile to be verified matches the default profile; when the profile to be verified is When the preset personal information is met, the operator is used to determine whether a similarity between the first lip dynamic image and a second lip dynamic image in a database is greater than a verification threshold, wherein the second lip dynamic image The dynamic image includes a second voice message; and when the similarity is greater than the verification threshold, the operator is used to generate a verification success message.

綜上所述，在本發明的基於動態影像的身分驗證裝置及方法中，先是利用語音訊息分析其語意內容以比對到預設個資作為第一道驗證，待通過此第一道驗證後，接著使用擷取到的唇部動態影像與預設的唇部動態影像進行相似度比對以作為第二道驗證，進而在通過兩道驗證時產生驗證成功訊息。如此一來，以雙重的語音及動態影像認證的方式重複確認用戶身份，可顯著提升用戶身份辨識的準確性且亦增加用戶的驗證過程的便利性。To sum up, in the dynamic image-based identity verification device and method of the present invention, the semantic content of the voice message is first analyzed to match the preset personal information as the first verification. After passing this first verification , and then use the captured lip dynamic image to compare the similarity with the preset lip dynamic image as the second verification, and then generate a verification success message when the two verifications are passed. In this way, repeatedly confirming the user's identity through dual voice and dynamic image authentication can significantly improve the accuracy of user identity recognition and also increase the convenience of the user's verification process.

以上之關於本揭露內容之說明及以下之實施方式之說明係用以示範與解釋本發明之精神與原理，並且提供本發明之專利申請範圍更進一步之解釋。The above description of the present disclosure and the following description of the embodiments are used to demonstrate and explain the spirit and principles of the present invention, and to provide further explanation of the patent application scope of the present invention.

以下在實施方式中詳細敘述本發明之詳細特徵以及優點，其內容足以使任何熟習相關技藝者了解本發明之技術內容並據以實施，且根據本說明書所揭露之內容、申請專利範圍及圖式，任何熟習相關技藝者可輕易地理解本發明相關之目的及優點。以下之實施例係進一步詳細說明本發明之觀點，但非以任何觀點限制本發明之範疇。The detailed features and advantages of the present invention are described in detail below in the implementation mode. The content is sufficient to enable anyone skilled in the relevant art to understand the technical content of the present invention and implement it according to the content disclosed in this specification, the patent scope and the drawings. , anyone familiar with the relevant art can easily understand the relevant objectives and advantages of the present invention. The following examples further illustrate the aspects of the present invention in detail, but do not limit the scope of the present invention in any way.

請參照圖1，圖1係依據本發明之一實施例所繪示的基於動態影像的身分驗證裝置的功能方塊圖。如圖1所示，身分驗證裝置1包含錄像器10、資料庫11及運算器12，其中運算器12分別連接錄像器10與資料庫11。錄像器10用以依據控制指令擷取包含第一語音訊息的第一唇部動態影像。所述的控制指令係來自運算器12根據一請求指令而產生的。在實作上，請求指令可以係來自用戶在裝置上登入網路銀行帳戶後所提出的交易請求，也可以是來自用戶在自動櫃員機台插卡進入銀行帳戶後所提出的交易請求，其中請求指令可例如是轉帳、匯款或提存等金融交易活動之請求指令，但本發明不以此為限。資料庫11儲存包含第二語音訊息的第二唇部動態影像以及預設個資，其中所述具有第二語音訊息的第二唇部動態影像以及預設個資係預先取得並儲存在資料庫11作為比對的依據。其中，雖然圖中未繪示，資料庫11係位於身分驗證裝置1的記憶體或儲存媒介當中。在實作上，資料庫11中內存的具有第二語音訊息的第二唇部動態影像與預設個資係依據登入之銀行帳戶的擁有者的資料(包含唇部動態影像資料與文字資料)所預先設定的。Please refer to FIG. 1 , which is a functional block diagram of an identity verification device based on dynamic images according to an embodiment of the present invention. As shown in FIG. 1 , the identity verification device 1 includes a video recorder 10 , a database 11 and a calculator 12 , wherein the calculator 12 is connected to the video recorder 10 and the database 11 respectively. The video recorder 10 is used to capture the first lip dynamic image including the first voice message according to the control command. The control instructions are generated from the arithmetic unit 12 according to a request instruction. In practice, the request instruction can be a transaction request made by the user after logging into the online banking account on the device, or it can be a transaction request made by the user after inserting the card into the bank account at the automatic teller machine, where the request instruction It may be, for example, a request instruction for financial transaction activities such as transfer, remittance or deposit, but the invention is not limited to this. The database 11 stores the second lip dynamic image including the second voice message and the default personal information, wherein the second lip dynamic image with the second voice message and the default personal information are obtained in advance and stored in the database 11 as the basis for comparison. Although not shown in the figure, the database 11 is located in the memory or storage medium of the identity verification device 1 . In practice, the second lip dynamic image with the second voice message and the default profile stored in the database 11 are based on the information of the owner of the logged-in bank account (including lip dynamic image data and text data) preset.

錄像器10所擷取的第一語音訊息係關聯於待驗證個資，而運算器12可判斷此待驗證個資是否符合資料庫11內的預設個資。具體來說，錄像器10會將所擷取到的具有第一語音訊息之第一唇部動態影像傳送到運算器12。透過外部的語音轉文字伺服器 (例如Google Cloud語音轉文字伺服器)或是內建的功能晶片，運算器12可對第一語音訊息解析出對應的文字訊息作為待驗證個資(例如用戶的身分證字號A123456789)。The first voice message captured by the video recorder 10 is associated with the personal information to be verified, and the operator 12 can determine whether the personal information to be verified matches the default personal information in the database 11 . Specifically, the video recorder 10 will transmit the captured first lip dynamic image with the first voice message to the operator 12 . Through an external voice-to-text server (such as Google Cloud voice-to-text server) or a built-in function chip, the operator 12 can parse the first voice message and obtain the corresponding text message as the personal information to be verified (such as the user's personal information). ID number A123456789).

接著，運算器12判斷此待驗證個資是否符合預設個資，若兩者符合，運算器12進一步判斷第一唇部動態影像與第二唇部動態影像的相似度是否大於驗證閾值。當相似度大於驗證閾值時，運算器12產生驗證成功訊息。也就是說，在此實施例中，運算器12係先判斷待驗證個資內的文字訊息是否與預設個資內的文字訊息相同(例如身分證字號A123456789)。若是，運算器12才會進一步分析用戶當前的唇形變化與預設的唇形變化之間的相似程度，最終據以判斷是否驗證成功。在實務上，運算器12可例如是處理器、微處理器、控制器或是其他具有運算功能的晶片。Then, the operator 12 determines whether the personal information to be verified matches the preset personal information. If the two match, the operator 12 further determines whether the similarity between the first lip dynamic image and the second lip dynamic image is greater than the verification threshold. When the similarity is greater than the verification threshold, the operator 12 generates a verification success message. That is to say, in this embodiment, the operator 12 first determines whether the text message in the personal data to be verified is the same as the text message in the default personal data (for example, ID card number A123456789). If so, the operator 12 will further analyze the similarity between the user's current lip shape change and the preset lip shape change, and finally determine whether the verification is successful. In practice, the arithmetic unit 12 may be, for example, a processor, a microprocessor, a controller, or other chips with arithmetic functions.

於一實施例中，第一語音訊息對應具有多個第一單位時間的第一時序，而第二語音訊息對應具有多個第二單位時間的第二時序。詳細來說，請進一步參照圖2，圖2係依據本發明之一實施例所繪示的第一語音訊息及第二語音訊息的示意圖。如圖2所示，於本發明的一實施例中，運算器12擷取第一語音訊息DA1的第一音節資料d1~d5所對應的時序以做為第一時序P1，且運算器12擷取第二語音訊息DA2的第二音節資料d1’~d5’所對應的時序作為第二時序P2。詳言之，第一語音訊息DA1具有第一音節資料d1~d5分別對應到第一時序P1的多個第一單位時間t1~t5，而第二語音訊息DA2具有第二音節資料d1’~d5’分別對應到第二時序P2的多個第二單位時間t1’~t5’。在此實施例中，第一音節資料d1~d5與第二音節資料d1’~d5’個別對應指示相同的單字或單詞。舉例來說，第一音節資料d1與第二音節資料d1’ 指示相同的單字或單詞，第一音節資料d2與第二音節資料d2’ 指示相同的單字或單詞，第一音節資料d3與第二音節資料d3’ 指示相同的單字或單詞，後續依此類推。In one embodiment, the first voice message corresponds to a first time sequence having a plurality of first unit times, and the second voice message corresponds to a second time sequence having a plurality of second unit times. For details, please further refer to FIG. 2 , which is a schematic diagram of a first voice message and a second voice message according to an embodiment of the present invention. As shown in Figure 2, in an embodiment of the present invention, the operator 12 captures the timing corresponding to the first syllable data d1~d5 of the first voice message DA1 as the first timing P1, and the operator 12 The timing corresponding to the second syllable data d1'~d5' of the second voice message DA2 is captured as the second timing P2. Specifically, the first voice message DA1 has first syllable data d1~d5 respectively corresponding to a plurality of first unit times t1~t5 of the first time sequence P1, and the second voice message DA2 has second syllable data d1'~ d5' respectively corresponds to multiple second unit times t1'~t5' of the second time sequence P2. In this embodiment, the first syllable data d1~d5 and the second syllable data d1'~d5' respectively indicate the same single character or word. For example, the first syllable data d1 and the second syllable data d1' indicate the same word or word, the first syllable data d2 and the second syllable data d2' indicate the same word or word, the first syllable data d3 and the second syllable data d3' indicate the same word or word. The syllable data d3' indicates the same single character or word, and so on.

於一實施例中，運算器12在該些第一單位時間的第N個偵測第一唇部動態影像的第一特徵點，以計算第一特徵點在第N個第一單位時間內的第一距離變化量。進一步地，運算器12在該些第二單位時間的第N個偵測第二唇部動態影像的第二特徵點，以計算第二特徵點在第N個第二單位時間內的第二距離變化量，其中N為正整數。接著，運算器12便可根據計算所得之第一距離變化量及第二距離變化量，判斷第一唇部動態影像與第二唇部動態影像的相似度是否大於驗證閾值。In one embodiment, the operator 12 detects the first feature point of the first lip dynamic image at the Nth first unit time to calculate the value of the first feature point in the Nth first unit time. The first distance change. Further, the operator 12 detects the second feature point of the second lip dynamic image in the Nth second unit time to calculate the second distance of the second feature point in the Nth second unit time. Variation, where N is a positive integer. Then, the operator 12 can determine whether the similarity between the first lip dynamic image and the second lip dynamic image is greater than the verification threshold based on the calculated first distance change amount and the second distance change amount.

以下將以第一實際範例說明上述運算器12係如何判斷第一唇部動態影像與第二唇部動態影像之間的相似度是否大於驗證閾值的具體實施方式。請進一步參照圖3A與圖3B，圖3A係依據本發明之一實施例所繪示的用戶個體的第一唇部動態影像示意圖，而圖3B係依據本發明之一實施例所繪示的預存之第二唇部動態影像示意圖。圖3A係以用戶個體A為例，當用戶個體A說話時(例如念出對應的身分證字號)，錄像器10擷取到用戶個體A的第一唇部動態影像，而在第一個(例如N為1)第一單位時間t1內的第一唇部動態影像如圖3A所示。運算器12在此第一個第一單位時間t1內偵測到第一特徵點x1(以上唇中央一參考點作為第一特徵點x1)由位置L1移動到位置L2，因此計算出第一距離變化量D1。The following will describe a specific implementation of how the above-mentioned operator 12 determines whether the similarity between the first lip dynamic image and the second lip dynamic image is greater than the verification threshold using a first practical example. Please further refer to FIG. 3A and FIG. 3B. FIG. 3A is a schematic diagram of the first lip dynamic image of an individual user according to an embodiment of the present invention, and FIG. 3B is a pre-stored lip dynamic image according to an embodiment of the present invention. Schematic diagram of the second lip dynamic image. Figure 3A takes user individual A as an example. When user individual A speaks (for example, reads the corresponding ID card number), the video recorder 10 captures the first lip dynamic image of user individual A, and in the first ( For example, N is 1) The first lip dynamic image in the first unit time t1 is shown in Figure 3A. The operator 12 detects that the first feature point x1 (a reference point in the center of the upper lip is used as the first feature point x1) moves from the position L1 to the position L2 within the first first unit time t1, and therefore calculates the first distance. Change amount D1.

另一方面，在資料庫11中儲存有預設的第二唇部動態影像。如圖3B所示，運算器12在第一個(例如N為1)第二單位時間t2內偵測到第二唇部動態影像的第二特徵點x2 由位置L1’移動到位置L2’，運算器12 即可計算得到第二距離變化量D2。進一步地，運算器12計算第一距離變化量D1與第二距離變化量D2的差值並且將其比對到一預設值。在此實施例中，當此差值小於預設值時，運算器12判斷第一唇部動態影像與第二唇部動態影像的相似度大於驗證閾值，此時運算器12認可用戶個體A的身分而對應產生驗證成功的訊息。反之，當此差值大於預設值時，運算器12判斷第一唇部動態影像與第二唇部動態影像的相似度小於驗證閾值，此時運算器12認定用戶個體A的身分有疑慮而對應產生驗證不成功的訊息。On the other hand, a preset second lip dynamic image is stored in the database 11 . As shown in FIG. 3B , the operator 12 detects that the second feature point x2 of the second lip dynamic image moves from the position L1' to the position L2' within the first (for example, N is 1) second unit time t2. The calculator 12 can calculate the second distance variation D2. Further, the operator 12 calculates the difference between the first distance change D1 and the second distance change D2 and compares it to a preset value. In this embodiment, when the difference is less than the preset value, the operator 12 determines that the similarity between the first lip dynamic image and the second lip dynamic image is greater than the verification threshold. At this time, the operator 12 recognizes user individual A's A verification success message is generated corresponding to the identity. On the contrary, when the difference is greater than the preset value, the operator 12 determines that the similarity between the first lip dynamic image and the second lip dynamic image is less than the verification threshold. At this time, the operator 12 determines that the identity of the user individual A is doubtful and Corresponding to the message that the verification is unsuccessful.

前述實施例僅針對一單位時間內唇部動態影像的單一特徵點進行分析，然而單一特徵點可能因某些特定發音的單詞或單字而無法辨識出有效的距離變化量，導致運算器12無法準確地判斷唇部動態影像的相似度。為了因應此問題，本發明的身分驗證裝置可針對一單位時間內唇部動態影像的多個特徵點進行分析。具體來說，於一實施例中，運算器12可用以在多個第一單位時間的第N個偵測第一唇部動態影像的多個第一特徵點，以計算該些第一特徵點在第N個第一單位時間內的多個第一距離變化量，且運算器可用以在多個第二單位時間的第N個偵測第二唇部動態影像的多個第二特徵點，以計算該些第二特徵點在第N個第二單位時間內的多個第二距離變化量，其中N為正整數。接著，運算器根據該些第一距離變化量及該些第二距離變化量，據以判斷第一唇部動態影像與第二唇部動態影像的相似度是否大於驗證閾值。The foregoing embodiment only analyzes a single feature point of the dynamic image of the lips within a unit of time. However, a single feature point may not be able to identify the effective distance change due to certain words or characters with specific pronunciations, causing the operator 12 to be unable to accurately to judge the similarity of lip dynamic images. In order to cope with this problem, the identity verification device of the present invention can analyze multiple feature points of the lip dynamic image within a unit of time. Specifically, in one embodiment, the operator 12 may be used to detect a plurality of first feature points of the first lip dynamic image at the Nth of a plurality of first unit times to calculate the first feature points. A plurality of first distance changes in the Nth first unit time, and the operator can be used to detect a plurality of second feature points of the second lip dynamic image in the Nth of a plurality of second unit times, To calculate a plurality of second distance changes of the second feature points in the N-th second unit time, where N is a positive integer. Then, the operator determines whether the similarity between the first lip dynamic image and the second lip dynamic image is greater than the verification threshold based on the first distance changes and the second distance changes.

以下將以第二實際範例說明上述運算器12係如何判斷第一唇部動態影像與第二唇部動態影像之間的相似度是否大於驗證閾值的具體實施方式。請進一步參照圖4A與圖4B，圖4A係依據本發明之另一實施例所繪示的用戶個體的第一唇部動態影像示意圖，而圖4B係依據本發明之另一實施例所繪示的預存之第二唇部動態影像示意圖。圖4A係以用戶個體B為例，當用戶個體B說話時(例如念出對應的身分證字號)，錄像器10擷取到用戶個體B的第一唇部動態影像，而在第一個(例如N為1)第一單位時間t1內的第一唇部動態影像如圖4A所示。運算器12在此第一個第一單位時間t1內偵測第一特徵點x1及x2 (以上唇中央及側邊的參考點作為第一及第二特徵點x1及x2)，其中第一特徵點x1由位置L1移動到位置L2，而第一特徵點x2由位置L3移動到位置L4，因此計算出二個第一距離變化量D1及D2。The following will describe a specific implementation of how the above-mentioned operator 12 determines whether the similarity between the first lip dynamic image and the second lip dynamic image is greater than the verification threshold using a second practical example. Please further refer to FIG. 4A and FIG. 4B. FIG. 4A is a schematic diagram of a first lip dynamic image of an individual user according to another embodiment of the present invention, and FIG. 4B is a schematic diagram of a first lip dynamic image according to another embodiment of the present invention. A schematic diagram of the pre-stored second lip dynamic image. Figure 4A takes user B as an example. When user B speaks (for example, reads the corresponding ID card number), the video recorder 10 captures the first lip dynamic image of user B, and in the first ( For example, N is 1) The first lip dynamic image in the first unit time t1 is shown in Figure 4A. The operator 12 detects the first feature points x1 and x2 within the first unit time t1 (the reference points at the center and side of the upper lip are used as the first and second feature points x1 and x2), where the first feature Point x1 moves from position L1 to position L2, and the first feature point x2 moves from position L3 to position L4. Therefore, two first distance changes D1 and D2 are calculated.

另一方面，在資料庫11中儲存有預設的第二唇部動態影像。如圖4B所示，運算器12在第一個(例如N為1)第二單位時間t2內偵測第二唇部動態影像的第二特徵點x3及x4，其中第二特徵點x3由位置L1’移動到位置L2’，而第二特徵點x4由位置L3’移動到位置L4’。對應地，運算器12可計算得到第二距離變化量D3與D4。進一步地，運算器12計算第一距離變化量D1與第二距離變化量D3的第一差值並且將其比對到第一預設值，且運算器12計算第一距離變化量D2與第二距離變化量D4的第二差值並且將其比對到第二預設值。On the other hand, a preset second lip dynamic image is stored in the database 11 . As shown in FIG. 4B , the operator 12 detects the second feature points x3 and x4 of the second lip dynamic image within the first (for example, N is 1) second unit time t2, where the second feature point x3 is determined by the position L1' moves to position L2', and the second feature point x4 moves from position L3' to position L4'. Correspondingly, the operator 12 can calculate the second distance changes D3 and D4. Further, the operator 12 calculates the first difference between the first distance change D1 and the second distance change D3 and compares it to the first preset value, and the operator 12 calculates the first distance change D2 and the second distance change D3. The second difference between the two distance changes D4 is compared to the second preset value.

在此實施例中，當第一與第二差值分別小於第一與第二預設值時，運算器12判斷第一唇部動態影像與第二唇部動態影像的相似度大於驗證閾值，此時運算器12認可用戶個體B的身分而對應產生驗證成功的訊息。反之，當第一差值大於第一預設值或第二差值大於第二預設值時，運算器12判斷第一唇部動態影像與第二唇部動態影像的相似度小於驗證閾值，此時運算器12認定用戶個體B的身分有疑慮而對應產生驗證不成功的訊息。In this embodiment, when the first and second differences are smaller than the first and second preset values respectively, the operator 12 determines that the similarity between the first lip dynamic image and the second lip dynamic image is greater than the verification threshold, At this time, the computer 12 recognizes the identity of the user individual B and generates a verification success message accordingly. On the contrary, when the first difference value is greater than the first preset value or the second difference value is greater than the second preset value, the operator 12 determines that the similarity between the first lip dynamic image and the second lip dynamic image is less than the verification threshold, At this time, the calculator 12 determines that the identity of the user individual B is doubtful and accordingly generates a message that the verification is unsuccessful.

前述兩個實施例僅針對一單位時間內唇部動態影像的一或多個特徵點進行分析，然而本發明的身分驗證裝置亦可針對連續的單位時間內唇部動態影像的特徵點位置變化進行分析。具體來說，於一實施例中，運算器12用以依序在每個第一單位時間內偵測第一唇部動態影像的一第一特徵點，以計算第一特徵點在第一時序內的第一移動軌跡，且運算器12更用以依序在每個第二單位時間內偵測第二唇部動態影像的第二特徵點，以計算第二特徵點在第二時序內的第二移動軌跡，運算器12根據第一移動軌跡及第二移動軌跡，據以判斷第一唇部動態影像與第二唇部動態影像的相似度是否大於驗證閾值。The foregoing two embodiments only analyze one or more feature points of the dynamic image of the lips within a unit of time. However, the identity verification device of the present invention can also analyze the position changes of the feature points of the dynamic image of the lips within a continuous unit of time. analyze. Specifically, in one embodiment, the operator 12 is used to sequentially detect a first feature point of the first lip dynamic image in each first unit time to calculate the first feature point at the first time. The first movement trajectory in the sequence, and the operator 12 is further used to sequentially detect the second feature points of the second lip dynamic image in each second unit time to calculate the position of the second feature point in the second time sequence. The operator 12 determines whether the similarity between the first lip dynamic image and the second lip dynamic image is greater than the verification threshold based on the first movement trajectory and the second movement trajectory.

以下將以第三實際範例說明上述運算器12係如何判斷第一唇部動態影像與第二唇部動態影像之間的相似度是否大於驗證閾值的具體實施方式。請進一步參照圖5A與圖5B，圖5A係依據本發明之另一實施例所繪示的用戶個體的第一唇部動態影像示意圖，而圖5B係依據本發明之另一實施例所繪示的預存之第二唇部動態影像示意圖。圖5A係以用戶個體C為例，為了便於說明，此實施例以三個單位時間t1~t3作為第一時序，且以三個單位時間t1’~t3’作為第二時序。當用戶個體C說話時(例如念出對應的身分證字號)，錄像器10擷取到用戶個體C的第一唇部動態影像。此第一唇部動態影像在連續的不同第一單位時間t1~t3的變化如圖5A所示。運算器12依序在每個第一單位時間t1~t3內偵測第一唇部動態影像的第一特徵點，計算取得第一特徵點在第一時序(即連續的第一單位時間t1~t3)內的第一移動軌跡R1。The following will describe a specific implementation of how the above-mentioned operator 12 determines whether the similarity between the first lip dynamic image and the second lip dynamic image is greater than the verification threshold using a third practical example. Please further refer to FIG. 5A and FIG. 5B. FIG. 5A is a schematic diagram of a first lip dynamic image of an individual user according to another embodiment of the present invention, and FIG. 5B is a schematic diagram of a first lip dynamic image according to another embodiment of the present invention. A schematic diagram of the pre-stored second lip dynamic image. Figure 5A takes user individual C as an example. For convenience of explanation, this embodiment uses three unit times t1~t3 as the first time sequence, and uses three unit times t1'~t3' as the second time sequence. When the user individual C speaks (for example, reads the corresponding ID number), the video recorder 10 captures the first lip dynamic image of the user individual C. The changes of this first lip dynamic image at different consecutive first unit times t1~t3 are shown in Figure 5A. The operator 12 sequentially detects the first feature point of the first lip dynamic image in each first unit time t1~t3, and calculates and obtains the first feature point in the first time sequence (that is, the continuous first unit time t1 ~t3) the first movement trajectory R1.

另一方面，在資料庫11中儲存有第二唇部動態影像。如圖5B所示，運算器12依序在每個第一單位時間t1’~t3’內偵測第二唇部動態影像的第二特徵點，計算取得第二特徵點在第二時序(即連續的第二單位時間t1’~t3’)內的第二移動軌跡R2。接著，運算器12例如利用曲線相似度演算法運算出第一移動軌跡R1與第二移動軌跡R2之間的相似度並將其對比到預設值。在此實施例中，當移動軌跡的相似度達到預設值時，運算器12判斷第一唇部動態影像與第二唇部動態影像的相似度大於驗證閾值，因此運算器12認可用戶個體C的身分而對應產生驗證成功的訊息。反之，當移動軌跡的相似度未達預設值時，運算器12判斷第一唇部動態影像與第二唇部動態影像的相似度小於驗證閾值。此時，運算器12認定用戶個體C的身分有疑慮而對應產生驗證不成功的訊息。On the other hand, the second lip dynamic image is stored in the database 11 . As shown in FIG. 5B , the operator 12 sequentially detects the second feature points of the second lip dynamic image in each first unit time t1'~t3', and calculates and obtains the second feature points in the second time sequence (i.e. The second movement trajectory R2 within the continuous second unit time t1'~t3'). Then, the operator 12 calculates the similarity between the first movement trajectory R1 and the second movement trajectory R2 by using, for example, a curve similarity algorithm and compares it to a preset value. In this embodiment, when the similarity of the movement trajectory reaches the preset value, the operator 12 determines that the similarity between the first lip dynamic image and the second lip dynamic image is greater than the verification threshold, so the operator 12 recognizes the user individual C A verification success message is generated corresponding to the identity of the user. On the contrary, when the similarity of the movement trajectories does not reach the preset value, the operator 12 determines that the similarity of the first lip dynamic image and the second lip dynamic image is less than the verification threshold. At this time, the computer 12 determines that the identity of the user individual C is doubtful and accordingly generates a message that the verification is unsuccessful.

請參照圖6，圖6係依據本發明之一實施例所繪示的基於動態影像的身分驗證方法的方法流程圖。如圖6所示，在步驟S1中，以運算器12依據請求指令而產生控制指令。在步驟S2中，以錄像器10依據控制指令擷取包含第一語音訊息的第一唇部動態影像，其中第一語音訊息關聯於待驗證個資。在步驟S3中，以運算器12判斷待驗證個資是否符合資料庫內的預設個資。Please refer to FIG. 6 , which is a method flow chart of an identity verification method based on dynamic images according to an embodiment of the present invention. As shown in FIG. 6 , in step S1 , the arithmetic unit 12 generates a control instruction according to the request instruction. In step S2, the video recorder 10 is used to capture the first lip dynamic image including the first voice message according to the control instruction, wherein the first voice message is associated with the personal information to be verified. In step S3, the operator 12 is used to determine whether the personal data to be verified matches the preset personal data in the database.

當待驗證個資符合預設個資時，在步驟S4中，以運算器12判斷第一唇部動態影像與資料庫11內的第二唇部動態影像的相似度是否大於驗證閾值，其中第二唇部動態影像包含第二語音訊息。當相似度大於驗證閾值時，在步驟S5中，以運算器12產生驗證成功訊息。When the personal information to be verified matches the preset personal information, in step S4, the operator 12 is used to determine whether the similarity between the first lip dynamic image and the second lip dynamic image in the database 11 is greater than the verification threshold, wherein the first lip dynamic image is greater than the verification threshold. The two lip dynamic images include the second voice message. When the similarity is greater than the verification threshold, in step S5, the operator 12 generates a verification success message.

於一實施例中，第一語音訊息對應具有多個第一單位時間的第一時序，且第二語音訊息對應具有多個第二單位時間的第二時序，其中判斷第一唇部動態影像與第二唇部動態影像的相似度是否大於驗證閾值包含：以運算器12在該些第一單位時間的第N個偵測第一唇部動態影像的第一特徵點，以計算第一特徵點在第N個第一單位時間內的第一距離變化量；以運算器12在該些第二單位時間的第N個偵測第二唇部動態影像的第二特徵點，以計算第二特徵點在第N個第二單位時間內的第二距離變化量，其中N為正整數；以及以運算器12運算第一距離變化量及第二距離變化量的差值。In one embodiment, the first voice message corresponds to a first time sequence having a plurality of first unit times, and the second voice message corresponds to a second time sequence having a plurality of second unit times, wherein the first lip dynamic image is determined Whether the similarity with the second lip dynamic image is greater than the verification threshold includes: using the operator 12 to detect the first feature point of the first lip dynamic image at the Nth of the first unit times to calculate the first feature The first distance change amount of the point in the Nth first unit time; using the operator 12 to detect the second feature point of the second lip dynamic image at the Nth second unit time to calculate the second The second distance change amount of the feature point in the Nth second unit time, where N is a positive integer; and the operator 12 is used to calculate the difference between the first distance change amount and the second distance change amount.

於一實施例中，第一語音訊息對應具有多個第一單位時間的第一時序，且第二語音訊息對應具有多個第二單位時間的第二時序，其中判斷第一唇部動態影像與第二唇部動態影像的相似度是否大於驗證閾值包含以運算器12在該些第一單位時間的第N個偵測第一唇部動態影像的多個第一特徵點，以計算該些第一特徵點在第N個第一單位時間內的多個第一距離變化量；以運算器12在該些第二單位時間的第N個偵測第二唇部動態影像的多個第二特徵點，以計算該些第二特徵點在第N個第二單位時間內的多個第二距離變化量，其中N為正整數；以及以運算器12個別運算該些第一距離變化量及該些第二距離變化量的差值。In one embodiment, the first voice message corresponds to a first time sequence having a plurality of first unit times, and the second voice message corresponds to a second time sequence having a plurality of second unit times, wherein the first lip dynamic image is determined Whether the similarity to the second lip dynamic image is greater than the verification threshold includes using the operator 12 to detect a plurality of first feature points of the first lip dynamic image at the Nth of the first unit times to calculate the A plurality of first distance changes of the first feature point in the Nth first unit time; using the operator 12 to detect a plurality of second lip dynamic images in the Nth second unit time. feature points to calculate a plurality of second distance changes of the second feature points in the N-th second unit time, where N is a positive integer; and use the operator 12 to individually calculate the first distance changes and The difference between the second distance changes.

於一實施例中，第一語音訊息對應具有多個第一單位時間的第一時序，且第二語音訊息對應具有多個第二單位時間的第二時序，其中判斷第一唇部動態影像與第二唇部動態影像的相似度是否大於驗證閾值包含以運算器12依序在每個第一單位時間內偵測第一唇部動態影像的第一特徵點，以計算第一特徵點在第一時序內的第一移動軌跡；以運算器12依序在每個第二單位時間內偵測第二唇部動態影像的第二特徵點，以計算第二特徵點在第二時序內的第二移動軌跡；以及以運算器12比對第一移動軌跡及第二移動軌跡。In one embodiment, the first voice message corresponds to a first time sequence having a plurality of first unit times, and the second voice message corresponds to a second time sequence having a plurality of second unit times, wherein the first lip dynamic image is determined Whether the similarity with the second lip dynamic image is greater than the verification threshold includes using the operator 12 to sequentially detect the first feature points of the first lip dynamic image in each first unit time to calculate the time of the first feature point. The first movement trajectory in the first time sequence; the operator 12 is used to sequentially detect the second feature points of the second lip dynamic image in each second unit time to calculate the position of the second feature point in the second time sequence. the second movement trajectory; and the operator 12 compares the first movement trajectory and the second movement trajectory.

雖然本發明以前述之實施例揭露如上，然其並非用以限定本發明。在不脫離本發明之精神和範圍內，所為之更動與潤飾，均屬本發明之專利保護範圍。關於本發明所界定之保護範圍請參考所附之申請專利範圍。Although the present invention is disclosed in the foregoing embodiments, they are not intended to limit the present invention. All changes and modifications made without departing from the spirit and scope of the present invention shall fall within the scope of patent protection of the present invention. Regarding the protection scope defined by the present invention, please refer to the attached patent application scope.

1:身分驗證裝置 10:錄像器 11:資料庫 12:運算器 DA1:第一語音訊息 DA2:第二語音訊息 d1~d5:第一音節資料 d1’~d5’:第二音節資料 t1~t5、t1’~t5’:單位時間 P1:第一時序 P2:第二時序 L1~L4、L1’~L4’:位置 x1~x4:特徵點 D1~D4:距離變化量 R1:第一移動軌跡 R2:第二移動軌跡 1: Identity verification device 10:Video recorder 11:Database 12:Operator DA1: first voice message DA2: Second voice message d1~d5: first syllable data d1’~d5’: second syllable information t1~t5, t1’~t5’: unit time P1: first timing P2: Second timing L1~L4, L1’~L4’: position x1~x4: Feature points D1~D4: distance change amount R1: first movement trajectory R2: Second movement trajectory

圖1係依據本發明之一實施例所繪示的基於動態影像的身分驗證裝置的功能方塊圖。圖2係依據本發明之一實施例所繪示的第一語音訊息及第二語音訊息的示意圖。圖3A係依據本發明之一實施例所繪示的用戶個體的第一唇部動態影像示意圖。圖3B係依據本發明之一實施例所繪示的預存之第二唇部動態影像示意圖。圖4A係依據本發明之另一實施例所繪示的用戶個體的第一唇部動態影像示意圖。圖4B係依據本發明之另一實施例所繪示的預存之第二唇部動態影像示意圖。圖5A係依據本發明之另一實施例所繪示的用戶個體的第一唇部動態影像示意圖。圖5B係依據本發明之另一實施例所繪示的預存之第二唇部動態影像示意圖。圖6係依據本發明之一實施例所繪示的基於動態影像的身分驗證方法的方法流程圖。 FIG. 1 is a functional block diagram of an identity verification device based on dynamic images according to an embodiment of the present invention. FIG. 2 is a schematic diagram of a first voice message and a second voice message according to an embodiment of the present invention. FIG. 3A is a schematic diagram of a first lip dynamic image of an individual user according to an embodiment of the present invention. FIG. 3B is a schematic diagram of a pre-stored second lip dynamic image according to an embodiment of the present invention. FIG. 4A is a schematic diagram of a first lip dynamic image of an individual user according to another embodiment of the present invention. FIG. 4B is a schematic diagram of a pre-stored second lip dynamic image according to another embodiment of the present invention. FIG. 5A is a schematic diagram of a first lip dynamic image of an individual user according to another embodiment of the present invention. FIG. 5B is a schematic diagram of a pre-stored second lip dynamic image according to another embodiment of the present invention. FIG. 6 is a method flow chart of an identity verification method based on dynamic images according to an embodiment of the present invention.

1:身分驗證裝置 1: Identity verification device

10:錄像器 10:Video recorder

11:資料庫 11:Database

12:運算器 12:Operator

Claims

An identity verification device based on dynamic lip images, including: a video recorder for capturing a first dynamic image of lips including a first voice message according to a control instruction, wherein the first voice message is associated with a waiting Verify personal information; a database for storing a second lip dynamic image including a second voice message and a default personal data; and a operator connected to the video recorder and the database, the operator is used to The control instruction is generated according to a request instruction, and the operator determines whether the personal information to be verified matches the preset personal information. When the personal information to be verified matches the preset personal information, the operator further determines the first Whether a similarity between the lip dynamic image and the second lip dynamic image is greater than a verification threshold, and when the similarity is greater than the verification threshold, a verification success message is generated, wherein the first voice message corresponds to a plurality of first A first timing sequence of unit time and the second voice message corresponds to a second timing sequence of a plurality of second unit times. The operator is used to detect the first lip at the Nth of the first unit times. The center of the upper lip in the dynamic image is used as a first feature point to calculate a first distance change of the first feature point in the Nth first unit time, and the operator is used to calculate a first distance change in the second unit time The Nth detection center of the upper lip in the second lip dynamic image is used as a second feature point to calculate a second distance change of the second feature point in the Nth second unit time, where N is a positive integer. The operator calculates the difference between the first distance change and the second distance change, and determines the first lip dynamic image and the third lip dynamic image when the difference is less than a preset value. The similarity between the two lip dynamic images is greater than the verification threshold, wherein the operator is used to retrieve the timing corresponding to a first syllable data of the first voice message as the first timing, and the operator is used to The timing corresponding to a second syllable data of the second voice message is captured as the second timing, and the first syllable data and the second syllable data indicate the same single character or word.

The identity verification device based on dynamic lip images as described in claim 1, wherein the operator is further configured to detect a plurality of first features of the first dynamic image of lips at the Nth of the first unit times. points to calculate a plurality of first distance changes of the first feature points in the Nth first unit time, and the operator is further used to detect the Nth of the second unit time. A plurality of second feature points of the second lip dynamic image to calculate a plurality of second distance changes of the second feature points in the N-th second unit time, where N is a positive integer, and the operator According to the first distance changes and the second distance changes, it is determined whether the similarity between the first lip dynamic image and the second lip dynamic image is greater than the verification threshold.

The identity verification device based on lip dynamic images as described in claim 2, wherein the operator calculates a plurality of first differences between the first distance changes and the second distance changes respectively, and divides the The first difference values are respectively compared to a plurality of first preset values. When the difference values are respectively smaller than the first preset values, the operator determines the first lip dynamic image and the second lip The similarity of the dynamic image is greater than the verification threshold.

The identity verification device based on the lip dynamic image as described in claim 1, wherein the operator is further used to sequentially detect the first feature point of the first lip dynamic image in each first unit time. , to calculate a first movement trajectory of the first feature point in the first time sequence, and the operator is further used to sequentially detect the second lip dynamic image in each second unit time. The second feature point is used to calculate a second movement trajectory of the second feature point in the second time sequence. The operator determines the first lip dynamics based on the first movement trajectory and the second movement trajectory. Whether the similarity between the image and the second lip dynamic image is greater than the verification threshold.

An identity verification method based on dynamic lip images, including: using a computer to generate a control instruction based on a request instruction; using a video recorder to capture a first lip movement including a first voice message based on the control instruction. Image, in which the first voice message is associated with a profile to be verified; the operator is used to determine whether the profile to be verified matches a default profile in a database; when the profile to be verified matches the default profile At this time, the operator is used to determine whether a similarity between the first lip dynamic image and a second lip dynamic image in the database is greater than a verification threshold, wherein the second lip dynamic image includes a second lip dynamic image. a voice message; and when the similarity is greater than the verification threshold, the operator generates a verification success message, wherein the first voice message corresponds to a first sequence with a plurality of first unit times and the second voice message Corresponding to a second time sequence with a plurality of second unit times, determining whether the similarity between the first lip dynamic image and the second lip dynamic image is greater than the verification threshold includes: using the operator to calculate the first lip dynamic image. Detect a first feature point of the first lip dynamic image in the Nth unit time to calculate a first distance change amount of the first feature point in the Nth first unit time; use this operation The device detects a second feature point of the second lip dynamic image in the Nth second unit time to calculate a second distance of the second feature point in the Nth second unit time. The amount of change, where N is a positive integer; the operator is used to calculate the difference between the first distance change amount and the second distance change amount; and the operator is used to determine the first distance change amount when the difference is less than a preset value The similarity between the lip dynamic image and the second lip dynamic image is greater than the verification threshold, wherein the identity verification method further includes using the operator to capture a timing corresponding to a first syllable data of the first voice message to As the first time sequence, the time sequence corresponding to a second syllable data of the second voice message captured by the operator is used as the second time sequence, wherein the first syllable data and the second syllable data indicate the same of words or words.

The identity verification method based on dynamic lip images as described in claim 5, wherein determining whether the similarity between the first dynamic image of lips and the second dynamic image of lips is greater than the verification threshold further includes: using the operator Detect a plurality of first feature points of the first lip dynamic image at the Nth first unit time to calculate a plurality of first feature points of the first feature points at the Nth first unit time. A distance variation; using the operator to detect a plurality of second feature points of the second lip dynamic image at the Nth of the second unit times, to calculate the second feature points of the second lip dynamic image at the Nth A plurality of second distance changes in the second unit time, where N is a positive integer; and the operator calculates a plurality of first differences between the first distance changes and the second distance changes respectively.

The identity verification method based on lip dynamic images as described in claim 6, wherein the operator is used to determine the difference between the first lip dynamic image and the second lip dynamic image when the difference is less than the preset value. The similarity being greater than the verification threshold includes: comparing the first difference values to a plurality of first preset values, wherein the operator is when the first difference values are respectively smaller than the first preset values, It is determined that the similarity between the first lip dynamic image and the second lip dynamic image is greater than the verification threshold.

The identity verification method based on dynamic lip images as described in claim 5, wherein determining whether the similarity between the first dynamic image of lips and the second dynamic image of lips is greater than the verification threshold further includes: using the operator Sequentially detecting the first feature point of the first lip dynamic image in each first unit time to calculate a first movement trajectory of the first feature point in the first time sequence; using the The operator sequentially detects the second feature point of the second lip dynamic image in each second unit time to calculate a second movement trajectory of the second feature point in the second time sequence; and The first movement trajectory and the second movement trajectory are compared with the operator.