TWI734297B - Multi-task object recognition system sharing multi-range features - Google Patents
Multi-task object recognition system sharing multi-range features Download PDFInfo
- Publication number
- TWI734297B TWI734297B TW108145764A TW108145764A TWI734297B TW I734297 B TWI734297 B TW I734297B TW 108145764 A TW108145764 A TW 108145764A TW 108145764 A TW108145764 A TW 108145764A TW I734297 B TWI734297 B TW I734297B
- Authority
- TW
- Taiwan
- Prior art keywords
- module
- image
- shared
- feature extraction
- range
- Prior art date
Links
Images
Abstract
Description
本發明係關於一種物件辨識技術,特別是指一種共享多範圍特徵之多任務物件辨識系統。 The present invention relates to an object identification technology, in particular to a multi-task object identification system that shares multiple range features.
有鑒於深度學習技術的蓬勃發展,現今各式的影像辨識任務大多可透過類神經模型的訓練來達成,但實際應用情境常需要多個辨識任務的整合,若每一個辨識任務皆由各自的模型負責,則計算複雜度將隨著辨識任務的數目增加。以人臉辨識為例,應用的場所經常為嚴格的權限控管,因而需要人臉辨識任務與假臉辨識任務兩者交互搭配,但若人臉辨識任務與假臉辨識任務使用兩個獨立的模型,則計算複雜度也會隨之變成兩倍。 In view of the vigorous development of deep learning technology, most of today’s image recognition tasks can be achieved through the training of neural-like models, but practical application scenarios often require the integration of multiple recognition tasks. If each recognition task is based on its own model Responsible, the computational complexity will increase with the number of identification tasks. Take face recognition as an example. The application site is often under strict authority control, so it needs to interact and match both the face recognition task and the fake face recognition task. However, if the face recognition task and the fake face recognition task use two independent Model, the computational complexity will also double.
近期有相關技術為解決因任務增加而效率倍增的問題,將不同任務整合由單一模型負責,但因每一個影像辨識任務對於物件的區域範圍要求不同,以致強制合併後反造成各任務的辨識正確率下降。例如,雖然人臉辨識與假臉辨識兩者皆以人臉影像作為輸入,但人臉辨識講究人臉的五官細節且需要的影像的範圍較小,而假臉辨識考量人臉周圍的背景環 境且需要的影像的範圍較大,導致人臉辨識與假臉辨識兩者造成影像的輸入範圍決定不易。 Recently, in order to solve the problem of increased efficiency due to increased tasks, a single model is responsible for the integration of different tasks. However, because each image recognition task has different requirements for the area range of the object, the forced merging will cause the recognition of each task to be correct. The rate drops. For example, although both face recognition and fake face recognition use face images as input, face recognition pays attention to the facial features of the face and requires a small range of images, while fake face recognition considers the background ring around the face. Due to the environment and the required image range is relatively large, it is difficult to determine the input range of the image due to both face recognition and fake face recognition.
因此,如何提供一種新穎或創新之物件(影像)辨識技術,以解決多辨識任務整合問題,實已成為本領域技術人員之一大研究課題。 Therefore, how to provide a novel or innovative object (image) recognition technology to solve the integration problem of multiple recognition tasks has become a major research topic for those skilled in the art.
本發明提供一種新穎或創新之共享多範圍特徵之多任務物件辨識系統,例如能解決在相同的物件但多任務應用時面臨的多個獨立模型不易整合的問題,或者提供共享網路層以有效減少重複或不必要的網路層,抑或者讓多個辨識任務之間能共享多範圍影像抽取出的特徵以提升辨識正確率。 The present invention provides a novel or innovative multi-task object recognition system that shares multi-range features. For example, it can solve the problem that multiple independent models are not easy to integrate when the same object is used in multi-task applications, or provide a shared network layer for effective Reduce repetitive or unnecessary network layers, or allow multiple recognition tasks to share features extracted from multi-range images to improve recognition accuracy.
本發明中共享多範圍特徵之多任務物件辨識系統包括:一多範圍產生模組,係產生或提供物件的影像的多範圍資訊;一多通道影像合併模組,係依據多範圍產生模組所產生或提供之物件的影像的多範圍資訊自物件中取樣出不同區域範圍的多張影像,以將多張影像合併成一張多通道影像;一具有共享網路層之共享特徵抽取模組,係利用類神經網路自多通道影像合併模組所合併之多通道影像中抽取出共享特徵;以及一特定任務特徵抽取模組群,係具有一個或多個不同的特定任務特徵抽取模組,以利用類神經網路自具有共享網路層之共享特徵抽取模組所抽取之共享特徵中抽取出一個或多個不同特定任務特徵,俾由特定任務特徵抽取模組群之一個或多個特定任務模型依據一個或多個不同特定任務特徵輸出物件的影像的辨識結果。 The multi-task object recognition system that shares multi-range features in the present invention includes: a multi-range generation module that generates or provides multi-range information of the image of the object; a multi-channel image merging module that is based on the multi-range generation module The generated or provided multi-range information of the image of the object samples multiple images of different areas from the object to merge the multiple images into a multi-channel image; a shared feature extraction module with a shared network layer, is A neural network is used to extract shared features from the multi-channel images merged by the multi-channel image merging module; and a task-specific feature extraction module group has one or more different task-specific feature extraction modules to Use neural networks to extract one or more different specific task features from the shared features extracted by the shared feature extraction module with the shared network layer, so as to extract one or more specific tasks of the module group from the specific task feature The model outputs the recognition result of the image of the object according to one or more different specific task features.
為讓本發明之上述特徵和優點能更明顯易懂,下文特舉實施例,並配合所附圖式作詳細說明。在以下描述內容中將部分闡述本發明之額外特徵及優點,且此等特徵及優點將部分自所述描述內容可得而知,或可藉由對本發明之實踐習得。本發明之特徵及優點借助於在申請專利範圍中特別指出的元件及組合來認識到並達到。應理解,前文一般描述與以下詳細描述二者均僅為例示性及解釋性的,且不欲約束本發明所欲主張之範圍。 In order to make the above-mentioned features and advantages of the present invention more comprehensible, embodiments are specifically described below in conjunction with the accompanying drawings. In the following description, the additional features and advantages of the present invention will be partially explained, and these features and advantages will be partly known from the description, or can be learned by practicing the present invention. The features and advantages of the present invention are realized and achieved by means of the elements and combinations specifically pointed out in the scope of the patent application. It should be understood that both the foregoing general description and the following detailed description are only illustrative and explanatory, and are not intended to limit the scope of the present invention.
1‧‧‧共享多範圍特徵之多任務物件辨識系統 1‧‧‧Multi-task object recognition system with shared multi-range features
10‧‧‧影像擷取模組 10‧‧‧Image capture module
20‧‧‧物件偵測模組 20‧‧‧Object Detection Module
30‧‧‧多範圍產生模組 30‧‧‧Multi-range generation module
40‧‧‧多通道影像合併模組 40‧‧‧Multi-channel image merging module
50‧‧‧共享特徵抽取模組 50‧‧‧Shared feature extraction module
51‧‧‧共享網路層 51‧‧‧Shared network layer
52‧‧‧共享特徵圖 52‧‧‧Shared feature map
60‧‧‧特定任務特徵抽取模組群 60‧‧‧Special task feature extraction module group
61‧‧‧特定任務特徵抽取模組 61‧‧‧Special task feature extraction module
62‧‧‧特定任務模型 62‧‧‧Specific mission model
70‧‧‧跨任務應用服務模組 70‧‧‧Cross-task Application Service Module
A‧‧‧影像或影像畫面 A‧‧‧Video or video screen
B‧‧‧位置資訊 B‧‧‧Location Information
C‧‧‧多通道影像 C‧‧‧Multi-channel image
C1至Cn‧‧‧影像 C1 to Cn‧‧‧Image
F‧‧‧特徵圖 F‧‧‧Characteristic map
L0‧‧‧第0層共享網路層 L0‧‧‧Level 0 shared network layer
L1‧‧‧第1層共享網路層 L1‧‧‧The first shared network layer
Lk‧‧‧第k層共享網路層 Lk‧‧‧k-th shared network layer
m、n‧‧‧正整數 m, n‧‧‧positive integer
第1圖為本發明中共享多範圍特徵之多任務物件辨識系統之架構示意圖; Figure 1 is a schematic diagram of the architecture of the multi-task object recognition system sharing multiple range features in the present invention;
第2圖為本發明中共享多範圍特徵之多任務物件辨識系統之實施例示意圖; Figure 2 is a schematic diagram of an embodiment of a multi-task object recognition system that shares multi-range features in the present invention;
第3圖為本發明之第1圖至第2圖中有關影像擷取模組、物件偵測模組、多範圍產生模組與多通道影像合併模組之輸入之實施例示意圖;以及 Figure 3 is a schematic diagram of an embodiment of the input of the image capture module, the object detection module, the multi-range generation module, and the multi-channel image merging module in Figures 1 to 2 of the present invention; and
第4圖為本發明之第2圖中有關共享特徵抽取模組之共享網路層之實施例示意圖。 Figure 4 is a schematic diagram of an embodiment of the shared network layer of the shared feature extraction module in Figure 2 of the present invention.
以下藉由特定的具體實施形態說明本發明之實施方式,熟悉此技術之人士可由本說明書所揭示之內容了解本發明之其他優點與功效, 亦可因而藉由其他不同的具體等同實施形態加以施行或應用。 The following describes the implementation of the present invention with specific specific embodiments. Those familiar with this technology can understand the other advantages and effects of the present invention from the content disclosed in this specification. It can also be implemented or applied by other different specific equivalent embodiments.
第1圖為本發明中共享多範圍特徵之多任務物件辨識系統1之架構示意圖,第2圖為本發明中共享多範圍特徵之多任務物件辨識系統1之實施例示意圖。
Fig. 1 is a schematic diagram of the architecture of the multi-task
如第1圖與第2圖所示,共享多範圍特徵之多任務物件辨識系統1可包括彼此互相通訊之一影像擷取模組10、一物件偵測模組20、一多範圍產生模組30、一多通道影像合併模組40、一共享特徵抽取模組50、一特定任務特徵抽取模組群60及一跨任務應用服務模組70。共享特徵抽取模組50可具有共享網路層51(如第0層共享網路層L0至第k層共享網路層Lk,且特定任務特徵抽取模組群60可具有一個或多個不同的特定任務特徵抽取模組61。
As shown in Fig. 1 and Fig. 2, the multi-task
影像擷取模組10可擷取影像或影像畫面,物件偵測模組20可自影像擷取模組10所擷取之影像或影像畫面中偵測出物件的位置資訊,多範圍產生模組30可利用物件偵測模組20所偵測之物件的位置資訊產生或提供物件的影像的多範圍資訊。多通道影像合併模組40可依據多範圍產生模組30所產生或提供之物件的影像的多範圍資訊自物件中取樣出不同區域範圍的多張影像,以將多張影像合併成一張多通道影像。共享特徵抽取模組50可利用類神經網路自多通道影像合併模組40所合併之多通道影像中抽取出共享特徵。特定任務特徵抽取模組群60之一個或多個不同的特定任務特徵抽取模組61可利用類神經網路自具有共享網路層51之共享特徵抽取模組50所抽取之共享特徵中抽取出一個或多個不同特定任務特徵,俾由特定任務特徵抽取模組群60之一個或多個特定任務模型62依據一個
或多個不同特定任務特徵輸出物件的影像的辨識結果。
The
例如,二個不同的特定任務模型62分別為人臉辨識模型與假臉辨識模型以作為防偽人臉辨識門禁系統之應用,且跨任務應用服務模組70可利用特定任務特徵抽取模組群60之一個或多個不同的特定任務特徵抽取模組61所抽取之一個或多個特定任務特徵提供、執行或達成跨任務應用服務。亦即,特定任務特徵抽取模組群60中每一個任務對應至一個特定任務模型62,且特定任務模型62是由類神經網路組成並從共享特徵圖52抽取所需的特徵,藉以判斷輸入的影像有怎樣的特質,以人臉來說,假設有二個特定任務模型62分別為人臉辨識模型與性別辨識模型,則此二個特定任務模型62從共享特徵圖52中抽取所需特徵後就可以輸出物件的影像的辨識結果,例如美國總統的影像經過影像擷取模組10至特定任務特徵抽取模組群60處理後,此二個特定任務模型62就會輸出"川普"、"男性"等二個資訊給跨任務應用服務模組70。
For example, two different task-
影像擷取模組10可為硬體之攝影機、照相機、監視器或感測器等,且影像擷取模組10所擷取之影像可為二維影像或三維影像,如深度影像、紅外線影像等,而二維影像之通道可為單通道或多通道。
The image capturing
物件偵測模組20可為硬體之物件偵測器或軟體之物件偵測程式等,且物件偵測模組20可依據不同應用使用不同的物件偵測器。例如,人臉分析或辨識之應用可使用有關MTCNN(Multi-task cascaded convolutional networks,多任務串聯卷積神經網路)的物件偵測器或物件偵測程式,而車輛偵測或辨識之應用可使用有關YOLO(You Only Look Once;你只看一次)的物件偵測器或物件偵測程式。
The
多範圍產生模組30可為軟體之多範圍產生程式等,多通道影像合併模組40可為軟體之多通道影像合併程式等。多範圍產生模組30或多通道影像合併模組40之多範圍擴展方式與數量可依據不同應用有不同的變化,若著重於物件之細節與質地特徵,則多範圍產生模組30或多通道影像合併模組40可以多個縮小範圍的影像為主;相反地,若著重於物件周圍之附屬物或背景,則多範圍產生模組30或多通道影像合併模組40以多個擴張範圍的影像為主(即添加較多擴張範圍的影像)。另外,若著重於共享網路層51的模型的大小或效能,則多範圍產生模組30或多通道影像合併模組40可以疊合較少不同範圍的影像的張數(即減少疊合的影像的張數);相反地,若著重於提升共享網路層51的泛用性,則多範圍產生模組30或多通道影像合併模組40可以疊合較多不同範圍的影像的張數(即增加疊合的影像的張數),以利於擷取影像更多的特徵。
The
共享特徵抽取模組50可為軟體之共享特徵抽取程式等,且共享特徵抽取模組50的共享網路層51(類神經網路層)的種類與層數可依據不同應用來調整。例如,在某些應用中,共享特徵抽取模組50的共享網路層51使用三維(3D)卷積的效果可能較佳,或者當多個特定任務特徵抽取模組61相當類似時,共享特徵抽取模組50的共享網路層51的層數便可以增加。
The shared
特定任務特徵抽取模組群60之特定任務特徵抽取模組61可為軟體之特定任務特徵抽取程式等,跨任務應用服務模組70可為軟體之跨任務應用服務程式等。各特定任務特徵抽取模組61的網路層的數量無須相同,也沒有數量限制,且各特定任務特徵抽取模組61可依據任務的特性
使用不同的網路層的種類與數量。
The specific task
前述類神經網路可為例如卷積神經網路(Convolutional Neural Network;CNN)、遞歸神經網路(recurrent neural network;RNN)、深度神經網路(Deep Neural Network;DNN)、長短期記憶(LSTM)神經網路等,但不以此為限。 The aforementioned neural network can be, for example, Convolutional Neural Network (CNN), Recurrent Neural Network (RNN), Deep Neural Network (DNN), Long Short-term Memory (LSTM) ) Neural network, etc., but not limited to this.
舉例而言,多範圍產生模組30可擷取多個縮小或擴展的影像的範圍,且多通道影像合併模組40可依據多範圍產生模組30所擷取之多個影像來疊合多通道的維度以整合為單一的輸入,使得單一的輸入所包括之特徵可以跨多種辨識任務。以人臉分析或辨識為例,擴大範圍的人臉影像可以涵蓋頭髮及耳朵,有助於性別判斷與人臉辨識應用;而且,縮小範圍與擴大範圍的人臉影像分別有助於擷取人臉影像的質地與背景特徵,有助於活體(如人體)識別的應用。
For example, the
共享特徵抽取模組50可具有共享網路層51,能有效減少重複或不必要的網路層,以降低模型的大小與提升執行速度。在應用時,也能依據不同需求抽換特定任務特徵抽取模組群60之特定任務模型62來提高彈性。例如,以實例說明抽換的概念,假設設計了一個具有人臉辨識、性別辨識與年齡辨識的產品,但在某一個應用的場景,客戶不需要年齡資訊,則可移除年齡辨識的特定任務模型62,而保留人臉辨識與性別辨識兩者的特定任務模型62,以降低整體模型大小與運算量。因此,本發明可讓多個辨識任務之間得以共享此共享網路層51,能減少整體計算複雜度,並讓多個辨識任務之間得以共享多範圍影像抽取出的特徵,提升辨識正確率。
The shared
本發明可提供多範圍影像疊合、跨任務共享特徵與可抽換的
特徵抽取群等技術,利用這些技術可使相同的物件但多個辨識任務的應用易於整合為單一模型,能有效降低模型的大小與預測時間,亦能增加應用的彈性。亦即,本發明可利用多範圍共享特徵的技術,透過共享特徵抽取模組50之共享網路層51將相同的物件但多個辨識任務的應用可整合為單一模型,以有效降低模型的大小與預測時間。
The invention can provide multi-range image overlay, cross-task sharing features and swappable
Technology such as feature extraction group, using these technologies can make the application of the same object but multiple identification tasks easy to integrate into a single model, which can effectively reduce the size of the model and the prediction time, and also increase the flexibility of the application. That is, the present invention can utilize the technology of sharing features in multiple ranges, and the application of the same object but multiple identification tasks can be integrated into a single model through the shared
本發明能解決模型龐大與效能低下的問題。亦即,在多個獨立模型下,由於每個辨識任務所需的影像的範圍可能都有些微不同,造成欲辨識的物件相同,但卻需要擷取不同範圍的多張影像(如n張影像或圖片)以分別輸入至多個(如n個)獨立模型而降低整體辨識速度。因此,本發明利用多範圍(多通道)疊合與共享網路層51的技術將模型需要的特徵整合為單一模型,使得模型的大小得以縮小。
The invention can solve the problems of large model and low efficiency. That is, under multiple independent models, since the range of images required for each recognition task may be slightly different, the object to be recognized is the same, but multiple images with different ranges (such as n images) need to be captured. (Or pictures) to input into multiple (such as n) independent models to reduce the overall recognition speed. Therefore, the present invention uses the technology of multi-range (multi-channel) overlay and shared
共享特徵抽取模組50係為了減少計算量及提升預測速度,且其概念是共享才會減少計算量,進而提升預測速度。舉例來說,假設獨立模型X的計算量為50,且獨立模型Y的計算量為50,若各別進行預測,則獨立模型X與獨立模型Y的總計算量為50+50=100,但若引入共享的概念,將獨立模型X與獨立模型Y可以共用的部分拆分出來將之稱為Z,而這共用部分Z的計算量為20,則獨立模型X與獨立模型Y剩餘的不可共用的部分分別稱為X'及Y',在共享後,總計算量會變成Z+X'+Y'=20+(50-20)+(50-20)=80,相較於原本的計算量100就節省了20的計算量。
The shared
本發明可具有高度的應用彈性。亦即,本發明將模型需要的特徵整合為單一模型下,仍然保有獨立模型的彈性,可以依據不同應用抽換不同的特定任務特徵抽取模組61,並依據不同應用來客製化不同任務組
合。
The invention can have a high degree of application flexibility. That is, the present invention integrates the features required by the model into a single model, and still retains the flexibility of an independent model. Different task-specific
第3圖為本發明之第1圖至第2圖中有關影像擷取模組10、物件偵測模組20、多範圍產生模組30與多通道影像合併模組40之輸入之實施例示意圖。
Figure 3 is a schematic diagram of an embodiment of the input of the
如第3圖與上述第1圖至第2圖所示,本發明之關鍵在於產生影像擷取模組10、物件偵測模組20、多範圍產生模組30與多通道影像合併模組40等之輸入,並提供共享特徵抽取模組50之共享網路層51(如第0層共享網路層L0至第k層共享網路層Lk)、特定任務特徵抽取模組群60之至少二特定任務模型62與跨任務應用服務模組70來構成系統之整個架構。需說明者,前述第0層共享網路層L0表示從0開始計數共享網路層,但若從1開始計數共享網路層,則應將第0層共享網路層L0改稱為第1層共享網路層,以此類推。
As shown in Fig. 3 and Figs. 1 to 2 above, the key of the present invention is to generate an
在產生影像擷取模組10、物件偵測模組20、多範圍產生模組30與多通道影像合併模組40之輸入時,本發明可包括下列程序P11至程序P14,並以人臉作為範例進行說明。
When generating input from the
程序P11:由影像擷取模組10擷取影像或影像畫面A。例如,影像擷取模組10(如攝影機或感測器等)可擷取影像或影像畫面A,如RGB影像(圖片)或影像畫面,其中RGB表示紅/綠/藍三原色。
Procedure P11: the image or image frame A is captured by the
程序P12:由物件偵測模組20自影像擷取模組10所擷取之影像或影像畫面A中偵測出物件的位置資訊B。例如,物件偵測模組20可使用物件偵測演算法(如人臉偵測演算法)自影像擷取模組10所擷取之影像或影像畫面A中偵測出物件(如人臉)的位置資訊B(如邊界盒座標)。
Procedure P12: The
程序P13:由多範圍產生模組30利用物件偵測模組20所偵測之物件的位置資訊B產生或提供物件的影像的多範圍資訊。例如,多範圍產生模組30可依據不同應用自物件偵測模組20所偵測之物件(如人臉)的中央或其他位置進行物件(如人臉)的邊界擴展或縮減,並決定物件(如人臉)的擴展或縮減幅度與所需擷取物件(如人臉)的影像的張數。繼之,多範圍產生模組30可在擷取物件(如人臉)的多張影像C1-Cn完畢後,再縮放物件(如人臉)的多張影像C1-Cn至固定長寬大小,在此以物件(如人臉)的影像為224x224(不限單位)的固定長寬大小為例,且假設不同範圍的物件(如人臉)的影像(如RGB影像)為多張影像C1-Cn(如n張影像),其中n為正整數。
Procedure P13: The
程序P14:由多通道影像合併模組40依據多範圍產生模組30所產生或提供之物件的影像的多範圍資訊自物件中取樣出不同區域範圍的多張影像C1-Cn(如n張影像),以將多張影像C1-Cn合併成一張多通道影像C。例如,多通道影像合併模組40可依據多範圍產生模組30所產生之物件(如人臉)的影像的多範圍資訊自物件中取樣出不同區域範圍的多張影像C1-Cn(如n張影像),以將多張影像C1-Cn在通道維度上合併成一張多通道影像C。假設每一張影像(圖片)的維度為224x224x3,則合併n張影像(圖片)後,n張影像的維度為224x224x3n。
Procedure P14: The multi-channel
第4圖為本發明之第2圖中有關共享特徵抽取模組50之共享網路層51之實施例示意圖。如第3圖與第4圖所示,在第3圖中產生影像擷取模組10、物件偵測模組20、多範圍產生模組30與多通道影像合併模組40等之輸入後,便會依序進入第4圖中共享特徵抽取模組50之共享網路層51
之第0層共享網路層L0、第1層共享網路層L1至第k層共享網路層Lk。
Figure 4 is a schematic diagram of an embodiment of the shared
如第4圖所示,具有共享網路層51之共享特徵抽取模組50可利用類神經網路自多通道影像合併模組40所合併之多通道影像C中抽取出共享特徵。例如,共享特徵抽取模組50之共享網路層51可透過下列程序P21至程序P22,以利用類神經網路(如卷積神經網路CNN)自多通道影像合併模組40所合併之多通道影像C中抽取出共享特徵。同時,本發明下列以第0層共享網路層L0與二維(2D)卷積為例,而第1層共享網路層L1至第k層共享網路層Lk可依此類推,且共享特徵抽取模組50亦可依據不同應用選擇二維(2D)卷積或三維(3D)卷積。
As shown in FIG. 4, the shared
程序P21:共享特徵抽取模組50之共享網路層51(如第0層共享網路層L0)可使用(如依序使用)不同長寬大小的核心(如卷積核心)對多通道影像合併模組40所合併之多通道影像C進行卷積。須注意者,當共享網路層51(如第0層共享網路層L0)之輸入通道為3n時,共享網路層51(如第0層共享網路層L0)之二維(2D)卷積之核心的深度為3n。
Procedure P21: The shared
程序P22:假設總共有多個(如m個)不同長寬大小的核心,則共享特徵抽取模組50之共享網路層51(如第0層共享網路層L0)對多通道影像C進行卷積後可產生深度為m的特徵圖F,以作為或提供予下一層共享網路層(如第1層共享網路層L1),其中m為正整數。
Procedure P22: Assuming that there are a total of multiple (such as m) cores with different lengths and widths, the shared
共享特徵抽取模組50在產生共享網路層51(如第0層共享網路層L0)之特徵圖F後,依據不同情境可以產生更深的共享網路層(如第1層共享網路層L1至第k層共享網路層Lk)。例如,第2圖所示特定任務特徵抽取模組群60之一個或多個特定任務模型62皆使用ResNet或ResNet系
列,如ResNet18、ResNet34或ResNet50,則共享特徵抽取模組50可以將後續的批量正規化層(Batch Normalization Layer)、ReLu激發層(ReLu Activation Function Layer)、最大池化層(Max Pooling Layer)等列入共享網路層51(如第0層共享網路層L0至第k層共享網路層Lk)。所以,由於有了共享特徵抽取模組50之共享網路層51(如第0層共享網路層L0至第k層共享網路層Lk),讓多個獨立模型可以透過共享網路層51整合為一個模型,以使整體大小減少,亦即透過共享整合的單一模型大小會小於各獨立模型的總和大小。
After the shared
共享特徵抽取模組50在計算完成所有的共享網路層51(如第0層共享網路層L0至第k層共享網路層Lk)後,可產生共享特徵圖52,以將共享特徵圖52輸入至第2圖所示特定任務特徵抽取模組群60之各個特定任務模型62。
The shared
舉例而言,由於第3圖所示多範圍產生模組30的縮減與擴展尺寸的作法可以有效的囊括物件(如人臉/頭部/背景)之細微人臉質地特徵、頭部周圍特徵與背景特徵,因此人臉辨識模型、假臉辨識模型、性別辨識模型與年齡辨識模型等,皆是第2圖所示特定任務特徵抽取模組群60中適合共享的特定任務模型62。
For example, due to the reduction and expansion of the
又如第1圖至第2圖所示,在特定任務特徵抽取模組群60之不同的特定任務特徵抽取模組61輸出(如平行輸出)不同的特定任務模型62的結果至跨任務應用服務模組70後,跨任務應用服務模組7O便可針對不同應用搭配不同的特定任務模型62來使用。例如,跨任務應用服務模組70可利用第一個特定任務模型62(如假臉辨識模型)先行篩選掉仿冒人臉,
再使用第二個特定任務模型62(如人臉辨識模型)、第三個特定任務模型62(如性別辨識模型)與第四個特定任務模型62(如年齡辨識模型)的辨識結果,以避免系統遭受有心人士盜用。
As shown in Figures 1 to 2, different task-specific
綜上,本發明中共享多範圍特徵之多任務物件辨識系統可至少具有下列特色、優點或技術功效。 In summary, the multi-task object recognition system sharing multiple-range features in the present invention can at least have the following features, advantages, or technical effects.
一、本發明能解決在相同的物件但多任務應用時,面臨的多個獨立模型不易整合的問題。 1. The present invention can solve the problem that multiple independent models are not easy to integrate when the same object is used in multi-task applications.
二、本發明之共享特徵抽取模組可具有共享網路層,能有效減少重複或不必要的網路層,以降低模型的大小與提升執行速度。 2. The shared feature extraction module of the present invention can have a shared network layer, which can effectively reduce repetitive or unnecessary network layers, so as to reduce the size of the model and increase the execution speed.
三、本發明可讓多個辨識任務之間得以共享此共享網路層,能減少整體計算複雜度,並讓多個辨識任務之間得以共享多範圍影像抽取出的特徵,提升辨識正確率。 3. The present invention allows the shared network layer to be shared among multiple identification tasks, reduces the overall computational complexity, and allows multiple identification tasks to share features extracted from multi-range images, thereby improving the accuracy of identification.
四、本發明可提供多範圍影像疊合、跨任務共享特徵與可抽換的特徵抽取群等技術,利用這些技術可使相同的物件但多個辨識任務的應用易於整合為單一模型,能有效降低模型的大小與預測時間,亦能增加應用的彈性。 4. The present invention can provide technologies such as multi-range image overlay, cross-task sharing features, and interchangeable feature extraction groups. Using these technologies, applications of the same object but multiple identification tasks can be easily integrated into a single model, which is effective Reducing the size of the model and the prediction time can also increase the flexibility of the application.
五、本發明可利用多範圍共享特徵的技術,透過共享特徵抽取模組之共享網路層將相同的物件但多個辨識任務的應用整合為單一模型,以有效降低模型的大小與預測時間。 5. The present invention can utilize the technology of multi-range sharing of features to integrate the same object but multiple recognition task applications into a single model through the shared network layer of the shared feature extraction module, so as to effectively reduce the size of the model and the prediction time.
六、本發明利用多範圍(多通道)疊合與共享網路層的技術將模型需要的特徵整合為單一模型,使得模型的大小得以縮小,俾減少共享特徵抽取模組預測時的計算量及提升預測速度。 6. The present invention uses multi-range (multi-channel) overlay and shared network layer technology to integrate the features required by the model into a single model, so that the size of the model can be reduced, so as to reduce the amount of calculation and the shared feature extraction module prediction. Improve the speed of forecasting.
七、本發明將模型需要的特徵整合為單一模型下,仍然保有獨立模型的彈性,能依據不同應用抽換不同的特定任務特徵抽取模組,或者依據不同應用來客製化不同任務組合。 7. The present invention integrates the features required by the model into a single model, and still retains the flexibility of an independent model. Different specific task feature extraction modules can be exchanged according to different applications, or different task combinations can be customized according to different applications.
八、本發明之應用範疇相當廣泛,能用於各種影像物件之辨識或監控任務,例如人臉辨識、活體辨識、性別年齡辨識、人形辨識、車牌辨識、車輛辨識、影像監控、智慧零售等。同時,本發明可能應用之產品為例如刷臉差勤產品、智慧門禁產品、來客分析產品、電子圍籬產品等。 8. The application scope of the present invention is quite wide, and it can be used for various image object recognition or monitoring tasks, such as face recognition, living body recognition, gender and age recognition, human figure recognition, license plate recognition, vehicle recognition, image monitoring, smart retail, etc. At the same time, the products that the present invention may be applied to are, for example, facial cleaning products, smart access control products, visitor analysis products, electronic fence products, and the like.
上述實施形態僅例示性說明本發明之原理、特點及其功效,並非用以限制本發明之可實施範疇,任何熟習此項技藝之人士均能在不違背本發明之精神及範疇下,對上述實施形態進行修飾與改變。任何使用本發明所揭示內容而完成之等效改變及修飾,均仍應為申請專利範圍所涵蓋。因此,本發明之權利保護範圍,應如申請專利範圍所列。 The above embodiments are only illustrative of the principles, features and effects of the present invention, and are not intended to limit the scope of implementation of the present invention. Anyone familiar with the art can comment on the above without departing from the spirit and scope of the present invention. Modifications and changes to the implementation form. Any equivalent changes and modifications made using the content disclosed in the present invention should still be covered by the scope of the patent application. Therefore, the protection scope of the present invention should be as listed in the scope of the patent application.
1‧‧‧共享多範圍特徵之多任務物件辨識系統 1‧‧‧Multi-task object recognition system with shared multi-range features
10‧‧‧影像擷取模組 10‧‧‧Image capture module
20‧‧‧物件偵測模組 20‧‧‧Object Detection Module
30‧‧‧多範圍產生模組 30‧‧‧Multi-range generation module
40‧‧‧多通道影像合併模組 40‧‧‧Multi-channel image merging module
50‧‧‧共享特徵抽取模組 50‧‧‧Shared feature extraction module
60‧‧‧特定任務特徵抽取模組群 60‧‧‧Special task feature extraction module group
61‧‧‧特定任務特徵抽取模組 61‧‧‧Special task feature extraction module
70‧‧‧跨任務應用服務模組 70‧‧‧Cross-task Application Service Module
Claims (9)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
TW108145764A TWI734297B (en) | 2019-12-13 | 2019-12-13 | Multi-task object recognition system sharing multi-range features |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
TW108145764A TWI734297B (en) | 2019-12-13 | 2019-12-13 | Multi-task object recognition system sharing multi-range features |
Publications (2)
Publication Number | Publication Date |
---|---|
TW202123069A TW202123069A (en) | 2021-06-16 |
TWI734297B true TWI734297B (en) | 2021-07-21 |
Family
ID=77516602
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
TW108145764A TWI734297B (en) | 2019-12-13 | 2019-12-13 | Multi-task object recognition system sharing multi-range features |
Country Status (1)
Country | Link |
---|---|
TW (1) | TWI734297B (en) |
-
2019
- 2019-12-13 TW TW108145764A patent/TWI734297B/en active
Also Published As
Publication number | Publication date |
---|---|
TW202123069A (en) | 2021-06-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN114913565B (en) | Face image detection method, model training method, device and storage medium | |
WO2021093453A1 (en) | Method for generating 3d expression base, voice interactive method, apparatus and medium | |
Deore et al. | Study of masked face detection approach in video analytics | |
Raghavendra et al. | Exploring the usefulness of light field cameras for biometrics: An empirical study on face and iris recognition | |
CN108596193B (en) | Method and system for building deep learning network structure aiming at human ear recognition | |
JP2008152530A (en) | Face recognition device, face recognition method, gabor filter applied device, and computer program | |
CN110458895A (en) | Conversion method, device, equipment and the storage medium of image coordinate system | |
CN111753782B (en) | False face detection method and device based on double-current network and electronic equipment | |
CN109815843A (en) | Object detection method and Related product | |
WO2020134818A1 (en) | Image processing method and related product | |
CN111160164A (en) | Action recognition method based on human body skeleton and image fusion | |
CN110633698A (en) | Infrared picture identification method, equipment and medium based on loop generation countermeasure network | |
CN109977912A (en) | Video human critical point detection method, apparatus, computer equipment and storage medium | |
JP2021136012A (en) | Method and apparatus for detecting liveness based on phase difference | |
CN101587590A (en) | Selective visual attention computation model based on pulse cosine transform | |
CN109447022A (en) | A kind of lens type recognition methods and device | |
JP7314959B2 (en) | PERSONAL AUTHENTICATION DEVICE, CONTROL METHOD, AND PROGRAM | |
TWI734297B (en) | Multi-task object recognition system sharing multi-range features | |
CN111126250A (en) | Pedestrian re-identification method and device based on PTGAN | |
CN110021036A (en) | Infrared target detection method, apparatus, computer equipment and storage medium | |
Ye et al. | Human motion analysis based on extraction of skeleton and dynamic time warping algorithm using RGBD camera | |
CN114650373A (en) | Imaging method and device, image sensor, imaging device and electronic device | |
Ma et al. | LAYN: Lightweight Multi-Scale Attention YOLOv8 Network for Small Object Detection | |
Jeanne et al. | Real-time face detection on a dual-sensor smart camera using smooth-edges technique | |
CN112016495A (en) | Face recognition method and device and electronic equipment |