TWI752478B

TWI752478B - Image processing method and image processing system

Info

Publication number: TWI752478B
Application number: TW109113973A
Authority: TW
Inventors: 陳冠文; 陳柏亨; 駱昭旭; 黃翊庭
Original assignee: 台達電子工業股份有限公司
Priority date: 2020-04-27
Filing date: 2020-04-27
Publication date: 2022-01-11
Also published as: TW202141425A

Abstract

An image processing method includes analyzing multiple images data based on Illumination-invariant Feature Network (IF-NET) with an image processing device to generate corresponding sets of eigenvector, in which image data includes a first image data related to at least a first feature of the sets of eigenvector, and a second image data related to at least a second feature of the sets of eigenvector; choosing a corresponding first training set of tiles and second training set of tiles from the first image data and second image data with an image processing device based on IF-NET, and computing on both training set of tiles to generate a least one loss value; and adjusting IF-NET based on a least one loss value. An image processing system is also disclosed herein.

Description

Image processing method and image processing system

本揭示內容是關於一種影像處理方法與影像處理系統，特別是關於一種基於深度學習抗光線干擾匹配的一種影像處理方法與影像處理系統。The present disclosure relates to an image processing method and an image processing system, in particular to an image processing method and an image processing system based on deep learning anti-light interference matching.

特徵匹配(Feature Match)被泛用在許多電腦視覺如圖像檢索、相機定位等領域當中，影像的特徵必須對尺度、方向、視角以及光線保持不變性與唯一性。Feature matching is widely used in many computer vision fields such as image retrieval, camera positioning, etc. The features of images must be invariant and unique to scale, orientation, viewing angle, and light.

然而，在環境場景相似度比對的領域當中，現有特徵匹配系統或方法並無針對光線差異與場景視野變化進行優化，導致匹配結果不如預期。However, in the field of environmental scene similarity comparison, existing feature matching systems or methods are not optimized for light differences and scene field changes, resulting in unsatisfactory matching results.

本揭示文件提供一種影像處理方法，其包括步驟：藉由影像資料處理裝置基於特徵抽取模型分析複數影像資料，以產生對應上述影像資料的特徵向量集合，其中影像資料包含與特徵向量集合中之至少一第一特徵相關之複數第一影像資料以及與特徵向量集合中之至少一第二特徵相關之複數第二影像資料；藉由影像資料處理裝置基於特徵抽取模型從第一影像資料與第二影像資料中分別選取對應的第一訓練圖塊組以及第二訓練圖塊組，並對第一訓練圖塊組以及第二訓練圖塊組進行運算，以產生對應之至少一損失函數值；以及根據至少一損失函數值調整特徵抽取模型，使得當影像資料處理裝置基於經調整之特徵抽取模型分析影像資料時，第一影像資料與第二影像資料匹配的程度增加。The present disclosure provides an image processing method, which includes the steps of: analyzing complex image data based on a feature extraction model by an image data processing device to generate a feature vector set corresponding to the image data, wherein the image data includes at least one of the feature vector sets a plurality of first image data related to the first feature and a plurality of second image data related to at least one second feature in the feature vector set; the image data processing device extracts the model from the first image data and the second image by the image data processing device In the data, the corresponding first training image block group and the second training image block group are respectively selected, and the first training image block group and the second training image block group are operated to generate corresponding at least one loss function value; and according to The at least one loss function value adjusts the feature extraction model such that when the image data processing apparatus analyzes the image data based on the adjusted feature extraction model, the degree of matching of the first image data and the second image data increases.

本揭示文件提供一種影像處理系統，其包括影像擷取裝置以及影像資料處理裝置。影像擷取裝置，用以擷取複數個影像資料；以及影像資料處理裝置，耦接影像擷取裝置，並用以基於特徵抽取模型對影像資料中之複數第一影像資料與複數第二影像資料進行比對操作，且依據比對操作的結果輸出影像定位結果；其中第一影像資料與至少一第一特徵相關，第二影像資料與至少一第二特徵相關，特徵抽取模型係根據對第一影像資料及第二影像資料進行運算所產生之至少一損失函數值進行調整。The present disclosure provides an image processing system, which includes an image capturing device and an image data processing device. an image capture device for capturing a plurality of image data; and an image data processing device, coupled to the image capture device, and used for performing a process on the plurality of first image data and the plurality of second image data in the image data based on the feature extraction model A comparison operation is performed, and an image positioning result is output according to the result of the comparison operation; wherein the first image data is related to at least one first feature, the second image data is related to at least one second feature, and the feature extraction model is based on the first image data. The data and the second image data are adjusted by at least one loss function value generated by the operation.

藉由上述影像處理方法與影像處理系統，能夠改善戶外場景光線變化大時的特徵匹配系統的準確度。The above-mentioned image processing method and image processing system can improve the accuracy of the feature matching system when the outdoor scene light changes greatly.

下文係舉實施例配合所附圖式作詳細說明，但所描述的具體實施例僅僅用以解釋本發明，並不用來限定本發明，而結構操作之描述非用以限制其執行之順序，任何由元件重新組合之結構，所產生具有均等功效的裝置，皆為本發明揭示內容所涵蓋的範圍。 The following is a detailed description of the embodiments in conjunction with the accompanying drawings, but the specific embodiments described are only used to explain the present invention, not to limit the present invention, and the description of the structure and operation is not used to limit the order of its execution, any The structure of the recombination of the components, the resulting devices with equal efficacy, are all within the scope of the disclosure of the present invention.

在全篇說明書與申請專利範圍所使用之用詞(terms)，除有特別註明外，通常具有每個用詞使用在此領域中、在此揭露之內容中與特殊內容中的平常意義。某些用以描述本揭露之用詞將於下或在此說明書的別處討論，以提供本領域技術人員在有關本揭露之描述上額外的引導。 Unless otherwise specified, the terms used throughout the specification and the scope of the patent application generally have the ordinary meaning of each term used in the field, in the content disclosed herein and in the specific content. Certain terms used to describe the present disclosure are discussed below or elsewhere in this specification to provide those skilled in the art with additional guidance in describing the present disclosure.

第1圖為根據一些實施例所繪示的影像處理系統100的示意圖。如第1圖所示，影像處理系統100包括影像擷取裝置110以及影像資料處理裝置120。影像資料處理裝置120與影像擷取裝置110耦接。影像擷取裝置110用以擷取如後述第3圖中的複數個影像資料300，例如各種不同的照片或是圖案，並將其串流至影像資料處理裝置120。 FIG. 1 is a schematic diagram of an image processing system 100 according to some embodiments. As shown in FIG. 1, the image processing system 100 includes a video The image capturing device 110 and the image data processing device 120 are included. The image data processing device 120 is coupled to the image capturing device 110 . The image capturing device 110 is used for capturing a plurality of image data 300 , such as various photos or patterns, as described in FIG. 3 , and streaming the image data to the image data processing device 120 .

在一些實施例中，影像擷取裝置110可藉由智慧型手機的攝像鏡頭、相機攝像鏡頭或是具有截圖功能的程式軟體實現。 In some embodiments, the image capturing device 110 can be implemented by a camera lens of a smart phone, a camera camera lens, or a program software with a screenshot function.

在一些實施例中，影像資料處理裝置120可藉由諸如筆記型電腦、桌上型電腦等電腦系統實現。 In some embodiments, the image data processing device 120 may be implemented by a computer system such as a notebook computer, a desktop computer, or the like.

在一些實施例中，影像資料處理裝置120包含特徵抽取模型130和指令庫140，其中特徵抽取模型130是預先配置於影像資料處理裝置120中，且其架構是建立在IF-Net(Illumination Neural Network)的深度學習網路架構。 In some embodiments, the image data processing apparatus 120 includes a feature extraction model 130 and an instruction library 140, wherein the feature extraction model 130 is preconfigured in the image data processing apparatus 120, and its architecture is based on IF-Net (Illumination Neural Network Network). ) deep learning network architecture.

又在一些實施例中，特徵抽取模型130係藉由IF-Net基於深度學習的卷積神經網路(Convolutional Neural Network,CNN)來訓練特徵抽取模型130產生的特徵描述器(descriptor)，並藉由其學習訓練以找出適應性高的特徵描述器。在一些實施例中，上述適應性高的特徵描述器可以用來解決戶外場景光線差異變化大的情況下的特徵匹配誤差。 In some embodiments, the feature extraction model 130 uses the IF-Net deep learning-based convolutional neural network (Convolutional Neural Network, CNN) to train the feature descriptor (descriptor) generated by the feature extraction model 130, and uses It is learned and trained to find feature descriptors with high adaptability. In some embodiments, the above-mentioned feature descriptor with high adaptability can be used to solve the feature matching error in the case where the light difference of the outdoor scene varies greatly.

在一些實施例中，指令庫140儲存有運算指令，藉由影像資料處理裝置120中的處理器(未繪示)存取並執行。In some embodiments, the instruction library 140 stores operation instructions, which are accessed and executed by a processor (not shown) in the image data processing device 120 .

第2圖為根據一些實施例所繪示的影像處理系統100的運作流程圖。如第2圖所示，影像處理系統100的運作流程包含步驟S210、步驟S220、步驟 S230、步驟 S240、步驟 S250、步驟 S260以及步驟 S270。為了方便清楚說明，第2圖所示的影像處理方法200係參照第1圖來做說明，但不以其為限。FIG. 2 is a flowchart illustrating the operation of the image processing system 100 according to some embodiments. As shown in FIG. 2, the operation flow of the image processing system 100 includes step S210, step S220, step S230, step S240, step S250, step S260 and step S270. For the convenience of clear description, the image processing method 200 shown in FIG. 2 is described with reference to FIG. 1, but is not limited thereto.

於步驟S210，影像擷取裝置110擷取當前環境影像作為影像資料300（如第3圖所示）輸入至影像資料處理裝置120。接著，於步驟S220，影像資料處理裝置120載入特徵抽取模型130，並於步驟S230載入一環境場景模型。In step S210 , the image capturing device 110 captures the current environment image as the image data 300 (as shown in FIG. 3 ), which is input to the image data processing device 120 . Next, in step S220, the image data processing device 120 loads the feature extraction model 130, and loads an environment scene model in step S230.

於步驟S240，分別從環境場景模型與影像資料300中抽取環境特徵，並於步驟S250中藉由影像資料處理裝置120對影像資料300進行環境特徵相似度的比對操作。接著於步驟S260，影像資料處理裝置120依據步驟S250中的比對結果進行空間定位，並於步驟S270中根據上述空間定位輸出影像定位結果。In step S240, environmental features are extracted from the environmental scene model and the image data 300, respectively, and in step S250, the image data processing device 120 performs a comparison operation of the similarity of environmental features on the image data 300. Next, in step S260, the image data processing device 120 performs spatial positioning according to the comparison result in step S250, and outputs an image positioning result according to the above-mentioned spatial positioning in step S270.

在一些實施例中，於步驟S240，影像資料處理裝置120基於特徵抽取模型130分析複數影像資料300，以產生對應影像資料300的特徵向量集合。影像資料300包含後述第3圖中與上述特徵向量集合中之至少一第一特徵向量相關之複數第一影像資料310，以及與上述特徵向量集合中之至少一第二特徵向量相關之複數第二影像資料320。以下將參照第3圖舉例說明。 In some embodiments, in step S240 , the image data processing device 120 analyzes the complex image data 300 based on the feature extraction model 130 to generate a feature vector set corresponding to the image data 300 . The image data 300 includes a plurality of first image data 310 related to at least one first feature vector in the above-mentioned feature vector set, and a complex number of second image data 310 related to at least one second feature vector in the above-mentioned feature vector set. Video material 320. An example will be described below with reference to FIG. 3 .

第3圖為根據一些實施例所繪示的影像資料300的分類示意圖。第一影像資料310包含在不同觀測距離或是觀測角度下的影像資料，第二影像資料320包含在不同亮度或是光線下觀測的影像資料，其中在較暗的光線下觀測或是亮度較暗的的影像資料在第3圖的圖示中以斜線塗滿表示。換言之，在一些實施例中，上述至少一第一特徵向量係關於影像觀測角度以及影像觀測距離中至少一者，上述至少一第二特徵向量係關於影像亮度、影像伽瑪值以及影像對比度中至少一者。 FIG. 3 is a schematic diagram of classification of image data 300 according to some embodiments. The first image data 310 includes image data under different observation distances or observation angles, and the second image data 320 includes image data observed under different brightness or light, and the second image data 320 is observed under darker light or the brightness is darker The image data of , are filled with slashed lines in the illustration in Figure 3. In other words, in some embodiments, the at least one first eigenvector is related to at least one of an image observation angle and an image observation distance, and the at least one second eigenvector is related to at least one of image brightness, image gamma, and image contrast. one.

承上所述，由於在環境特徵相似度比對的領域當中，現有技術中的特徵抽取模型130對於光線差異與場景視野變化的干擾抵抗較差，導致匹配結果不如預期。因此，本揭示文件提供一種影像處理方法來調整特徵抽取模型130，以藉此提高匹配結果。 Continuing from the above, in the field of environmental feature similarity comparison, the feature extraction model 130 in the prior art is less resistant to interference from light differences and changes in the scene field of view, resulting in unsatisfactory matching results. Therefore, the present disclosure provides an image processing method to adjust the feature extraction model 130 to thereby improve the matching results.

在一些實施例中，上述影像處理方法包含藉由影像資料處理裝置120基於特徵抽取模型130從第一影像資料310與第二影像資料320中分別選取對應的一第一訓練圖塊組以及一第二訓練圖塊組。以下將參照第4A圖與第4B圖舉例說明。 In some embodiments, the above-mentioned image processing method includes using the image data processing device 120 to select a corresponding first training block group and a first training block group from the first image data 310 and the second image data 320 based on the feature extraction model 130 , respectively. Two sets of training tiles. An example will be described below with reference to FIGS. 4A and 4B .

第4A圖與第4B圖為根據一些實施例所繪製樣本篩選機制400的示意圖。參照第4A圖，在一些實施例中，影像資料處理裝置120係根據樣本篩選機制400自第一影像資料310與第二影像資料320中選取第一訓練圖塊組以及第二訓練圖塊組。4A and 4B are schematic diagrams of a sample screening mechanism 400 according to some embodiments. Referring to FIG. 4A , in some embodiments, the image data processing device 120 selects the first training image block group and the second training image block group from the first image data 310 and the second image data 320 according to the sample screening mechanism 400 .

在一些實施例中，第一影像資料310與第二影像資料320中分別具有基礎圖塊(anchor)、複數個同類圖塊(positive)與複數個異類圖塊(negative)。同類圖塊與基礎圖塊間有較高的匹配值，因此，同類圖塊與基礎圖塊間具有較短的歐式距離。與同類圖塊相反，異類圖塊與基礎圖塊間有較低的匹配值。因此，異類圖塊與基礎圖塊間具有較長的歐式距離。In some embodiments, the first image data 310 and the second image data 320 respectively include a base image block (anchor), a plurality of similar image blocks (positive), and a plurality of heterogeneous image blocks (negative). Similar tiles and base tiles have higher matching values, so there is a shorter Euclidean distance between similar tiles and basic tiles. In contrast to homogeneous tiles, heterogeneous tiles have a lower match value with the base tile. Therefore, there is a long Euclidean distance between the heterogeneous tiles and the base tiles.

又在一些實施例中，如第4B圖所示，上述歐式距離係代表基於量測特徵抽取模型130所產生的特徵描述器所輸出的圖塊間的距離。舉例來說，同類圖塊的特徵向量集合與基礎圖塊的特徵向量集合在特徵描述器的空間上具有較短的歐式距離L1，而異類圖塊的特徵向量集合與基礎圖塊的特徵向量集合間具有較長的歐式距離L2。In some embodiments, as shown in FIG. 4B , the Euclidean distance represents the distance between the blocks output by the feature descriptor generated based on the measurement feature extraction model 130 . For example, the feature vector set of the same type of tile and the feature vector set of the basic tile have a shorter Euclidean distance L1 in the space of the feature descriptor, while the feature vector set of the heterogeneous tile and the feature vector set of the basic tile have a shorter Euclidean distance L1. There is a long Euclidean distance L2.

承上所述，在不同實施例中，量測經訓練後的特徵抽取模型130產生的特徵描述器所輸出圖塊間的距離將會變動。舉例來說，同類圖塊的特徵向量集合與基礎圖塊的特徵向量集合在特徵描述器的空間上，將具有小於歐式距離L1的歐式距離L3，而異類圖塊的特徵向量集合與基礎圖塊的特徵向量集合間將具有大於歐式距離L2的歐式距離L4。換言之，影像資料處理裝置120基於經訓練後的特徵抽取模型130所抽取的特徵將具有比原來更高的匹配程度。As mentioned above, in different embodiments, the distance between the output blocks of the feature descriptor generated by the trained feature extraction model 130 will vary. For example, the feature vector set of the same type of tile and the feature vector set of the basic tile will have an Euclidean distance L3 smaller than the Euclidean distance L1 in the space of the feature descriptor, while the feature vector set of the heterogeneous tile and the basic tile will have an Euclidean distance L3. The set of eigenvectors will have an Euclidean distance L4 greater than the Euclidean distance L2. In other words, the features extracted by the image data processing device 120 based on the trained feature extraction model 130 will have a higher matching degree than the original ones.

因此，在一些實施例中，影像資料處理裝置120選取第一訓練圖塊組的步驟包含從第一影像資料310之複數同類圖塊中，選取與第一影像資料310之基礎圖塊匹配的程度最低的至少一同類圖塊作為第一訓練圖塊組，以及從第一影像資料310之複數異類圖塊中，選取與第一影像資料310之基礎圖塊匹配的程度最高的至少一異類圖塊作為第一訓練圖塊組。換言之，第一訓練圖塊組包含與第一影像資料310中基礎圖塊的歐式距離最長的同類圖塊，以及與第一影像資料310中基礎圖塊的歐式距離最短的異類圖塊。Therefore, in some embodiments, the step of selecting the first training block group by the image data processing device 120 includes selecting the degree of matching with the base block of the first image data 310 from the plurality of similar blocks of the first image data 310 The lowest at least one similar image block is used as the first training image block group, and from the plural heterogeneous image blocks of the first image data 310, at least one heterogeneous image block with the highest degree of matching with the basic image data 310 is selected as the first training block group. In other words, the first training block group includes the same type of block with the longest Euclidean distance from the base block in the first image data 310 , and the heterogeneous block with the shortest Euclidean distance from the base block in the first image data 310 .

在另一些實施例中，影像資料處理裝置120選取第二訓練圖塊組的步驟包含從第二影像資料320之複數同類圖塊中，選取與第二影像資料320之基礎圖塊匹配的程度最低的至少一同類圖塊作為第二訓練圖塊組，以及從第二影像資料320之複數異類圖塊中，選取與第二影像資料320之基礎圖塊匹配的程度最高的至少一異類圖塊作為第二訓練圖塊組。換言之，第二訓練圖塊組包含與第二影像資料320中基礎圖塊的歐式距離最長的同類圖塊，以及與第二影像資料320中基礎圖塊的歐式距離最短的異類圖塊。In some other embodiments, the step of selecting the second training block group by the image data processing device 120 includes selecting, from the plurality of similar blocks of the second image data 320, the least matching degree with the base block of the second image data 320 at least one similar block of the second image data 320 is used as the second training block group, and from the plurality of heterogeneous image blocks of the second image data 320, at least one heterogeneous image block with the highest degree of matching with the basic image block of the second image data 320 is selected as the The second set of training tiles. In other words, the second training block group includes similar blocks with the longest Euclidean distance from the base block in the second image data 320 , and heterogeneous blocks with the shortest Euclidean distance from the base block in the second image data 320 .

藉由上述實施例中所描述的步驟，可以有效最大化基礎圖塊與異類圖塊間的歐式距離，以及縮短基礎圖塊與同類圖塊間的歐式距離，以讓特徵抽取模型130產生更具有代表性的特徵描述器。Through the steps described in the above-mentioned embodiments, the Euclidean distance between the base block and the heterogeneous blocks can be effectively maximized, and the Euclidean distance between the base block and similar blocks can be shortened, so that the feature extraction model 130 can generate more Representative feature descriptors.

在一些實施例中，影像資料處理裝置120執行指令庫140中的運算指令對第一訓練圖塊組以及第二訓練圖塊組進行運算，以產生對應之至少一損失函數值。上述進行運算的步驟包含藉由離群損失函數對第一訓練圖塊組以及第二訓練圖塊組進行運算，以產生至少一損失函數值。其中，離群損失函數如以下公式：

L為損失函數值；n為圖塊總數目；w _p與w _n為權重值； d _M(a _i,p _i)代表基礎圖塊與同類圖塊的歐式距離，而d _m(a _i,n _i)則代表基礎圖塊與異類圖塊的歐式距離。 In some embodiments, the image data processing device 120 executes operation instructions in the instruction library 140 to perform operations on the first training image block group and the second training image block group to generate at least one corresponding loss function value. The above step of performing operations includes performing operations on the first training image block group and the second training image block group by using an outlier loss function to generate at least one loss function value. Among them, the outlier loss function is as follows:

L is the loss function value; n is the total number of tiles; w _p and _wn are the weight values; d _M (a _i , p _i ) represents the Euclidean distance between the basic tile and similar tiles, and d _m (a _i , n _i ) represents the Euclidean distance between the base tile and the heterogeneous tile.

權重值w _p為同一批次(batch)影像資料300的同類圖塊與基礎圖塊間的歐式距離之平均，權重值w _n為同一批次的異類圖塊與基礎圖塊間的歐式距離之平均。如以下公式：

The weight value w _p is the average of the Euclidean distances between the same type of image data 300 in the same batch and the base image block, and the weight value w _n is the sum of the Euclidean distances between the heterogeneous image blocks in the same batch and the base image block. average. Such as the following formula:

承上所述，若同一批次進行運算的資料群中參雜有雜訊(noise)，而上述雜訊相對於訓練資料來說為離群值，將會對訓練效能產生造成影響。因此，在一些實施例中，上述損失函數值可以降低在訓練IF-Net深度網路時的雜訊影響，以在訓練過程中更有效的達到讓特徵抽取模型130收斂的效果。As mentioned above, if there is noise mixed in the same batch of data groups that are operated on, and the noise is an outlier relative to the training data, it will have an impact on the training performance. Therefore, in some embodiments, the above-mentioned loss function value can reduce the influence of noise when training the IF-Net deep network, so as to more effectively achieve the effect of making the feature extraction model 130 converge during the training process.

在一些實施例中，影像處理方法包含根據上述至少一損失函數值調整特徵抽取模型130，使得當影像資料處理裝置120基於經調整之特徵抽取模型130分析影像資料300時，第一影像資料310與第二影像資料320匹配的程度增加。In some embodiments, the image processing method includes adjusting the feature extraction model 130 according to the at least one loss function value, so that when the image data processing device 120 analyzes the image data 300 based on the adjusted feature extraction model 130, the first image data 310 and the The degree of matching of the second image data 320 increases.

又在一些實施例中，調整特徵抽取模型130的步驟更包含將第一影像資料310與第二影像資料320輸入至共享深度神經網絡模型參數(shared-weight)的特徵抽取模型130，以分別產生對應的不同損失函數值，以及將對應第一影像資料與第二影像資料的不同損失函數值儲存並更新特徵抽取模型130中的至少一網路參數。以下將參照第5圖舉例說明。In some embodiments, the step of adjusting the feature extraction model 130 further includes inputting the first image data 310 and the second image data 320 into the feature extraction model 130 sharing the parameters of the deep neural network model (shared-weight) to generate respectively. corresponding different loss function values, and storing and updating at least one network parameter in the feature extraction model 130 with the different loss function values corresponding to the first image data and the second image data. An example will be described below with reference to FIG. 5 .

第5圖為根據一些實施例所繪示的分群共享參數步驟500流程圖。如第5圖所示，在一些實施例中，影像資料處理裝置120基於特徵抽取模型130存取指令庫140中的運算指令對第一影像資料310進行運算，以產生第一損失函數值，且對第二影像資料320進行運算，以產生第二損失函數值。影像資料處理裝置120將第一損失函數值與第二損失函數值儲存並更新特徵抽取模型130中的網路參數。FIG. 5 is a flowchart illustrating a step 500 of sharing parameters among groups according to some embodiments. As shown in FIG. 5 , in some embodiments, the image data processing device 120 performs operations on the first image data 310 based on the operation instructions in the feature extraction model 130 accessing the instruction library 140 to generate the first loss function value, and An operation is performed on the second image data 320 to generate a second loss function value. The image data processing device 120 stores the first loss function value and the second loss function value and updates the network parameters in the feature extraction model 130 .

在一些實施例中，上述分別將第一影像資料310與第二影像資料320輸入至共享深度神經網絡模型參數(shared-weight)的IF-Net，並一次性的更新網路參數的方法，可更有效的讓特徵抽取模型130具備處理不同類型資料的能力。In some embodiments, the above-mentioned method of inputting the first image data 310 and the second image data 320 to the IF-Net that shares the parameters of the deep neural network model (shared-weight) and updating the network parameters at one time can be More effectively, the feature extraction model 130 has the ability to process different types of data.

在一些實施例中，影像資料處理裝置120基於經調整之特徵抽取模型130來進行以下操作中之至少一者，如對第一影像資料310與第二影像資料320進行如上述步驟S250的比對操作、依據如上述的比對操作結果進行如步驟S270的空間定位操作以及依據空間定位操作的結果輸出的影像定位結果。In some embodiments, the image data processing device 120 performs at least one of the following operations based on the adjusted feature extraction model 130 , such as comparing the first image data 310 and the second image data 320 as in the above step S250 operation, performing the spatial positioning operation in step S270 according to the result of the above comparison operation, and outputting the image positioning result according to the result of the spatial positioning operation.

承上所述，影像資料處理裝置120基於經調整之特徵抽取模型130於步驟S250中，將提高影像資料300的匹配程度，且於步驟S270中的影像定位結果也將更為準確。如下方表(一)為根據本揭示文件中的實施例所產生的匹配程度數據比較表。表(一) HardNet IF-Net 鑑定 (Verification) Inter 93.85% 93.83% Intra 93.53% 93.57% 檢索(Retrieval) 60.85% 93.57% 匹配(Matching) 41.64% 48.65% 在光線或是場景變化大的情況下，依照特徵抽取模型130從影像資料300中抽取到的特徵數量比例判斷正確率，可明顯由表(一)中看出本揭示文件提出的影像處理方法與系統能夠提升匹配的準確率。 Based on the above, the image data processing device 120 will improve the matching degree of the image data 300 in step S250 based on the adjusted feature extraction model 130, and the image positioning result in step S270 will be more accurate. The following table (1) is a comparison table of matching degree data generated according to the embodiments in the present disclosure. Table I) HardNet IF-Net Verification Inter 93.85% 93.83% Intra 93.53% 93.57% Retrieval 60.85% 93.57% Matching 41.64% 48.65% In the case of large changes in light or scene, according to the ratio of the number of features extracted from the image data 300 by the feature extraction model 130 to determine the correct rate, it can be clearly seen from Table (1) that the image processing method proposed in this disclosure is related to the The system can improve the matching accuracy.

雖然本揭示內容已以實施方式揭露如上，然其並非用以限定本揭示內容，任何本領域具通常知識者，在不脫離本揭示內容之精神和範圍內，當可作各種之更動與潤飾，因此本揭示內容之保護範圍當視後附之申請專利範圍所界定者為準。Although the present disclosure has been disclosed as above in embodiments, it is not intended to limit the present disclosure. Anyone with ordinary knowledge in the art can make various changes and modifications without departing from the spirit and scope of the present disclosure. Therefore, the protection scope of the present disclosure shall be determined by the scope of the appended patent application.

110:影像擷取裝置 120:影像資料處理裝置 130:特徵抽取模型 140:指令庫 200:影像處理方法 S210、S220、S230、S240、S250、S260、S270:步驟 300:影像資料 310:第一影像資料 320:第二影像資料 400:樣本篩選機制 L1~L4:歐式距離 500:分群共享參數步驟 110: Image capture device 120: Video data processing device 130: Feature extraction model 140: Instruction library 200: Image Processing Methods S210, S220, S230, S240, S250, S260, S270: Steps 300: Video data 310: The first video data 320: Second image data 400: Sample Screening Mechanism L1~L4: Euclidean distance 500: Steps of grouping and sharing parameters

第1圖為根據一些實施例所繪示的影像處理系統的示意圖。 FIG. 1 is a schematic diagram of an image processing system according to some embodiments.

第2圖為根據一些實施例所繪示的運作影像處理系統的流程圖。 FIG. 2 is a flowchart illustrating an operation of an image processing system according to some embodiments.

第3圖為根據一些實施例所繪示的影像資料的分類示意圖。 FIG. 3 is a schematic diagram of classification of image data according to some embodiments.

第4A圖與第4B圖為根據一些實施例所繪製樣本篩選機制的示意圖。 4A and 4B are schematic diagrams of sample screening mechanisms drawn according to some embodiments.

第5圖為根據一些實施例所繪示的分群共享參數步驟流程圖。 FIG. 5 is a flow chart of steps of sharing parameters among groups according to some embodiments.

200:影像處理方法 S210、S220、S230、S240、S250、S260、S270:步驟 200: Image Processing Methods S210, S220, S230, S240, S250, S260, S270: Steps

Claims

An image processing method, comprising the steps of: analyzing complex image data based on a feature extraction model (IF-NET) by an image data processing device to generate a feature vector set corresponding to the image data, wherein the image data includes and Complex first image data related to at least one first feature vector in the feature vector set and complex second image data related to at least one second feature vector in the feature vector set; by the image data processing device based on the The feature extraction model selects a corresponding first training image block group and a second training image block group respectively from the first image data and the second image data, and selects the first training image block group and the first training image block group. Two training image block groups are operated to generate corresponding at least one loss function value; and the feature extraction model is adjusted according to the at least one loss function value, wherein the step of selecting the first training image block group includes: selecting the first training image block group from the first Among the plurality of similar blocks of image data, at least one similar block with the lowest degree of matching with one of the basic blocks of the first image data is selected as the first training block group; and from the first image data Among the plurality of heterogeneous picture blocks, at least one heterogeneous picture block with the highest matching degree with the basic picture block of the first image data is selected as the first training picture block group.

The image processing method of claim 1, wherein when the image data processing device analyzes the image data based on the adjusted feature extraction model, the first image data matches the second image data degree increased.

The image processing method of claim 1, further comprising: performing at least one of the following operations on the first image data and the second image data based on the adjusted feature extraction model by the image data processing device performing a comparison operation on the image data; performing a spatial positioning operation according to the result of the comparison operation; and outputting an image positioning result according to the result of the spatial positioning operation.

The image processing method of claim 1, wherein the at least one first eigenvector is related to at least one of an image observation angle and an image observation distance, the at least one second eigenvector is related to image brightness, image gamma, and At least one of image contrast.

The image processing method according to claim 1, wherein the step of selecting the second training block group comprises: selecting a base image corresponding to the second image data from a plurality of similar blocks of the second image data at least one similar block with the lowest degree of block matching is used as the second training block group; and from the plurality of heterogeneous blocks of the second image data, selecting a block matching the basic block of the second image data At least one heterogeneous image block with the highest degree is used as the second training image block group.

The image processing method of claim 1, wherein the step of performing operations on the first training image block group and the second training image block group comprises: using an outlier loss function on the first training image block group and The second training image block group is operated to generate the corresponding at least one loss function.

The image processing method according to claim 1, wherein the step of adjusting the feature extraction model further comprises: inputting the first image data and the second image data into the feature of a shared-weight model of a deep neural network extracting a model to generate corresponding different loss function values respectively; and storing and updating at least one network parameter in the feature extraction model for the different loss function values corresponding to the first image data and the second image data.

An image processing system includes: an image capture device for capturing a plurality of image data; and an image data processing device coupled to the image capture device, the image data processing device analyzes the image data based on a feature extraction model image data to generate a feature vector set corresponding to the image data; wherein the image data includes a plurality of first image data related to at least one first feature vector in the feature vector set and a plurality of first image data related to the feature vector set At least one second feature vector is associated with a plurality of second image data; the image data processing device executes based on the feature extraction model: From the plurality of similar picture blocks of the first image data, select at least one similar picture block with the lowest degree of matching with a basic picture block of the first image data as the first training picture block group; Among a plurality of heterogeneous blocks of image data, at least one heterogeneous block with the highest degree of matching with the basic block of the first image data is selected as the first training block group; and from the second image data Select a corresponding second training image block group from the , and perform operations on the first training image block group and the second training image block group to generate at least one corresponding loss function value; the feature extraction model is based on the The at least one loss function value generated by the operation of the first image data and the second image data is adjusted.

The image processing system of claim 8, wherein the first eigenvector is related to at least one of an image observation angle and an image observation distance, and the at least one second eigenvector is related to image brightness, image gamma, and image contrast at least one of them.