TWI494900B

TWI494900B - Method of real time image tracking

Info

Publication number: TWI494900B
Application number: TW102139318A
Authority: TW
Inventors: Jiunn Lin Wu; Chia Feng Chang; Ting Yu Tsai
Original assignee: Nat Univ Chung Hsing
Priority date: 2013-10-30
Filing date: 2013-10-30
Publication date: 2015-08-01
Also published as: TW201516966A

Description

Instant image tracking method

本發明係關於一種即時影像追蹤方法，尤指一種適用於模板影像物件與原圖(欲匹配)影像之目標物件不同動作、大小與形狀之即時影像追蹤方法。The invention relates to a real-time image tracking method, in particular to an instant image tracking method suitable for different actions, sizes and shapes of a target object of a template image object and an original image (to be matched).

影像追蹤目前被廣泛運用在軍事上以及視訊處理上，係正處於熱門發展的領域。常見的影像追蹤是使用一種完全搜尋法(Full search)，其係將模板影像覆蓋至原圖影像上並與像素點比對以及將模板影像在原圖影像上移動，每移動一次便比對全部像素點一次，並把二圖各像素點之差值相加作為該位置的相似度，當整個原圖影像之比較完成後，將差值總和最小的位置作為匹配位置。然而此種方法往往需要耗費大量的時間，因此若原圖影像較大，其花費的時間將十分驚人。Image tracking is currently widely used in military and video processing, and is in the field of popular development. A common image tracking method uses a full search method, which overlays the template image onto the original image and compares it with the pixel and moves the template image on the original image. Each movement is compared to all pixels. Click once and add the difference between the pixels of the two maps as the similarity of the position. When the comparison of the original image is completed, the position with the smallest sum of the differences is taken as the matching position. However, this method often takes a lot of time, so if the original image is large, it will take a very long time.

在影像處理領域裡，有人使用一種中值門檻位元圖(Median threshold bitmap)進行影像校準處理，其係使用模板影像與原圖影像裡像素值的中間值做為門檻，將二影像進行轉換並進行校準，雖然這種方法只需比對影像的位元值而可以快速地處理影像校準，然而由於影像校準並不如同影像追蹤般會有模板影像與原圖影像物件大小不同、動作也不同等變數，且當影像中被某個顏色大量佔據的時候取出的中值有很大的機會為此顏色(尤其當影像比較小的時候，也就是影像資訊較少的時候)，那可能造成原圖影像的中值門檻位元圖和模板影像的中值門檻位元圖有所不同進而影響匹配結果，因此中值門檻位元圖並不適用於影像追蹤。In the field of image processing, a median threshold bitmap is used to perform image calibration processing. The intermediate image of the template image and the original image is used as a threshold to convert the two images. For calibration, although this method only needs to compare the bit values of the image to process the image calibration quickly, since the image calibration is not as good as the image tracking, the template image is different from the original image. The action is not equally variable, and the median value taken when the image is heavily occupied by a certain color has a great chance for this color (especially when the image is small, that is, when the image information is small), then The median threshold bitmap of the original image and the median threshold of the template image may be different to affect the matching result, so the median threshold bitmap is not suitable for image tracking.

因此需要開發一種新的影像處理方法來達到快速影像追蹤的功能。Therefore, it is necessary to develop a new image processing method to achieve fast image tracking.

本發明之目的係在提供一種即時影像追蹤方法，其係藉由一影像處理裝置進行處理，該方法包括步驟：(A)輸入一模板影像資訊與原圖影像資訊；(B)使用一影像金字塔將該模板影像與原圖影像縮放為數組不同大小的影像層級，該等影像層級依照該等影像大小依序排列；(C)進行該最小影像層級中該模板影像與原圖影像之門檻位元圖轉換，用以分別取得該模板影像與原圖影像的一平均值門檻位元圖；(D)進行該模板影像與原圖影像之平均值門檻位元圖相似度匹配，藉此找出該模板影像於原圖影像中之一匹配位置；(E)將步驟(D)所取得的影像匹配位置對應至一下一影像層級的原圖影像上，並由該匹配位置處擴張一範圍，進行該影像層級之模板影像與原圖影像之門檻位元圖轉換，之後只針對該範圍進行模板影像與該原圖影像之平均值門檻位元圖相似度匹配；以及(F)重複進行步驟(E)，直到影像層級為該等影像的原始大小層級，並以該層級之匹配為至作為匹配結果輸出。藉此可以在該原圖影像中找出和模板影像匹配的物件。另外，本發明亦可將加入一放大後之原圖影像作為影像金字塔的最後一層，並且再進行一次模板匹配的程序，最後把匹配的位移除以二，藉此得到更精準的匹配位置。The object of the present invention is to provide a real-time image tracking method, which is processed by an image processing device, the method comprising the steps of: (A) inputting a template image information and original image information; (B) using an image pyramid The template image and the original image are scaled to an image level of different sizes of the array, and the image levels are sequentially arranged according to the image size; (C) the threshold of the template image and the original image in the minimum image level is performed. a map conversion for respectively obtaining an average threshold map of the template image and the original image; (D) performing a similarity matching between the template image and the average image threshold of the original image, thereby finding out The template image is matched to one of the original image images; (E) the image matching position obtained in the step (D) is mapped to the original image image of the next image level, and the range is expanded by the matching position, and the image is Converting the template image of the image level to the threshold image of the original image, and then only matching the template image with the average threshold of the original image for the range; and F) repeating step (E) until the image level is the original size level of the images, and the matching of the levels is Output as a match result. In this way, objects matching the template image can be found in the original image. In addition, the present invention can also add an enlarged original image as the last layer of the image pyramid, and then perform a template matching process, and finally remove the matched bits by two, thereby obtaining a more accurate matching position.

11、31‧‧‧原圖影像11, 31‧‧‧ original image

511‧‧‧最小影像層級匹配位置511‧‧‧Minimum image level matching position

12、32‧‧‧模板影像12, 32‧‧‧ template image

52‧‧‧下一層級影像層級52‧‧‧Next level image level

13‧‧‧影像處理裝置13‧‧‧Image processing device

521‧‧‧下一影像層級匹配位置521‧‧‧Next image level matching position

14‧‧‧匹配完成示意圖14‧‧‧Complete completion diagram

53‧‧‧原始層級影像層級53‧‧‧ Original level image level

S21、S22、S23、S24、S25、S26、S27、S28‧‧‧步驟S21, S22, S23, S24, S25, S26, S27, S28‧‧‧ steps

531‧‧‧最終匹配位置531‧‧‧Final matching position

S31、S32、S33、S34‧‧‧步驟S31, S32, S33, S34‧‧‧ steps

S61、S62、S63‧‧‧步驟S61, S62, S63‧‧‧ steps

33‧‧‧原圖影像之平均值門檻位元圖33‧‧‧Average threshold image of the original image

S71、S72、S73、S74、S75、S76、S77、S78‧‧‧步驟S71, S72, S73, S74, S75, S76, S77, S78‧‧

34‧‧‧模板影像之平均值門檻位元圖34‧‧‧Average threshold map of template images

S81、S82、S83、S84、S85、S86、S87、S88‧‧‧步驟S81, S82, S83, S84, S85, S86, S87, S88‧‧

S41、S42、S43、S44、S45‧‧‧步驟S41, S42, S43, S44, S45‧‧ steps

S91、S92、S93‧‧‧步驟S91, S92, S93‧‧‧ steps

50‧‧‧影像金字塔50‧‧‧Image Pyramid

101‧‧‧放大影像中之原始像素點101‧‧‧Enlarge the original pixel in the image

51‧‧‧最小影像層級51‧‧‧Minimum image level

102‧‧‧第一型態內插像素點102‧‧‧First type interpolation pixel

103‧‧‧第二型態內插像素點103‧‧‧Second type interpolated pixel

圖1係本發明影像匹配之示意圖。Figure 1 is a schematic illustration of image matching of the present invention.

圖2係本發明影像匹配方法之流程圖。2 is a flow chart of the image matching method of the present invention.

圖3係本發明平均值門檻位元圖轉換之流程圖。3 is a flow chart showing the conversion of the average threshold map of the present invention.

圖4係本發明平均值門檻位元圖匹配之流程圖。4 is a flow chart showing the matching of the average threshold map of the present invention.

圖5係本發明影像金字塔之流程圖。Figure 5 is a flow chart of the image pyramid of the present invention.

圖6係本發明結合互斥或位元圖運算之流程圖。6 is a flow chart of the present invention in conjunction with a mutually exclusive or bit map operation.

圖7係本發明影像匹配方法之另一流程圖。Fig. 7 is another flow chart of the image matching method of the present invention.

圖8係本發明取得更精準的匹配位置方法之流程圖。FIG. 8 is a flow chart of the method for obtaining a more accurate matching position according to the present invention.

圖9係本發明另一提升匹配精準度的方法之流程圖。FIG. 9 is a flow chart of another method for improving matching accuracy according to the present invention.

圖10係本發明雙線性內插法之示意圖。Figure 10 is a schematic illustration of the bilinear interpolation method of the present invention.

本發明提出一種即時影像追蹤方法，請參照圖1，可知該追蹤意旨在原圖影像(11)上找出與模板影像(12)相匹配的物件(14)，例如圖1中所示之“C”，其中一影像可視為由複數個以矩陣排列之像素(121)所構成，例如圖1所示之模板影像(12)實際上係由許多像素(121)構成。該方法係藉由一影像處理裝置(13)來實施，該影像處理裝置可以係一電腦、一微處理器等，只要是任何具有計算功能之裝置皆可做為本發明之影像處理裝置。The present invention provides an instant image tracking method. Referring to FIG. 1, it can be seen that the tracking intention is to find an object (14) matching the template image (12) on the original image (11), for example, "C shown in FIG. One of the images can be considered to be composed of a plurality of pixels (121) arranged in a matrix. For example, the template image (12) shown in FIG. 1 is actually composed of a plurality of pixels (121). The method is implemented by an image processing device (13), and the image processing device can be A computer, a microprocessor, etc., as long as it is any device having a computing function, can be used as the image processing device of the present invention.

圖2為本發明一實施例之流程圖，首先進行步驟S21，一影像處理裝置取得一模板影像與欲匹配的一原圖影像。之後進行步驟S22，該影像處理裝置使用一影像金字塔將該模板影像與該原圖影像縮放為數組不同大小的影像，並依此分為數個影像層級，並進行步驟S23將該等影像層級依照影像的大小依序排列，該依序排列係將最小的層級排在該影像金字塔的頂部做為第一層級，將原始影像大小的層級排在該影像金字塔的底部，其中各層級影像大小可以是整數倍縮放，例如各層級的差異為2倍大小，但不限於此。將各層級依序排列後，進行步驟S24，將最小影像層級中模板影像與原圖影像進行一平均值門檻位元圖轉換，用以取得該模板影像與原圖影像各別的平均值門檻位元圖，該平均值門檻位元圖將會在之後的段落做更詳細的說明。2 is a flowchart of an embodiment of the present invention. First, in step S21, an image processing apparatus obtains a template image and an original image to be matched. Then, in step S22, the image processing device uses an image pyramid to scale the template image and the original image into an array of images of different sizes, and divides the image into several image levels, and proceeds to step S23 to image the images according to the image. The size is arranged in order, and the ordering ranks the smallest level at the top of the image pyramid as the first level, and the original image size is arranged at the bottom of the image pyramid, wherein the image size of each level can be an integer The multi-scale, for example, the difference of each level is twice the size, but is not limited thereto. After the levels are arranged in sequence, step S24 is performed to perform an average threshold map conversion between the template image and the original image in the minimum image level to obtain the average threshold of the template image and the original image. The meta-graph, the average threshold map will be explained in more detail in the following paragraphs.

當取得該二圖像各別的平均值門檻位元圖後，進行步驟S25，將該模板影像與原圖影像之平均值門檻位元圖進行相似度匹配，藉此找出該模板影像於原圖影像中之一匹配位置以及該匹配位置的匹配中心區塊，該相似度匹配將會在後續段落做更詳細的說明。之後進行步驟S26，其藉由步驟S25所取得的該匹配中心區塊，將該匹配中心區塊對應至下一層級之原圖影像，由於下一層級之原圖影像必大於上一層級之原圖影像，因此需將該對應的匹配中心區塊作為中心向外擴張至一範圍。之後進行步驟S27，將該下一層級中之模板影像與原圖影像進行平均值門檻位元圖轉換，用以取得該下一影像層級中模板影像與原圖影像的平均值門檻位元圖，接著在該範圍裡進行模板影像與原圖影像之平均值門檻位元圖之相似度匹配，其中該原圖影像之平均值門檻位元圖之轉換較佳為只擷取該範圍裡的影像進行轉換，但也可以係整張原圖影像進行轉換或可自行選擇欲轉換的範圍。之後進行步驟S28，判斷影像大小是否為原始影像大小，若不是原始影像大小則重複步驟S26至S27進行更下一層級的相似度匹配，若是已達到原始的影像大小則將該層級完成之匹配位置輸出，以該匹配位置視為模板影像在原圖影像上之物件，藉此完成模板影像在原圖影像上之追蹤。After obtaining the average threshold value map of the two images, proceeding to step S25 to perform similarity matching on the template image and the average threshold map of the original image, thereby finding out that the template image is in the original One of the matching locations in the image and the matching center block of the matching location will be described in more detail in subsequent paragraphs. Then, in step S26, the matching center block is corresponding to the original image of the next level by the matching center block obtained in step S25, because the original image of the next level must be larger than the original level of the previous level. The image is imaged, so the corresponding matching center block needs to be expanded outward to a range. Then proceeding to step S27, Performing an average threshold map conversion between the template image and the original image in the next level to obtain an average threshold map of the template image and the original image in the next image level, and then performing template in the range The image is matched with the similarity of the average threshold of the original image. The average threshold of the original image is preferably converted to capture only the image in the range, but can also be integrated. The original image is converted or the range to be converted can be selected. Then, in step S28, it is determined whether the image size is the original image size. If it is not the original image size, steps S26 to S27 are repeated to perform the next level of similarity matching. If the original image size has been reached, the matching position of the level is completed. Output, the matching position is regarded as the object of the template image on the original image, thereby completing the tracking of the template image on the original image.

圖3為步驟S24或S27之平均值門檻位元圖轉換之細部流程圖，首先進行步驟S31，將原圖影像(31)與模板影像(32)進行一色彩轉換，該色彩轉換較佳為灰階轉換，但不限於此，為使說明清楚，以下將以灰階轉換作為舉例。當灰度轉換後，接著進行步驟S32，設定一平均值作為之後位元圖轉換之門檻值，該平均值可以是模板影像與原圖影像各自使用自身影像灰階值的平均值(即模板影像使用模板影像之平均值，原圖影像使用原圖影像之平均值)，但較佳的情況是模板影像與原圖影像皆使用模板影像灰階值的平均值作為門檻圖運算之平均值，模板影像與原圖影像皆使用模板影像灰階值作為平均值的好處在於當進行平均值運算時，有時原圖影像與模板影像的平均值會有很大的差異 (例如因為原圖影像的背影太複雜所造成)，而原圖影像一般而言會包含模板影像之物件，因此以模板影像作為平均值將可以達到最準確的運算效果。當設定好平均值後，進行步驟S33與S34，利用以下的條件各別求出模板影像與原圖影像的平均值門檻位元圖：其中x為影像的灰階值，Mean為影像灰階值的平均值。經由該平均值門檻位元轉換後，模板影像與原圖影像上的像素點若大於平均值(即1)將以白色顯示，若小於等於平均值(即0)則以黑色顯示，但並不限定為白色與黑色，只要是兩種有區別的顏色及可作為轉換後之顯示顏色。之後將轉換後的原圖影像(33)和模板影像(34)輸出。3 is a detailed flow chart of the average threshold 槛 bitmap conversion of step S24 or S27. First, step S31 is performed to perform a color conversion between the original image (31) and the template image (32), and the color conversion is preferably gray. The order conversion is not limited thereto, and for clarity of explanation, the gray scale conversion will be exemplified below. After the gradation conversion, step S32 is followed to set an average value as the threshold value of the subsequent bitmap conversion, and the average value may be an average value of the grayscale values of the template image and the original image respectively (ie, the template image) Using the average of the template image, the original image uses the average of the original image), but it is better that the template image and the original image use the average value of the template image grayscale value as the average of the threshold image operation, the template The advantage of using the template image grayscale value as the average value between the image and the original image is that when the average value is calculated, the average value of the original image and the template image may be greatly different (for example, because the original image is backed by the image). The image is too complicated, and the original image will generally contain the object of the template image, so the template image as the average will achieve the most accurate calculation results. After the average value is set, steps S33 and S34 are performed, and the average threshold value map of the template image and the original image is obtained by using the following conditions: Where x is the grayscale value of the image and Mean is the average of the grayscale values of the image. After the value is converted by the average threshold, if the pixel on the template image and the original image is larger than the average value (ie, 1), it will be displayed in white. If it is less than or equal to the average value (ie, 0), it will be displayed in black, but not It is limited to white and black, as long as it is two different colors and can be used as the converted display color. The converted original image (33) and template image (34) are then output.

圖4(a)為本發明之相似度匹配示意圖，首先進行步驟S41，將模板影像之平均值位元圖覆蓋於原圖影像之平均值位元圖上，並進行步驟S42，在模板影像的位置處以及模板影像大小的範圍內進行二張影像之點對點的像素值比對運算，並取得一比對結果，之後進行步驟S43，將該比對結果進行一運算，該運算較佳為互斥或(XOR)運算，但不限於此，為了更清楚地說明，以下將以互斥或運算作為實施例。由於互斥或運算時同時輸入0與1時才會輸出1，否則皆為0，因此比對時，當模板影像之平均值位元圖與原圖影像之平均值位元圖之一像素點相同時，經由互斥或運算後將會得到0值，而當像素點不相同時，將會得到1值。接著進行步驟S44，把該模板影像範圍內所有1值相加，即為模板影像在原圖該位置處的相異程度值。之後並將該模板影像之平均值位元圖移至下一個位置，並進行相似度的計算，並以此類推。當模板影像移動完所有原圖影像範圍後，進行步驟S45，將每個位置的相異程度進行相比，其中該相異程度最小(取得的1最少)的位置即為模板影像在原圖影像上所匹配的位置。圖4(b)為匹配示意圖，請一併參照圖3，其中示意圖41係模板影像(34)於原圖影像(33)中模板影像物件位置附近(411)進行像素比對之過程，由於各像素(黑與白)並不完全相同因此假設其差異的像素點共有60處，則將該60處記錄為1，並取得其相異程度值60。示意圖42係模板影像(34)於原圖影像(33)另一位置進行像素比對之過程，由於原圖影像(34)中該位置(412)經由位元圖轉換後已為全黑的像素，因此其相異程度值將比示意圖41之比對位置更高，假設其差異的像素點共有100處，則將該100處記錄為1，並取得其相異程度值100。示意圖43係模板影像(34)於原圖影像(33)之模板物件位置處(413)進行像素比對之過程，由於兩者白色像素與黑色像素分布最為接近，因此其相異程度值將最小，因此將此處設定為匹配位置。4(a) is a schematic diagram of the similarity matching of the present invention. First, step S41 is performed to overlay the average bitmap of the template image on the average bitmap of the original image, and step S42 is performed in the template image. Performing a point-to-point pixel value comparison operation of two images in the range of the image size and the size of the template image, and obtaining a comparison result, and then performing step S43 to perform an operation on the comparison result, which is preferably mutually exclusive. Or (XOR) operation, but is not limited thereto, and for the sake of clarity, the following will be a mutually exclusive or arithmetic operation. 1 is output when 0 and 1 are input at the same time of mutual exclusion or operation. Otherwise, it is 0. Therefore, when comparing, the average value of the template image and the average bitmap of the original image are pixel points. When they are the same, a value of 0 will be obtained after the mutual exclusion or operation, and when the pixel points are not the same, a value of 1 will be obtained. Then, in step S44, all the values in the template image range are added, that is, the difference degree value of the template image at the position of the original image. The average bitmap of the template image is then moved to the next position, and the similarity is calculated, and so on. After the template image has moved all the image range of the original image, step S45 is performed to compare the degree of dissimilarity of each position, wherein the position of the difference is the smallest (the least one obtained) is the template image on the original image. The location that is matched. Figure 4 (b) is a matching diagram, please refer to Figure 3 together, wherein the schematic 41 is a template image (34) in the original image (33) in the vicinity of the template image object position (411) for pixel comparison process, due to The pixels (black and white) are not exactly the same. Therefore, assuming that there are 60 pixels in which the difference is common, the 60 points are recorded as 1, and the dissimilarity value 60 is obtained. The schematic image of the 42-module image (34) is compared with the pixel at another position of the original image (33), since the position (412) in the original image (34) is converted to a black pixel by the bitmap. Therefore, the difference degree value will be higher than the alignment position of the schematic diagram 41. If there are 100 pixels of the difference, the 100 points are recorded as 1, and the difference degree value 100 is obtained. The diagram 43 is a template image (34) for pixel comparison at the template object position (413) of the original image (33). Since the white pixels and the black pixels are closest to each other, the difference value will be the smallest. , so set this to the matching location.

圖5為一原圖影像之影像金字塔之示意圖，請一併參照圖1，當進行完最小影像層級(51)的匹配後，會得到最小影像層級之模板影像在原圖影像中的一匹配位置(511)，由於影像金字塔係將影像依照比例從最小影像之層逐漸放大，因此該匹配位置(511)可以做為下一影像層級(52) 中原圖影像的參考，但是由於在小影像層級時，因為影像縮小失去了大部分的影像資訊，所以必須在最小影像層級時針對全部的原圖影像進行匹配處理。然而也由於隨著影像的放大，下一影像層級(52)中欲匹配物件(521)的大小也會放大，因此該匹配位置所包含的物件(511)會較小，且該匹配的位置也有可能稍微產生位移，因此必須將該取得的匹配範圍(511)擴張。該擴張係由該匹配位置作為擴張的中心處，向外擴張，再一時較佳實施例中，該擴張為3X3比例(即9倍大小)，但不限於此，藉此來確保匹配位置範圍的精準度。FIG. 5 is a schematic diagram of an image pyramid of an original image. Referring to FIG. 1 together, when the minimum image level (51) is matched, a matching position of the template image of the minimum image level in the original image is obtained ( 511), since the image pyramid system gradually enlarges the image from the layer of the minimum image, the matching position (511) can be used as the next image level (52) The reference of the Central Plains image, but because at the small image level, because the image reduction loses most of the image information, it is necessary to perform matching processing for all the original images at the minimum image level. However, since the size of the object to be matched (521) in the next image level (52) is also enlarged as the image is enlarged, the object (511) included in the matching position is smaller, and the matching position is also It is possible to slightly shift, so the obtained matching range (511) must be expanded. The expansion is outwardly expanded by the matching position as the center of the expansion. In a preferred embodiment, the expansion is a 3×3 ratio (ie, 9 times the size), but is not limited thereto, thereby ensuring a matching position range. Precision.

由此可知，藉由影像金字塔在最小影像層級中找出匹配位置(511)後，可以依照該位置快速地在之後更大影像層級中找出相對應的匹配位置(512)，並只在該位置周圍進行該層級的匹配，直到所有層級皆已完成匹配，即當該層級為原始影像大小(53)時，即完成實際上的匹配。Therefore, after the matching position (511) is found in the minimum image level by the image pyramid, the corresponding matching position (512) can be quickly found in the subsequent larger image level according to the position, and only in the This level of matching is done around the location until all levels have been matched, ie when the level is the original image size (53), the actual match is completed.

藉此，本發明可提供一快速的影像追蹤方法。Thereby, the present invention can provide a fast image tracking method.

此外在各影像層級的縮放中，有些影像層級可能會縮放得非常小，此時會造成模板影像的資訊不夠，若只將影像資訊不夠的模板影像在上一層級所取得的匹配位置對應到目前層級的匹配範圍上搜尋，將有可能造成匹配位置不正確，因為該等影像並不完整，所以當目前層級的影像資訊不足時，可不使用該上一層級的匹配範圍進行搜尋的方式，而改以使用完全搜尋法進行原圖影像的完全搜尋。其中該影像資訊可以係各種資訊，較佳為模板影像的像素數量，但不限於此，為使說明更詳細，以下將以像素數量作為範例。另外該影像資訊足夠與否的判斷，較佳係設定為當模板影像的像素數量超過512時，該影像資訊視為足夠。In addition, in the zooming of each image level, some image levels may be scaled very small, which may cause insufficient information of the template image. If only the template image with insufficient image information is matched to the current matching position obtained at the previous level, Searching at the matching range of the level may cause the matching position to be incorrect, because the images are not complete, so when the image information of the current level is insufficient, the search range of the upper level may not be used for searching. A full search of the original image is performed using the full search method. The image information can be various information, preferably a template image. The number of pixels is not limited thereto, and in order to make the description more detailed, the number of pixels will be exemplified below. In addition, the determination of whether the image information is sufficient or not is preferably set such that when the number of pixels of the template image exceeds 512, the image information is deemed to be sufficient.

此外，該平均值門檻位元圖之步驟可結合其它的位元圖運算。由於利用平均值門檻時，在光影交錯的區域裡，常會有些光點的像素值接近於該平均值，因此可能會造成該平均值門檻位元圖出現該等光點雜訊，因此本發明可與互斥或位元圖結合藉此過濾該等光點雜訊，用以取得一更精準的平均值門檻位元圖。圖6為本發明結合互斥或運算之流程圖，請一併參考圖2與圖3，首先進行步驟S61設定雜訊門檻範圍，其係以該等模板影像之平均值為基礎，將該平均值±r作為該雜訊門檻範圍。接著進行步驟S62，將在該範圍內的該等模板影像與原圖影像之像素值(即光點雜訊)記錄為0，將不在該等範圍內之該等模板影像與原圖影像之像素值(即非光點雜訊)記錄為1，藉此取得該互斥或位元圖，其中該r值取決於光點雜訊之像素值與該平均值之差值，例如光點雜訊之值為172，該平均值為168，則r值為4。接著進行步驟S63，將圖2流程所取得的平均值門檻位元圖與該互斥或位元圖進行AND運算，由於兩張圖的像素值皆為1的部分才會在AND運算後以1輸出，因此互斥或位元圖上之光點雜訊(即為0)將會被過濾掉，藉此AND運算後的影像將是更精準的平均值門檻位元圖，即去除掉光點雜訊的平均值門檻位元圖。In addition, the steps of the average threshold map can be combined with other bitmap operations. Since the average threshold is used, in the region where the light and shadow are interlaced, the pixel values of some light spots are often close to the average value, so the average threshold pixel map may be caused by the light spot noise, so the present invention can In combination with the mutex or bit map, the spot noise is filtered to obtain a more accurate average threshold map. FIG. 6 is a flowchart of a mutual exclusion or operation according to the present invention. Referring to FIG. 2 and FIG. 3 together, step S61 is first performed to set a noise threshold range, which is based on the average of the template images. The value ±r is used as the noise threshold range. Then, in step S62, the pixel values of the template image and the original image in the range (ie, the spot noise) are recorded as 0, and the template images and the pixels of the original image are not in the range. The value (ie, non-spot noise) is recorded as 1, thereby obtaining the mutual exclusion or bit map, wherein the r value depends on the difference between the pixel value of the spot noise and the average value, such as spot noise. The value is 172, and the average is 168, then the r value is 4. Then, in step S63, the average threshold value bitmap obtained by the flow of FIG. 2 is ANDed with the mutual exclusion or bit map, and the portion where the pixel values of both the graphs are all 1 will be 1 after the AND operation. Output, so the mutex noise (ie 0) on the mutex or bit map will be filtered out, so the image after the AND operation will be a more accurate average threshold map, ie the light point is removed. The average threshold value of the noise.

藉此，本發明可提供一快速且精準的影像追蹤方法。Thereby, the present invention can provide a fast and accurate image tracking method.

圖7為本發明另一實施例之流程圖，其中該流程與圖2實施例之流程不同之處在於圖2之流程係在於先將原圖影像與模板影像進行縮放並分為數個階級，當進行到該階級時再進行平均值門檻位元圖的轉換，本實施例則改為在影像縮放前先完成原始大小的原圖影像與模板影像之平均值門檻位元圖的轉換，在將轉換後的影像進行縮放。本實施例詳細的流程如下，首先進行步驟S71，將一模板影像與欲匹配的原圖影像輸入至一影像處理裝置。輸入影像後進行步驟S72，將該模板影像與原圖影像進行一平均值門檻位元圖轉換，用以取得該模板影像與原圖影像各別的平均值門檻位元圖，該平均值門檻位元圖的取得與前述實施例相同。之後進行步驟S73，使用一影像金字塔將該模板影像與該原圖影像的平均值門檻位元圖縮放為數組不同大小的影像，並依此分為數個影像層級，之後進行步驟S74，將該等影像層級依照影像的大小依序排列，該依序排列係將最小的層級排在該影像金字塔的頂部做為第一層級，將原圖大小的層級排在該影像金字塔的底部，其中各原圖大小可以是整數倍縮放，例如各層級的差異為2倍大小，但不限於此。FIG. 7 is a flowchart of another embodiment of the present invention, wherein the flow is different from the flow of the embodiment of FIG. 2 in that the flow of FIG. 2 is to first scale the original image and the template image into several classes. The conversion of the average threshold map is performed when the class is performed. In this embodiment, the conversion of the original image of the original image and the average threshold of the template image is performed before the image is scaled, and the conversion is performed. The resulting image is scaled. The detailed process of this embodiment is as follows. First, step S71 is performed to input a template image and an original image to be matched to an image processing apparatus. After inputting the image, proceeding to step S72, performing an average threshold map conversion between the template image and the original image to obtain an average threshold value map of the template image and the original image, the average threshold The acquisition of the meta-graph is the same as in the previous embodiment. Then, in step S73, the template image and the average threshold of the original image are scaled into an array of different size images by using an image pyramid, and then divided into several image levels, and then step S74 is performed. The image hierarchy is sequentially arranged according to the size of the image. The sequential arrangement ranks the smallest level at the top of the image pyramid as the first level, and the level of the original image size is arranged at the bottom of the image pyramid, wherein each original image The size may be an integer multiple of the scaling, for example, the difference of each level is 2 times the size, but is not limited thereto.

將各層級依序排列後，進行步驟S75，將該模板影像與原圖影像之平均值門檻位元圖進行相似度匹配，藉此找出該模板影像於原圖影像中之一匹配位置以及其匹配中心像素區塊，該相似度匹配與前述實施例相同。之後進行步驟S76，藉由步驟S75所取得該匹配位置的匹配中心像素區塊，將該匹配中心區塊對應至目前層級中原圖影像，由於目前層級之原圖影像大於上一層級之原圖影像，因此將該對應的匹配中心區塊作為中心向外擴張至一範圍，並在該範圍裡進行目前層級中模板影像與原圖影像之平均值門檻位元圖之相似度匹配。之後進行步驟S77，判斷目前層級之影像大小是否為原始影像大小，進行下一層級的相似度匹配，若是原始影像大小則將匹配位置輸出，並完成模板影像在原圖影像上之匹配。After the levels are arranged in sequence, step S75 is performed to perform similarity matching between the template image and the average threshold image of the original image, thereby finding a matching position of the template image in the original image and match The center pixel block is the same as the previous embodiment. Then, in step S76, the matching center pixel block of the matching position is obtained in step S75, and the matching center block is corresponding to the original image in the current level, because the original image of the current level is larger than the original image of the upper level. Therefore, the corresponding matching center block is outwardly expanded to a range, and the similarity matching between the template image of the current level and the average threshold of the original image is performed in the range. Then, in step S77, it is determined whether the image size of the current level is the original image size, and the similarity matching of the next level is performed. If the original image size is used, the matching position is output, and the matching of the template image on the original image is completed.

此外，如圖8所示為本發明之另一實施例，其與圖7實施例不同之處僅在於將圖7之流程改為先進行步驟S82之影像金字塔之縮放以及步驟S83將影像依大小排列，並進行步驟S84將各階層的影像同時進行位元圖轉換，之後再進行步驟S86~S88之各階層的影像匹配。In addition, FIG. 8 is another embodiment of the present invention, which differs from the embodiment of FIG. 7 only in that the flow of FIG. 7 is changed to the zooming of the image pyramid of step S82 and the image of the image by step S83. Arrange, and in step S84, the image of each level is simultaneously converted into a bitmap, and then the image matching of each level of steps S86 to S88 is performed.

本發明亦提出一提升匹配精準度的方法，如圖9所示，在完成上述實施例中影像金字塔原始影像大小圖層的匹配後，可再進行步驟S91，將原始影像圖層的原圖影像與模板影像進行一雙線性內差法(Bilinear Interpolation)放大兩倍，之後進行步驟S92，將放大後的原圖影像作為影像金字塔的最後一層，並進行與模板影像之匹配，在取得匹配位置後，進行步驟S93將該放大的匹配位置座標值除以2，藉此可取得更為精準的匹配位置。其中該欲放大的影像可以係已進行位元值轉換之原始影像大小圖層的影像，也可以是尚未進行位元值轉換之影像，即可以在放大後再進行位元值轉換，該放大後的原圖影像加入該影像金字塔後，亦可以如前述實施例將原圖影像層級的匹配位置映射至該放大影像層級形成一範圍，之後只在該範圍內進行匹配，藉此減少匹配的時間。The present invention also proposes a method for improving the matching accuracy. As shown in FIG. 9, after the matching of the original image size layer of the image pyramid in the above embodiment is completed, step S91 may be further performed to image the original image of the original image layer and the template. The image is doubled by a bilinear interpolation method, and then step S92 is performed, and the enlarged original image is used as the last layer of the image pyramid, and matched with the template image, after the matching position is obtained, Step S93 is performed to divide the enlarged matching position coordinate value by 2, thereby obtaining a more accurate matching position. The image to be enlarged may be an image of an original image size layer that has been subjected to bit value conversion, or Therefore, the image has not been converted by the bit value, that is, the bit value conversion can be performed after the enlargement, and after the enlarged original image is added to the image pyramid, the matching position of the original image level can also be obtained as in the foregoing embodiment. Mapping to the magnified image level forms a range, after which only matching is performed within the range, thereby reducing the time of matching.

圖10為雙線性內插法之示意圖，放大兩倍後的影像中具備數個原始影像的像素點(101)，以及各原始影像像素點(101)之間的第一型態內插像素點(102)與第二型態內插像素點(103)，其中該第一型態內插像素點(102)係兩個原始影像像素點(101)之間的像素點，其像素值是該兩個原始影像像素值之平均值，該第二型態內插像素點(103)係四個原始影像像素點(101)之間的像素點，其像素值是該四個原始影像像素值之平均值。由於放大後的影像具備更多的像素值，因此原圖影像與模板影像之匹配將更加精準。由於放大後的影像上之位置座標值也會是原始大小影像座標值的兩倍，因此在處理完放大影像的匹配後，要將所取得的匹配位置之座標除以2，以取得原始大小影像的座標。此外，放大兩倍只是一較佳實施例，本發明亦可以放大其它的倍數。10 is a schematic diagram of a bilinear interpolation method, in which a pixel having a plurality of original images (101) and a first type of interpolated pixel between each original image pixel (101) are imaged in a magnified image. Point (102) and a second type of interpolated pixel point (103), wherein the first type of interpolated pixel point (102) is a pixel point between two original image pixel points (101), the pixel value of which is An average of the two original image pixel values, the second type of interpolated pixel point (103) is a pixel point between the four original image pixel points (101), and the pixel value is the four original image pixel values The average value. Since the magnified image has more pixel values, the match between the original image and the template image will be more accurate. Since the position coordinate value on the enlarged image is also twice the value of the original size image coordinate, after processing the matching of the enlarged image, the coordinate of the obtained matching position is divided by 2 to obtain the original size image. The coordinates of the coordinates. Moreover, zooming in twice is only a preferred embodiment, and the present invention can also amplify other multiples.

以上實施例皆可將平均值門檻位元圖與互斥或位元圖結合運算，以取得更精準的平均值門檻位元圖。In the above embodiments, the average threshold pixel map can be combined with the mutual exclusion or bit map to obtain a more accurate average threshold map.

以上實施例在影像資訊不足的影像金字塔層級上，皆可使用完全搜尋法來進行匹配比對。In the above embodiment, the full search method can be used for matching matching on the image pyramid level where the image information is insufficient.

因此，本發明可提供快速的影像追蹤方法，利用平均值門檻篩選影像資訊以及利用影像金字塔縮放影像並分成數個層級，再利用各層級間影像的匹配位置資訊來減少下一層級的搜尋範圍大小，並可適用於影像資訊不足時，藉此大幅降低影像追蹤的時間。此外本發明利用模板影像像素值的平均值為位元圖的門檻，並藉由互斥或運算取得精準的位元圖，再將模板影像之位元圖與原圖影像之位元圖進行互斥或運算，在原圖影像上找出和模板影像最相近的一範圍作為匹配處，藉此由於模板影像物件的像素值在不同的動作、角度等情況並不會有太多差異，因此本發明可適用於模板影像與該原圖影像中欲匹配之物件係為完全相同、同一物件但不同動作、同一物件但不同背景、同一物件但不同角度等情況或該等不同情況之組合。此外，本發明更可以加入一放大影像匹配步驟，來獲得更為精準的匹配位置。Therefore, the present invention can provide a fast image tracking method, and Use the average threshold to filter the image information and use the image pyramid to scale the image and divide it into several levels, and then use the matching position information of the images between the levels to reduce the search range of the next level, and can be used when the image information is insufficient. Significantly reduce the time for image tracking. In addition, the present invention utilizes the average value of the pixel values of the template image as the threshold of the bitmap, and obtains a precise bitmap by mutual exclusion or operation, and then interleaves the bitmap of the template image with the bitmap of the original image. Repel or calculate, find a range closest to the template image on the original image as a match, whereby the pixel value of the template image object does not have much difference in different actions, angles, etc., so the present invention Applicable to the template image and the object to be matched in the original image are exactly the same, the same object but different actions, the same object but different background, the same object but different angles or the like or a combination of these different situations. In addition, the present invention can further add an enlarged image matching step to obtain a more accurate matching position.

上述實施例僅係為了方便說明而舉例而已，本發明所主張之權利範圍自應以申請專利範圍所述為準，而非僅限於上述實施例。The above-mentioned embodiments are merely examples for convenience of description, and the scope of the claims is intended to be limited to the above embodiments.

Claims

An instant image tracking method is processed by an image processing device, the method comprising the steps of: (A) obtaining a template image information and original image information; (B) using an image pyramid to image the template image with an original image The image is scaled to an image level of different sizes of the array, and the image levels are sequentially arranged according to the image size; (C) the threshold image conversion of the template image and the original image in the minimum image level is performed to obtain the image An average threshold value map of the template image and the original image; (D) matching the template image with the average threshold value of the original image, thereby finding the template image in the original image a matching position; (E) correspondingly matching the matching position obtained in the step (D) to the original image image of the next image level, and expanding a range from the matching position to perform the template image and the original image of the image level The threshold of the image is converted by the bitmap, and then only the template image is matched with the average threshold of the original image for the range; and (F) the step (E) is repeated until the shadow Level for such an image size of the original level; whereby to match the position of the last stage outputs the result as matching.

The method for tracking an instant image according to claim 1, wherein the step (F) further comprises a step (G) for transmitting the original image and the template image of the original size image layer by a bilinear interpolation. Method (Bilinear Interpolation) enlarges an integer multiple to be the last level of the image pyramid, and performs matching of the enlarged original image with the template image, and then divides the coordinate value of the matching position by the magnification to obtain a more accurate Matching location.

The instant image tracking method of claim 2, wherein the average of the average threshold map is an average of pixel values of the template image.

The method for tracking an instant image according to claim 2, wherein the similarity matching of the step (D) is mutually exclusive or (XOR) the pixel of the template image and the average threshold of the original image. ) Operation.

The instant image tracking method of claim 2, wherein the range of the step (E) is outwardly expanding centering on the matching position.

The instant image tracking method of claim 2, wherein when the image information of the hierarchy is insufficient, the search is performed by a full search method (Full Search).

The method for tracking an instant image according to claim 2, wherein the obtaining of the average threshold map of the step (C) or (E) further comprises an average threshold of the template image and the original image. The bit maps are respectively subjected to an AND operation with the mutual exclusion or bit map of the template image and the original image to obtain an accurate average threshold map of the template image and the original image, and The accurate average threshold map is used to replace the original average threshold map for the subsequent steps.

The method for tracking an instant image according to claim 7, wherein the bitwise AND operation is to mutually exclusive or bit map interlaced regions The impurity point on the upper side is set to 0, and after the operation is performed, the impurities are eliminated, wherein the determination of the impurity point determines whether the value of the pixel point is within the impurity point range, and if it is within the impurity point range, it is judged The pixel points are impurity points.

For example, the instant image tracking method described in claim 2, wherein the image size difference of each level of the image pyramid is an integer multiple.

The instant image tracking method of claim 1, wherein the template image and the object to be matched in the original image may be identical, the same object but different actions, the same object but different backgrounds, the same object but Situations such as different angles or combinations of such different situations.

An instant image tracking method is processed by an image processing device, the method comprising the steps of: (A) obtaining a template image information and original image information; (B) using an image pyramid to image the template image with an original image The image is scaled to an image level of different sizes of the array, and the image levels are sequentially arranged according to the image size; (C) the threshold image conversion of the template image and the original image image in all image levels is performed to obtain the image (1) performing a minimum threshold of the template image and the original image; (D) performing a minimum image layer concentration matching the template image with the average threshold value of the original image, thereby finding the template image One of the original image matches the position; (E) the matching position obtained in step (D) is mapped to the original image of the next image level, and a range is expanded by the matching position, and then only the range is Performing the average threshold of the template image and the original image Metagraph similarity matching; and (F) repeating step (E) until the image level is the original size level of the images; thereby outputting the matching position of the last level as a matching result.

The method for tracking an instant image according to claim 11, wherein the step (F) further comprises a step (G) for transmitting the original image and the template image of the original size image layer by a bilinear interpolation. The method enlarges an integer multiple to be the last level of the image pyramid, and performs matching between the enlarged original image and the template image, and then divides the coordinate value of the matching position by the magnification, thereby obtaining a more accurate Match the location.

The instant image tracking method of claim 12, wherein the average of the average threshold map is an average of pixel values of the template image.

The method for tracking an instant image according to claim 12, wherein the similarity matching of the step (D) is mutually exclusive or (XOR) the pixel of the template image and the average threshold of the original image. ) Operation.

The instant image tracking method of claim 12, wherein the range of the step (E) is outward expansion centering on the matching position.

The instant image tracking method of claim 12, wherein when the image information of the hierarchy is insufficient, the search is performed by a full search method (Full Search).

The method for tracking an instant image according to claim 12, wherein the average threshold of the step (C) or (E) is obtained. The step includes performing an AND operation between the template image and the average threshold of the original image and the mutual exclusion or bitmap of the template image and the original image to obtain the template image and the original image. The accurate average threshold image of the image is taken, and the original average threshold map is replaced by the original average threshold map to perform the subsequent steps.

The real-time image tracking method according to claim 17, wherein the bitwise AND operation is performed to set the impurity point on the mutually exclusive or bit map interlaced area to 0, thereby performing the operation. The impurities are eliminated, wherein the determination of the impurity point determines whether the value of the pixel is within the range of the impurity point, and if it is within the impurity point, the pixel point is determined to be an impurity point.

For example, the instant image tracking method described in claim 12, wherein the image size difference of each level of the image pyramid is an integer multiple.

The method for tracking an instant image according to claim 12, wherein the template image and the object to be matched in the original image may be identical, the same object but different actions, the same object but different backgrounds, the same object but Situations such as different angles or combinations of such different situations.

An instant image tracking method is processed by an image processing device, the method comprising the steps of: (A) obtaining a template image information and an original image information; and (B) performing an average of the template image and the original image. Threshold bitmap conversion; (C) using an image pyramid to scale the average threshold of the template image and the original image to an image hierarchy of different sizes of the array, the image levels are sequentially arranged according to the image size; (D) performing a minimum image layer concentration matching the template image with the average threshold value of the original image, thereby finding a matching position of the template image in the original image; (E) step (D) The obtained matching position corresponds to the average threshold 槛 bitmap of the original image image of the next image level, and a range is expanded by the matching position, and then the template image and the original image are only used for the range. The average threshold map is similarly matched; and (G) repeats step (E) until the image level is the original size level of the images; thereby, the matching position of the last level is output as a matching result.

The method for tracking an instant image according to claim 21, wherein the step (G) further comprises a step (H) for transmitting the original image and the template image of the original image size level through a bilinear interpolation. The method enlarges an integer multiple to be the last level of the image pyramid, and performs matching between the enlarged original image and the template image, and then divides the coordinate value of the matching position by the magnification, thereby obtaining a more accurate Match the location.

The method for tracking an instant image according to claim 22, wherein the average of the average threshold map is an average of pixel values of the template image.

The method for tracking an instant image according to claim 22, wherein the similarity matching of the step (D) is mutually exclusive or (XOR) the pixel of the template image and the average threshold of the original image. ) Operation.

The instant image tracking method of claim 22, wherein the range of the step (E) is outward expansion centering on the matching position.

The instant image tracking method of claim 22, wherein when the image information of the hierarchy is insufficient, the search is performed by a full search method (Full Search).

The method for tracking an instant image according to claim 22, wherein the obtaining of the average threshold map of the step (C) or (E) further comprises an average threshold of the template image and the original image. The bit maps are respectively subjected to an AND operation with the mutual exclusion or bit map of the template image and the original image to obtain an accurate average threshold map of the template image and the original image, and The accurate average threshold map is used to replace the original average threshold map for the subsequent steps.

The real-time image tracking method according to claim 27, wherein the bitwise AND operation is performed to set the impurity point on the mutually exclusive or bit map interlaced area to 0, thereby performing the operation. The impurities are eliminated, wherein the determination of the impurity point determines whether the value of the pixel is within the range of the impurity point, and if it is within the impurity point, the pixel point is determined to be an impurity point.

For example, the instant image tracking method described in claim 22, wherein the image size difference of each level of the image pyramid is an integer multiple.

The method for tracking an instant image according to claim 22, wherein the template image and the object to be matched in the original image may be identical, the same object but different actions, the same object but different backgrounds, the same object but Situations such as different angles or combinations of such different situations.