TWI480809B

TWI480809B - Image feature extraction method and device

Info

Publication number: TWI480809B
Application number: TW098129245A
Authority: TW
Original assignee: Alibaba Group Holding Ltd
Priority date: 2009-08-31
Filing date: 2009-08-31
Publication date: 2015-04-11
Also published as: TW201108126A

Description

Image feature extraction method and device

本申請案係關於圖像處理技術領域，特別關於一種圖像特徵提取方法及裝置。The present application relates to the field of image processing technologies, and in particular, to an image feature extraction method and apparatus.

圖像特徵包括顏色、紋理、形狀這三方面的特徵。本篇申請案主要涉及圖像的形狀特徵。基於形狀的圖像特徵提取演算法，是將圖像中能夠反映物體形狀的特徵透過數學方式提取並表達出來。Image features include three aspects of color, texture, and shape. This application mainly deals with the shape characteristics of the image. The shape-based image feature extraction algorithm extracts and expresses the features in the image that reflect the shape of the object through mathematical means.

圖像特徵提取的應用十分廣泛。例如在圖像搜尋領域，需要由負責提供圖像搜尋的搜尋引擎將圖片資料庫中的圖片與接收到的請求搜尋的圖片進行比較，從而找出圖片資料庫中與請求搜尋的圖片相接近的圖片。而這類搜尋中，進行比較的是請求搜尋的圖片的圖像特徵與資料庫中的圖片的圖像特徵，那麼，預先對圖片進行圖像特徵提取就成為必不可少的一步。Image feature extraction is widely used. For example, in the field of image search, the search engine responsible for providing image search needs to compare the image in the image database with the image requested to be searched, thereby finding out that the image database is close to the image requested for search. image. In this type of search, the image features of the image requested for searching and the image features of the image in the database are compared. Therefore, image feature extraction of the image in advance becomes an indispensable step.

關於基於形狀的圖像特徵提取，現有技術中較常用的方法是基於Hough變換來實現的。Hough變換是將圖像平面上的點對應到參數平面上的線，最後透過統計特性來提取圖像特徵。其原理可以描述如下：直線的方程可以用y=k*x+b來表示，其中k和b是參數，分別是斜率和截距。過某一點(x₀ ,y₀ )的所有直線的參數都滿足方程y₀ =k*x₀ +b。對目標圖像平面(x,y)中的亮度滿足預設條件的點，利用b=y-k*x求出(k,b)平面上對應的直線，把這直線上的所有點都賦值為1，並設該n條直線相交於一點時，該點的值賦為所經過的直線的個數。則對於圖像平面上的直線，按照上述過程，可以在參數平面上(k,b)得到一族直線，則參數平面上這些直線的相交的點，其值最高，則該點可以表示目標平面上的一條直線，從而，透過上述過程，透過統計參數平面上最高值的點可以檢測目標平面上存在的直線。Regarding shape-based image feature extraction, the more common methods in the prior art are implemented based on Hough transform. The Hough transform is to map the points on the image plane to the lines on the parameter plane, and finally extract the image features through statistical properties. The principle can be described as follows: The equation of a straight line can be represented by y=k*x+b, where k and b are parameters, which are the slope and the intercept, respectively. The parameters of all the straight lines passing through a certain point (x ₀ , y ₀ ) satisfy the equation y ₀ =k*x ₀ +b. For the point where the brightness in the target image plane (x, y) satisfies the preset condition, use b=yk*x to find the corresponding line on the (k, b) plane, and assign all points on the line to 1 And when the n straight lines intersect at one point, the value of the point is assigned to the number of straight lines that pass. Then, for a straight line on the image plane, according to the above process, a family of straight lines can be obtained on the parameter plane (k, b), and the intersection of the straight lines on the parameter plane has the highest value, and the point can represent the target plane. A straight line, through which the straight line existing on the target plane can be detected through the point of the highest value on the statistical parameter plane.

多條直線，依此類推。對於計算圓和弧的，與其類似。Multiple lines, and so on. For calculating circles and arcs, it is similar.

為了更清楚的說明現有技術的上述原理，下面以圖1作為目標圖像來說明，為了簡單說明，圖1中為10*10像素大小的圖片，其中存在一條直線，以圖片的左下角為座標原點，該直線可以表示為y=5。設該圖片背景亮度較低，而這一條直線上的點的亮度較高。採用Hough變換檢測這條直線的方法如下：In order to more clearly illustrate the above principle of the prior art, the following description is made with FIG. 1 as a target image. For the sake of simplicity, FIG. 1 is a picture of 10*10 pixels in size, in which a straight line exists, with the lower left corner of the picture as a coordinate At the origin, the line can be expressed as y=5. Let the background brightness of the picture be lower, and the brightness of the points on this line is higher. The method of detecting this line by Hough transform is as follows:

S1：按照座標檢測圖1中的每一個點；S1: detecting each point in FIG. 1 according to coordinates;

S2：當檢測到目標圖像上的點(設該點的座標為(x₀ ,y₀ ))亮度大於預定閾值時，在參數平面(如圖2所示)上標識一條b=y₀ -k*x₀ 直線，為該標識的直線上的每一點賦值為1(如命名為α值)；S2: when it is detected that the point on the target image (the coordinates of the point is (x ₀ , y ₀ )) is greater than a predetermined threshold, a b=y ₀ is identified on the parameter plane (shown in FIG. 2). a line of k*x ₀ , assigned a value of 1 for each point on the line of the logo (eg, named alpha);

S3：對於參數平面上標識出的直線的相交點，設定該相交點的α值為經過該點的直線的個數。實際上，相交點的α值，也可以表示為該點上所有經過的所有直線在這一點上的α值之和，與前述設定該相交點的α值為經過該點的直線的個數的表達，實質上是同一道理。S3: For the intersection point of the straight line identified on the parameter plane, the alpha value of the intersection point is set as the number of straight lines passing through the point. In fact, the alpha value of the intersection point can also be expressed as the sum of the alpha values of all the straight lines passing through the point at the point, and the alpha value of the intersection point set as described above is the number of straight lines passing the point. Expression is essentially the same truth.

經過上述處理，對於圖1所示目標圖像中y=5的直線，在圖2所示的參數平面上，可以標識出一個點，即k=0，b=5的點，該點的α值最高，則該參數平面上的(0，5)這一點，可以表示目標圖像上的y=5這一條直線。而參數平面上的(0，5)這一點，即0和5正好分別是目標平面上y=5的斜率與截距，說明參數平面上標識出的這一點，檢測出了目標平面上存在的y=5的這條直線。After the above processing, for the straight line of y=5 in the target image shown in FIG. 1, on the parameter plane shown in FIG. 2, a point, that is, a point of k=0, b=5, and α of the point can be identified. If the value is the highest, the point (0, 5) on the parameter plane can represent the line y=5 on the target image. The (0,5) points on the parameter plane, ie 0 and 5, are exactly the slope and intercept of y=5 on the target plane, respectively, indicating the point identified on the parameter plane, and detecting the existence of the target plane. This line of y=5.

上述以一個實例說明了現有Hough變換方法對於檢測目標平面中一條直線的方法。而對於目標平面中存在多條直線的情況，按照上述方法，可以在參數平面上得到多個高賦值的點，從而由參數平面上高賦值的這些點可以表示目標平面上的多條直線。The above describes an example of a method for detecting a straight line in a target plane by the existing Hough transform method. For the case where there are multiple straight lines in the target plane, according to the above method, a plurality of highly assigned points can be obtained on the parameter plane, so that the points assigned by the high values on the parameter plane can represent multiple straight lines on the target plane.

對於目標圖像中圓形、弧形等其他形狀的檢測，原理與上述過程類似。For the detection of other shapes such as a circle, an arc, and the like in the target image, the principle is similar to the above process.

在對現有技術的研究和實踐過程中，發明人發現現有技術中存在以下問題：上述採用Hough變換方法特徵提取，會不可避免的涉及浮點運算，例如直線的斜率存在浮點運算的情況，當然，對於更為複雜的圓、弧形的特徵提取，將涉及更多的浮點運算。而本領域技術人員知道，浮點運算對CPU等硬體的計算能力提出了較高的要求，同樣的硬體配置下，對於涉及浮點運算的現有Hough方法，計算速度較低。In the research and practice of the prior art, the inventors have found that the prior art has the following problems: the above feature extraction using the Hough transform method will inevitably involve floating-point operations, such as the case where the slope of the line has floating-point operations, of course. For more complex round and arc feature extraction, more floating point operations will be involved. Those skilled in the art know that floating-point operations impose high requirements on the computing power of CPUs and the like. Under the same hardware configuration, the existing Hough method involving floating-point operations has a lower calculation speed.

本申請案實施例的目的是提供一種圖像特徵提取方法及裝置，以提高圖像特徵提取的速度。The purpose of embodiments of the present application is to provide an image feature extraction method and apparatus to improve the speed of image feature extraction.

為解決上述技術問題，本申請案實施例提供一種圖像特徵提取方法及裝置是這樣實現的：一種圖像特徵提取方法，包括：從原始圖像中摳出所含物體的圖像；將所述摳出的圖像用單一顏色為背景填充邊界，並使得填充後的圖像成為最小正方形；將正方形圖像全圖等比縮放為第一預定大小的圖像，將縮放後的圖像分割為互不重疊的第二預定大小的子圖像塊；分別計算水平、垂直、正45°、負45°方向上相鄰像素的亮度導數，將分別在四個方向導數極值點的個數、以及位於子圖像塊四個邊界上極值點的總個數作為該子圖像塊的特徵向量；將所有子圖像塊的特徵向量作為原始圖像的特徵向量。In order to solve the above technical problem, an embodiment of the present application provides an image feature extraction method and apparatus, which are implemented by: an image feature extraction method, comprising: extracting an image of an object contained in an original image; The extracted image fills the boundary with a single color as the background, and makes the filled image the smallest square; the square image full image is scaled to the first predetermined size image, and the scaled image is segmented. a sub-image block of a second predetermined size that does not overlap each other; respectively calculate a luminance derivative of adjacent pixels in a horizontal, vertical, positive 45°, and negative 45° directions, and the number of derivative extreme points in each of the four directions And the total number of extreme points located on the four boundaries of the sub-image block as the feature vector of the sub-image block; the feature vector of all the sub-image blocks is used as the feature vector of the original image.

一種圖像特徵提取裝置，包括：摳取單元，用於從原始圖像中摳出所含物體的圖像；填充單元，將所述摳出的圖像用單一顏色為背景填充邊界，並使得填充後的圖像成為最小正方形；歸一化處理單元，包括縮放單元和分割單元，縮放單元用於將正方形圖像全圖等比縮放為第一預定大小的圖像，分割單元用於將縮放後的圖像分割為互不重疊的第二預定大小的子圖像塊；亮度導數計算和統計單元，用於分別計算水平、垂直、正45°、負45°方向上相鄰像素的亮度導數，並將分別在四個方向導數極值點的個數、以及位於子圖像塊四個邊界上極值點的總個數作為該子圖像塊的特徵向量；合成單元，將所有子圖像塊的特徵向量作為原始圖像的特徵向量。An image feature extraction device includes: a capture unit for extracting an image of an object contained in an original image; a filling unit that fills the extracted image with a single color as a background, and makes The filled image becomes a minimum square; the normalization processing unit includes a scaling unit and a dividing unit, and the scaling unit is configured to scale the square image full image to a first predetermined size image, and the dividing unit is used to zoom The subsequent image is divided into sub-image blocks of a second predetermined size that do not overlap each other; a luminance derivative calculation and statistical unit for respectively calculating luminance derivatives of adjacent pixels in the horizontal, vertical, positive 45°, and negative 45° directions And the number of derivative extreme points in the four directions and the total number of extreme points on the four boundaries of the sub-image block as the feature vector of the sub-image block; the synthesis unit, all the sub-pictures The feature vector of the block is used as the feature vector of the original image.

由以上本申請案實施例提供的技術方案可見，從原始圖像中摳出所含物體的圖像，將所述摳出的圖像用單一顏色為背景填充邊界，並使得填充後的圖像成為最小正方形，將正方形圖像全圖等比縮放為第一預定大小的圖像，將縮放後的圖像分割為互不重疊的第二預定大小的子圖像塊，分別計算水平、垂直、正45°、負45°方向上相鄰像素的亮度導數，將分別在四個方向導數極值點的個數、以及位於子圖像塊四個邊界上極值點的總個數作為該子圖像塊的特徵向量，將所有子圖像塊的特徵向量作為原始圖像的特徵向量，經過上述處理，可以提取圖像的形狀特徵，並且，由於這些處理只涉及整形運算，並不涉及浮點運算，因此，同樣的硬體配置下，相比現有技術，可以大大提高處理速度。It can be seen from the technical solution provided by the above embodiments of the present application that an image of the contained object is extracted from the original image, and the extracted image is filled with a single color as a background, and the filled image is made. The minimum square is formed, and the square image full image is scaled to an image of a first predetermined size, and the scaled image is divided into sub-image blocks of a second predetermined size that do not overlap each other, and horizontal, vertical, and The luminance derivative of adjacent pixels in the positive 45° and negative 45° directions, the number of derivative extreme points in the four directions and the total number of extreme points on the four boundaries of the sub-image block are taken as the sub- The feature vector of the image block, the feature vector of all the sub-image blocks is used as the feature vector of the original image, and after the above processing, the shape feature of the image can be extracted, and since these processes only involve the shaping operation, the floating process is not involved. Point operation, therefore, under the same hardware configuration, the processing speed can be greatly improved compared to the prior art.

本申請案實施例提供一種圖像特徵提取方法及裝置。The embodiment of the present application provides an image feature extraction method and apparatus.

為了使本技術領域的人員更好地理解本申請案中的技術方案，下面將結合本申請案實施例中的附圖，對本申請案實施例中的技術方案進行清楚、完整地描述，顯然，所描述的實施例僅僅是本申請案一部分實施例，而不是全部的實施例。基於本申請案中的實施例，本領域普通技術人員在沒有作出創造性勞動前提下所獲得的所有其他實施例，都應當屬於本發明保護的範圍。In order to enable a person skilled in the art to better understand the technical solutions in the present application, the technical solutions in the embodiments of the present application will be clearly and completely described in the following with reference to the accompanying drawings in the embodiments of the present application. The described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments of the present application without departing from the inventive scope should fall within the scope of the present invention.

如圖3所示，本申請案圖像特徵提取方法實施例的流程如下：As shown in FIG. 3, the flow of the image feature extraction method embodiment of the present application is as follows:

S310：從原始圖像中摳出所含物體的圖像。S310: Pull out an image of the contained object from the original image.

將原始圖像匯總的物體所在圖像摳取出來的方法有若干種，以下給出一些介紹。There are several methods for extracting the image of the object in which the original image is summarized. Some introductions are given below.

原始圖像中，往往有物體和背景，而背景一般佔據原始圖像的週邊部分，物體一般佔據原始圖像中的中間部分。而且，在原始圖像中，物體邊緣與背景的像素存在著較大的灰度差異，因此，可以利用這一特點將原始圖像中物體所在的圖像摳出，即：可以根據物體邊緣與背景存在的灰度差異從原始圖像中摳出所含物體的圖像。In the original image, there are often objects and backgrounds, and the background generally occupies the peripheral portion of the original image, and the object generally occupies the middle portion of the original image. Moreover, in the original image, there is a large difference in gray scale between the edge of the object and the pixels of the background. Therefore, this feature can be used to extract the image of the object in the original image, that is, according to the edge of the object and The grayscale difference that exists in the background extracts the image of the contained object from the original image.

以下給出該步驟的一個具體例子，先來看找出原始圖像中物體所在區域的左右邊界，包括以下步驟：A specific example of this step is given below. First look at finding the left and right boundaries of the region in which the object is located in the original image, including the following steps:

A1：統計原始圖像所有像素每一列上灰度值之和。A1: Count the sum of the gray values on each column of all pixels of the original image.

例如一個10*10像素的圖像，每個像素都具有灰度值屬性。該步驟中，即是計算每一列上所有像素的灰度值之和，作為該列像素的灰度值。For example, a 10*10 pixel image, each pixel has a gray value attribute. In this step, the sum of the gray values of all the pixels on each column is calculated as the gray value of the column pixels.

為了便於電腦軟硬體的處理，通常可以統計的每一列灰度值之和存放在陣列中。In order to facilitate the processing of computer software and hardware, the sum of the gray values of each column that can be counted is usually stored in the array.

A2：從左至右計算原始圖像相鄰兩列灰度值的差值，並記錄差值大於閾值的結果時右側列的橫座標x_a 。A2: Calculate the difference between the gray values of the adjacent two columns of the original image from left to right, and record the abscissa x _{a of the} right column when the difference is greater than the threshold.

例如，從左至右掃描A1中保存在陣列中的各列的灰度值，並依次計算陣列中相鄰值的差值，當掃描到差值大於閾值時，如掃描到第2列與第3列的灰度值差值為50，大於預設的閾值30，記錄第3列的橫座標x₃ ，對應陣列中第3個元素的下標。For example, scanning the gray values of the columns stored in the array in A1 from left to right, and sequentially calculating the difference between adjacent values in the array, and when scanning the difference is greater than the threshold, such as scanning to the second column and the The difference of the gray value of the three columns is 50, which is greater than the preset threshold value 30, and the abscissa x _{3 of} the third column is recorded, corresponding to the subscript of the third element in the array.

這樣，找到了原始圖像中的表明物體所在位置的左邊界。In this way, the left border in the original image indicating the location of the object is found.

需要指出的是，當掃描到差值大於閥值時，也可以是記錄差值大於閾值的結果時左側列的橫座標，如掃描到第2列與第3列的灰度值差值為50，大於預設的閾值30，也可以是記錄第2列的橫座標x₂ ，對應陣列中第2個元素的下標。該方式與前面所述只差1個像素的列寬，並不影響方法的整體效果。It should be noted that when the scan difference is greater than the threshold, it may also be the abscissa of the left column when the difference is greater than the threshold, for example, the difference between the gray values of the second column and the third column is 50. , greater than the preset threshold 30, may also be the abscissa x _{2 of} the second column, corresponding to the subscript of the second element in the array. This method differs from the previously described column width of only one pixel and does not affect the overall effect of the method.

A3：從右至左計算原始圖像相鄰兩列灰度值之和的差值，並記錄差值大於閾值的結果時左側列的橫座標x_b 。A3: Calculate the difference between the sum of the gray values of the adjacent two columns of the original image from right to left, and record the abscissa x _{b of the} left column when the difference is greater than the threshold.

該步驟與A2類似，掃描並計算陣列，記錄下標x_b ，這樣，找到了原始圖像中的表明物體所在位置的右邊界。This step is similar to A2, scanning and calculating the array, recording the subscript x _b , thus finding the right border in the original image indicating the location of the object.

上述A2、A3中閾值的設置可以根據經驗值設置，如一般情況下當大於某一值時，背景與物體邊界會有較明顯的區別，這時，可以將該值設定為閾值。The setting of the thresholds in A2 and A3 above may be set according to the empirical value. For example, when the value is greater than a certain value, the background and the object boundary may be significantly different. In this case, the value may be set as a threshold.

再來看找出原始圖像中物體所在區域的上下邊界，原理與上述A1至A3類似，包括以下步驟：Looking at the upper and lower boundaries of the area where the object is located in the original image, the principle is similar to the above A1 to A3, including the following steps:

B1：統計原始圖像所有像素每一行上灰度值之和。B1: Counts the sum of the gray values on each line of all pixels of the original image.

B2：從上至下計算原始圖像相鄰兩列灰度值的差值，並記錄差值大於閾值的結果時下側行的縱座標y_a 。B2: Calculate the difference between the gray values of the adjacent two columns of the original image from top to bottom, and record the ordinate y _{a of the lower} row when the difference is greater than the threshold.

B3：從下至上原始圖像相鄰兩列灰度值之和的差值，並記錄差值大於閾值的結果時上側行的縱座標y_b 。B3: the difference between the sum of the gray values of the adjacent two columns of the original image from bottom to top, and records the ordinate y _b of the upper row when the difference is greater than the result of the threshold.

這樣，找到了原始圖像中的表明物體所在位置的上、下邊界，分別為y_a 、y_b 。Thus, the upper and lower boundaries in the original image indicating the position of the object are found, respectively y _a , y _b .

則(x_a ，x_b ，y_a ，y_b )範圍內的圖像即為從原始圖像上摳出的物體圖像。Then the image in the range of (x _a , x _b , y _a , y _b ) is the object image that is extracted from the original image.

上述方式，是將原始圖像中物體所在的矩形圖像摳出，該方式最簡單易行。當然，還存在其他略為複雜的方式，如在上述方式的基礎上，增加兩個對角線方向上的灰度值差值的計算，從而找出物體在兩個對角線方向上的邊界，進而找出原始圖像中物體所在的八邊形圖像。當然，進一步的增加方向，還可以得出物體所在的16邊形、32邊形的圖像，不再贅述。In the above manner, the rectangular image in which the object is located in the original image is extracted, which is the easiest to implement. Of course, there are other slightly more complicated ways. For example, based on the above method, the calculation of the gray value difference values in the two diagonal directions is added to find the boundary of the object in the two diagonal directions. Then find out the octagonal image of the object in the original image. Of course, further increase of the direction, you can also get the image of the 16-sided, 32-sided shape of the object, no longer repeat them.

另外，還可以是將原始圖像分為水平方向的幾個子區域，利用前述方式找出每個子區域上物體的左右邊界；對應地，將原始圖像分為垂直方向的幾個子區域，利用前述方式找出每個子區域上物體的上下邊界。這樣，可以得到物體所在的多邊形區域。當然，這類方式會比較複雜。In addition, the original image may be divided into several sub-regions in the horizontal direction, and the left and right boundaries of the object on each sub-region are found by the foregoing manner; correspondingly, the original image is divided into several sub-regions in the vertical direction, using the foregoing The way to find the upper and lower boundaries of the object on each sub-area. In this way, you can get the polygon area where the object is located. Of course, this type of approach will be more complicated.

S320：將所述摳出的圖像用單一顏色為背景填充邊界，使得摳出的圖像成為最小正方形。S320: Fill the boundary with a single color as a background, so that the image that is extracted becomes the smallest square.

所述單一顏色，例如可以是採取RGB為(0，0，0)的顏色，當然也可以是其他RGB顏色。一般地，採取填充RGB為(0，0，0)的顏色，速度快，也不構成干擾，從而有利於後續亮度導數的計算。The single color may be, for example, a color in which RGB is (0, 0, 0), and of course, other RGB colors. Generally, the color of filling RGB is (0, 0, 0), which is fast and does not constitute interference, thereby facilitating the calculation of the subsequent luminance derivative.

用單一顏色填充邊界，使摳出的圖像成為最小的正方形，目的是在後續步驟中，便於將包括物體的圖像分割為預定大小的子圖像塊。圖4示出了填充摳出的圖像為最小正方形的示意圖。The boundary is filled with a single color to make the extracted image the smallest square, in order to facilitate segmentation of the image including the object into sub-image blocks of a predetermined size in a subsequent step. Figure 4 shows a schematic diagram of the image of the fill-out being the smallest square.

S330：將正方形圖像全圖等比縮放為第一預定大小的圖像，將縮放後的圖像分割為互不重疊的第二預定大小的子圖像塊。S330: The square image full image is scaled to an image of a first predetermined size, and the scaled image is divided into sub-image blocks of a second predetermined size that do not overlap each other.

將正方形圖像全圖等比縮放為第一預定大小的圖像，例如，可以將正方形圖像全圖等比縮放為64*64像素大小的圖像。當然，也可以是等比縮放為128*128像素大小的圖像。等比縮放，指的是將長、寬按照等比例縮放。The square image full image is scaled to an image of a first predetermined size, for example, the square image full image may be scaled to an image of 64*64 pixel size. Of course, it can also be an image that is scaled to a size of 128*128 pixels. Scale scaling refers to scaling the length and width in equal proportions.

將縮放後的圖像分割為互不重疊的第二預定大小的子圖像塊，可以是16*16像素的圖像，也可以是8*8像素大小的圖像，或者是32*32像素大小的圖像。The scaled image is divided into sub-image blocks of a second predetermined size that do not overlap each other, and may be an image of 16*16 pixels, or an image of 8*8 pixels, or 32*32 pixels. The size of the image.

該步驟的處理，是為了將包括物體的正方形圖像做歸一化處理，以使後續處理標準化、簡單化。第一預定大小和第二預定大小，可以是預先設定，設定的大小只要在合理範圍內，不存在質的區別。The processing of this step is to normalize the square image including the object to standardize and simplify the subsequent processing. The first predetermined size and the second predetermined size may be preset, and the set size does not have a qualitative difference as long as it is within a reasonable range.

以下以第一預定大小為64*64像素、第二預定大小為16*16像素為例加以說明。第二預定大小採取16*16像素時，第一預定大小64*64像素的包括物體的圖像，將被分割為4*4塊子圖像塊。Hereinafter, the first predetermined size is 64*64 pixels, and the second predetermined size is 16*16 pixels as an example. When the second predetermined size takes 16*16 pixels, the image of the first predetermined size 64*64 pixels including the object will be divided into 4*4 block sub-image blocks.

此外，需要指出的是，也可以將縮放後的圖像分割為有重疊情況的第二預定大小的子圖像塊，這樣的做法可能會使計算過程稍微繁瑣，也可能會增加最後輸出的特徵向量的維度，但仍不失為一種可行的方案。In addition, it should be noted that the scaled image may also be divided into sub-image blocks of a second predetermined size with overlapping conditions, which may make the calculation process a bit cumbersome and may increase the characteristics of the final output. The dimension of the vector, but still a viable option.

S340：分別計算水平、垂直、正45°、負45°方向上相鄰像素的亮度導數，將分別在四個方向導數極值點的個數、以及位於子圖像塊四個邊界上極值點的總個數作為該子圖像塊的特徵向量。S340: Calculate the luminance derivatives of adjacent pixels in the horizontal, vertical, positive 45°, and negative 45° directions, respectively, and the number of derivative extreme points in the four directions and the extreme values on the four boundaries of the sub-image block. The total number of points is used as the feature vector of the sub-image block.

首先介紹這裏引入的表明圖像塊特徵的特徵向量。可以視為一個五元向量，即M(a，b，c，d，e)，該五元向量，在按照S340進行處理時，在執行該處理的電腦系統中，會被初始化，則初始化後，子圖像塊的五元向量為M(a，b，c，d，e)=M(0，0，0，0，0)。The feature vector introduced here to indicate the characteristics of the image block is first introduced. It can be regarded as a five-element vector, that is, M(a, b, c, d, e). When processed according to S340, it will be initialized in the computer system that performs the processing, and then initialized. The five-element vector of the sub-image block is M(a, b, c, d, e) = M(0, 0, 0, 0, 0).

其次介紹亮度導數。亮度導數的定義為：亮度導數=亮度差/像素間距。一般可以使用人眼敏感曲線的演算法求出亮度值。這個演算法為本領域技術公知內容。一種方案是：亮度L=116/3*(0.212649*R/255+0.715169*G/255+0.072182*B/255)，其中RGB代表顏色值。亮度值一般是1表示全亮，0表示暗。但是在處理上一般將0~1的浮點值對應到1到255的整數範圍上。可見，亮度導數可以表示像素間亮度的變化情況。而對於提取圖像形狀特徵來講，一般地，根據圖像中物體邊緣或輪廓與其他部分圖像內容的亮度差異較明顯這一屬性，來找出物體在圖像中的邊緣或輪廓，從而用數學的形式描述出圖像中物體的形狀。Next, the brightness derivative is introduced. The luminance derivative is defined as: luminance derivative = luminance difference / pixel pitch. The brightness value can generally be found using an algorithm for the human eye sensitivity curve. This algorithm is well known in the art. One solution is: brightness L = 116 / 3 * (0.212649 * R / 255 + 0.715169 * G / 255 + 0.072182 * B / 255), where RGB represents the color value. The brightness value is generally 1 for full light and 0 for dark. However, in processing, a floating point value of 0 to 1 is generally mapped to an integer range of 1 to 255. It can be seen that the brightness derivative can indicate the change in brightness between pixels. For extracting image shape features, generally, according to the attribute that the brightness difference between the edge or contour of the object and other parts of the image content is relatively obvious, the edge or contour of the object in the image is found, thereby The shape of the object in the image is described in mathematical form.

上述屬性，可以用亮度導數的極值來描述，具體的，可以在某一方向上逐一計算每個相鄰像素的亮度導數極值表達。當某一方向上逐一計算每個相鄰像素的亮度導數極值過程中，某一像素位置前後的亮度導數發生符號變化，即為亮度導數的極值點。從物理意義上講，在計算得到極值點的地方，很可能就是圖像中物體與其他部分的邊緣，或者是圖像中可以表明物體上某一部分不同於其他部分形狀的特徵。可見，這些特徵，都是能夠表明物體本身形狀特性的特徵，因此，可以用來作為表示物體形狀特徵的量。The above attributes can be described by the extreme values of the luminance derivative. Specifically, the luminance derivative extreme value expression of each adjacent pixel can be calculated one by one in a certain direction. When the luminance derivative extreme value of each adjacent pixel is calculated one by one in a certain direction, the luminance derivative before and after a certain pixel position changes symbolically, that is, the extreme value of the luminance derivative. Physically speaking, where the extreme points are calculated, it is likely to be the edge of the object and other parts of the image, or a feature in the image that indicates that a part of the object is different from the shape of the other part. It can be seen that these features are features that can indicate the shape characteristics of the object itself and, therefore, can be used as an amount representing the shape feature of the object.

對於分別計算水平、垂直、正45°、負45°方向上相鄰像素的亮度導數，以下給出一種具體計算方式的介紹：For the calculation of the luminance derivatives of adjacent pixels in the horizontal, vertical, positive 45°, and negative 45° directions, an introduction to a specific calculation method is given below:

計算子圖像塊在水平方向上的亮度導數，如果存在亮度導數極值，並且落在子圖像塊之內的，則對b加1，如果存在多個極值，則b即為極值的個數；如果極值存在於子圖像塊邊界上，則a加1。需要指出的是，這裏計算的可以是相鄰兩列的像素的亮度導數。Calculate the luminance derivative of the sub-image block in the horizontal direction. If there is a luminance derivative extreme value and falls within the sub-image block, add 1 to b. If there are multiple extreme values, b is the extreme value. The number of a; if the extreme value exists on the sub-image block boundary, then a is increased by 1. It should be noted that the luminance derivative of the pixels of two adjacent columns can be calculated here.

計算子圖像塊在垂直方向上的亮度導數，如果存在亮度導數極值，並且落在子圖像塊之內，則對c加1，如果存在多個極值，則c即為極值的個數；如果極值存在於子圖像塊邊界上，則a加1。需要指出的是，這裏計算的可以是相鄰兩行的像素的亮度導數。Calculate the luminance derivative of the sub-image block in the vertical direction. If there is a luminance derivative extreme value and falls within the sub-image block, add 1 to c. If there are multiple extreme values, then c is the extreme value. Number; if the extreme value exists on the sub-image block boundary, a is incremented by 1. It should be noted that the luminance derivative of the pixels of two adjacent rows can be calculated here.

計算子圖像塊在正45°方向上的亮度導數，如果存在亮度導數極值，並且落在子圖像塊之內，則對d加1，如果存在多個極值，則d即為極值的個數；如果極值存在於子圖像塊邊界上，則a加1。需要指出的是，這裏計算的是相鄰兩列像素上的亮度導數。需要指出的是，這裏計算的可以是正45°方向上相鄰兩像素上的亮度導數。Calculate the luminance derivative of the sub-image block in the positive 45° direction. If there is a luminance derivative extreme value and falls within the sub-image block, add 1 to d. If there are multiple extreme values, then d is the pole. The number of values; a is incremented by 1 if the extreme value exists on the sub-image block boundary. It should be noted that the luminance derivative on the adjacent two columns of pixels is calculated here. It should be noted that the calculation here may be the luminance derivative on two adjacent pixels in the positive 45° direction.

計算子圖像塊在負45°方向上的亮度導數，如果存在亮度導數極值，並且落在子圖像塊之內，則對e加1，如果存在多個極值，則e即為極值的個數；如果極值存在於子圖像塊邊界上，則a加1。需要指出的是，這裏計算的可以是負45°方向上相鄰兩像素上的亮度導數。Calculate the luminance derivative of the sub-image block in the negative 45° direction. If there is a luminance derivative extreme value and falls within the sub-image block, add 1 to e. If there are multiple extreme values, then e is the pole. The number of values; a is incremented by 1 if the extreme value exists on the sub-image block boundary. It should be noted that the calculation here may be the luminance derivative on two adjacent pixels in the negative 45° direction.

可見，經過上述處理後，該子圖像塊對應的五元向量中，a為表示該圖像塊中的物體邊界位於圖像塊四周邊界上的情況。除a的情況外，即除落在子圖像塊邊界上的情況外，b表示子圖像塊內部物體在水平方向上的存在具有形狀特徵的邊緣個數，c表示子圖像塊內部物體在垂直方向上的存在具有形狀特徵的邊緣個數，d表示子圖像塊內部物體在正45°方向上的存在具有形狀特徵的邊緣個數，e表示子圖像塊內部物體在負45°方向上的存在具有形狀特徵的邊緣個數。It can be seen that, after the above processing, in the five-element vector corresponding to the sub-image block, a is a case where the boundary of the object in the image block is located on the boundary of the image block. In addition to the case of a, that is, the case of dropping on the boundary of the sub-image block, b indicates the number of edges of the object inside the sub-image block having the shape feature in the horizontal direction, and c indicates the object inside the sub-image block. There are the number of edges with shape features in the vertical direction, d indicates the number of edges with the shape feature in the positive 45° direction of the object inside the sub-image block, and e indicates that the internal object of the sub-image block is at minus 45°. The presence of the number of edges with shape features in the direction.

由上述這些統計出的特徵，可以用來較好的表示圖像中物體形狀的特徵。The features derived from these above can be used to better represent the features of the shape of the object in the image.

當然，上述統計方式並非唯一方式，本領域技術人員在看到上述給出的方式後，應當會容易想到其他的可行的方式，例如，對a、b、c、d、e的統計處理，可以具有靈活的方式。Of course, the above statistical method is not the only way. Those skilled in the art should easily think of other feasible ways after seeing the above-mentioned manner, for example, statistical processing of a, b, c, d, e, It has a flexible way.

S350：將所有子圖像塊的特徵向量作為原始圖像的特徵向量。S350: The feature vector of all the sub-image blocks is used as the feature vector of the original image.

上述S340中，得到一個子圖像塊的五元向量組，該五元向量組可以表達該子圖像塊的形狀特徵。In the above S340, a five-element vector group of one sub-image block is obtained, and the five-element vector group can express the shape feature of the sub-image block.

由於S330中，將圖像分割為了若干互不重疊的子圖像塊，因此，S350中，需要用所有分割的子圖像塊的五元向量組來表示該圖像的形狀特徵。Since the image is divided into a plurality of sub-image blocks that do not overlap each other in S330, in S350, the five-dimensional vector group of all the divided sub-image blocks is required to represent the shape feature of the image.

例如將經過S320處理後的64*64像素的圖像分割為16*16像素的互不重疊的塊，則分為了4*4塊。表示每一塊子圖像塊形狀特徵的一個五元向量組，則這16塊子圖像塊，排列起來，成為16個五元向量組，或者說是一共為80元的向量組。而這個80元向量組，可以用來表明該圖像的形狀特徵。For example, the 64*64 pixel image processed by S320 is divided into 16*16 pixel non-overlapping blocks, and is divided into 4*4 blocks. A five-element vector group representing the shape features of each sub-image block, the 16 sub-image blocks are arranged to form 16 five-element vector groups, or a vector group of a total of 80 yuan. And this 80-element vector group can be used to indicate the shape characteristics of the image.

另外，還可以包括以下歸一化處理：In addition, the following normalization processing can also be included:

S311：比較圖像的長和寬，如果長大於寬，則將圖像順時針旋轉90度。S311: Compare the length and width of the image. If the length is longer than the width, rotate the image 90 degrees clockwise.

旋轉是保證所有圖像都在一個基本形狀。比如一支筆，在圖片中的形狀有豎著放置，也會有橫放置。為了統一對比各圖片中筆的形狀，最好將圖片按同一方向放置。Rotation is to ensure that all images are in a basic shape. For example, a pen, the shape in the picture is placed vertically, there will be horizontal placement. In order to uniformly compare the shape of the pen in each picture, it is best to place the pictures in the same direction.

此外，也可以是逆時針旋轉90度。Alternatively, it may be rotated 90 degrees counterclockwise.

需要指出的是，S311中處理的物件可以是S310步驟中從原始圖像中摳出的圖像。It should be noted that the object processed in S311 may be an image extracted from the original image in the step S310.

S312：比較所述摳出圖像或所述最小正方形圖像上半部分和下半部分灰度值總和，如果上半部分的灰度值總和大於下半部分灰度值總和，則將所述摳出圖像或所述最小正方形圖像倒置。S312: comparing the sum of the gray value of the upper half and the lower half of the extracted image or the minimum square image, if the sum of the gray values of the upper half is greater than the sum of the gray values of the lower half, The extracted image or the smallest square image is inverted.

與上面的旋轉類似，該步驟也是為了使圖像中的物體有統一的方向。例如一個圖像中顯示一個蘋果，這個蘋果是倒放的。而大多數顯示蘋果的圖片中，蘋果都是正放的，因此，最好將倒放蘋果的圖片進行倒置，以便於與其他圖片進行比較。顯然地，對於上大下小的物體，其圖像中，上半部分圖像的灰度值之和要大於下半部分的灰度值之和；相反的，對於上小下大的物體，其圖像中，下半部分圖像的灰度值之和要大於上半部分的灰度值之和。上述處理過程，只涉及整形運算，並不涉及浮點運算，因此，同樣的硬體配置下，相比現有技術，可以大大提高處理速度。Similar to the rotation above, this step is also to make the objects in the image have a uniform orientation. For example, an image shows an apple, and the apple is inverted. In most of the pictures showing Apple, Apple is in the process of being placed, so it is best to invert the image of the inverted Apple to compare it with other pictures. Obviously, for an object that is large and small, the sum of the gray values of the upper half of the image is greater than the sum of the gray values of the lower half; on the contrary, for objects that are small and large, In the image, the sum of the gray values of the lower half of the image is greater than the sum of the gray values of the upper half. The above processing involves only the shaping operation and does not involve floating-point operations. Therefore, under the same hardware configuration, the processing speed can be greatly improved compared with the prior art.

從實際中搜尋引擎的角度考慮，按照每天更新100萬產品圖形計算，每秒必須要處理12張圖片，每張圖片100ms以內處理完成。這個只是考慮平均的情況，考慮到每天有4小時的產品更新的峰值，以及訪問磁片和網路的開銷，按設計要求每秒必須處理50張圖片，每張圖片特徵在20ms內完成。如果採用現有技術中的Hough變換，對一張200X200像素的圖像識別直線，用標準的四核伺服器，大約要20ms。僅僅Hough直線變換本身就已經不夠時間了。如果識別圓形，所用時間更長。From the perspective of the actual search engine, according to the daily update of 1 million product graphics calculation, 12 pictures must be processed per second, and each picture is processed within 100ms. This is just a matter of averaging. Considering the peak of product updates for 4 hours per day, and the cost of accessing the disk and network, 50 images must be processed per second according to design requirements, and each image feature is completed within 20ms. If the Hough transform in the prior art is used, a straight line is recognized for a 200×200 pixel image, and a standard quad core server is used, which takes about 20 ms. Just the Hough line transformation itself is not enough time. If the circle is recognized, it takes longer.

而採用本申請案實施例中對圖片分塊處理的辦法，由於不涉及浮點運算，因此可以加快處理速度。並且，對分塊後圖像的處理，還可以充分利用現有的多核處理器的優勢。同樣對於一個200X200的圖片，利用本申請案實施例中的方法，全部處理過程可以在10ms之內完成。However, in the embodiment of the present application, the method of processing the picture into blocks is faster because the floating point operation is not involved. Moreover, the processing of the image after the block can also make full use of the advantages of the existing multi-core processor. Also for a 200X200 picture, using the method in the embodiment of the present application, the entire process can be completed within 10ms.

以下介紹本申請案中一種圖像特徵提取裝置實施例，圖5示出了該裝置實施例的框圖，如圖5所示，該裝置實施例包括：摳取單元51，用於從原始圖像中摳出所含物體的圖像；填充單元52，將所述摳出的圖像用單一顏色為背景填充邊界，並使得填充後的圖像成為最小正方形；歸一化處理單元53，包括縮放單元531和分割單元532，縮放單元531用於將正方形圖像全圖等比縮放為第一預定大小的圖像，分割單元532用於將縮放後的圖像分割為互不重疊的第二預定大小的子圖像塊；亮度導數計算和統計單元54，用於分別計算水平、垂直、正45°、負45°方向上相鄰像素的亮度導數，並將分別在四個方向導數極值點的個數、以及位於子圖像塊四個邊界上極值點的總個數作為該子圖像塊的特徵向量；合成單元55，將所有子圖像塊的特徵向量作為原始圖像的特徵向量。An embodiment of an image feature extraction device in the present application is described below. FIG. 5 is a block diagram showing an embodiment of the device. As shown in FIG. 5, the device embodiment includes: a capture unit 51 for An image of the object contained in the image; a filling unit 52, filling the extracted image with a single color as a background, and making the filled image a minimum square; the normalization processing unit 53, including a scaling unit 531 and a dividing unit 532, the scaling unit 531 is configured to scale the square image full image to a first predetermined size image, and the dividing unit 532 is configured to divide the scaled image into two non-overlapping images. a sub-image block of a predetermined size; a luminance derivative calculation and statistics unit 54 for respectively calculating luminance derivatives of adjacent pixels in the horizontal, vertical, positive 45°, and negative 45° directions, and respectively exponential values in four directions The number of points, and the total number of extreme points on the four boundaries of the sub-image block as the feature vector of the sub-image block; the synthesizing unit 55, the feature vector of all the sub-image blocks as the original image Feature vector.

較佳地，所述的裝置實施例中，所述歸一化處理單元53還可以包括：Preferably, in the device embodiment, the normalization processing unit 53 may further include:

旋轉單元533，用於比較所述摳出圖像的長和寬，如果長大於寬，則將填充後的圖像順時針旋轉90度。如圖6所示。The rotating unit 533 is configured to compare the length and width of the extracted image. If the length is greater than the width, the filled image is rotated 90 degrees clockwise. As shown in Figure 6.

倒置單元534，用於比較所述摳出圖像上半部分和下半部分灰度值總和，如果上半圖像的灰度值總和大於下半部分灰度值總和，則將填充後的圖像倒置。如圖7所示。The inverting unit 534 is configured to compare the sum of the gray values of the upper half and the lower half of the extracted image, and if the sum of the gray values of the upper half image is greater than the sum of the gray values of the lower half, the filled graph Like an inversion. As shown in Figure 7.

當然，所述歸一化處理單元53也可以同時包括旋轉單元533和倒置單元534，如圖8所示。Of course, the normalization processing unit 53 can also include the rotation unit 533 and the inversion unit 534 at the same time, as shown in FIG.

如前述方法實施例中所述，該裝置實施例中，所述摳取單元51從原始圖像中摳出所含物體的圖像，具體可以是根據物體邊緣與背景存在較大灰度差異從原始圖像中摳出所含物體的圖像。As described in the foregoing method embodiment, in the device embodiment, the capturing unit 51 extracts an image of the object contained in the original image, and specifically may have a large grayscale difference according to the edge of the object and the background. An image of the contained object is extracted from the original image.

更進一步地，所述摳取單元51根據物體邊緣與背景存在較大灰度差異從原始圖像中摳出所含物體的圖像，包括：統計原始圖像所有像素每一列上灰度值之和；從左至右計算原始圖像相鄰兩列灰度值的差值，並記錄差值大於閾值的結果時右側列的橫座標x_a ；從右至左原始圖像相鄰兩列灰度值之和的差值，並記錄差值大於閾值時左側列的橫座標x_b ；統計原始圖像所有像素每一行上灰度值之和；從上至下計算原始圖像相鄰兩行灰度值的差值，並記錄差值大於閾值時下側行的縱座標y_a ；從下至上原始圖像相鄰兩行灰度值之和的差值，並記錄差值大於閾值時上側行的縱座標y_b ；則(x_a ，x_b ，y_a ，y_b )範圍內的圖像即為從原始圖像上摳出的物體圖像。Further, the capturing unit 51 extracts an image of the contained object from the original image according to a large gray scale difference between the edge of the object and the background, and includes: counting gray values of each column of all pixels of the original image. And; calculate the difference between the gray values of the adjacent two columns of the original image from left to right, and record the abscissa x _{a of the} right column when the difference is greater than the threshold; the adjacent two columns of gray from the right to the left of the original image The difference between the sum of the degrees, and record the abscissa x _{b of the} left column when the difference is greater than the threshold; count the sum of the gray values on each line of all pixels of the original image; calculate the adjacent two rows of the original image from top to bottom The difference between the gray values, and records the ordinate y _{a of the} lower row when the difference is greater than the threshold; the difference between the sum of the gray values of the adjacent two rows of the original image from the bottom to the top, and records the upper side when the difference is greater than the threshold The ordinate y _{b of the} line; then the image in the range of (x _a , x _b , y _a , y _b ) is the image of the object drawn from the original image.

該裝置實施例可以位於電腦系統中，由硬體、硬體或軟硬體的結合來實現，並且較佳地，所述裝置可以位於實現搜尋引擎功能的電腦系統中。The device embodiment can be located in a computer system, implemented by a combination of hardware, hardware, or hardware and software, and preferably, the device can be located in a computer system that implements search engine functionality.

由以上實施例可見，從原始圖像中摳出所含物體的圖像，將所述摳出的圖像用單一顏色為背景填充邊界，並使得填充後的圖像成為最小正方形，將正方形圖像全圖等比縮放為第一預定大小的圖像，將縮放後的圖像分割為互不重疊的第二預定大小的子圖像塊，分別計算水平、垂直、正45°、負45°方向上相鄰像素的亮度導數，將分別在四個方向導數極值點的個數、以及位於子圖像塊四個邊界上極值點的總個數作為該子圖像塊的特徵向量，將所有子圖像塊的特徵向量作為原始圖像的特徵向量，經過上述處理，可以提取圖像的形狀特徵，並且，由於這些處理只涉及整形運算，並不涉及浮點運算，因此，同樣的硬體配置下，相比現有技術，可以大大提高處理速度。It can be seen from the above embodiment that the image of the contained object is extracted from the original image, the extracted image is filled with a single color as the background, and the filled image becomes the smallest square, and the square image is The image is scaled to a first predetermined size image, and the scaled image is divided into sub-image blocks of a second predetermined size that do not overlap each other, and are respectively calculated horizontally, vertically, positively 45°, and negatively by 45°. The luminance derivative of the adjacent pixels in the direction, the number of the derivative extreme points in the four directions and the total number of extreme points located on the four boundaries of the sub-image block are used as the feature vectors of the sub-image block. The feature vector of all the sub-image blocks is used as the feature vector of the original image. After the above processing, the shape feature of the image can be extracted, and since these processes involve only the shaping operation, the floating point operation is not involved, and therefore, the same In the hardware configuration, the processing speed can be greatly improved compared to the prior art.

為了描述的方便，描述以上裝置時以功能分為各種單元分別描述。當然，在實施本發明時可以把各單元的功能在同一個或多個軟體和/或硬體中實現。For the convenience of description, the above devices are described separately by function into various units. Of course, the functions of the various units may be implemented in the same or multiple software and/or hardware in the practice of the invention.

透過以上的實施方式的描述可知，本領域的技術人員可以清楚地瞭解到本發明可借助軟體加必需的通用硬體平臺的方式來實現。基於這樣的理解，本發明的技術方案本質上或者說對現有技術做出貢獻的部分可以以軟體產品的形式體現出來，該電腦軟體產品可以儲存在儲存媒體中，如ROM/RAM、磁碟、光碟等，包括若干指令用以使得一台電腦設備(可以是個人電腦，伺服器，或者網路設備等)執行本發明各個實施例或者實施例的某些部分所述的方法。As can be seen from the description of the above embodiments, those skilled in the art can clearly understand that the present invention can be implemented by means of a software plus a necessary universal hardware platform. Based on such understanding, the technical solution of the present invention can be embodied in the form of a software product in essence or in the form of a software product, which can be stored in a storage medium such as a ROM/RAM or a disk. Optical disks, etc., include instructions for causing a computer device (which may be a personal computer, server, or network device, etc.) to perform the methods described in various embodiments of the present invention or portions of the embodiments.

本說明書中的各個實施例均採用遞進的方式描述，各個實施例之間相同相似的部分互相參見即可，每個實施例重點說明的都是與其他實施例的不同之處。尤其，對於系統實施例而言，由於其基本相似於方法實施例，所以描述的比較簡單，相關之處參見方法實施例的部分說明即可。The various embodiments in the specification are described in a progressive manner, and the same or similar parts between the various embodiments may be referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the system embodiment, since it is basically similar to the method embodiment, the description is relatively simple, and the relevant parts can be referred to the description of the method embodiment.

本發明可用於眾多通用或專用的計算系統環境或配置中。例如：個人電腦、伺服器電腦、手持設備或可擕式設備、平板型設備、多處理器系統、基於微處理器的系統、機上盒、可程式化的消費電子設備、網路PC、小型電腦、大型電腦、包括以上任何系統或設備的分散式計算環境等等。The invention is applicable to a wide variety of general purpose or special purpose computing system environments or configurations. For example: PCs, server computers, handheld or portable devices, tablet devices, multiprocessor systems, microprocessor-based systems, set-top boxes, programmable consumer electronics, network PCs, small Computers, large computers, decentralized computing environments including any of the above systems or devices, and more.

本發明可以在由電腦執行的電腦可執行指令的一般上下文中描述，例如程式模組。一般地，程式模組包括執行特定任務或實現特定抽象資料類型的常式、程式、物件、元件、資料結構等等。也可以在分散式計算環境中實踐本發明，在這些分散式計算環境中，由透過通訊網路而被連接的遠端處理設備來執行任務。在分散式計算環境中，程式模組可以位於包括儲存設備在內的本地和遠端電腦儲存媒體中。The invention may be described in the general context of computer-executable instructions executed by a computer, such as a program module. Generally, a program module includes routines, programs, objects, components, data structures, and the like that perform particular tasks or implement particular abstract data types. The invention may also be practiced in a distributed computing environment in which tasks are performed by remote processing devices that are connected through a communications network. In a distributed computing environment, program modules can be located in local and remote computer storage media, including storage devices.

雖然透過實施例描繪了本發明，本領域普通技術人員知道，本發明有許多變形和變化而不脫離本發明的精神，希望所附的申請專利範圍包括這些變形和變化而不脫離本發明的精神。Although the present invention has been described by way of example, it is understood by those of ordinary skill in the art .

51．．．摳取單元51. . . Capture unit

52．．．填充單元52. . . Filling unit

53．．．歸一化處理單元53. . . Normalized processing unit

531．．．縮放單元531. . . Scaling unit

532．．．分割單元532. . . Split unit

54．．．亮度導數計算和統計單元54. . . Luminance derivative calculation and statistical unit

55．．．合成單元55. . . Synthetic unit

533．．．旋轉單元533. . . Rotating unit

534．．．倒置單元534. . . Inverted unit

為了更清楚地說明本申請案實施例或現有技術中的技術方案，下面將對實施例或現有技術描述中所需要使用的附圖作簡單地介紹，顯而易見地，下面描述中的附圖僅僅是本申請案中記載的一些實施例，對於本領域普通技術人員來講，在不付出創造性勞動性的前提下，還可以根據這些附圖獲得其他的附圖。In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the embodiments or the description of the prior art will be briefly described below. Obviously, the drawings in the following description are merely Some of the embodiments described in the present application can also obtain other drawings based on these drawings without departing from the prior art for those skilled in the art.

圖1為現有技術Hough變換中的目標圖像；圖2為現有技術Hough變換中的參數平面；圖3為本申請案方法實施例的流程圖；圖4為本申請案實施例中填充摳出的圖像為最小正方形的示意圖；圖5為本申請案實施例中填充摳出的圖像為最小正方形的示意圖；圖6為本申請案裝置實施的框圖；圖7為本申請案裝置實施的框圖；圖8為本申請案裝置實施的框圖。1 is a target image in a prior art Hough transform; FIG. 2 is a parameter plane in a prior art Hough transform; FIG. 3 is a flowchart of an embodiment of the method of the present application; FIG. 4 is a pop-up in the embodiment of the present application. The image is a schematic diagram of a minimum square; FIG. 5 is a schematic diagram of the image of the filled-out image in the embodiment of the present application as a minimum square; FIG. 6 is a block diagram of the implementation of the device of the present application; Figure 8 is a block diagram of an implementation of the apparatus of the present application.

Claims

An image feature extraction method includes: extracting an image of an object contained in an original image; filling the extracted image with a single color as a background, and making the filled image a minimum square; The square image full image is scaled to an image of a first predetermined size, and the scaled image is divided into sub-image blocks of a second predetermined size; respectively, the sub-image block is calculated horizontally, vertically, positively 45°, negative The luminance derivative of adjacent pixels in the 45° direction, the number of derivative extreme points in the four directions and the total number of extreme points on the four boundaries of the sub-image block as the characteristics of the sub-image block Vector; the feature vector of all sub-image blocks is used as the feature vector of the original image.

The method of claim 1, wherein extracting an image of the contained object from the original image comprises: extracting the contained object from the original image according to a grayscale difference between the edge of the object and the background Image.

The method of claim 2, wherein the image of the contained object is extracted from the original image according to the grayscale difference between the edge of the object and the background, including: counting gray of each pixel of the original image. The sum of the degrees; calculate the difference between the gray values of the adjacent two columns of the original image from left to right, and record the abscissa x _{a of the} right column when the difference is greater than the threshold; the adjacent two columns of the original image from right to left The difference between the sum of the gray values, and record the abscissa x _{b of the} left column when the difference is greater than the threshold; the sum of the gray values on each line of all pixels of the original image; calculate the original image from top to bottom The difference between the gray values of the column, and records the ordinate y _{a of the} lower row when the difference is greater than the threshold; the difference between the sum of the gray values of the adjacent two columns of the original image from bottom to top, and records the difference is greater than The threshold is the ordinate y _b of the upper row; then the image within the range of (x _a , x _b , y _a , y _b ) is the image containing the object that is extracted from the original image.

The method of claim 1, wherein filling the extracted image with a single color as a background comprises: using the color of the extracted image as RGB (0, 0, 0) The background fills the border.

The method of claim 1, wherein the dividing the scaled image into the second predetermined size sub-image block comprises: dividing the scaled image into a second predetermined size that does not overlap each other. Sub-image block.

The method of claim 1, further comprising: comparing the length and width of the extracted image, and if the length is greater than the width, rotating the image by 90 degrees clockwise.

The method of claim 1, further comprising: comparing the sum of the gray value of the upper half and the lower half of the extracted image or the minimum square image, if the sum of the gray values of the upper half is greater than The sum of the half-tone values is inverted, and the captured image or the smallest square image is inverted.

An image feature extraction device includes: a capture unit for extracting an image of an object contained in an original image; a filling unit, filling the extracted image with a single color as a background, and filling the image The subsequent image becomes a minimum square; the normalization processing unit includes a scaling unit and a dividing unit, and the scaling unit is configured to scale the square image full image to an image of a first predetermined size, the dividing unit is configured to The scaled image is divided into sub-image blocks of a second predetermined size; a luminance derivative calculation and statistical unit for respectively calculating luminance derivatives of adjacent pixels in a horizontal, vertical, positive 45°, negative 45° direction, and The number of derivative extreme points in the four directions and the total number of extreme points on the four boundaries of the sub-image block as the feature vector of the sub-image block; the synthesis unit, which will be the sub-image block The feature vector is used as the feature vector of the original image.

The device of claim 8, wherein the normalization processing unit further comprises: a rotation unit for comparing length and width of the image, if the length is greater than the width, the padded The image is rotated 90 degrees clockwise.

The device of claim 8, wherein the normalization processing unit further comprises: an inverting unit for comparing the sum of the gray values of the upper half and the lower half of the image, if the upper half If the sum of the gray values of the image is greater than the sum of the gray values of the lower half, the filled image is inverted.

The device of any one of claims 8 to 10, wherein the device is located in a computer system that implements a search engine function.