TW201113819A

TW201113819A - Embedded device capable real-time recognizing the unspecific gesture and its recognizing method

Info

Publication number: TW201113819A
Application number: TW98134583A
Authority: TW
Inventors: Shih-Hung Tseng
Original assignee: Tatung Co
Priority date: 2009-10-13
Filing date: 2009-10-13
Publication date: 2011-04-16

Abstract

An embedded device capable real-time recognizing unspecific gesture and its recognizing method are disclosed. The recognizing method of the unspecific gesture is used to operate the function of the embedded device in real-time. The embedded device includes an image capturing unit, an image processing unit, an image recognition unit, and a transmission unit. The image capturing unit captures the images. The image processing unit isolates the gesture image from the foregoing images. The image processing unit performs the process of binary threshold and closing method while the cam-shift algorithm is used to locate the region of interest (ROI) area of palm's skin color. Next, the ROI area is overlapped on the gesture image, and the skeletonization process is executed. A plus and subtraction shift of the ROI area is performed, and the cyclic ROI area is formed between the plus and subtraction shift of the cyclic ROI area. The image recognition unit overlaps the cyclic ROI area on the skeletonization image. To locate the end point of fingers as number of fingers in the cyclic ROI area by the masking method, the unspecific gesture is recognized in real-time by the number of the fingers. Finally, the transmission unit transmits the control signal of the corresponding gesture to the embedded device for executing the corresponding action by the control signal.

Description

201113819 六、發明說明：【發明所屬之技術領域】本發明係關於嵌入式裝置之技術領域，尤指—種非特定手勢之嵌人式裝置，利用不同的手勢操作後入式裝置之功能。【先前技術】隨著顯示器輕薄化技術的發展，電視已經成為現代人生活的-部份，收看電視不再只是在客廳或臥房才能從事的娱樂活動’因此’不論廚房、浴室、汽車..等環境都可以見到電視的蹤影。然而，在從事料理、沐浴或駕駛等活動同時收看電視的過料，❹者不方便使用遙控器及電視本身的操作按鍵來調整電視的頻道、音量等功能。因此’針對上述之問題，本發明提供一種可即時辨識非特定手勢之钱人式裝置，以方便使用者於忙碌之情況下仍能夠快速地切換電視頻道及調整音量大小。【發明内容】本發明為-種可即時辨識非特定手勢之嵌入式裝置，包括：-影像操取單元、_影像處理單元、—影像辨識單疋、以及—專輸單元。其中，影像棟取單元擷取影像；影像處理單元使用背景相減法㈣取之影像中過濾出手勢影像’再對手勢影像進行：值化處理、平滑化處理，並使用以膚色辨識為基礎之追蹤演算法定義手勢膚色的關注區 201113819 域’將關注區域與二值化、平滑化後之手勢影像結合，進行細線化之影像處理，再將關注區域做擴大及縮小的偏移’於擴大和縮小偏移後的關注區域之間形成一環狀關注區域；影像辨識單元將環狀關注區域與細線化後之影像結合’使用遮罩之方式搜尋環狀關注區域内之頂點數量，其也代表手指的隻數，再依據頂點的數量來辨識手勢，並將手勢辨識結果對應其相關功能指令之控制訊號作輸出；傳輸單元傳送控制訊號至嵌入式裝置，嵌入式裝置再依據控 •制訊號執行相對應之動作。因此，本發明藉由使用背景相減法從擷取之影像中過濾出手勢影像後，利用膚色追蹤法找出膚色關注區域，而後續的手勢辨識便以此關注區域為基礎提出在環狀關注區域内做演算和辨識的工作，再不需要整張畫面進行掃瞄辨識，因此畫面檔案小，可加快辨識的速度，而以細線化演算法搭配遮罩頂點搜尋法作手勢的辨識，以頂點數即手指的隻數來做手勢辨識的結果，並將其結果依據控制訊號執 • 行相對應之動作。因此無固定的手勢，故無需資料庫，俾，簡化裝置之架構且達到人性化的目的。二值化處理後之影像僅有黑白二色，故可大幅提升影像處理之速度，俾能達到即時辨識之目的與功效，又，手勢辨識是依據手指的隻數，因此本發明之手勢辨識無左右手之限制。其中，本發明之影像擷取單元可為攝影機，嵌入式裝置可為市面上之電子、家電、消費性產品。、 201113819 其中’追蹤演算法定義手勢膚色的關注區域之方法可先將影像由RGB色彩轉換為HSV色彩，並將HSV色彩空間中之Hue參數移除，抗拒光源對顏色的影響，再以S和V參數做膚色之追蹤而定義出關注區域。其中’關注區域之擴大偏移可為非等比之擴大。其中’當嵌入式裝置為電視機’手勢辨識為一時，其所對應之控制訊號可使電視機之頻道遞增；手勢辨識為二時，其所對應之控制訊號可使電視機之頻道遞減丨手勢辨識為三時，其所對應之控制訊號可使電視機之音量遞增；手勢辨識為四時，其所對應之控制訊號可使電視機之音量遞減，手勢辨識為五時，其所對應之控制訊號可使電視機之電源開與關。【實施方式】請同時參閱圖1及圖2,圖i為本發明之可即時辨識非特定手勢之嵌入式裝置的架構圖，圖2為本發明之手勢辨識的不意圖。圖1中顯示有一種可即時辨識影像之嵌入式裝置201113819 VI. Description of the Invention: [Technical Field] The present invention relates to the technical field of embedded devices, and more particularly to a non-specific gesture embedded device that utilizes different gestures to operate the function of the in-line device. [Prior Art] With the development of thin and light display technology, TV has become a part of modern life. Watching TV is no longer just an entertainment activity in the living room or bedroom. Therefore, regardless of the kitchen, bathroom, car, etc. The environment can see the trace of the TV. However, when watching TV, such as cooking, bathing or driving, it is not convenient to use the remote control and the operation buttons of the TV itself to adjust the channel and volume of the TV. Therefore, in view of the above problems, the present invention provides a money-based device that can instantly recognize non-specific gestures, so that the user can quickly switch TV channels and adjust the volume when he or she is busy. SUMMARY OF THE INVENTION The present invention is an embedded device that can instantly recognize non-specific gestures, including: an image manipulation unit, an image processing unit, an image recognition unit, and a dedicated transmission unit. The image capturing unit captures the image; the image processing unit uses the background subtraction method (4) to extract the gesture image from the image, and then performs the value processing, smoothing, and tracking using the skin color recognition. The algorithm defines the region of interest of the gesture skin color 201113819 The domain ' combines the region of interest with the binarized and smoothed gesture image, performs thin-line image processing, and then expands and reduces the offset of the region of interest' to expand and contract An annular region of interest is formed between the shifted regions of interest; the image recognition unit combines the circular region of interest with the thinned image to search for the number of vertices in the annular region of interest using a mask, which also represents a finger The number is determined according to the number of vertices, and the gesture recognition result is output corresponding to the control signal of the relevant function instruction; the transmission unit transmits the control signal to the embedded device, and the embedded device executes the phase according to the control signal Corresponding action. Therefore, after filtering the gesture image from the captured image by using the background subtraction method, the present invention uses the skin color tracking method to find the skin color attention area, and the subsequent gesture recognition is proposed based on the attention area in the annular attention area. The calculation and identification work inside does not require the entire screen for scan identification. Therefore, the screen file is small, which can speed up the recognition. The thin line algorithm is matched with the mask vertex search method for gesture recognition. The number of fingers is used to make the result of gesture recognition, and the result is based on the corresponding action of the control signal. Therefore, there is no fixed gesture, so no database is needed, and the structure of the device is simplified and the purpose of humanization is achieved. The binarized image has only black and white, so the speed of image processing can be greatly improved, and the purpose and effect of instant recognition can be achieved. Moreover, the gesture recognition is based on the number of fingers, so the gesture recognition of the present invention is not The restrictions of the right and left hands. The image capturing unit of the present invention can be a camera, and the embedded device can be an electronic, home appliance or consumer product on the market. , 201113819 Where the 'tracking algorithm defines the region of interest of the gesture skin color can first convert the image from RGB color to HSV color, and remove the Hue parameter in the HSV color space, resist the influence of the light source on the color, and then S and The V parameter is used to track the skin color to define the area of interest. The expansion of the 'area of interest' can be expanded by a non-equal ratio. When the 'when the embedded device is a TV' gesture is recognized as one, the corresponding control signal can increase the channel of the television; when the gesture is recognized as two, the corresponding control signal can decrement the channel of the television. When the recognition is three, the corresponding control signal can increase the volume of the TV; when the gesture is recognized as four, the corresponding control signal can reduce the volume of the TV, and the gesture is recognized as five o'clock, and the corresponding control The signal turns the power of the TV on and off. [Embodiment] Please refer to FIG. 1 and FIG. 2 simultaneously. FIG. 1 is an architectural diagram of an embedded device capable of instantly recognizing a non-specific gesture according to the present invention, and FIG. 2 is a schematic diagram of gesture recognition according to the present invention. Figure 1 shows an embedded device that instantly recognizes images.

影像掏取早元11係例如一攝影機， ;、及一傳輸單元14，其每秒鐘擷取30幅影像，其中擷取之影像係可參照圖2之示意圖，當中包含有一手勢影像33。中過濾出手勢影像33，前述影像處理單元12係使用背景相減法從操取之影像再對手勢影像3 3進行二值化處理、 201113819 平滑化處理，並將手勢影像由RGB色彩轉換為HSV色彩，使用追蹤演算法做膚色之追蹤而定義出手勢膚色的關注區域31，將關注區域31與二值化、平滑化後之手勢影像33結合，進行細線化之處理，再將關注區域31做縮小3〇及非等比擴大32的偏移，以形成—環狀關注區域％。前述影像辨識單元13係將環狀關注區域36與細線化後之影像34結合’使用遮罩之方式搜尋環狀關注區域％内之頂點35數量即手指之數量，再依據頂點的數量來辨識手The image capture frame 11 is, for example, a camera, and a transmission unit 14, which captures 30 images per second. The captured image can be referred to the schematic diagram of FIG. 2, which includes a gesture image 33. The gesture image 33 is filtered out, and the image processing unit 12 performs binarization processing on the gesture image 3 3 and the 201113819 smoothing process from the manipulated image by using the background subtraction method, and converts the gesture image from RGB color to HSV color. The tracking area is used to track the skin color and define the attention area 31 of the gesture skin color, and the attention area 31 is combined with the binarized and smoothed gesture image 33 to perform thinning processing, and then the attention area 31 is reduced. The 〇 and unequal ratios increase the offset of 32 to form a ring-shaped region of interest %. The image recognition unit 13 combines the annular attention area 36 with the thinned image 34. The mask is used to search for the number of vertices 35 in the annular area of interest, that is, the number of fingers, and then recognize the hand according to the number of vertices.

勢’輸出手勢所對應之控制訊號；前述傳輸單元14係傳送控制訊號至嚴入式裝们，嵌入式裝置i再依據控制訊號執行相對應之動作。因此依據本發明上述之技術内容，本發明之影像辨識不需要對整張畫面進行掃㈣識，僅需對㈣出的手勢影像畫面進行辨識，且二值化處理後之手勢影像僅有黑白，色’因此棺案小’既可簡化裝置之架構，又可大幅提升影像處理之速度，而達到即時辨識之目的與功效。於本發明之實施例中，嵌入式裝置】係例如一電視機，請參閱圖3⑷〜⑻為本發明之手勢辨識為—至五的示音圖’如圖3㈧所示’當手勢辨識為一時，其所對應之控：訊號係使電視狀頻道遞增；如圖3(_示，$ 所其所控制訊號係使電視機之頻道遞減；如使電視機：丄：：為三時’其所對應之控制訊號係二;如圖3(D)所示’當手勢辨識為四時，八所對應之控制訊號係使電視機之音量遞減；如圖 201113819 丁田手勢辨識為五時，其所對應之控制訊號係使電視機之電源開與關。圖3⑷〜⑻僅為手勢辨識結果為一至五的示意圖’辨識結果為—至五之手勢並不限於如圖3 (A)〜⑻所示本發明之手勢辨識是依據手指的隻數，，手勢，也毋特定五隻為上限，因此，不需要貝枓庫，並可做到無論是右手還是左手，或聯合兩手皆可辨識。凊參閱圖4為本發明非特定手勢辨識之方法的流程圖係用於一嵌入式裝置1進行影像辨識之方法’嵌入式裝置1係例如一電視機，包括有影像摘取單元U、影像處理單兀12、影像辨識單元13、及傳輸單元14。於此，請同時參閱圖5(A) (J)，圖5(A)〜⑺係以手勢為五示^辨識過程之示意圖，首先，影像擷取單心每秒鐘擷取观影像（步驟所肺之影像係如圖5⑷所示包含有—手勢影像；其人t/像處理單元12使用背景相減法將棟取之影像（圖5(句）減去-預先拍攝之背景影像（圖5⑽而過遽出手勢影像（圖 5(C))(步驟21);接著’對手勢影像進行二值化處理（步驟功以獲得-黑白之手勢影像（圖5(D))、以及進行平滑化處理 (步驟23)以獲得-平滑、黑白之手勢影像⑽⑽於此同時’影像處理單元12將手勢影像由臟色彩轉換為赠色彩’使用追縱演算法（例如：c.shift演算法）做膚色之追縱而定義出手勢膚色的關注區域3 1 (步驟Μ)，如圖5(ρ)所示。然後，影像處理單元12將關注區域31與二值化、平滑化後之手勢影像33結合而獲得如圖5⑹所示之結合的手勢 201113819 影像’並對此手勢影像進行細線化處理以獲得如圖5(H)所示之手勢影像之細線，再將關注區域31做縮小3〇及非等比擴大32的偏移，以形成一環狀關注區域36(如圖5(〖）所示）。衫像辨識單元13將環狀關注區域3 6與細線化後之影像3 4結合（如圖5(J)所示）’使用遮罩之方式搜尋環狀關注區域刊内之頂點35數量及手指之數量，再依據手指間之差異角度辨識手勢（步驟25)，若無辨識出手勢，影像擷取單元丨丨重新擷取影像（步驟20)’若有辨識出手勢，則判斷手勢對應之功能 •(步驟26) ’輸出手勢所對應之控制訊號（步驟27)，嵌入式裝置1再依據控制訊號執行相對應之動作（步驟28)。上述實施例僅係為了方便說明而舉例而已，本發明所主張之權利範圍自應以申請專利範圍所述為準，而非僅限於上述實施例。【圖式簡單說明】圖隱本發明之可即時辨識非特定手勢之嵌入式裝置的架構圖。圖2係本發明之手勢辨識的示意圓。圖3(A)’係本發明之手勢辨識為—至五的示意圖。圖4係本發明非特定手勢辨識之方法的流程圖。圖5⑷〜⑺係以手勢為五示範辨識過程之示意圖。【主要元件符號說明】 12影像處理單元嵌入式裝置 11影像擷取單元 201113819 20〜28步 35頂點 13影像辨識單元14傳輸單元 33手勢影像 34細線化後之影像 36環狀關注區域The potential unit outputs a control signal corresponding to the gesture; the transmission unit 14 transmits the control signal to the strict input device, and the embedded device i performs the corresponding action according to the control signal. Therefore, according to the above technical content of the present invention, the image recognition of the present invention does not need to scan (4) the entire picture, and only needs to recognize the (4) gesture image picture, and the gesture image after the binarization process is only black and white. The color 'is therefore small' can simplify the structure of the device, and can greatly improve the speed of image processing, and achieve the purpose and effect of instant identification. In the embodiment of the present invention, the embedded device is, for example, a television. Referring to FIG. 3 (4) to (8), the gesture of the present invention is recognized as a sound map of -5 (as shown in FIG. 3 (8)" when the gesture is recognized as one. The corresponding control: The signal system makes the TV channel increase; as shown in Figure 3 (_, the signal controlled by the system is such that the channel of the TV is decremented; if the TV: 丄:: is three o'clock Corresponding control signal system 2; as shown in Figure 3 (D) 'When the gesture is recognized as four, the eight corresponding control signals reduce the volume of the TV; as shown in Figure 201113819, the Dingtian gesture is recognized as five, which corresponds to The control signal is to turn the power of the TV on and off. Figure 3 (4) ~ (8) is only a schematic diagram of the gesture recognition results of one to five 'the recognition result is - the five gestures are not limited to the one shown in Figure 3 (A) ~ (8) The gesture recognition of the invention is based on the number of fingers, the gesture, and the specific five are the upper limit. Therefore, the shellfish library is not needed, and the right hand or the left hand, or both hands can be recognized. 4 is the non-specific gesture recognition of the present invention The flow chart of the method is used for an embedded device 1 for image recognition. The embedded device 1 is, for example, a television set, including an image extracting unit U, an image processing unit 12, an image recognition unit 13, and a transmission unit. 14. In this case, please refer to FIG. 5(A)(J) at the same time, and FIG. 5(A)~(7) is a schematic diagram of the recognition process of the gesture. First, the image captures a single heart and captures the image every second. (The image of the lung of the step includes a gesture image as shown in Fig. 5 (4); the human t/image processing unit 12 subtracts the image of the image from the image taken by the background subtraction method (Fig. 5 (sentence) minus the background image of the pre-shooting ( Figure 5 (10) is over the gesture image (Figure 5 (C)) (step 21); then 'binary processing the gesture image (step to get - black and white gesture image (Figure 5 (D)), and Smoothing process (step 23) to obtain a smooth, black-and-white gesture image (10) (10) while the 'image processing unit 12 converts the gesture image from dirty color to a gift color' using a tracking algorithm (eg, c. shift algorithm) The area of interest that defines the skin color of the gesture by the pursuit of skin color 3 1 ( Then, the image processing unit 12 combines the region of interest 31 with the binarized and smoothed gesture image 33 to obtain the combined gesture 201113819 image as shown in FIG. 5 (6). And the gesture image is thinned to obtain a thin line of the gesture image as shown in FIG. 5(H), and then the attention area 31 is reduced by 3〇 and the non-equal ratio is expanded by 32 to form a circular focus. The area 36 is as shown in Fig. 5. The shirt image recognition unit 13 combines the annular attention area 36 with the thinned image 3 4 (as shown in Fig. 5(J)). Search for the number of vertices 35 in the circular area of interest and the number of fingers, and then recognize the gesture according to the difference between the fingers (step 25). If no gesture is recognized, the image capturing unit 丨丨 recaptures the image (step 20) 'If the gesture is recognized, the function corresponding to the gesture is judged. (Step 26) 'The control signal corresponding to the output gesture is output (step 27), and the embedded device 1 performs the corresponding action according to the control signal (step 28). The above-described embodiments are merely examples for the convenience of the description, and the scope of the claims is intended to be limited by the scope of the claims. BRIEF DESCRIPTION OF THE DRAWINGS Fig. 1 is a block diagram of an embedded device of the present invention which can instantly recognize a non-specific gesture. 2 is a schematic circle of gesture recognition of the present invention. Fig. 3(A)' is a schematic diagram of the gesture recognition of the present invention as - to five. 4 is a flow chart of a method for non-specific gesture recognition of the present invention. Figure 5 (4) ~ (7) is a schematic diagram of the gesture identification process. [Main component symbol description] 12 image processing unit Embedded device 11 Image capture unit 201113819 20~28 step 35 vertices 13 image recognition unit 14 transmission unit 33 gesture image 34 thinned image 36 ring-shaped area of interest

Claims

201113819 VII. Patent application scope: 1. An embedded device that can instantly recognize non-specific gestures, including: an image capturing unit for capturing images; and an image processing unit for filtering images using background subtraction method The gesture image is output, and the gesture image is binarized and smoothed, and the tracking region is used to define the attention area of the gesture skin color, and the attention area is combined with the binarized and smoothed gesture image. Fine-line image processing, then the focus area is enlarged and reduced, φ forms an annular region of interest between the enlarged and reduced offset regions of interest - the image recognition unit' Combining the thinned images, searching for the circular attention amount, that is, the number of fingers, using the mask to identify the gesture, outputting the corresponding signal of the gesture; and the transmission unit transmitting the control signal to the embedded device The embedded device then performs a corresponding action according to the control signal. 2. The embedded device of claim 2, wherein the image capturing unit comprises a camera. 3. The embedded device as claimed in claim 3, wherein the embedded device comprises a commercially available electronic, home appliance, or consumer product. 4. The embedded device as claimed in claim 3, wherein the tracking algorithm defines a region of interest of the gesture skin color by first converting the image from the color of the eye to the color of the HSV, and then performing the tracking of the skin color. Define the area of interest. 201113819 Among them, 5. The embedded dream described in item 1 of the patent application scope is an expansion of the non-equal ratio. 6. If the embedded device is a TV set, the gesture is recognized as a ', ' ^ research, and the control signal of the remaining temple is the TV. The channel of the machine is incremented. Τπ你7. If the embedded device described in the scope of the patent application is the same as the television device, the gesture is recognized as _ 八中^, the control signal of the household The channel of the TV is decremented. 〃 '

8. The embedded device of claim 1, wherein when the embedded device is a television, the gesture is recognized as three, eight, and the control signal is to increase the volume of the television. 9 corresponding to the embedded device of claim 3, wherein when the embedded device is a television, the gesture is recognized as 呷, and the corresponding control signal is used to reduce the volume of the television. . 10. The embedded device according to claim 1, wherein when the embedded device is a television and the gesture is recognized as five, the control signal of the pair is turned on and off. . ~

U. The embedded device performs an instant recognition method for a non-specific gesture, the embedded device includes an image capturing unit, an image processing unit, an image recognition unit, and a transmission unit. The method comprises the following steps: (A) the image is Taking the unit to capture the image; (B) the image processing unit uses the background subtraction method to filter the gesture image from the captured image; 12 201113819 (c) Binarizing the 4 gesture image to define the gesture skin color using the tracking algorithm The area of interest; and the image processing of the gesture image of the ▲, ! 二 area and the binarization and smoothing, and then the expansion and reduction of the region of interest is expanded and reduced. A circular region of interest is formed between the shifted regions of interest. ^取咏12· As identified in the patent scope, item n, the method further includes the following steps: βΛ

(Ε) The image recognition unit combines the circular attention area with the thinned image, and uses a mask to search for the number of vertices in the circular attention area, that is, the number of fingers, thereby recognizing the gesture; If the gesture is recognized, the control signal corresponding to the gesture corresponding to the gesture is determined; and (G) the embedded device performs the corresponding operation according to the control signal. The identification method includes the camera. ~ 14. As stated in the application for the identification method described in the scope of patent application, the embedded devices include electronic, home appliances and consumer products in the market. 15. The identification method according to claim 11 wherein, in step (C), the tracking algorithm defines a region of interest of the gesture skin color, and the image is first converted from RGB color to HS V color, and Hs ν color _ the Hue parameter in the office is removed, resisting the influence of the light source on the color, and then defining the area of interest by 8 and ^ to the skin color. 13 201113819 16. In the case of the method of claim n, the extended deviation of the region of interest in step (D) is non-equal, and A' is violated. The identification side is expanded. A pull-in device is a television set, and the gesture is recognized as - ^, ‘日寻* 事^户管+ control signal is to increase the channel of the television. ι • Identification method as described in claim 12 of the patent application. The embedded device is a television set, and the gesture recognizes A_^ s ΏΒΧ* ^. At the moment, the control signal of the pot is decremented by the channel of the television. ^ I 19• The identification method described in claim 12 is that when the embedded device is a television set and the gesture recognizes the field, the slanting control signal increases the volume of the television. i. The identification method according to claim 12, wherein the embedded device is a television set, and the control signal is used to make the television set. The method according to claim 12, wherein when the embedded device is a television set, the gesture is recognized as five times, and the corresponding control signal is used to make the television The power of the machine is turned on and off.