CN111914808A - Gesture recognition system realized based on FPGA and recognition method thereof - Google Patents

Gesture recognition system realized based on FPGA and recognition method thereof Download PDF

Info

Publication number
CN111914808A
CN111914808A CN202010834453.XA CN202010834453A CN111914808A CN 111914808 A CN111914808 A CN 111914808A CN 202010834453 A CN202010834453 A CN 202010834453A CN 111914808 A CN111914808 A CN 111914808A
Authority
CN
China
Prior art keywords
gesture
module
data
recognition
area
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010834453.XA
Other languages
Chinese (zh)
Other versions
CN111914808B (en
Inventor
王俊
易金
陈康
林瑞全
欧明敏
邢新华
武义
赵显煜
郑炜
李振嘉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fuzhou University
Original Assignee
Fuzhou University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fuzhou University filed Critical Fuzhou University
Priority to CN202010834453.XA priority Critical patent/CN111914808B/en
Publication of CN111914808A publication Critical patent/CN111914808A/en
Application granted granted Critical
Publication of CN111914808B publication Critical patent/CN111914808B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • G06V40/28Recognition of hand or arm movements, e.g. recognition of deaf sign language
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/017Gesture based interaction, e.g. based on a set of recognized hand gestures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/30Noise filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/50Extraction of image or video features by performing operations within image blocks; by using histograms, e.g. histogram of oriented gradients [HoG]; by summing image-intensity values; Projection analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/56Extraction of image or video features relating to colour
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/107Static hand or arm
    • G06V40/113Recognition of static hand signs
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention relates to a gesture recognition system realized based on FPGA and a recognition method thereof, comprising a CMOS camera data acquisition module, an FPGA data processing module, a DDR3 storage module and a VGA display module. The CMOS camera is connected with the FPGA, and the driving of the camera is completed inside the FPGA chip. After video data collected by the camera enters the FPGA chip, the video data is cached in the DDR3 storage module under the action of the data read-write control module, and meanwhile, the video data is read out under the action of the data read-write control module. And the character driving and video overlapping module overlaps the recognized content with the gesture image in a character form in real time according to the recognition results of the static gesture and the dynamic gesture and then sends the overlapped recognized content to the VGA display module for display. Compared with the prior art, the method effectively solves the problems of unstable gesture recognition and instantaneity of gesture recognition in complex environments with insufficient illumination, skin-color-like interference and the like.

Description

Gesture recognition system realized based on FPGA and recognition method thereof
Technical Field
The invention relates to the technical field of image processing and the field of man-machine interaction, in particular to a gesture recognition system and a gesture recognition method based on FPGA.
Background
Gesture recognition technology is one of the hot spots in the field of human-computer interaction. With the support of gesture recognition technology, interactive devices such as computers and robots can be controlled more naturally and effectively.
Gesture recognition can be divided into two types: static gesture recognition and dynamic gesture recognition. Most existing solution designs can only be used for static gestures or dynamic gestures. In gesture recognition, there are very few methods that can simultaneously handle static gestures and dynamic gestures. The algorithm provided by the invention can freely switch between static gesture recognition and dynamic gesture recognition while solving the problem of gesture recognition in a complex environment, thereby greatly improving the efficiency of a gesture recognition system.
Complex environments such as insufficient illumination and similar skin color interference have great influence on gesture recognition accuracy, and an application system based on computer software has the defects of large volume, high power consumption, high cost, poor instantaneity and the like. The FPGA is used as a programmable logic device, has the characteristics of parallel processing, pipeline calculation and the like, contains rich configurable logic resources and an IP core which can be called, and is an ideal device for digital image processing.
Disclosure of Invention
In view of the above, the present invention provides a gesture recognition system implemented based on an FPGA and a recognition method thereof, which simultaneously implement static gesture recognition and dynamic gesture recognition by using a HOG feature and an SVM classifier on an FPGA platform, provide an effective man-machine interaction means, and effectively solve the problems of instability of gesture recognition and instantaneity of gesture recognition in complex environments with insufficient illumination and skin-color-like interference.
The invention is realized by adopting the following scheme: a gesture recognition system realized based on FPGA comprises a CMOS camera data acquisition module, an FPGA data processing module, a DDR3 storage module and a VGA display module; the FPGA data processing module comprises a camera driving module, a data read-write control module, a color space transformation module, a median filtering module, a histogram equalization module, a gesture segmentation module, a static and dynamic gesture judgment module, a feature extraction module, a classification identification module, a dynamic gesture track identification module, a character driving and video superposition module and a VGA driving module; the camera driving module drives the CMOS camera data acquisition module to acquire video data, and transmits the video data back to the data read-write control module, so that the data read-write control module controls the read and write of the acquired video data in the DDR3 storage module; the DDR3 storage module is used for storing video data acquired by the CMOS camera data acquisition module, transmitting the stored video data to the color space conversion module for data format conversion, and filtering through the median filtering module to reduce noise of the video data; the histogram equalization module is used for performing equalization processing on the video data filtered by the median filtering module so as to improve the contrast of the video data; the gesture segmentation module is used for segmenting a gesture area in the video data after equalization processing; the static and dynamic gesture judgment module comprises a dynamic gesture processing module and a static gesture processing module and is used for judging whether the gesture of the current frame is a static gesture or a dynamic gesture, when the specific gesture of the palm is detected, the gesture is considered to be a dynamic gesture, and the video data stream enters the dynamic gesture processing module; when the specific gesture of the palm is not detected, the gesture is considered to be a static gesture, and the video data stream enters a static gesture processing module; the feature extraction module is used for extracting HOG features of the current static gesture area; the classification recognition module uses an SVM classifier to classify and recognize the HOG characteristics of the current gesture area; the dynamic gesture track recognition module tracks the current dynamic gesture and recognizes the current dynamic gesture; the character driving and video overlapping module is used for driving the result of the static gesture recognition or the result of the dynamic gesture recognition into corresponding characters and overlapping the corresponding characters with the original gesture video; the VGA driving module is used for driving the VGA chip and displaying the superposed video in real time.
The invention provides a recognition method of a gesture recognition system based on FPGA, which comprises the following steps:
step S1: the CMOS camera data acquisition module acquires video data to the FPGA data processing module in real time;
step S2: the data read-write control module controls the read and write of the collected video data in the DDR3 storage module, and the DDR3 storage module stores the collected video data;
step S3: the color space transformation module converts the acquired RGB format data into YCbCr format data;
step S4: the median filtering module adopts a 3 multiplied by 3 template and uses a rapid median filtering algorithm to filter the data converted by the color space conversion module;
step S5: the filtered data is instantiated into two random memory blocks by a histogram equalization module, a Y component in YCbCr data after format data conversion is firstly subjected to histogram statistics to obtain a histogram, the histogram is cut and redistributed according to a cutting threshold value, then equalization operation is carried out on the histogram, the threshold value range is 0-100, and the equalization formula is as follows:
Figure BDA0002639104690000041
MN is the total number of image pixels, njThe number of pixels with the gray level j is, and the value range of j is 0 to k; l is the number of image gray levels, and by the above formula, the gray value of the pixel in the image after histogram equalization can be obtained by mapping the gray value r (k) of the pixel in the image after median filtering to s (k);
step S6: the gesture segmentation module is used for carrying out segmentation on the YCbCr color gamut space based on a skin color threshold value and eliminating the influence of similar skin colors by adopting a four-connected domain marking algorithm; determining the position of a gesture centroid through a centroid calculation formula, wherein the position of the gesture centroid represents the position of a center, and determining a minimum rectangular frame of a gesture area through the position of the gesture center;
step S7: judging whether the gesture is a static gesture or a dynamic gesture, starting timing when the static and dynamic gesture judging module detects that the gesture of the current frame is a palm, judging that the gesture is a dynamic gesture if the timing time reaches the set time K seconds and 0< K < 3, and then executing step S8; otherwise, if the gesture is determined to be a static gesture, HOG feature extraction and SVM classification recognition of the gesture area are performed, and then step S8 is executed;
step S8: the names of various gestures in the character driving and video overlapping module can generate corresponding character image data, SVM classification recognition results and dynamic gesture track recognition results are input into the character driving and video overlapping module, the image data are driven by the SVM classification recognition results and the dynamic gesture track recognition results, the driven character image data and the recognized gesture area image data are overlapped, and finally the character image data and the recognized gesture area image data are displayed on the VGA display module at the same time.
Further, the conversion formula for converting the acquired RGB format data into the YCbCr format data in step S3 is:
Y=0.299R+0.587G+0.114B
Cb=-0.172R-0.339G+0.511B+128
Cr=0.511R-0.428G-0.083B+128
this formula is further translated into:
Figure BDA0002639104690000051
further, the specific content of the filtering process in step S4 is: firstly, respectively sequencing 3 images in each row; then extracting the minimum value of the three maximum values, the median value of the three median values and the maximum value of the three minimum values; and finally, taking the median of the three values obtained above again, namely obtaining the median of the finally obtained 9 pixels.
Further, the step S5 specifically includes the following steps:
step S51: and (3) performing threshold segmentation based on skin color in the YCbCr color gamut space, wherein the threshold segmentation formula of the skin color is as follows:
Figure BDA0002639104690000052
step S52: eliminating the influence of similar skin color by adopting a four-connected domain marking algorithm;
step S53: marking the gesture area with a rectangular frame; determining the position of a gesture centroid through a centroid calculation formula, wherein the position of the gesture centroid represents the position of a center, and determining a minimum rectangular frame of a gesture area through the position of the gesture center; the centroid calculation formula is as follows:
Figure BDA0002639104690000053
Figure BDA0002639104690000054
Figure BDA0002639104690000055
Figure BDA0002639104690000056
M00the centroid weight of the gesture area in the whole frame image, M01Is the weight of the centroid of the gesture area in the vertical direction, M10Is the weight of the centroid of the gesture area in the horizontal direction, XcIs the abscissa, Y, of the centroid of the gesture areacThe vertical coordinate of the centroid of the gesture area, the pixel value of each pixel point of the f (x, y) gesture area, y is the vertical coordinate of the pixel point of the gesture area, and x is the horizontal coordinate of the pixel point of the gesture area.
Further, the step S52 specifically includes the following steps:
step S521: judging whether the leftmost point and the uppermost point in the four neighborhoods of the current input pixel point are marked or not, if the leftmost point in the four neighborhoods of the current input pixel point is marked and the uppermost point is not marked, marking the point as the same area as the leftmost point; if the leftmost point is not marked and the uppermost point is marked, marking the point as the uppermost point and the same area; if the leftmost point and the uppermost point in the four neighborhoods of the current input pixel point are marked simultaneously, marking the point as a mark of a point with a small pixel value compared with the pixel values of the leftmost point and the uppermost point in the four neighborhoods of the current input pixel point in the two regions; if the leftmost point and the uppermost point are not marked at the same time, marking the point as a new area;
step S522: and calculating the area of each marked connected region, namely summing the number of pixel points of each connected region, wherein the connected region with the largest area is the gesture region to be segmented.
Further, the specific content of performing the HOG feature extraction of the gesture area in step S7 is: using a Shift _ RAM Shift register to perform two-line Shift storage on the static gesture image, forming a 3 x 3 pixel array with the input data of the current line, and calculating the gradient amplitude and the gradient direction of an input pixel point (x, y) by using the pixel array;
gradient G in horizontal directionx(x, y) gradient with vertical Gy(x, y) the formula is as follows:
Gx(x,y)=H(x+1,y)-H(x-1,y)
Gy(x,y)=H(x,y+1)-H(x,y-1)
the gradient amplitude G (x, y) and the gradient direction theta (x, y) of the pixel point (x, y) are calculated according to the following formula:
Figure BDA0002639104690000071
Figure BDA0002639104690000072
dividing cells and blocks: the direction value range of the pixel points is 0-180 degrees, the pixel points are divided into 9 bin intervals in total, and each 20 degrees is one interval; forming a cell by every 8 × 8 pixels, forming a block by every 2 × 2 cells, and forming an image by every 15 × 7 blocks; one cell contains 9-dimensional feature vectors, one block contains 36-dimensional feature vectors, and one image contains 3780-dimensional feature vectors, namely the HOG feature vectors extracted from the gesture regions.
Further, the specific content of the SVM classification and identification in step S7 is as follows: the classification recognition module performs classification recognition by using a linear SVM classifier, and realizes static gesture multi-classification by using one-to-many methods (OVO SVMs); defining five types of static gestures, wherein the mark of the gesture 1 is A, the mark of the gesture 2 is B, the mark of the gesture 3 is C, the mark of the gesture fist is D and the mark of the gesture palm is E; for the A, B, C, D, E five types of gestures, respectively selecting feature vectors corresponding to (A, B), (A, C), (A, D), (A, E), (B, C), (B, D), (B, E), (C, D), (C, E), (D, E) as a training set during training to obtain ten groups of training results; during testing, ten groups of training results are respectively used for testing, and the ten groups of classification results are voted to obtain the final classification result.
Further, the voting process is as follows:
step 1: a ═ B ═ C ═ D ═ E ═ 0; // ticket number initialization
Step 2: (a, B) -classifier ifAwin, then a ═ a + 1; else B ═ B + 1;
and step 3: (a, C) -classifier ifAwin, then a ═ a + 1; else C ═ C + 1;
and 4, step 4: (a, D) -classifier ifAwin, then a ═ a + 1; else D ═ D + 1;
and 5: (a, E) -classifier ifAwin, then a ═ a + 1; else E ═ E + 1;
step 6: (B, C) -classifier ifAwin, then B ═ B + 1; else C ═ C + 1;
and 7: (B, D) -classifier ifAwin, then B ═ B + 1; else D ═ D + 1;
and 8: (B, E) -classifier ifAwin, then B ═ B + 1; else E ═ E + 1;
and step 9: (C, D) -classifier ifAwin, then C ═ C + 1; else D ═ D + 1;
step 10: (C, E) -classifier ifAwin, then C ═ C + 1; else E ═ E + 1;
step 11: (D, E) -classifier ifD win, then D ═ D + 1; else E ═ E + 1;
step 12: the results is The max (A, B, C, D, E).
Further, the specific content of performing the dynamic gesture trajectory recognition in step S7 is: when the system enters a dynamic gesture recognition module, the gesture of the palm is tracked, and the palm is dynamically operated once from the initial position to the final position; and only when the pause time of the palm at the starting position and the ending position exceeds a system set threshold time N seconds and the distance between the starting position and the ending position exceeds a system set threshold distance M, 0< N <5, 20< M <100, and the current dynamic operation can be regarded as an effective operation.
Compared with the prior art, the invention has the following beneficial effects:
the method provided by the invention can effectively detect and identify the gesture in complex environments with insufficient illumination, similar skin color interference and the like. The gesture recognition problem under the complex environment is solved, meanwhile, the static gesture recognition and the dynamic gesture recognition can be freely switched, and the efficiency of a gesture recognition system is greatly improved. The gesture recognition system based on the FPGA is realized by utilizing the characteristics of parallel processing and pipeline computing of an FPGA hardware platform, rich configurable logic resources, a callable IP core and the like, and experimental results show that the method has a good recognition effect.
Drawings
Fig. 1 is a system configuration diagram of an embodiment of the present invention.
Fig. 2 is a flowchart of an identification method according to an embodiment of the present invention.
Fig. 3 is a hardware architecture diagram of an HOG feature extraction module according to an embodiment of the present invention.
Detailed Description
The invention is further explained below with reference to the drawings and the embodiments.
It should be noted that the following detailed description is exemplary and is intended to provide further explanation of the disclosure. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments according to the present application. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, and it should be understood that when the terms "comprises" and/or "comprising" are used in this specification, they specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof, unless the context clearly indicates otherwise.
As shown in fig. 1, the present embodiment provides a gesture recognition system implemented based on an FPGA, which includes a CMOS camera data acquisition module, an FPGA data processing module, a DDR3 storage module, and a VGA display module; the FPGA data processing module comprises a camera driving module, a data read-write control module, a color space transformation module, a median filtering module, a histogram equalization module, a gesture segmentation module, a static and dynamic gesture judgment module, a feature extraction module, a classification identification module, a dynamic gesture track identification module, a character driving and video superposition module and a VGA driving module; the camera driving module drives the CMOS camera data acquisition module to acquire video data, and transmits the video data back to the data read-write control module, so that the data read-write control module controls the read and write of the acquired video data in the DDR3 storage module; the DDR3 storage module is used for storing video data acquired by the CMOS camera data acquisition module, transmitting the stored video data to the color space conversion module for data format conversion, and filtering through the median filtering module to reduce noise of the video data; the histogram equalization module is used for performing equalization processing on the video data filtered by the median filtering module so as to improve the contrast of the video data; the gesture segmentation module is used for segmenting a gesture area in the video data after equalization processing; the static and dynamic gesture judgment module comprises a dynamic gesture processing module and a static gesture processing module and is used for judging whether the gesture of the current frame is a static gesture or a dynamic gesture, when the specific gesture of the palm is detected, the gesture is considered to be a dynamic gesture, and the video data stream enters the dynamic gesture processing module; when the specific gesture of the palm is not detected, the gesture is considered to be a static gesture, and the video data stream enters a static gesture processing module; the feature extraction module is used for extracting HOG features of the current static gesture area; the classification recognition module uses an SVM classifier to classify and recognize the HOG characteristics of the current gesture area; the dynamic gesture track recognition module tracks the current dynamic gesture and recognizes the current dynamic gesture; the character driving and video overlapping module is used for driving the result of the static gesture recognition or the result of the dynamic gesture recognition into corresponding characters and overlapping the corresponding characters with the original gesture video; the VGA driving module is used for driving the VGA chip and displaying the superposed video in real time.
Preferably, as shown in fig. 2, the embodiment provides a recognition method of a gesture recognition system implemented based on an FPGA, including the following steps:
step S1: the CMOS camera data acquisition module acquires video data to the FPGA data processing module in real time;
step S2: the data read-write control module controls the read and write of the collected video data in the DDR3 storage module, and the DDR3 storage module stores the collected video data;
step S3: the color space transformation module converts the acquired RGB format data into YCbCr format data;
step S4: the median filtering module adopts a 3 multiplied by 3 template and uses a rapid median filtering algorithm to filter the data converted by the color space conversion module;
step S5: the filtered data is instantiated into two random memory blocks by a histogram equalization module, a Y component in YCbCr data after format data conversion is firstly subjected to histogram statistics to obtain a histogram, the histogram is cut and redistributed according to a cutting threshold value, then equalization operation is carried out on the histogram, the threshold value range is 0-100, and the equalization formula is as follows:
Figure BDA0002639104690000111
MN is the total number of image pixels, njThe number of pixels with the gray level j is, and the value range of j is 0 to k; l is the number of image grey levels, (e.g. 256 for an 8bit image).
By the above formula, the gray value of the pixel in the image after histogram equalization can be obtained by mapping the gray value r (k) of the pixel in the image after median filtering to s (k);
step S6: the gesture segmentation module is used for carrying out segmentation on the YCbCr color gamut space based on a skin color threshold value and eliminating the influence of similar skin colors by adopting a four-connected domain marking algorithm; determining the position of a gesture centroid through a centroid calculation formula, wherein the position of the gesture centroid represents the position of a center, and determining a minimum rectangular frame of a gesture area through the position of the gesture center;
step S7: judging whether the gesture is a static gesture or a dynamic gesture, starting timing when the static and dynamic gesture judging module detects that the gesture of the current frame is a palm, judging that the gesture is a dynamic gesture if the timing time reaches the set time K seconds and 0< K < 3, and then executing step S8; otherwise, if the gesture is determined to be a static gesture, HOG feature extraction and SVM classification recognition of the gesture area are performed, and then step S8 is executed;
the static and dynamic gesture judgment module can be divided into a static gesture processing module and a dynamic gesture processing module, the static and dynamic gesture judgment module judges the current gesture, and if the current gesture is a static gesture, the video data stream enters the static gesture processing module. If the gesture is a dynamic gesture, the video data stream enters a dynamic gesture processing module.
Step S8: the names of various gestures in the character driving and video overlapping module can generate corresponding character image data, SVM classification recognition results and dynamic gesture track recognition results are input into the character driving and video overlapping module, the image data are driven by the SVM classification recognition results and the dynamic gesture track recognition results, the driven character image data and the recognized gesture area image data are overlapped, and finally the character image data and the recognized gesture area image data are displayed on the VGA display module at the same time.
The character driving and video overlapping module can generate corresponding image data according to the names of various gestures, then drives the data according to the gesture recognition result, overlaps the driven character image data and the recognized gesture area image data, and finally displays the data in a VGA.
In this embodiment, the conversion formula for converting the collected RGB format data into the YCbCr format data in step S3 is:
Y=0.299R+0.587G+0.114B
Cb=-0.172R-0.339G+0.511B+128
Cr=0.511R-0.428G-0.083B+128
this formula is further translated into: to facilitate hardware implementation
Figure BDA0002639104690000131
In this embodiment, the specific content of the filtering process performed in step S4 is: firstly, respectively sequencing 3 images in each row; then extracting the minimum value of the three maximum values, the median value of the three median values and the maximum value of the three minimum values; and finally, taking the median of the three values obtained above again, namely obtaining the median of the finally obtained 9 pixels.
In this embodiment, the step S5 specifically includes the following steps:
step S51: and (3) performing threshold segmentation based on skin color in the YCbCr color gamut space, wherein the threshold segmentation formula of the skin color is as follows:
Figure BDA0002639104690000132
step S52: eliminating the influence of similar skin color by adopting a four-connected domain marking algorithm;
step S53: marking the gesture area with a rectangular frame; determining the position of a gesture centroid through a centroid calculation formula, wherein the position of the gesture centroid represents the position of a center, and determining a minimum rectangular frame of a gesture area through the position of the gesture center; the centroid calculation formula is as follows:
Figure BDA0002639104690000133
Figure BDA0002639104690000134
Figure BDA0002639104690000135
Figure BDA0002639104690000136
M00is a gesture areaCentroid weight, M, of the entire frame image01Is the weight of the centroid of the gesture area in the vertical direction, M10Is the weight of the centroid of the gesture area in the horizontal direction, XcIs the abscissa, Y, of the centroid of the gesture areacThe vertical coordinate of the centroid of the gesture area, the pixel value of each pixel point of the f (x, y) gesture area, y is the vertical coordinate of the pixel point of the gesture area, and x is the horizontal coordinate of the pixel point of the gesture area.
In this embodiment, the step S52 specifically includes the following steps:
step S521: judging whether the leftmost point and the uppermost point in the four neighborhoods of the current input pixel point are marked or not, if the leftmost point in the four neighborhoods of the current input pixel point is marked and the uppermost point is not marked, marking the point as the same area as the leftmost point; if the leftmost point is not marked and the uppermost point is marked, marking the point as the uppermost point and the same area; if the leftmost point and the uppermost point in the four neighborhoods of the current input pixel point are marked simultaneously, marking the point as a mark of a point with a small pixel value compared with the pixel values of the leftmost point and the uppermost point in the four neighborhoods of the current input pixel point in the two regions; if the leftmost point and the uppermost point are not marked at the same time, marking the point as a new area;
step S522: and calculating the area of each marked connected region, namely summing the number of pixel points of each connected region, wherein the connected region with the largest area is the gesture region to be segmented.
In this embodiment, the specific content of performing the HOG feature extraction of the gesture area in step S7 is: using a Shift _ RAM Shift register to perform two-line Shift storage on the static gesture image, forming a 3 x 3 pixel array with the input data of the current line, and calculating the gradient amplitude and the gradient direction of an input pixel point (x, y) by using the pixel array;
gradient G in horizontal directionx(x, y) gradient with vertical Gy(x, y) the formula is as follows:
Gx(x,y)=H(x+1,y)-H(x-1,y)
Gy(x,y)=H(x,y+1)-H(x,y-1)
the gradient amplitude G (x, y) and the gradient direction theta (x, y) of the pixel point (x, y) are calculated according to the following formula:
Figure BDA0002639104690000151
Figure BDA0002639104690000152
dividing cells and blocks: the direction value range of the pixel points is 0-180 degrees, the pixel points are divided into 9 bin intervals in total, and each 20 degrees is one interval; forming a cell by every 8 × 8 pixels, forming a block by every 2 × 2 cells, and forming an image by every 15 × 7 blocks; one cell contains 9-dimensional feature vectors, one block contains 36-dimensional feature vectors, and one image contains 3780-dimensional feature vectors, namely the HOG feature vectors extracted from the gesture regions.
In this embodiment, the specific content of the SVM classification and identification in step S7 is as follows: the classification recognition module performs classification recognition by using a linear SVM classifier, and realizes static gesture multi-classification by using one-to-many methods (OVO SVMs); defining five types of static gestures, wherein the mark of the gesture 1 is A, the mark of the gesture 2 is B, the mark of the gesture 3 is C, the mark of the gesture fist is D and the mark of the gesture palm is E;
the gesture 1 is a gesture when the human hand stretches out one finger, the gesture 2 is a gesture when the human hand stretches out two fingers, and the gesture 3 is a gesture when the human hand stretches out three fingers. The fist is the gesture when the human hand stretches out the fist, and the palm is the gesture when the human hand stretches out the palm.
For the A, B, C, D, E five types of gestures, respectively selecting feature vectors corresponding to (A, B), (A, C), (A, D), (A, E), (B, C), (B, D), (B, E), (C, D), (C, E), (D, E) as a training set during training to obtain ten groups of training results; during testing, ten groups of training results are respectively used for testing, and the ten groups of classification results are voted to obtain the final classification result.
In this embodiment, the voting process is as follows:
step 1: a ═ B ═ C ═ D ═ E ═ 0; // ticket number initialization
Step 2: (a, B) -classifier ifAwin, then a ═ a + 1; else B ═ B + 1;
and step 3: (a, C) -classifier ifAwin, then a ═ a + 1; else C ═ C + 1;
and 4, step 4: (a, D) -classifier ifAwin, then a ═ a + 1; else D ═ D + 1;
and 5: (a, E) -classifier ifAwin, then a ═ a + 1; else E ═ E + 1;
step 6: (B, C) -classifier ifAwin, then B ═ B + 1; else C ═ C + 1;
and 7: (B, D) -classifier ifAwin, then B ═ B + 1; else D ═ D + 1;
and 8: (B, E) -classifier ifAwin, then B ═ B + 1; else E ═ E + 1;
and step 9: (C, D) -classifier ifAwin, then C ═ C + 1; else D ═ D + 1;
step 10: (C, E) -classifier ifAwin, then C ═ C + 1; else E ═ E + 1;
step 11: (D, E) -classifier ifD win, then D ═ D + 1; else E ═ E + 1;
step 12: the results is The max (A, B, C, D, E).
In this embodiment, the specific content of performing the dynamic gesture trajectory recognition in step S7 is: when the system enters a dynamic gesture recognition module, the gesture of the palm is tracked, and the palm is dynamically operated once from the initial position to the final position; and only when the pause time of the palm at the starting position and the ending position exceeds a system set threshold time N seconds and the distance between the starting position and the ending position exceeds a system set threshold distance M, 0< N <5, 20< M <100, and the current dynamic operation can be regarded as an effective operation.
Preferably, in this embodiment, the CMOS camera is connected to the FPGA, and the driving of the camera is completed inside the FPGA chip. After video data collected by the camera enters the FPGA chip, the video data is cached in the DDR3 storage module under the action of the data read-write control module, and meanwhile, the video data is read out under the action of the data read-write control module. And the character driving and video overlapping module overlaps the recognized content with the gesture image in a character form in real time according to the recognition results of the static gesture and the dynamic gesture and then sends the overlapped recognized content to the VGA display module for display.
Specifically, in this embodiment, the CMOS camera data acquisition module acquires video data in real time to the inside of the FPGA chip; the FPGA data processing module carries out image preprocessing, gesture segmentation, static and dynamic gesture recognition, driving of a CMOS camera and a VGA and controlling read and write of DDR3 video data on the collected video data. The DDR3 memory module stores captured video data. And the VGA display module displays the gesture image marked by the gesture recognition result. The data read-write control module controls the read and write of the collected video data in the DDR 3; the DDR3 storage module stores video data acquired by the camera acquisition module; the color space transformation module converts the acquired RGB format data into YCbCr format data; the median filtering module is used for filtering the video data to reduce the noise of the video data; the histogram equalization module performs equalization processing on the video data to improve the contrast of the video data; the gesture segmentation module segments a gesture area in the video data; the static and dynamic gesture judgment module judges whether the current frame gesture is a static gesture or a dynamic gesture; the feature extraction module extracts HOG features of the current gesture area; the classification recognition module uses an SVM classifier to classify and recognize the HOG characteristics of the current gesture area; the dynamic gesture track recognition module tracks the current dynamic gesture and recognizes the current dynamic gesture; and the character driving and video overlapping module drives the result of the static gesture recognition or the result of the dynamic gesture recognition into corresponding characters and overlaps the original gesture video. The VGA driving module drives the VGA chip to display the superposed video in real time.
Preferably, in this embodiment, the video data collected is first converted from RGB video format to YCbCr format, then the video data is subjected to median filtering, and then the filtered data is subjected to histogram equalization. After the video image is enhanced, the gesture area is segmented based on the skin color, and the skin color interference such as the human face in the skin color threshold segmentation is removed through connected domain analysis. And then calculating the centroid of the gesture area, and generating a minimum bounding rectangle of the gesture area according to the centroid. And judging whether the current gesture is a static gesture or a dynamic gesture, if so, extracting HOG characteristics of the gesture area, and performing SVM classification and identification. And if the gesture is a dynamic gesture, performing dynamic gesture track recognition. And driving characters according to the static gesture recognition result or the dynamic gesture recognition result, overlapping the characters with the original gesture image, and displaying the characters in a VGA (video graphics array) mode.
Preferably, the hardware architecture of the HOG feature extraction module of the present embodiment is shown in fig. 3. And performing two-line Shift storage on the video data stream input by the gesture area by using a Shift _ RAM Shift register, forming a 3 x 3 pixel array with the input data of the current line, and calculating the horizontal gradient and the vertical gradient of the input pixel points by using the pixel array. And then calling an IP core of the multiplier, an IP core of the open operation and an arc tangent IP core to calculate the gradient amplitude and the gradient direction. At the same time, the gesture area data is buffered in an 8 × 8 array, resulting in a cell of 8 × 8 pixels. And counting the HOG characteristics of each cell in 9 direction intervals, connecting 4 cell characteristics of the block area in series to form the HOG characteristics of the block, and connecting the HOG characteristics of each block in series to form the HOG characteristics of the image. And finally, normalizing all HOG characteristics and outputting the normalized HOG characteristics to the SVM for classification use.
The above description is only a preferred embodiment of the present invention, and all equivalent changes and modifications made in accordance with the claims of the present invention should be covered by the present invention.

Claims (10)

1. The utility model provides a gesture recognition system based on FPGA realizes which characterized in that: the device comprises a CMOS camera data acquisition module, an FPGA data processing module, a DDR3 storage module and a VGA display module; the FPGA data processing module comprises a camera driving module, a data read-write control module, a color space transformation module, a median filtering module, a histogram equalization module, a gesture segmentation module, a static and dynamic gesture judgment module, a feature extraction module, a classification identification module, a dynamic gesture track identification module, a character driving and video superposition module and a VGA driving module; the camera driving module drives the CMOS camera data acquisition module to acquire video data, and transmits the video data back to the data read-write control module, so that the data read-write control module controls the read and write of the acquired video data in the DDR3 storage module; the DDR3 storage module is used for storing video data acquired by the CMOS camera data acquisition module, transmitting the stored video data to the color space conversion module for data format conversion, and filtering through the median filtering module to reduce noise of the video data; the histogram equalization module is used for performing equalization processing on the video data filtered by the median filtering module so as to improve the contrast of the video data; the gesture segmentation module is used for segmenting a gesture area in the video data after equalization processing; the static and dynamic gesture judgment module comprises a dynamic gesture processing module and a static gesture processing module and is used for judging whether the gesture of the current frame is a static gesture or a dynamic gesture, when the specific gesture of the palm is detected, the gesture is considered to be a dynamic gesture, and the video data stream enters the dynamic gesture processing module; when the specific gesture of the palm is not detected, the gesture is considered to be a static gesture, and the video data stream enters a static gesture processing module; the feature extraction module is used for extracting HOG features of the current static gesture area; the classification recognition module uses an SVM classifier to classify and recognize the HOG characteristics of the current gesture area; the dynamic gesture track recognition module tracks the current dynamic gesture and recognizes the current dynamic gesture; the character driving and video overlapping module is used for driving the result of the static gesture recognition or the result of the dynamic gesture recognition into corresponding characters and overlapping the corresponding characters with the original gesture video; the VGA driving module is used for driving the VGA chip and displaying the superposed video in real time.
2. An identification method of the gesture recognition system based on the FPGA implementation of claim 1, characterized in that: the method comprises the following steps:
step S1: the CMOS camera data acquisition module acquires video data to the FPGA data processing module in real time;
step S2: the data read-write control module controls the read and write of the collected video data in the DDR3 storage module, and the DDR3 storage module stores the collected video data;
step S3: the color space transformation module converts the acquired RGB format data into YCbCr format data;
step S4: the median filtering module adopts a 3 multiplied by 3 template and uses a rapid median filtering algorithm to filter the data converted by the color space transformation module;
step S5: the filtered data is instantiated into two Random Access Memories (RAMs) by a histogram equalization module, a Y component in YCbCr data after format data conversion is firstly subjected to histogram statistics to obtain a histogram, the histogram is cut and redistributed according to a cutting threshold value, then equalization operation is carried out on the histogram to improve the contrast of an image, the threshold value range is 0-100, and an equalization formula is as follows:
Figure FDA0002639104680000021
MN is the total number of image pixels, njThe number of pixels with the gray level j is, and the value range of j is 0 to k; l is the number of image gray levels, and by the above formula, the gray value of the pixel in the image after histogram equalization can be obtained by mapping the gray value r (k) of the pixel in the image after median filtering to s (k);
step S6: the gesture segmentation module is used for carrying out segmentation on the YCbCr color gamut space based on a skin color threshold value and eliminating the influence of similar skin colors by adopting a four-connected domain marking algorithm; determining the position of a gesture centroid through a centroid calculation formula, wherein the position of the gesture centroid represents the position of a center, and determining a minimum rectangular frame of a gesture area through the position of the gesture center;
step S7: judging whether the gesture is a static gesture or a dynamic gesture, starting timing when the static and dynamic gesture judging module detects that the gesture of the current frame is a palm, judging that the gesture is a dynamic gesture if the timing time reaches the set time K seconds and 0< K < 3, and then executing step S8; otherwise, if the gesture is determined to be a static gesture, HOG feature extraction and SVM classification recognition of the gesture area are performed, and then step S8 is executed;
step S8: the names of various gestures in the character driving and video overlapping module can generate corresponding character image data, SVM classification recognition results or dynamic gesture track recognition results are input into the character driving and video overlapping module, the image data are driven by the SVM classification recognition results or dynamic gesture track recognition results, the driven character image data and the recognized gesture area image data are overlapped, and finally the character image data and the recognized gesture area image data are displayed on the VGA display module at the same time.
3. The recognition method of the gesture recognition system based on FPGA implementation according to claim 2, characterized in that: in step S3, the conversion formula for converting the acquired RGB format data into YCbCr format data is:
Y=0.299R+0.587G+0.114B
Cb=-0.172R-0.339G+0.511B+128
Cr=0.511R-0.428G-0.083B+128
this formula is further translated into:
Y=(77R+150G+29B)>>8
Cb=(-43R-85G+128B+32768)>>8。
Cr=(128R-107G-21B+32768)>>8
4. the recognition method of the gesture recognition system based on FPGA implementation according to claim 2, characterized in that: the specific contents of the filtering process in step S4 are: firstly, respectively sequencing 3 images in each row; then extracting the minimum value of the three maximum values, the median value of the three median values and the maximum value of the three minimum values; and finally, taking the median of the three values obtained above again, namely obtaining the median of the finally obtained 9 pixels.
5. The recognition method of the gesture recognition system based on FPGA implementation according to claim 2, characterized in that: the step S5 specifically includes the following steps:
step S51: and (3) performing threshold segmentation based on skin color in the YCbCr color gamut space, wherein the threshold segmentation formula of the skin color is as follows:
Figure FDA0002639104680000041
step S52: eliminating the influence of similar skin color by adopting a four-connected domain marking algorithm;
step S53: marking the gesture area with a rectangular frame; determining the position of a gesture centroid through a centroid calculation formula, wherein the position of the gesture centroid represents the position of a center, and determining a minimum rectangular frame of a gesture area through the position of the gesture center; the centroid calculation formula is as follows:
Figure FDA0002639104680000042
Figure FDA0002639104680000043
Figure FDA0002639104680000044
Figure FDA0002639104680000045
M00the centroid weight of the gesture area in the whole frame image, M01Is the weight of the centroid of the gesture area in the vertical direction, M10Is the weight of the centroid of the gesture area in the horizontal direction, XcIs the abscissa, Y, of the centroid of the gesture areacThe vertical coordinate of the centroid of the gesture area, the pixel value of each pixel point of the f (x, y) gesture area, y is the vertical coordinate of the pixel point of the gesture area, and x is the horizontal coordinate of the pixel point of the gesture area.
6. The recognition method of the FPGA-based gesture recognition system according to claim 5, comprising the following steps of: the step S52 specifically includes the following steps:
step S521: judging whether the leftmost point and the uppermost point in the four neighborhoods of the current input pixel point are marked or not, if the leftmost point in the four neighborhoods of the current input pixel point is marked and the uppermost point is not marked, marking the point as the same area as the leftmost point; if the leftmost point is not marked and the uppermost point is marked, marking the point as the uppermost point and the same area; if the leftmost point and the uppermost point in the four neighborhoods of the current input pixel point are marked simultaneously, marking the point as a mark of a point with a small pixel value compared with the pixel values of the leftmost point and the uppermost point in the four neighborhoods of the current input pixel point in the two regions; if the leftmost point and the uppermost point are not marked at the same time, marking the point as a new area;
step S522: and calculating the area of each marked connected region, namely summing the number of pixel points of each connected region, wherein the connected region with the largest area is the gesture region to be segmented.
7. The recognition method of the gesture recognition system based on FPGA implementation according to claim 2, characterized in that: the specific content of the HOG feature extraction of the gesture area in step S7 is: using a Shift _ RAM Shift register to perform two-line Shift storage on the static gesture image, forming a 3 x 3 pixel array with the input data of the current line, and calculating the gradient amplitude and the gradient direction of an input pixel point (x, y) by using the pixel array;
gradient G in horizontal directionx(x, y) gradient with vertical Gy(x, y) the formula is as follows:
Gx(x,y)=H(x+1,y)-H(x-1,y)
Gy(x,y)=H(x,y+1)-H(x,y-1)
the gradient amplitude G (x, y) and the gradient direction theta (x, y) of the pixel point (x, y) are calculated according to the following formula:
Figure FDA0002639104680000061
Figure FDA0002639104680000062
dividing cells and blocks: the direction value range of the pixel points is 0-180 degrees, the pixel points are divided into 9 bin intervals in total, and each 20 degrees is one interval; forming a cell by every 8 × 8 pixels, forming a block by every 2 × 2 cells, and forming an image by every 15 × 7 blocks; one cell contains 9-dimensional feature vectors, one block contains 36-dimensional feature vectors, and one image contains 3780-dimensional feature vectors, namely the HOG feature vectors extracted from the gesture regions.
8. The recognition method of the gesture recognition system based on FPGA implementation according to claim 2, characterized in that: the specific content of the SVM classification and identification in step S7 is as follows: the classification recognition module carries out classification recognition by using a linear SVM classifier and realizes multi-classification of static gestures by a one-to-many method; defining five types of static gestures, wherein the mark of the gesture 1 is A, the mark of the gesture 2 is B, the mark of the gesture 3 is C, the mark of the gesture fist is D and the mark of the gesture palm is E; for the A, B, C, D, E five types of gestures, respectively selecting feature vectors corresponding to (A, B), (A, C), (A, D), (A, E), (B, C), (B, D), (B, E), (C, D), (C, E), (D, E) as a training set during training to obtain ten groups of training results; during testing, ten groups of training results are respectively used for testing, and the ten groups of classification results are voted to obtain the final classification result.
9. An identification method of the gesture recognition system based on FPGA implementation according to claim 7, characterized in that: the voting process is as follows:
step 1: a ═ B ═ C ═ D ═ E ═ 0; // ticket number initialization
Step 2: (a, B) -classifer if a win, then a ═ a + 1; else B ═ B + 1;
and step 3: (a, C) -classifer if a win, then a ═ a + 1; else C ═ C + 1;
and 4, step 4: (a, D) -classifer if a win, then a ═ a + 1; else D ═ D + 1;
and 5: (a, E) -classifer if a win, then a ═ a + 1; else E ═ E + 1;
step 6: (B, C) -classifer if a win, then B ═ B + 1; else C ═ C + 1;
and 7: (B, D) -classifer if a win, then B ═ B + 1; else D ═ D + 1;
and 8: (B, E) -classifer if a win, then B ═ B + 1; else E ═ E + 1;
and step 9: (C, D) -classifer if a win, then C ═ C + 1; else D ═ D + 1;
step 10: (C, E) -classifer if a win, then C ═ C + 1; else E ═ E + 1;
step 11: (D, E) -classifier if D win, then D ═ D + 1; else E ═ E + 1;
step 12: the results is The max (A, B, C, D, E).
10. The recognition method of the gesture recognition system based on FPGA implementation according to claim 2, characterized in that: the specific content of performing dynamic gesture trajectory recognition in step S7 is: when the system enters a dynamic gesture recognition module, the gesture of the palm is tracked, and the palm is dynamically operated once from the initial position to the final position; and only when the pause time of the palm at the starting position and the ending position exceeds a system set threshold time N seconds and the distance between the starting position and the ending position exceeds a system set threshold distance M, 0< N <5, 20< M <100, and the current dynamic operation can be regarded as an effective operation.
CN202010834453.XA 2020-08-19 2020-08-19 Gesture recognition system realized based on FPGA and recognition method thereof Active CN111914808B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010834453.XA CN111914808B (en) 2020-08-19 2020-08-19 Gesture recognition system realized based on FPGA and recognition method thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010834453.XA CN111914808B (en) 2020-08-19 2020-08-19 Gesture recognition system realized based on FPGA and recognition method thereof

Publications (2)

Publication Number Publication Date
CN111914808A true CN111914808A (en) 2020-11-10
CN111914808B CN111914808B (en) 2022-08-12

Family

ID=73279377

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010834453.XA Active CN111914808B (en) 2020-08-19 2020-08-19 Gesture recognition system realized based on FPGA and recognition method thereof

Country Status (1)

Country Link
CN (1) CN111914808B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112922490A (en) * 2021-02-09 2021-06-08 哈尔滨理工大学 Intelligent window system based on FPGA and STM32 are united
CN115766976A (en) * 2022-11-09 2023-03-07 深圳市南电信息工程有限公司 Image display system and control method thereof

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101853071A (en) * 2010-05-13 2010-10-06 重庆大学 Gesture identification method and system based on visual sense
US20120068917A1 (en) * 2010-09-17 2012-03-22 Sony Corporation System and method for dynamic gesture recognition using geometric classification
US20140294237A1 (en) * 2010-03-01 2014-10-02 Primesense Ltd. Combined color image and depth processing
CN107958218A (en) * 2017-11-22 2018-04-24 南京邮电大学 A kind of real-time gesture knows method for distinguishing
CN110309806A (en) * 2019-07-08 2019-10-08 哈尔滨理工大学 A kind of gesture recognition system and its method based on video image processing
CN110956099A (en) * 2019-11-14 2020-04-03 哈尔滨工程大学 Dynamic gesture instruction identification method
CN111160194A (en) * 2019-12-23 2020-05-15 浙江理工大学 Static gesture image recognition method based on multi-feature fusion

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140294237A1 (en) * 2010-03-01 2014-10-02 Primesense Ltd. Combined color image and depth processing
CN101853071A (en) * 2010-05-13 2010-10-06 重庆大学 Gesture identification method and system based on visual sense
US20120068917A1 (en) * 2010-09-17 2012-03-22 Sony Corporation System and method for dynamic gesture recognition using geometric classification
CN107958218A (en) * 2017-11-22 2018-04-24 南京邮电大学 A kind of real-time gesture knows method for distinguishing
CN110309806A (en) * 2019-07-08 2019-10-08 哈尔滨理工大学 A kind of gesture recognition system and its method based on video image processing
CN110956099A (en) * 2019-11-14 2020-04-03 哈尔滨工程大学 Dynamic gesture instruction identification method
CN111160194A (en) * 2019-12-23 2020-05-15 浙江理工大学 Static gesture image recognition method based on multi-feature fusion

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
千承辉等: "基于Kinect的手语识别方法", 《传感器与微系统》, no. 06, 10 June 2019 (2019-06-10) *
吴斌方等: "基于SVM与Inception-v3的手势识别", 《计算机系统应用》, no. 05, 15 May 2020 (2020-05-15) *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112922490A (en) * 2021-02-09 2021-06-08 哈尔滨理工大学 Intelligent window system based on FPGA and STM32 are united
CN115766976A (en) * 2022-11-09 2023-03-07 深圳市南电信息工程有限公司 Image display system and control method thereof
CN115766976B (en) * 2022-11-09 2023-10-13 深圳市南电信息工程有限公司 Image display system and control method thereof

Also Published As

Publication number Publication date
CN111914808B (en) 2022-08-12

Similar Documents

Publication Publication Date Title
CN102880865B (en) Dynamic gesture recognition method based on complexion and morphological characteristics
Munib et al. American sign language (ASL) recognition based on Hough transform and neural networks
CN102831404B (en) Gesture detecting method and system
US11169614B2 (en) Gesture detection method, gesture processing device, and computer readable storage medium
Rajam et al. Recognition of Tamil sign language alphabet using image processing to aid deaf-dumb people
CN111914808B (en) Gesture recognition system realized based on FPGA and recognition method thereof
CN102508547A (en) Computer-vision-based gesture input method construction method and system
EP3058513B1 (en) Multi-color channel detection for note recognition and management
CN107133562B (en) Gesture recognition method based on extreme learning machine
CN110032932B (en) Human body posture identification method based on video processing and decision tree set threshold
CN106503619B (en) Gesture recognition method based on BP neural network
Meng et al. An extended HOG model: SCHOG for human hand detection
CN109558855B (en) A kind of space gesture recognition methods combined based on palm contour feature with stencil matching method
CN114170672A (en) Classroom student behavior identification method based on computer vision
CN104850232A (en) Method for acquiring remote gesture tracks under camera conditions
CN112199015B (en) Intelligent interaction all-in-one machine and writing method and device thereof
CN110633666A (en) Gesture track recognition method based on finger color patches
Qiu et al. Computer Vision Technology Based on Deep Learning
CN107885324A (en) A kind of man-machine interaction method based on convolutional neural networks
CN113705640A (en) Method for quickly constructing airplane detection data set based on remote sensing image
CN102122345A (en) Finger gesture judging method based on hand movement variation
Kakkoth et al. Visual descriptors based real time hand gesture recognition
CN106326891A (en) Mobile terminal, target detection method and device of mobile terminal
Ding et al. An Asymmetric Parallel Residual Convolutional Neural Network for Pen-Holding Gesture Recognition
Zhu et al. Application of Attention Mechanism-Based Dual-Modality SSD in RGB-D Hand Detection

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant