CN111857334A - Human body gesture letter recognition method and device, computer equipment and storage medium - Google Patents

Human body gesture letter recognition method and device, computer equipment and storage medium Download PDF

Info

Publication number
CN111857334A
CN111857334A CN202010626967.6A CN202010626967A CN111857334A CN 111857334 A CN111857334 A CN 111857334A CN 202010626967 A CN202010626967 A CN 202010626967A CN 111857334 A CN111857334 A CN 111857334A
Authority
CN
China
Prior art keywords
gesture
image
data
information
recognition
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010626967.6A
Other languages
Chinese (zh)
Inventor
张卫东
陈斌
张国庆
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Jiaotong University
Original Assignee
Shanghai Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Jiaotong University filed Critical Shanghai Jiaotong University
Priority to CN202010626967.6A priority Critical patent/CN111857334A/en
Publication of CN111857334A publication Critical patent/CN111857334A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • G06F3/014Hand-worn input/output arrangements, e.g. data gloves
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/03Arrangements for converting the position or the displacement of a member into a coded form
    • G06F3/033Pointing devices displaced or positioned by the user, e.g. mice, trackballs, pens or joysticks; Accessories therefor
    • G06F3/0346Pointing devices displaced or positioned by the user, e.g. mice, trackballs, pens or joysticks; Accessories therefor with detection of the device orientation or free movement in a 3D space, e.g. 3D mice, 6-DOF [six degrees of freedom] pointers using gyroscopes, accelerometers or tilt-sensors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/56Extraction of image or video features relating to colour
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/107Static hand or arm
    • G06V40/113Recognition of static hand signs

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Human Computer Interaction (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • User Interface Of Digital Computer (AREA)
  • Image Analysis (AREA)

Abstract

A method, a device, a computer device and a storage medium for recognizing human body gesture letters are provided, which input characters by adopting a brand-new recognition mode of combining motion information and image information according to the sign language habit of a sign language user, conform to the standard sign language action standard, and have simple operation and convenient use.

Description

Human body gesture letter recognition method and device, computer equipment and storage medium
Technical Field
The invention relates to the technical field of image recognition technology and wearable technology, in particular to a hand-wearing type gesture recognition device and method applied to recognizing human gesture letters.
Background
The intelligent wearable equipment is a hot door in the engineering field at present, and especially the intelligent wearable equipment providing human-computer interaction is more sought by people. From the release of google glasses to the release of the first smart watch, wearable devices with various functions enter the lives of people so far, but the modes for inputting characters, particularly Chinese characters, in the wearable devices are very limited. There are now mainly three input modes: voice input, on-screen keyboard or physical keyboard input such as touch screen input, and more popular gesture input.
Compared with the three input modes, the voice input has certain limitation on the use group and the use environment, language dysfunction groups such as deaf-mutes and the like which can only use sign language without speaking in daily life are greatly blocked in use, the accuracy is low in the noisy environment, the degree of restriction on the environment is large, and the user privacy can not be well protected in a certain range due to sound propagation. The traditional physical keyboard or screen keyboard is restricted by the physical volume, the physical keyboard has larger volume and is inconvenient to carry, the screen keyboard is restricted by the screen volume in use, and if the screen volume is small, the click accuracy is greatly reduced. The existing limb recognition technology is more applied to large-scale gesture motion capture and recognition, and is mainly applied to large-scale limb motion capture such as limb tumble alarm, somatosensory game virtual entertainment and the like, at present, the only product based on text input is Ring, the principle is that a sensor is placed on one finger, English is input in a mode of continuous stroke writing of the finger, and because English has few strokes and Chinese strokes are complex, Chinese input based on a continuous handwriting mode is difficult to realize in practice.
At present, no suitable wearable intelligent device aiming at Chinese character input exists.
Disclosure of Invention
Aiming at the defects in the prior art, the invention provides a method and a device for recognizing human gesture letters, computer equipment and a storage medium, which are used for inputting characters by adopting a brand-new recognition mode of combining motion information and image information according to the sign language habit of a sign language user, meet the standard sign language action standard, and are simple to operate and convenient to use.
The purpose of the invention is realized by the following technical scheme:
the invention relates to a human body gesture letter recognition method, which comprises the following steps:
1. acquiring detailed actions by combining flexible wearable capturing equipment and image capturing equipment;
2. the sensor is attached to the hand in a mode that the miniature sensor is arranged on the half-naked glove for wearing;
3. the acquisition of hand gesture details is realized through miniature electronic positioners and gyroscopes which are arranged at each joint of the hand and the back of the hand;
4. the image detection of the body part of the user is finished by a camera which is arranged outside the body at a certain distance;
5. the wearing is convenient and comfortable through the flexible circuit and the micro embedded equipment;
6. The broadcasting and displaying of the recognized sign language are realized through a voice broadcasting module and a liquid crystal screen which are built on the embedded equipment;
7. processing the real-time image transmitted by the camera through external GPU equipment;
8. data transmission between the wearable device and the GPU equipment is achieved through the Bluetooth module;
9. the inside of the processor firstly applies a plurality of image processing algorithms to the collected image to preprocess so as to improve the operation efficiency;
10. an algorithm in the processor can fuse the image and data of the multiple sensors to process hand motion information, and sign language result matching is carried out according to a trained hand mapping skeleton contour decision model with nine axes and sixteen points.
The invention relates to a human body gesture letter recognition device, which comprises: the outside GPU equipment of the core treater of power supply unit, image acquisition equipment, built-in GPU server and built-in gyroscope and electronic positioning sensor, the gesture terminal equipment who shows voice broadcast module and bluetooth module, wherein: the image acquisition equipment acquires video data containing human body trunk information and gesture state information of a user and outputs the video data to the core processor, the GPU server in the core processor extracts motion data of key joints transmitted by the gesture information fusion wearable part from the video data, the motion data are matched with samples in the marking feature library, and corresponding gesture letters are further identified, and are broadcasted, displayed and fed back through the voice and display module after being screened finally.
And the identification is to preprocess the collected image through a plurality of image algorithms and fuse the gesture state motion information to obtain the direction pointed by the gesture, so as to identify the expressed gesture information and output the finally identified gesture letters to a sound playing and displaying device for sound and display feedback.
The external equipment comprises sound playing equipment and/or image display equipment and is used for feeding back gesture recognition information.
The electronic positioning sensor is attached to each joint of the finger through a knitted fabric and used for collecting the motion angle of the finger joint and outputting the integrated motion data to the core GPU processor in real time through the Bluetooth module to provide gesture motion data for the internal algorithm.
The human gesture letter recognition device adopts flexible circuit and miniature modularization design, gets rid of a great deal of circuit and walks the line, and every joint can utilize miniature module to detect motion information, connects each miniature module with flexible circuit according to gesture motion trend for do not have great constraint power when dressing, the design is pasted to the half naked formula that gloves also adopted, and the part that only needs to depend on the sensor in joint department just has the knitting to paste this and makes this equipment dress do not have the constraint power.
The processing of the hand motion information comprises the following steps: taking a sample image of a human gesture letter as sample data, firstly, preprocessing the image by using a built-in function in an OPENCV library, carrying out algorithm fusion by using an improved OPEPOSE human posture detection algorithm and motion data of each motion sensor of a sign language joint, training a hand mapping skeleton contour decision model with nine axes and sixteen points, generating data such as each joint chain angle and direction of a hand according to the fusion data transmitted in real time to construct a real-time hand model, and matching the real-time hand model with the decision data of a specific sign language posture generated by each sign language in the decision model.
The identification, utilize the gesture profile of the integration motion information that improved OPEPOSE gesture algorithm detected, this improved algorithm combines the data of the joint motion joint that wearable part of gesture was extracted to carry out gesture recognition, this motion information includes the angle of 16 key points of choosing, acceleration information, improved OPEPOSE gesture recognition algorithm can carry out the record to the discernment gesture that the belief is higher than 0.8, match through the KNN algorithm with the characteristic mark gesture model in the system file, final to the characteristic data that matches, output at last report and show the gesture representative that discerns, concrete step includes:
Step 1: in the preprocessing stage, the image is processed by utilizing functions in an OPENCV library, and color model conversion, skin color detection and gesture segmentation are carried out on the sample by an algorithm so as to distinguish a gesture and a gesture background area, remove a non-gesture area and carry out target positioning on a human gesture area;
step 2: the method comprises the steps that a video acquisition device is used for acquiring a large number of sample pictures represented by gesture sign language letters as sample data, 500 pictures are required as samples at present, an improved human posture recognition algorithm (a human posture recognition algorithm of Part Affinity Fields, also called an OPENPE algorithm) associated with limb areas is adopted to train the samples in combination with data of a gesture terminal, and gesture feature identification files for recognizing fingers, palms and respective orientations of the fingers and palms are generated to be used for recognizing input gestures to be detected;
and step 3: performing gesture recognition on the system with the trained characteristic gesture model, screening recognized characteristic gesture data by using a KNN algorithm, and outputting a gesture letter with the highest recognition probability as a gesture determined by final recognition;
the image processing method by utilizing the functions in the OPENCV library comprises the following specific steps:
The algorithm in the OPENCV library is mainly used in the initial working stage of processing images, because the background environment of image acquisition is sometimes complex, and the images need to be processed in a relevant way before being accurately processed by a processor, and the specific steps are as follows:
firstly, a GPU processor reads an image transmitted by image acquisition equipment, because the background environment in the actual environment is complex, the image needs to be preprocessed, certain environmental interference is removed through Gaussian filtering, denoising and image enhancement, and corresponding built-in functions in an OPENCV library can be called and processed; the image is transmitted after being preprocessed, and the image is represented by an original RGB model (red, green and blue), and the model has certain correlation with each other and is not suitable for independent processing. Whereas the YCbCr model (brightness, hue, saturation) represents that the information of the image has consistency with human visual perception and the brightness and the chromaticity are independent of each other, the brightness information and the chromaticity information of the color can be well separated, so this process converts the RGB model of the image into the YCbCr model representation:
Figure BDA0002566874750000031
after a preprocessed YCbCr image model is obtained, a target region and an environment background in the image are segmented through a skin color detection method of an ellipse model, and a binary image containing a hand region is extracted; and then, acquiring a threshold value for the acquired binary image containing the gesture area by adopting a maximum between-class variance method, removing the non-hand area, and segmenting the hand target area and the non-hand area.
The skin color detection method of the ellipse model comprises the following steps: the skin color detection based on the ellipse model is a skin color segmentation model based on a YCbCr color space, after the skin information of people is mapped to the YCbCr space through statistical data, skin color pixel points are approximately distributed in an ellipse in a CbCr two-dimensional space, whether an input image is the skin of a human body can be judged by utilizing the characteristic, namely whether the input image is the skin is judged by judging whether a CbCr coordinate of the image is in the ellipse. Since the color and brightness of the skin color region are nonlinear function, and the clustering type is reduced along with the nonlinear transformation of Y. Here, it is necessary to combine the image dataAnd removing highlight shadow parts, and carrying out nonlinear transformation on the color tones Cb and Cr in the YCbCr color space to obtain C 'b and C' r. The specific parameters of the ellipse equation are as follows:
Figure BDA0002566874750000041
generally, Cx 109.38, Cy 152.02, θ 2.53, ecx 1.60, exy 2.41, a 25.39, and b 14.03 are calculated according to the clustering characteristics of skin color points in the CbCr subspace.
The gesture segmentation is realized based on a maximum between-class variance method, and specifically comprises the following steps: the currently common threshold selection method provided by Otsu can effectively segment images, introduce the maximum classes to obtain thresholds for segmenting gesture regions and non-gesture regions, and remove the influence of the non-gesture regions on gesture segmentation. In order to obtain the segmented threshold, firstly selecting a connected region of each image, and establishing a histogram through normalization treatment:
Figure BDA0002566874750000042
Wherein: j is Bin where the ith connected region is normalized to histogram, M is the number of connected regions with gray scale value of 255 in the image, N [ i [ i ]]Is the size of the ith connected component pixel.
Meanwhile, a threshold r of a histogram of a connected region is obtained by adopting a maximum inter-class variance method, segmentation processing is carried out according to the threshold, non-gesture regions on the image with rough complexion in each direction are removed, more accurate gesture segmentation processing is carried out on the two-value image for complexion detection, and the threshold is used for
Figure BDA0002566874750000043
Wherein v istDesired for the size of the pixel, wrThe size of the connected region pixel falls within 0-r]In-range probability sum, 1-wrThe size of the connected region pixel falls within the range of r to 255]The in-range probability. In order to remove more non-gesture areas, the segmentation threshold is manually increased, and the threshold used for segmentation is set again1.2 r, the pixels of all connected regions in the image are in [ 0-1.2 r]The pixel values of the range are set to 0 (background), falling within [1.2 × r to 255 []The pixels in the image are all set to be 255 (gesture areas), and in such a way, some non-gesture areas in the image after skin color detection are removed, and more accurate gesture segmentation is obtained.
The improved human posture recognition algorithm of the associated limb area, Part Affinity Fields (PAF), is that the associated limb between the joint points, each pixel point in the image is a 2D vector, and the vector encodes the position and direction of the limb. The traditional PAF algorithm flow is: and respectively predicting a heat map and a PAF of the key points of the input image, and then associating according to the key points and the most dichotomous matching of the limbs to finally obtain all postures of all people in the map. The method comprises the following specific steps: the improved PAF attitude algorithm is applied, an 'angle chain' element is added to be used as a characteristic element for predicting a key point heat map, and the main flow of the improved algorithm is as follows: and respectively predicting a heat map and a PAF (Point-to-Point) of the key points for the input image, splicing the angle information of each joint transmitted by the wearable gesture motion capture equipment as a chain angle matrix with nine axes and sixteen points with the predicted heat map and the PAF of the key points, associating the angle information with the predicted heat map and the PAF of the key points according to minimum dichotomy matching, and finally obtaining the hand gesture posture with the bending degree of each joint by taking the key points as a partition.
And 4, step 4: and broadcasting and displaying gesture information on the recognition gesture result output by the core processor.
Technical effects
Compared with the prior art, the invention adopts the flexible circuit technology, the hardware circuit adopts the modularized hardware design, and each miniature detection module is independently attached to the joint, unlike the prior wearable devices which are integrated, so that the movement has great constraint force. The miniature sensor can be attached to a daily clothes glove by using a flexible circuit and a modularized hardware design technology in hardware realization of the device, and the device is convenient to wear and has no large constraint force. The invention utilizes an algorithm branch built in an OPENCV library to position a target area, emphasizes processing the image of the hand to improve the operation efficiency, and then calculates the skeleton contour of each finger by applying an improved associated limb area posture recognition algorithm to the image of the hand.
Drawings
FIG. 1 is a flow chart of a human gesture letter recognition method of the present invention;
FIG. 2 is a diagram of a human body feature model according to the present invention;
FIG. 3 is a diagram of a training sample according to the present invention;
FIG. 4 is a block diagram of a GPU device;
fig. 5 is an appearance structure diagram of the wearable device;
FIG. 6 is a block schematic diagram of a wearable device;
FIG. 7 is a schematic view of a nine-axis rotational model;
FIG. 8 is a schematic diagram of a key point skeleton outline of the gesture "sixteen points";
fig. 9 is a schematic diagram of modules involved in the overall method.
Detailed Description
As shown in fig. 1 and 9, the present embodiment relates to a method, an apparatus, a computer device and a storage medium for recognizing human body gesture letters, including: the device comprises a Ubuntu16.04 system serving as a core processor and carrying an improved openposition gesture extraction library, power supply equipment used for supplying power to all components of the device, video acquisition equipment used for acquiring human body gesture information and sending the acquired information to the core processor, voice broadcasting equipment used for broadcasting gesture information recognized by the core processor, image display equipment used for displaying the recognized gesture information, a gesture terminal device matched with an STM32 micro-processor, a plurality of electronic positioning sensors and a Bluetooth module, wherein the gesture terminal device depends on the modules on a semi-naked glove to position gestures.
As shown in fig. 5 and 6, the wearable device according to this embodiment is a part of fig. 5 is an external view of the wearable portion, a left half of fig. 5 is a back area of a hand, U1 is an STM32 core controller, an internal integrated control chip, a micro liquid crystal unit L1, a power supply module P1 and a bluetooth transmission module T1 are attached to a knitted fabric and attached to a palm portion through magic paste, a right half of fig. 5 is a palm area of the hand, S0, S1, S2, S3 and S4 units are micro electronic compass positioning sensors (magnetometers) attached to finger sleeves, each movable finger joint is attached to one sensor to detect joint movement information, and an M1 unit is a micro MPU gyroscope sensor, each module of which is connected to the STM32 core controller through a flexible flat cable, and the flexible flat cable trend conforms to the finger movement trend.
As shown in fig. 7, each plane in the model with nine axes and sixteen points rotates by a fixed angle, and each plane has two other rotation directions, and the sixteen points are worth of a hand key point outline map according to nine rotation modes of coordinate axes.
As shown in fig. 8, a total of 16 key nodes are identified, each key joint point has motion information (angle, acceleration) of each joint, and the sequential joint points add the motion information of the joint point with the connecting key point as the origin.
As shown in fig. 4, the left half of the GPU device is a module illustration diagram of the GPU device, and the right half is a structural diagram of the GPU device, including: a is a GPU core processor, the interior of the GPU core processor is managed by a Linux operating system, B is a visual acquisition module (camera), C is a network interface, D is a sound card, E is an expansion interface, and F is a display module interface. Wherein: the unit B (visual acquisition module) is connected with the unit A (GPU core processor) and used for transmitting image information acquired in real time, the unit C (network interface module) is connected with the unit A (GPU core processor) and used for uploading data, the unit D (sound card) is connected with the unit A and used for broadcasting a processed result, the unit E (expansion interface) is used for connecting accessory equipment such as a mouse, a keyboard and the like to provide interactive operation and used for providing a man-machine interactive mode for the unit A, and the unit F (display module interface) provides a display equipment interface for the unit A and used for being connected with an external display screen and used for displaying an operation picture of an operating system.
The embodiment relates to a human body gesture recognition method of the device, which comprises the following steps:
step one, wearing a wearable device terminal, placing a camera at a position 0.5-1 m away from a human body, and aligning a hand to acquire gesture image information through a video acquisition device;
Step two, image preprocessing: the method comprises the steps of eliminating the interference of a background picture through color model conversion, skin color detection and gesture segmentation on a video frame picture obtained from a video;
step three, image training: training input sample data to generate an identification file, training the sample by adopting an improved human posture recognition algorithm of an associated limb area and combining data of a gesture terminal, and generating a characteristic identification file for recognizing information of fingers, palms and respective orientations of the fingers and the palms;
step four, gesture target recognition: the trained equipment can be directly used without secondary training, the gesture information to be tested is generated into a feature file to be matched with the improved human posture recognition algorithm gesture model of the associated limb area, and if the matching confidence reaches a certain value (the current gesture is considered to be matched with the feature character if the matching confidence is more than 0.8), the matched feature gesture is recorded;
the improved algorithm of the improved human body posture recognition algorithm (OPENPOSE algorithm) of the associated limb area specifically comprises the following steps: and respectively predicting a heat map and a PAF (minimum distance function) of the key points for the input image, splicing the angle information of each joint transmitted by the wearable gesture motion capture equipment as a chain angle matrix with the predicted heat map and PAF of the key points, and associating the angle information with the matching of the absolute minimum dichotomy to finally obtain the hand posture with the bending degree of each joint by taking the key points as a partition.
Step five, outputting gesture information: and finally, screening the matched gesture data in a certain time through a KNN algorithm, and outputting the gesture with the highest probability as a result to broadcast and display if the data all reach the credibility. And when the matched gesture data has low credibility, the final result is not output, and the steps from one step to five are repeated (if a sample library needing to be detected is not newly added, the step three can be omitted).
As shown in fig. 8, the algorithm extracts information of image joints, specifically: the improved human posture recognition algorithm of the associated limb area is used for extracting skeleton information like key nodes of the graph and fusing key motion angle data for each joint.
The specific steps of gesture generation in the gesture training database include:
after the device is worn, each Chinese gesture to be collected is placed in a training folder by taking at least 500 images as a group, the named format is 'letter code', and the wrist rotates by different angles after the gesture is fixed and the Chinese gesture is shot randomly, so that the sample data set has a wider identification range and better conforms to the environment in practical use.
Step 2) recognizing Chinese letter gestures and word gestures as gesture model files, inputting the gesture model files into an image recognition system, and generating label image script files firstly;
step 3) reading the generated label image script file by the system, fusing the motion data of the corresponding gesture model transmitted by the gesture wearable part to generate a feature script file as a feature file of the gesture model for identifying the gesture, for example, the feature label file generated by a Chinese letter gesture A system is as follows:
{“mpii_image”:”A0001.jpg”,”mpii_annorect_idx”:0,”hand_pts”:[[116.2,521.0,1.0],[117.205529785,526.14074707,1.0],[118.50113525,527.9468383789,1.0],[119.278491,537.7512817],[117.7237793,578.51727295,1.0],[115.59898682,572.066956,1.0],[116.6095581,608.1887207,1.0],[116.97232666,599.9323,1.0],[118.3456665039,606.382693945,1.0],[120.3927,546.265686,1.0],[115.6767334,591.93395996,1.0],[119.278491211,537.75128,1.0],[116.6095660644531,568.5156860351562,1.0],[1174.6466064453125,607.7464599609375,1.0],[117.697826856,546.1656860351562,1.0]],”head_box”:[[103.71,136.0],[114.6,290.0]],”head_size”:94.0,”hand_box_center”:[117.995849609376,564.594360351563],”is_left”:0,”is_mpii”:1,”firac_finger”:[89.6,89.2],”seconac_finger”:[89.9,98.9,87.6],”thirdac_finger”:[89.7,98.7,92.5],”fourac_finger”:[89.7,96.7,98.6],”fivac_finger”:[89.7,93.6,91.1]}
each gesture to be tested generates a corresponding file, and as long as each data in the file is matched with each feature file of a corresponding letter (currently, each gesture has 500 feature script files), the test is subjected to letter identification.
The specific steps of matching the gesture data with the highest credibility by the KNN algorithm comprise:
1) for the distance between the predicted gesture and the point in the known category data set and the current point;
2) sorting according to the ascending order of the distances;
3) selecting k points with the minimum distance from the current point;
4) determining the occurrence frequency of the category where the first k points are located;
5) and returning the category with the highest occurrence frequency of the first k points as the prediction classification of the current point.
Through specific practical experiment, at daily work, under the specific environment setting of life, put the camera after wearing wearable equipment and face the main area of gesture activity and start the power and the system environment of wearable equipment and GPU equipment with the position within 0.5 ~ 1 meter of health and use in order to dress, just later can put out the sign language according to the daily use habit of user, the system can discern the sign language in the millisecond rank and report in real time and show the feedback, the experimental data that can obtain are:
the current test is Chinese gesture letters, and experimental data obtained by testing each letter 400 times according to daily use habits are as follows:
test letter Rate of accuracy Test letter Rate of accuracy Test letter Rate of accuracy
A 98.3333% K 95.3333% U 99%
B 99.3333% L 99.3333% V 98%
C 97.6667% M 96%. W 98.3333%
D 98.6667% N 96.6667% X 96.6667%
E 97.6667% O 98.3333% Y 98.5%
F 98.3333% P 99% Z 99.5%
G 99% Q 96.3333% ZH 97.754%
H 95.3333% R 98% CH 98.75%
I 96.3333% S 97.3333% SH 97.25%
J 95.6667% T 96% NG 99.6667%
In conclusion, the method and the device have the advantages that the input detection video frame image information is preprocessed, non-gesture parts are removed to determine the target detection area, and the processing efficiency of a subsequent algorithm is improved; the method comprises the steps of improving an OPENPLE algorithm model, fusing motion information of each key joint point of a gesture by using a method with nine axes and sixteen points, generating a feature file for the gesture to be detected after the gesture to be detected is identified by the algorithm, recording all information (gesture size, motion information (angle and acceleration) of each thumb, gesture center point coordinates and gesture outline coordinates) of the current gesture, matching the information with an identification feature file of a sample gesture model, screening gestures identified within a certain time by using a KNN algorithm, and outputting the gesture with the highest identification rate as a result.
The foregoing embodiments may be modified in many different ways by those skilled in the art without departing from the spirit and scope of the invention, which is defined by the appended claims and all changes that come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein.

Claims (11)

1. A human body gesture letter recognition method is characterized by comprising the following steps:
step 1) acquiring detailed actions by combining a flexible wearable capturing device and an image capturing device;
step 2) attaching the sensors to the hands in a mode of installing the micro sensors on the half-bare gloves for wearing;
step 3) acquiring hand posture details through miniature electronic positioners and gyroscopes installed at each joint of the hand and the back of the hand;
step 4) finishing image detection on the body part of the user through a camera arranged outside the body at a certain distance;
step 5), the wearing is convenient and comfortable through the flexible circuit and the micro embedded equipment;
step 6) realizing the broadcast and display of the recognized sign language through a voice broadcast module and a liquid crystal screen which are built on the embedded equipment;
step 7), processing the real-time image transmitted by the camera through external GPU equipment;
Step 8) realizing data transmission between the wearable device and the GPU equipment through the Bluetooth module;
step 9) preprocessing the collected image by using a plurality of image processing algorithms in the processor to improve the operation efficiency;
and step 10) an algorithm in the processor fuses the image and data of the multi-sensor to process hand motion information, and sign language result matching is carried out according to a trained hand mapping skeleton contour decision model with nine axes and sixteen points.
2. A human gesture alphabet recognition device for implementing the method of claim 1, comprising: the outside GPU equipment of the core treater of power supply unit, image acquisition equipment, built-in GPU server and built-in gyroscope and electronic positioning sensor, the gesture terminal equipment who shows voice broadcast module and bluetooth module, wherein: the image acquisition equipment acquires video data containing human body trunk information and gesture state information of a user and outputs the video data to the core processor, the GPU server in the core processor extracts motion data of key joints transmitted by the gesture information fusion wearable part from the video data, the motion data are matched with samples in the marking feature library, and corresponding gesture letters are further identified, and are broadcasted, displayed and fed back through the voice and display module after being screened finally.
3. The recognition device according to claim 2, wherein the recognition is to preprocess the collected image by a plurality of image algorithms and to fuse the gesture state motion information to obtain the direction pointed by the gesture, thereby recognizing the expressed gesture information and outputting the finally recognized gesture letters to the sound playing and displaying device for sound and display feedback;
the external equipment comprises sound playing equipment and/or image display equipment and is used for feeding back gesture recognition information;
the electronic positioning sensor is attached to each joint of the finger through a knitted fabric and used for collecting the motion angle of the finger joint and outputting the integrated motion data to the core GPU processor in real time through the Bluetooth module to provide gesture motion data for the internal algorithm.
4. The identification device as claimed in claim 2, wherein the identification device for human body gesture letters adopts a flexible circuit and a micro modular design, so that a plurality of lines are eliminated, each joint can detect motion information by utilizing a micro module, the flexible circuit is used for connecting the micro modules according to gesture motion trends, the glove is also of a semi-naked sticking design, and no binding force exists when the glove is worn due to the fact that knitted fabrics are stuck only on the part of the joint where the sensor needs to be attached.
5. The recognition apparatus according to claim 2, wherein the processing of the hand motion information is: taking a sample image of a human gesture letter as sample data, firstly, preprocessing the image by using a built-in function in an OPENCV library, carrying out algorithm fusion by using an improved OPEPOSE human posture detection algorithm and motion data of each motion sensor of a sign language joint, training a hand mapping skeleton outline decision model with nine axes and sixteen points, generating data of each joint chain angle and direction of a hand according to fusion data transmitted in real time to construct a real-time hand model, and matching the real-time hand model with decision data of a specific sign language posture generated by each sign language in the decision model.
6. The recognition device of claim 2, wherein the recognition is a gesture profile detected by using an improved OPENPLE gesture algorithm, the improved algorithm combines data of an articulation joint extracted from a gesture wearable part to perform gesture recognition, the motion information comprises angle and acceleration information of selected 16 key points, the improved OPENPLE gesture recognition algorithm records a recognized gesture with the confidence degree higher than 0.8, the recognized gesture is matched with a feature identification gesture model in a system file through a KNN algorithm, finally matched feature data is obtained, and finally a recognized gesture representation is output and displayed in a broadcast mode.
7. The identification device according to claim 2 or 6, wherein the identification comprises the following specific steps:
step 1: in the preprocessing stage, the image is processed by utilizing functions in an OPENCV library, and color model conversion, skin color detection and gesture segmentation are carried out on the sample by an algorithm so as to distinguish a gesture and a gesture background area, remove a non-gesture area and carry out target positioning on a human gesture area;
step 2: the method comprises the steps that a video acquisition device is used for acquiring a large number of sample pictures represented by gesture sign language letters as sample data, 500 pictures are required as samples at present, an improved human body posture recognition algorithm of a relevant limb area is adopted to combine data of a gesture terminal to train the samples, and gesture feature identification files for recognizing fingers, palms and respective orientations of the fingers and the palms are generated to be used for recognizing input gestures to be detected;
and step 3: performing gesture recognition on the system with the trained characteristic gesture model, screening recognized characteristic gesture data by using a KNN algorithm, and outputting a gesture letter with the highest recognition probability as a gesture determined by final recognition;
and 4, step 4: and broadcasting and displaying gesture information on the recognition gesture result output by the core processor.
8. The recognition device of claim 7, wherein said processing the image using functions in the OPENCV library comprises: the algorithm in the OPENCV library is mainly used in the initial working stage of processing images, because the background environment of image acquisition is sometimes complex, and the images need to be processed in a relevant way before being accurately processed by a processor, and the specific steps are as follows:
firstly, a GPU processor reads an image transmitted by image acquisition equipment, the image is preprocessed, certain environmental interference is removed through Gaussian filtering, denoising and image enhancement, and corresponding built-in functions are arranged in an OPENCV library and can be called and processed; the image is transmitted after being preprocessed and represented by an original RGB model, the models have certain correlation with each other and are not suitable for independent processing, and the YCbCr model represents that the information of the image has consistency with human visual perception and the brightness and the chroma are mutually independent, so that the brightness information and the chroma information of colors can be well separated, and the RGB model of the image is converted into the YCbCr model to represent:
Figure FDA0002566874740000031
after a preprocessed YCbCr image model is obtained, a target region and an environment background in the image are segmented through a skin color detection method of an ellipse model, and a binary image containing a hand region is extracted; and then, acquiring a threshold value for the acquired binary image containing the gesture area by adopting a maximum between-class variance method, removing the non-hand area, and segmenting the hand target area and the non-hand area.
9. The identification device of claim 7, wherein the skin color detection method of the ellipse model comprises: the skin color detection based on the ellipse model is a skin color segmentation model based on a YCbCr color space, after statistical data shows that skin information of people is mapped to the YCbCr space, skin color pixel points are approximately distributed in a two-dimensional space of CbCr in an ellipse, whether an input image is the skin of a human body can be judged by utilizing the characteristic, namely whether the input image is the skin is judged by judging whether a CbCr coordinate of the image is in the ellipse, because the color and the brightness of a skin color area form a nonlinear function relation, and the clustering type can be reduced along with the nonlinear transformation of Y, a highlight shadow part in the image needs to be removed, and hue Cb and Cr in the YCbCr color space are subjected to nonlinear transformation to obtain C 'b and C' r, and the specific parameters of an elliptic equation are as follows:
Figure FDA0002566874740000032
10. the recognition device of claim 7, wherein the gesture segmentation is implemented based on a maximum between-class variance method, specifically: the currently common threshold selecting method provided by Otsu can effectively segment images, introduce a threshold value which can obtain a segmentation gesture area and a non-gesture area between maximum classes, remove the influence of the non-gesture area on gesture segmentation, firstly select a communication area of each image for obtaining the segmentation threshold value, and establish a histogram through normalization processing:
Figure FDA0002566874740000033
Wherein: j is Bin where the ith connected region is normalized to histogram, M is the number of connected regions with gray scale value of 255 in the image, N [ i [ i ]]The size of the ith connected region pixel; meanwhile, a threshold r of a histogram of a connected region is obtained by adopting a maximum inter-class variance method, segmentation processing is carried out according to the threshold, non-gesture regions on the image with rough complexion in each direction are removed, more accurate gesture segmentation processing is carried out on the two-value image for complexion detection, and the threshold is used for
Figure FDA0002566874740000041
Wherein v istDesired for the size of the pixel, wrThe size of the connected region pixel falls within 0-r]Probability in rangeAnd, 1-wrThe size of the connected region pixel falls within the range of r to 255]In-range probability, in order to remove more non-gesture areas, a segmentation threshold is increased by adopting a manual means, the threshold used for segmentation is set to be 1.2 r, and then the pixels of all connected areas in the image are in the range of 0-1.2 r]The pixel values of the range are set to 0 (background), falling within [1.2 × r to 255 []The pixels in the image are all set to be 255 (gesture areas), and in such a way, some non-gesture areas in the image after skin color detection are removed, and more accurate gesture segmentation is obtained.
11. The identification device of claim 7 wherein said improved algorithm for recognizing body posture associated with limb area comprises the steps of: and respectively predicting a heat map and a PAF (Point-to-Point) of the key points for the input image, splicing the angle information of each joint transmitted by the wearable gesture motion capture equipment as a chain angle matrix with nine axes and sixteen points with the predicted heat map and the PAF of the key points, associating the angle information with the predicted heat map and the PAF of the key points according to minimum dichotomy matching, and finally obtaining the hand gesture posture with the bending degree of each joint by taking the key points as a partition.
CN202010626967.6A 2020-07-02 2020-07-02 Human body gesture letter recognition method and device, computer equipment and storage medium Pending CN111857334A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010626967.6A CN111857334A (en) 2020-07-02 2020-07-02 Human body gesture letter recognition method and device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010626967.6A CN111857334A (en) 2020-07-02 2020-07-02 Human body gesture letter recognition method and device, computer equipment and storage medium

Publications (1)

Publication Number Publication Date
CN111857334A true CN111857334A (en) 2020-10-30

Family

ID=72989047

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010626967.6A Pending CN111857334A (en) 2020-07-02 2020-07-02 Human body gesture letter recognition method and device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN111857334A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113361382A (en) * 2021-05-14 2021-09-07 沈阳工业大学 Hand shape recognition method based on compressed relative contour feature points
CN113611387A (en) * 2021-07-30 2021-11-05 清华大学深圳国际研究生院 Motion quality assessment method based on human body pose estimation and terminal equipment
CN113961071A (en) * 2021-10-11 2022-01-21 维沃移动通信有限公司 Smart watch, interaction method and interaction device
CN115131871A (en) * 2021-03-25 2022-09-30 华为技术有限公司 Gesture recognition system and method and computing device
WO2023123473A1 (en) * 2021-12-31 2023-07-06 华为技术有限公司 Man-machine interaction method and system, and processing device
CN116805272A (en) * 2022-10-29 2023-09-26 武汉行已学教育咨询有限公司 Visual education teaching analysis method, system and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104866824A (en) * 2015-05-17 2015-08-26 华南理工大学 Manual alphabet identification method based on Leap Motion
CN105005769A (en) * 2015-07-08 2015-10-28 山东大学 Deep information based sign language recognition method
CN108268125A (en) * 2016-12-31 2018-07-10 广州映博智能科技有限公司 A kind of motion gesture detection and tracking based on computer vision
CN109614922A (en) * 2018-12-07 2019-04-12 南京富士通南大软件技术有限公司 A kind of dynamic static gesture identification method and system
CN110390275A (en) * 2019-07-04 2019-10-29 淮阴工学院 A kind of gesture classification method based on transfer learning
CN110764621A (en) * 2019-11-01 2020-02-07 华东师范大学 Self-powered intelligent touch glove and mute gesture broadcasting system

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104866824A (en) * 2015-05-17 2015-08-26 华南理工大学 Manual alphabet identification method based on Leap Motion
CN105005769A (en) * 2015-07-08 2015-10-28 山东大学 Deep information based sign language recognition method
CN108268125A (en) * 2016-12-31 2018-07-10 广州映博智能科技有限公司 A kind of motion gesture detection and tracking based on computer vision
CN109614922A (en) * 2018-12-07 2019-04-12 南京富士通南大软件技术有限公司 A kind of dynamic static gesture identification method and system
CN110390275A (en) * 2019-07-04 2019-10-29 淮阴工学院 A kind of gesture classification method based on transfer learning
CN110764621A (en) * 2019-11-01 2020-02-07 华东师范大学 Self-powered intelligent touch glove and mute gesture broadcasting system

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115131871A (en) * 2021-03-25 2022-09-30 华为技术有限公司 Gesture recognition system and method and computing device
CN113361382A (en) * 2021-05-14 2021-09-07 沈阳工业大学 Hand shape recognition method based on compressed relative contour feature points
CN113361382B (en) * 2021-05-14 2024-02-02 沈阳工业大学 Hand shape recognition method based on compressed relative contour feature points
CN113611387A (en) * 2021-07-30 2021-11-05 清华大学深圳国际研究生院 Motion quality assessment method based on human body pose estimation and terminal equipment
CN113961071A (en) * 2021-10-11 2022-01-21 维沃移动通信有限公司 Smart watch, interaction method and interaction device
WO2023123473A1 (en) * 2021-12-31 2023-07-06 华为技术有限公司 Man-machine interaction method and system, and processing device
CN116805272A (en) * 2022-10-29 2023-09-26 武汉行已学教育咨询有限公司 Visual education teaching analysis method, system and storage medium

Similar Documents

Publication Publication Date Title
CN111857334A (en) Human body gesture letter recognition method and device, computer equipment and storage medium
CN107633207B (en) AU characteristic recognition methods, device and storage medium
CN109359538B (en) Training method of convolutional neural network, gesture recognition method, device and equipment
Sagayam et al. Hand posture and gesture recognition techniques for virtual reality applications: a survey
Devanne et al. 3-d human action recognition by shape analysis of motion trajectories on riemannian manifold
Ahmed et al. Vision based hand gesture recognition using dynamic time warping for Indian sign language
Munib et al. American sign language (ASL) recognition based on Hough transform and neural networks
Kulkarni et al. Appearance based recognition of american sign language using gesture segmentation
Agrawal et al. A survey on manual and non-manual sign language recognition for isolated and continuous sign
CN109409994A (en) The methods, devices and systems of analog subscriber garments worn ornaments
Nath et al. Real time sign language interpreter
CN109325408A (en) A kind of gesture judging method and storage medium
CN112052186A (en) Target detection method, device, equipment and storage medium
CN111722713A (en) Multi-mode fused gesture keyboard input method, device, system and storage medium
CN112489129B (en) Pose recognition model training method and device, pose recognition method and terminal equipment
CN106200971A (en) Man-machine interactive system device based on gesture identification and operational approach
Desai et al. Human Computer Interaction through hand gestures for home automation using Microsoft Kinect
Auephanwiriyakul et al. Thai sign language translation using scale invariant feature transform and hidden markov models
CN108073851A (en) A kind of method, apparatus and electronic equipment for capturing gesture identification
Kumar et al. A hybrid gesture recognition method for American sign language
Shin et al. Hand region extraction and gesture recognition using entropy analysis
Nayakwadi et al. Natural hand gestures recognition system for intelligent hci: A survey
CN109359543B (en) Portrait retrieval method and device based on skeletonization
CN108108648A (en) A kind of new gesture recognition system device and method
Yousaf et al. Virtual keyboard: real-time finger joints tracking for keystroke detection and recognition

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination