CN109033954B - Machine vision-based aerial handwriting recognition system and method - Google Patents

Machine vision-based aerial handwriting recognition system and method Download PDF

Info

Publication number
CN109033954B
CN109033954B CN201810620085.1A CN201810620085A CN109033954B CN 109033954 B CN109033954 B CN 109033954B CN 201810620085 A CN201810620085 A CN 201810620085A CN 109033954 B CN109033954 B CN 109033954B
Authority
CN
China
Prior art keywords
image
algorithm
svm
track
characters
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810620085.1A
Other languages
Chinese (zh)
Other versions
CN109033954A (en
Inventor
汪梅
王博馨
孙敏
牛钦
翟珂
王刚
张佳楠
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xian University of Science and Technology
Original Assignee
Xian University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xian University of Science and Technology filed Critical Xian University of Science and Technology
Priority to CN201810620085.1A priority Critical patent/CN109033954B/en
Publication of CN109033954A publication Critical patent/CN109033954A/en
Application granted granted Critical
Publication of CN109033954B publication Critical patent/CN109033954B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • G06V40/28Recognition of hand or arm movements, e.g. recognition of deaf sign language
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2413Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
    • G06F18/24133Distances to prototypes

Abstract

The invention discloses an aerial handwriting recognition system and method based on machine vision.A handwriting character video input part is used for acquiring an input aerial handwriting character video with a specific color in real time and generating a track picture by using acquired track points; the preprocessing part is used for preprocessing the track picture generated by the track points by filtering, gray level binarization and morphology; the character segmentation part is used for converting the RGB color space of the track picture generated by the preprocessed track points into HSV and segmenting characters in the HSV by using a GA-Otsu segmentation algorithm; the characteristic extraction part is used for extracting the characteristics of the divided characters; and the classification recognition part is used for optimizing the support vector machine algorithm SVM by using the extracted features through a Particle Swarm Optimization (PSO) algorithm to obtain an optimal SVM classification model and outputting recognition results of all characters. The invention only needs to capture the writing track of a specific color, has high efficiency, saves cost, is convenient to carry, and does not need to judge the start, stop and end of writing.

Description

Machine vision-based aerial handwriting recognition system and method
Technical Field
The invention relates to the technical field of machine vision recognition, in particular to an aerial handwriting recognition system and method based on machine vision.
Background
The novel comfortable and natural man-machine interaction means of air handwriting is different from the traditional man-machine interaction mode, and the air handwriting allows a user to write in the air in a natural and unconstrained mode, so that more intuitive, convenient and comfortable interaction experience is provided. The whole process of the air handwriting mainly comprises two aspects of technologies. Namely dynamic target object capture and handwriting recognition. As a novel man-machine interaction mode, the air handwriting opens a new era of man-machine interaction which is from beginning to end and is expected to play an important role in future man-machine interaction application. At present, the air handwriting recognition is mostly realized by two modes, (1) based on an acceleration sensor, the air handwriting motion data is collected and analyzed. And then extracting the characteristic vector of the acquired data, and classifying and identifying by adopting a pattern recognition algorithm. (2) Based on a computer camera, a writing gesture is captured using a specific algorithm. And then analyzing and identifying the specific content formed by the generated motion trail. However, the following common problems still exist in the practical application of the methods:
1. when writing in the air, the initial and end states of writing need to be distinguished. If the pause phenomenon occurs in the writing process, the phenomenon of excessive noise data occurs, so that a real comfortable, cheap and natural man-machine interaction mode cannot be realized.
2. Generally, the requirement on an environmental scene is strict, a user is required to wear a specific data glove or sensing equipment, and a position tracking positioner is required to be equipped, so that the system is expensive, low in practicability and difficult to popularize.
3. The non-contact design requires real-time monitoring of a specific target, and thus has certain limitations.
4. The air writing is carried out in a three-dimensional space without using a plane as a support like a writing pad. Therefore, the overlapping between strokes and the imbalance of the font proportion greatly increase the difficulty of recognition.
5. The accuracy and speed of recognition cannot meet the needs of people.
Disclosure of Invention
The invention aims to solve the defects in the prior art and provides a machine vision-based air handwriting recognition system and method.
In order to achieve the purpose, the invention is implemented according to the following technical scheme:
an aerial handwriting recognition system based on machine vision is composed of a handwriting character video input part, a preprocessing part, a character segmentation part, a feature extraction part and a classification recognition part, wherein the preprocessing part, the character segmentation part, the feature extraction part and the classification recognition part are installed in a computer; the preprocessing part is used for preprocessing the track picture generated by the track points by filtering, gray level binarization and morphology; the character segmentation part is used for converting the RGB color space of the track picture generated by the preprocessed track points into HSV and segmenting characters in the HSV by using a GA-Otsu segmentation algorithm; the characteristic extraction part is used for extracting the characteristics of the divided characters; and the classification recognition part is used for optimizing the support vector machine algorithm SVM by using the extracted features through a Particle Swarm Optimization (PSO) algorithm to obtain an optimal SVM classification model and outputting recognition results of all characters.
In addition, the invention also provides an aerial handwriting recognition method based on machine vision, which comprises the following steps:
s1, handwritten character video input and preprocessing: capturing an aerial handwritten character video with a specific color through a camera, detecting the specific color as a first track point for the first time, finishing acquisition when the number of the formed track points reaches 24, connecting the track points by using a straight line to form characters, simultaneously storing the characters into a picture for judging and identifying, finally normalizing the stored character picture to enable the size of the picture to be 50 x 50, storing the picture in a png format, and sending the picture to a computer for preprocessing operations of filtering, gray level binarization and morphology through an operation processing part;
s2, character segmentation: the threshold value of the preprocessed image I (x, y) with the arbitrarily selected pixels of m x n is preset as t, and the ratio of the number of the pixels in the foreground range is omega1Taking the mean as mu1(ii) a The ratio of the number of pixels in the background range is omega2Taking the mean as mu2The overall mean of the image I (x, y) is denoted as mu, and the inter-class variance is denoted as gthThe number of pixels with gray values greater than the threshold is recorded as n2The number of pixels smaller than the threshold is denoted as n1Wherein n is2Is a foreground point, n1Is a background point:
Figure GDA0003367798570000031
n1+n2=m×n,ω12=1,μ=ω1μ12μ2,gth=ω11-μ)222-μ)2to g forth=ω11-μ)222-μ)2Derivation is carried out, the derivative is made to be equal to 0, the obtained threshold is the optimal threshold t obtained by the method, any character image is taken as a sample, and the optimal threshold t obtained by the Otsu method is 230; optimizing Otsu by using a genetic algorithm so as to segment characters in the image;
s3, feature extraction: firstly, extracting texture features of an image by using LBP (local binary pattern) to obtain texture information of the image; secondly, extracting multilayer HOG characteristics of the image to obtain contour information of the image; finally, fusing the two features to serve as final feature information of the image;
s4, classification and identification: the method comprises the steps of utilizing gamma and C parameters of particle swarm optimization algorithm Particle Swarm Optimization (PSO) optimization support vector machine algorithm (SVM) to obtain an optimal SVM classification model, wherein the prediction precision is directly influenced by the change of the gamma and C parameters, so that the identification accuracy is used as a fitness function of the PSO to continuously optimize the gamma and C parameters, the value of the fitness function is maximized, the initial population of the particle swarm is set to be 20, the evolution algebra is 20, and a learning factor C is obtained1=c2The SVM parameter C varies by 10 ═ 10-1~102The range of variation of gamma is 10-2~103And finally outputting the identification result.
Further, the step of optimizing Otsu by using a genetic algorithm in S2 is as follows:
s21, chromosome coding: in image segmentation, selection, intersection and variation are realized by a binary coding method, and since the image to be extracted is uniformly stored as an 8-bit gray scale map of 50 x 50, an 8-bit binary code between 00000000-11111111 can be used for representing a segmentation threshold, the length of a chromosome string is measured by 10 bits of 2-dimensional numbers, the first 8 bits form a logic mode, and the second 2 bits represent the threshold and an adaptability value;
s22, setting the size M of the group, wherein M is 20-100;
s23, determining a fitness function according to the following criteria: the value of the fitness function is not less than zero; in the optimizing process, the objective function changes, but the change needs to be the same as the other direction, namely the direction of the change of the fitness function in the population evolution, and g is selectedth=ω11-μ)222-μ)2As the fitness function, the fitness function is the target function because the problem is the largest and is not negative;
s24, determining genetic control parameters: adopting a single-point crossing mode in chromosome crossing, selecting the crossing probability P to be 0.6, and adopting the following mutation strategy for a mutation operator: the mutation is to divide the evolution process into three stages of early stage, middle stage and later stage, and the probability P of less mutation in the early stagem10.05, so that the variant forms are protected and the diversity is also protected; probability P of medium term mutation selection being greaterm2When the algorithm enters the convergence stage, the algorithm is 0.1; greater probability of late selection Pm3In the segmentation process, two important parameters including a ditch G and a termination algebra T are provided, wherein the ditch G represents the updating proportion of individuals in each generation of group, and G is 0.4, and T is 50;
s25, executing genetic algorithm operation: m individuals X are arbitrarily generated between (0,0) to (255)1~XMFor Otsu, firstly, coding them into 8-bit binary codes according to binary form to form initial group, then after decoding, calculating correspondent fitness value of every individual in the group, finally making selection, cross and mutation operations, and judging that it isWhether a preset finishing criterion is reached or not is judged, namely whether a preset finishing evolution algebra is reached or not is judged; when the loop is over, the individual with the largest fitness value is recorded as the selected threshold for image segmentation.
Further, the specific step of S3 is:
s31, extracting n times of HOG features from the image, obtaining n HOG feature maps of the image, which are marked as HOG (n), (n is 1,2, 3);
s32, hog (n), (n ═ 1,2,3) the feature extraction image is divided in the same manner, and divided into symmetrical sub-block images of the same size and not overlapping;
and S33, calculating HOG histogram features of all the sub-block images. Sequentially cascading histogram features of all the sub-blocks to form feature vectors corresponding to feature images of each layer of HOG (n), (n is 1,2,3), namely acquiring layered HOGi, (i is 1,2,3) features;
s34, extracting texture features from the image, and acquiring texture information;
and S35, performing serial fusion on the texture features and the HOGi hierarchical features respectively.
Further, the specific step of S4 is:
s41, taking the character feature data of the handwritten character as the input of the SVM;
s42, initializing a kernel function parameter gamma and a penalty factor C of the SVM;
s43, initializing the position and the speed of the population, and taking the accuracy rate calculated by an SVM algorithm as a fitness function of the particles;
s44, updating each individual particle by using a PSO algorithm, and calculating the fitness value of the newly generated particle;
s45, judging whether the individual extreme value of the current particle is the global optimal solution of the population or not, and if so, taking the extreme value of the particle as the global optimal solution; if not, returning to the step S44;
and S46, training the training samples by the SVM by adopting the optimized parameters to generate a classification model, and testing by using the test set.
Compared with the prior art, the invention has the beneficial effects that:
the invention realizes the identification of the handwritten characters in the air by combining the common camera and the specific color target without judging the start and the end of writing, thereby solving the two problems of quick positioning of a 'control hand' and the start and the end of writing.
The invention provides an Otsu threshold optimization algorithm, namely a Ga-Otsu algorithm, based on a traditional Otsu segmentation algorithm. And continuously optimizing the Otsu segmentation threshold value through a genetic algorithm to obtain the threshold value with the best segmentation effect. The experimental comparison result verifies the feasibility of the improved algorithm in image segmentation.
The invention provides a serial fusion algorithm based on LBP characteristics and HOG hierarchical characteristics, which not only effectively shows the edge information of an image, but also can enhance the texture detail information of the image. The experimental result shows that the average correct rate of the LBP-HOG3 feature fusion algorithm is the highest and is 92%; the average time was the shortest, 12.375 s.
The invention adopts a particle swarm optimization algorithm to carry out parameter optimization on the traditional SVM classification algorithm. Experimental results show that the method can accelerate the network learning speed and improve the character classification and recognition accuracy. The classifier accuracy designed based on the method reaches 93.8%, is improved by 5% compared with the classifier designed based on the traditional SVM algorithm, and is improved by 9.8% compared with the classifier based on the BP neural network.
The invention has strong practicability, can be integrated into various cross-platform household appliances to control the appliances, and can even be developed into an intelligent household system.
The system has strong expandability and can write any characters such as numbers, characters and the like in the air in an expandable way.
Drawings
FIG. 1 is a block diagram of the system architecture of the present invention.
Fig. 2 is a schematic diagram of character image acquisition according to an embodiment of the present invention.
FIG. 3 is a sample of a partial character image collected according to an embodiment of the present invention.
FIG. 4 is a flowchart of genetic algorithm optimization Otsu according to an embodiment of the present invention.
FIG. 5 is a comparison graph of the segmentation effect optimized by the genetic algorithm and the effect before optimization according to the embodiment of the present invention.
FIG. 6 is a flowchart of the fused algorithm design according to the embodiment of the present invention.
FIG. 7 is a flow chart of an algorithm for optimizing SVM parameters by PSO according to an embodiment of the present invention.
Fig. 8 is a block diagram of an aerial handwriting recognition system according to an embodiment of the present invention.
Fig. 9 is an interactive interface of an aerial handwriting recognition system according to an embodiment of the present invention.
FIG. 10 is a diagram of the recognition results of upper and lower case letters according to the embodiment of the present invention: (a) the result of capital letter identification; (b) recognition results of lower case letters.
FIG. 11 is a diagram of recognition results of similar letters according to an embodiment of the present invention: (a) a recognition result of one of the similar letters; (b) and the recognition result of another similar letter.
Fig. 12 is a process diagram of image extraction by using 3-layer LBP and 3-layer HOG features according to the embodiment of the present invention.
Fig. 13 is a schematic flow chart illustrating fusion of 3-layer LBP and 3-layer HOG features according to an embodiment of the present invention.
Detailed Description
The present invention will be further described with reference to specific examples, which are illustrative of the invention and are not to be construed as limiting the invention.
As shown in fig. 1, the machine vision-based aerial handwriting recognition system of the present embodiment is composed of a handwriting character video input part, a preprocessing part 2, a character segmentation part 3, a feature extraction part 4 and a classification recognition part 5 which are installed in a computer, wherein the handwriting character video input part includes a camera 1, and the camera 1 is used for acquiring an input aerial handwriting character video with a specific color in real time and generating a track picture from the acquired track points; the preprocessing part 2 is used for preprocessing the track picture generated by the track points by filtering, gray level binarization and morphology; the character segmentation part 3 is used for converting the RGB color space of the track picture generated by the preprocessed track points into HSV and segmenting characters in the HSV by using a GA-Otsu segmentation algorithm; the feature extraction part 4 performs feature extraction on the divided characters; and the classification and recognition part 5 is used for optimizing the support vector machine algorithm SVM by using the particle swarm optimization PSO to the extracted features to obtain an optimal SVM classification model and outputting recognition results of all characters.
When the machine vision-based aerial handwriting recognition system of the embodiment is used for machine vision-based aerial handwriting recognition, the method specifically comprises the following steps:
video input and preprocessing of handwritten characters: capturing an aerial handwritten character video with a specific color through a camera, detecting the specific color as a first track point for the first time, finishing acquisition when the number of the formed track points reaches 24, connecting the track points by using a straight line to form characters, simultaneously storing the characters into a picture for judging and identifying, finally normalizing the stored character picture to enable the size of the picture to be 50 x 50, storing the picture in a png format, and sending the picture to a computer for preprocessing operations of filtering, gray level binarization and morphology through an operation processing part. The character image acquisition schematic diagram is shown in FIG. 2; a sample of a captured partial character image is shown in fig. 3.
As the threshold value of the maximum between-class variance method (Otsu) cannot reach an ideal binarization result, the invention optimizes the threshold value by using the uniqueness of a Genetic Algorithm (GA), can effectively find out the global optimal solution value of a variable space, and further solves the problem that the variable value is difficult to obtain in image segmentation. An image I (x, y) with m x n pixels is arbitrarily chosen. The threshold value is preset as t, and the ratio of the pixel points in the foreground range is omega1Taking the mean as mu1(ii) a The ratio of the number of pixels in the background range is omega2Taking the mean as mu2. The overall mean of the image I (x, y) is denoted as mu, and the inter-class variance is denoted as gth. The number of pixels with gray values greater than the threshold is recorded as n2The number of pixels smaller than the threshold is denoted as n1. Wherein n is2Is a foreground point, n1Is a background point.
Figure GDA0003367798570000081
Figure GDA0003367798570000082
n1+n2=m×n (3)
ω12=1 (4)
μ=ω1μ12μ2 (5)
gth=ω11-μ)222-μ)2 (6)
And (4) the derivative of the formula (6) is obtained, and the derivative is equal to 0, and the obtained threshold is the optimal threshold t taken by the method. Taking any character image as a sample, the optimal threshold value t obtained by the Otsu method is 230. And optimizing Otsu by using a genetic algorithm, wherein the specific implementation process is as follows:
step 1: the chromosome is encoded. In image segmentation, selection, intersection and mutation are realized by a binary coding method. Since the images to be extracted are uniformly saved as a 50 x 50 8-bit gray scale image, an 8-bit binary code between 00000000 to 11111111 can be used to represent a segmentation threshold. The length of the chromosome string is measured by 10 bits of 2-dimensional numbers, the first 8 bits form a logic mode, and the last 2 bits represent a threshold value and an adaptability value. The parameter coding mode is as follows:
011……001 Real value of thresholds Fitness of thresholds
step 2: the population size M is set. When M takes a very small number, the running time of the GA is greatly reduced, but the population complexity is cut down. This drawback occasionally causes early maturation of the GA, which greatly reduces the segmentation quality of the image region. When M takes a very large number, the operation efficiency of the algorithm is not high, and the operation time is increased. Generally, 20-100M is selected, and after multiple tests, the value of pop ═ { a1, a2, a20}, and M ═ 20 is selected in the invention.
And step 3: a fitness function is determined. The function is often determined based on an objective function, and is implemented according to the following criteria: the value of the fitness function is not less than zero; the formula is defined as:
Figure GDA0003367798570000091
in the formula FminI.e. the specific input value, or the minimum value of f (x) in all current generations or the latest K generation.
In the process of optimizing, the objective function will change, but the change needs to be the same as the other direction, which is the direction of the change of the fitness function in the population evolution. The final objective function to be solved is the inter-class variance function of Otsu, which is shown in the above equation (6). Since the problem is the largest and not negative, the fitness function is the objective function.
And 4, step 4: determining a genetic control parameter. In the image segmentation algorithm, individuals are selected using the commonly used roulette method. The invention adopts a single-point crossing mode in chromosome crossing. The cross probability P is 0.6.
The mutation strategy is adopted for the mutation operator as follows: the mutation is to divide the evolution process into three stages of early stage, middle stage and later stage, and the probability P of less mutation in the early stagem10.05, so that the variant forms are protected and the diversity is also protected; probability P of medium term mutation selection being greaterm2When the algorithm enters the convergence stage, the algorithm is 0.1; greater probability of late selection Pm3This stage allows the local lookup capability to be improved by 0.3. In the segmentation process, two important parameters, namely a ditch G and a termination algebra T, are provided, wherein the ditch G represents the updating proportion of individuals in each generation group, and G is 0.4 and T is 50.
And 5: and executing the genetic algorithm operation. After the population size is determined, M individuals X are arbitrarily generated between (0,0) to (255) because the target image is an 8-bit bmp image1~XMFor Otsu, they are first coded in binary form into 8-bit binary codes, forming the initial population. And then calculating the fitness value corresponding to each individual in the population after decoding. And finally, selecting, crossing and mutating, and judging whether a preset finishing criterion is reached (namely whether a preset finishing evolution algebra is reached). When the loop is over, the individual with the largest fitness value is recorded as the selected threshold for image segmentation.
The genetic algorithm optimization Otsu flow chart is shown in fig. 4, and the optimized segmentation effect and the pre-optimization effect pair are shown in fig. 5.
Feature extraction is a core technology of image classification and identification. The invention provides a feature extraction algorithm based on fusion of LBP texture and HOG gradient features to improve the accuracy of classification and identification. The fusion algorithm has the main idea that: firstly, extracting texture features of an image by using LBP (local binary pattern) to obtain texture information of the image; secondly, extracting multilayer HOG characteristics of the image to obtain contour information of the image; and finally, fusing the two features to serve as final feature information of the image. The specific algorithm steps are as follows:
step 1: extracting the HOG features of the image for n times to obtain n HOG feature maps of the image, recording the HOG feature maps as HOG (n), (n is 1,2 and 3), extracting three layers of HOG features, namely extracting the first layer HOG feature of the original image, then extracting the HOG feature again on the basis of the feature image of the first layer, and so on, and extracting three times of HOG features;
step 2: hog (n), (n ═ 1,2,3) feature extraction images are segmented in the same way and into uniform sub-block images of the same size and without overlapping;
and step 3: and calculating HOG histogram features of all the sub-block images. Sequentially cascading histogram features of all the sub-blocks to form feature vectors corresponding to feature images of each layer of HOG (n), (n is 1,2,3), namely acquiring layered HOGi, (i is 1,2,3) features;
and 4, step 4: after extracting the LBP features from the image, obtaining texture information, wherein the feature extraction process is shown in fig. 1;
and 5: the texture features and the HOGi hierarchical features are serially fused, respectively, and the effect is shown in fig. 12.
Extraction effect as can be seen from fig. 12, both the multi-layer LBP feature and the multi-layer HOG feature contain image information. In fig. 12, (a) the texture structure of the whole image can be well shown, and the texture information and the contour information are more prominent; the texture information in fig. 12 (b) is not as clear as that expressed in fig. 12 (a), but the effective information can still be better displayed; the information in fig. 12 (c) is the weakest. Fig. 12 (d) clearly shows edge information of an image; edge information such as the outline of the image can be better displayed in fig. 12 (e) and fig. 12 (f). Therefore, each characteristic layer can clearly display different information. Therefore, in order to extract the same effective information in less time, the method of fusing the LBP feature and the layered HOG feature is proposed herein to perform feature extraction, and the effect of feature extraction is shown in fig. 13.
The fused algorithm design flow chart is shown in fig. 6.
First, take the letter B as an example, extract the texture feature of the letter B as a, and the base a1(ii) a Extracting the HOG features at the same time and recording as b, wherein three forms of HOG are respectively extracted and integrated as b1,b2,b3And (5) characterizing. A is to1Respectively and b1,b2,b3Features are serially fused, denoted as z1=[a1,b1];z2=[a2,b2];z3=[a3,b3]. A new fusion function relation is established as follows:
F(x)=a1*a+bi*b (7)
wherein i is 1,2, 3; a is1+bi1. For all letters, operating according to the fusion algorithm principle, and recording all feature fusion combinations. And classifying the three fusion combination features respectively, obtaining descending order arrangement of the accuracy, and selecting the combination with the best accuracy for feature extraction. The accuracy and test time of the fusion features under Support Vector Machine (SVM) classification are shown in table 1.
TABLE 1 accuracy and test time of fusion features under SVM classification
Figure GDA0003367798570000111
As can be seen from Table 1, the average accuracy of LBP-HOG1 was 84.75%, and the average time was 17.925 s; the average accuracy of LBP-HOG2 was 87%, and the average time was 15.225 s; the average accuracy of LBP-HOG3 was 92% and the average time was 12.375 s. The LBP-HOG3 is the best fusion combination feature in comparison, and the LBP-HOG3 algorithm is selected for feature extraction in the invention. When extracting features using LBP-HOG3, the extracted features are already subjected to [0,1] normalization processing, so the resulting merged features are also already normalized features, and therefore normalization feature processing is not required. The present invention can directly use the combined features for classification recognition of the system herein.
And (3) optimizing gamma and C parameters of a Support Vector Machine (SVM) algorithm by utilizing a Particle Swarm Optimization (PSO) to obtain an optimal SVM classification model. The prediction accuracy is directly influenced by the change of the values of the parameters gamma and C. Therefore, the identification accuracy is used as a fitness function of the PSO to continuously optimize the parameters gamma and C, so that the value of the fitness function reaches the maximum value. The initial population of the particle swarm is set to be 20, the evolution algebra is 20, and a learning factor c1=c2The SVM parameter C varies by 10 ═ 10-1~102The range of variation of gamma is 10-2~103. The algorithm comprises the following steps:
step 1: taking character feature data of the handwriting character as input of the SVM;
step 2: initializing a penalty factor C of the SVM to be 34.7321, and setting a kernel function parameter gamma to be 4.5175;
and step 3: the position and velocity of the population are initialized. The accuracy rate calculated by the SVM algorithm is used as a fitness function of the particles;
and 4, step 4: updating each individual particle by using a PSO algorithm, and calculating the fitness value of the newly generated particle;
and 5: and judging whether the individual extreme value of the current particle is the global optimal solution of the population. If so, taking the extreme value of the particle as a global optimal solution; if not, returning to the step 4;
step 6: and training the training samples by the SVM by adopting the optimized parameters to generate a classification model, and testing by using the test set.
The PSO optimization SVM is adopted, the initial population is set to be 20, the evolution algebra is 20, C1 is C2 is 10, the penalty factor C is 34.7321, and the kernel function parameter gamma is 4.5175. A flow chart of the parameter optimization process is shown in fig. 7.
The test result shows that the accuracy rate of the character recognition by the method is 94.375%. Taking character b as an example, 5 persons are selected, each person performs 20 experiments on each classifier, and the average value is taken to be compared with the unoptimized SVM algorithm. The comparative results are shown in Table 2.
Table 2 experimental algorithm comparison effect table
Figure GDA0003367798570000131
In the off-line training and testing process, 500 samples (250 upper case characters and 250 lower case characters) are selected as training samples, and the remaining 160 samples (80 upper case characters and 80 lower case characters) are selected as testing samples. From the test results, the system can recognize the English characters in both large and small writing modes accurately. From the statistical results, the recognition rate of the uppercase characters is basically higher than that of the lowercase characters, and the recognition rate of the non-similar characters is higher than that of the similar characters. Wherein, the identification accuracy rate of the non-similar characters AXYZ and the like reaches 96 percent, and the identification rate of the similar characters opqae and the like only reaches 90 percent. The specific recognition pairs of similar characters and non-similar characters are shown in table 3, the recognition pairs of large characters and small characters under different algorithms are shown in table 4, the system is tested by different testers, and the recognition comparison results are shown in table 5.
TABLE 3 comparison table for identifying similar and dissimilar characters under different algorithms
Figure GDA0003367798570000132
TABLE 4 comparison table of upper and lower case character recognition rates under different algorithms
Figure GDA0003367798570000133
Figure GDA0003367798570000141
TABLE 5 comparison table of character recognition rates under different testers
Figure GDA0003367798570000142
As can be seen from Table 5, the average recognition accuracy of English characters based on the PSO-SVM algorithm is higher than that of the traditional BP neural network and SVM algorithm under the condition of not considering the influence of other factors. The average identification accuracy of the BP algorithm is the lowest and is 84.0 percent; the average recognition accuracy of the SVM method is intermediate and is 88.8 percent; the average recognition accuracy of the PSO-SVM method is highest and reaches 93.8%.
In order to further verify the invention, the system adopts a Windows10 system as a platform for software running, and Python2.7 and IDLE Python (GUI) + Numpy function libraries as development environments. Therefore, the gesture recognition condition can be observed more intuitively, and the use is more convenient. Wherein the Gui class completes the design of the main interface, and the INit function defines the layout of the main interface and comprises a main window, a track window, two buttons and a result window. The STRain function calls a training function, and a training sample under a letter folder is read for training; the reset function is used for resetting and storing the array of 24 track points and emptying the storage of the track points; the cb function is used for storing the track filled with 24 points into a test. Obtaining a final recognition result; the play function is a timer response function, the camera displays a frame of data on a main interface in real time when reading the frame of data, the main interface is written by py.qt, the frame of image is detected at the same time, whether blue points move or not is checked, the number of track points in 24 fixed-length arrays is returned, and when 24 points are read, the image is stored and then identified.
The Video class finishes reading the camera and performs processing of a response frame. Initializing, namely defining how many points are used for storing a track, wherein the points are used for 24 points, and self.pts is the length of 24 data; the writeFrame function returns three results, the first is a current frame with a square frame and a track, the second is a square image background, the color is white, the size of the square is as large as that of a green frame in the first image, and the third function is how many points are stored in the current track; the setImage function is used to convert the current frame into an image that can be displayed on the py. Fig. 8 is a block diagram of an aerial virtual writing system, fig. 9 is an interactive interface of an aerial handwriting recognition system of the present invention, fig. 10 is a result of upper and lower case letter recognition, and fig. 11 is a result of similar letter recognition.
In summary, the present invention is distinguished from the prior art by:
1. need not to wear sensor gloves, also need not cut apart the finger and regard as writing pen, only need catch the writing orbit of specific colour, it is efficient, save the cost, conveniently carry, also need not judge the start and end and the end of writing.
2. And optimizing a maximum inter-class variance method (Otsu) threshold by using uniqueness of a genetic algorithm to obtain an optimal threshold, so that the segmented image information is clearer and a more ideal binarization effect can be obtained.
3. In the aspect of feature extraction, the LBP features can only extract effective texture information but cannot well describe image edge information and direction information; the HOG characteristic describes the edge information of the image according to the local shape change of the image and cannot well describe the texture information, and the LBP-HOG fusion characteristic algorithm provided by the invention is complementary with the edge information of the image, so that the edge information of the image is effectively represented, and the texture detail information of the image can be enhanced.
The technical solution of the present invention is not limited to the limitations of the above specific embodiments, and all technical modifications made according to the technical solution of the present invention fall within the protection scope of the present invention.

Claims (5)

1. A recognition method of an aerial handwriting recognition system based on machine vision is characterized by comprising the following steps:
s1, handwritten character video input and preprocessing: capturing an aerial handwritten character video with a specific color through a camera, detecting the specific color as a first track point for the first time, finishing acquisition when the number of the formed track points reaches 24, connecting the track points by using a straight line to form characters, simultaneously storing the characters into a picture for judging and identifying, finally normalizing the stored character picture to enable the size of the picture to be 50 x 50, storing the picture in a png format, and sending the picture to a computer for preprocessing operations of filtering, gray level binarization and morphology through a preprocessing part;
s2, character segmentation: the threshold value of the preprocessed image I (x, y) with the arbitrarily selected pixels of m x n is preset as t, and the ratio of the number of the pixels in the foreground range is omega1Taking the mean as mu1(ii) a The ratio of the number of pixels in the background range is omega2Taking the mean as mu2The overall mean of the image I (x, y) is denoted as mu, and the inter-class variance is denoted as gthThe number of pixels with gray values greater than the threshold is recorded as n2The number of pixels smaller than the threshold is denoted as n1Wherein n is2Is a foreground point, n1Is a background point:
Figure FDA0003367798560000011
Figure FDA0003367798560000012
n1+n2=m×n,ω12=1,μ=ω1μ12μ2,gth=ω11-μ)222-μ)2to g forth=ω11-μ)222-μ)2Derivation is carried out, the derivative is made to be equal to 0, the obtained threshold is the optimal threshold t obtained by the method, any character image is taken as a sample, and the optimal threshold t obtained by the Otsu method is 230; optimizing Otsu by using a genetic algorithm so as to segment characters in the image;
s3, feature extraction: firstly, extracting texture features of an image by using LBP (local binary pattern) to obtain texture information of the image; secondly, extracting multilayer HOG characteristics of the image to obtain contour information of the image; finally, fusing the two features to serve as final feature information of the image;
s4, classification and identification: by utilizing Particle Swarm Optimization (PSO), optimizing gamma and C parameters of a Support Vector Machine (SVM) algorithm to obtain an optimal SVM classification model, wherein the prediction precision is directly influenced by the change of the gamma and C parameters, so that the identification accuracy is used as a fitness function of the PSO to continuously optimize the gamma and C parameters to ensure that the value of the fitness function is maximized, the initial population of the particle swarm is set to be 20, the evolution algebra is 20, and a learning factor C is obtained1=c2The SVM parameter C varies by 10 ═ 10-1~102The range of variation of gamma is 10-2~103And outputting a final recognition result.
2. The recognition method of the machine-vision-based aerial handwriting recognition system of claim 1, characterized in that: the optimization step of Otsu by using a genetic algorithm in the S2 comprises the following steps:
s21, chromosome coding: in image segmentation, selection, intersection and variation are realized by a binary coding method, and since the image to be extracted is uniformly stored as an 8-bit gray scale map of 50 x 50, an 8-bit binary code between 00000000-11111111 can be used for representing a segmentation threshold, the length of a chromosome string is measured by 10 bits of 2-dimensional numbers, the first 8 bits form a logic mode, and the second 2 bits represent the threshold and an adaptability value;
s22, setting the size M of the group, wherein M is 20-100;
s23, determining a fitness function according to the following criteria: the value of the fitness function is not less than zero; in the optimizing process, the objective function changes, but the change needs to be the same as the other direction, namely the direction of the change of the fitness function in the population evolution, and g is selectedth=ω11-μ)222-μ)2As the fitness function, the fitness function is the target function because the problem is the largest and is not negative;
s24, determining genetic control parameters: adopting a single-point crossing mode in chromosome crossing, selecting the crossing probability P to be 0.6, and adopting the following mutation strategy for a mutation operator: the mutation is to divide the evolution process into three stages of early stage, middle stage and later stage, and the probability P of less mutation in the early stagem10.05, so that the variant forms are protected and the diversity is also protected; probability P of medium term mutation selection being greaterm2When the algorithm enters the convergence stage, the algorithm is 0.1; greater probability of late selection Pm3In the segmentation process, two important parameters including a ditch G and a termination algebra T are provided, wherein the ditch G represents the updating proportion of individuals in each generation of group, and G is 0.4, and T is 50;
s25, executing genetic algorithm operation: m individuals X are arbitrarily generated between (0,0) to (255)1~XMFor Otsu, firstly, coding the Otsu into 8-bit binary codes according to a binary form to form an initial population, then, after decoding, calculating the fitness value corresponding to each individual in the population, finally, carrying out selection, crossing and mutation operations, and judging whether a preset finishing criterion is reached or not, namely whether a preset finishing evolution algebra is reached or not; when the loop is over, the individual with the largest fitness value is recorded as the selected threshold for image segmentation.
3. The recognition method of the machine-vision-based aerial handwriting recognition system of claim 1, characterized in that: the specific steps of S3 are as follows:
s31, extracting n times of HOG features from the image, obtaining n HOG feature maps of the image, which are marked as HOG (n), (n is 1,2, 3);
s32, hog (n), (n ═ 1,2,3) the feature extraction image is divided in the same manner, and divided into symmetrical sub-block images of the same size and not overlapping;
s33, calculating the histogram features of the HOG of all the subblock images, and sequentially concatenating the histogram features of all the subblocks to form a feature vector corresponding to each layer of the HOG (n), (n ═ 1,2,3) feature images, that is, to obtain the hierarchical HOGi, (i ═ 1,2,3) features;
s34, extracting texture features from the image, and acquiring texture information;
and S35, performing serial fusion on the texture features and the HOGi hierarchical features respectively.
4. The recognition method of the machine-vision-based aerial handwriting recognition system of claim 1, characterized in that: the specific steps of S4 are as follows:
s41, taking the character feature data of the handwritten character as the input of the SVM;
s42, initializing a kernel function parameter gamma and a penalty factor C of the SVM;
s43, initializing the position and the speed of the population, and taking the accuracy rate calculated by an SVM algorithm as a fitness function of the particles;
s44, updating each individual particle by using a PSO algorithm, and calculating the fitness value of the newly generated particle;
s45, judging whether the individual extreme value of the current particle is the global optimal solution of the population or not, and if so, taking the extreme value of the particle as the global optimal solution; if not, returning to the step S44;
and S46, training the training samples by the SVM by adopting the optimized parameters to generate a classification model, and testing by using the test set.
5. An aerial handwriting recognition system based on machine vision, which executes the method according to claim 1, is characterized by comprising a handwriting character video input part, a preprocessing part, a character segmentation part, a feature extraction part and a classification recognition part which are installed in a computer, wherein the handwriting character video input part comprises a camera, and the camera is used for acquiring the input aerial handwriting character video with specific colors in real time and generating track pictures from the acquired track points; the preprocessing part is used for preprocessing the track picture generated by the track points by filtering, gray level binarization and morphology; the character segmentation part is used for converting the RGB color space of the track picture generated by the preprocessed track points into HSV and segmenting characters in the HSV by using a GA-Otsu segmentation algorithm; the characteristic extraction part is used for extracting the characteristics of the divided characters; and the classification recognition part is used for optimizing the support vector machine algorithm SVM by using the extracted features through a Particle Swarm Optimization (PSO) algorithm to obtain an optimal SVM classification model and outputting recognition results of all characters.
CN201810620085.1A 2018-06-15 2018-06-15 Machine vision-based aerial handwriting recognition system and method Active CN109033954B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810620085.1A CN109033954B (en) 2018-06-15 2018-06-15 Machine vision-based aerial handwriting recognition system and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810620085.1A CN109033954B (en) 2018-06-15 2018-06-15 Machine vision-based aerial handwriting recognition system and method

Publications (2)

Publication Number Publication Date
CN109033954A CN109033954A (en) 2018-12-18
CN109033954B true CN109033954B (en) 2022-02-08

Family

ID=64609783

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810620085.1A Active CN109033954B (en) 2018-06-15 2018-06-15 Machine vision-based aerial handwriting recognition system and method

Country Status (1)

Country Link
CN (1) CN109033954B (en)

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109741302B (en) * 2018-12-20 2021-04-30 江南大学 SD card form recognition system and method based on machine vision
CN109858501A (en) * 2019-02-20 2019-06-07 云南农业大学 A kind of two phase flow pattern feature extracting method
CN110006907A (en) * 2019-04-10 2019-07-12 清华大学深圳研究生院 A kind of die casting detection method of surface flaw and system based on machine vision
CN110619274A (en) * 2019-08-14 2019-12-27 深圳壹账通智能科技有限公司 Identity verification method and device based on seal and signature and computer equipment
CN110751082B (en) * 2019-10-17 2023-12-12 烟台艾易新能源有限公司 Gesture instruction recognition method for intelligent home entertainment system
CN111340033B (en) * 2020-03-17 2023-05-02 北京工业大学 Secondary identification method for easily-mixed characters
CN111476158B (en) * 2020-04-07 2020-12-04 金陵科技学院 Multi-channel physiological signal somatosensory gesture recognition method based on PSO-PCA-SVM
CN111913585A (en) * 2020-09-21 2020-11-10 北京百度网讯科技有限公司 Gesture recognition method, device, equipment and storage medium
CN113128372A (en) * 2021-04-02 2021-07-16 西安融智芙科技有限责任公司 Blackhead identification method and device based on image processing and terminal equipment
CN113157644B (en) * 2021-04-14 2023-01-13 上海创建达一智能科技有限公司 Real-time evaluation system for note effect of class and meeting based on paper handwriting
CN113239761B (en) * 2021-04-29 2023-11-14 广州杰赛科技股份有限公司 Face recognition method, device and storage medium
CN113378648A (en) * 2021-05-19 2021-09-10 上海可深信息科技有限公司 Artificial intelligence port and wharf monitoring method

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5058182A (en) * 1988-05-02 1991-10-15 The Research Foundation Of State Univ. Of New York Method and apparatus for handwritten character recognition
CN101937286A (en) * 2009-06-29 2011-01-05 比亚迪股份有限公司 Light pen track identification system and method
CN104992192A (en) * 2015-05-12 2015-10-21 浙江工商大学 Visual motion tracking telekinetic handwriting system

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5058182A (en) * 1988-05-02 1991-10-15 The Research Foundation Of State Univ. Of New York Method and apparatus for handwritten character recognition
CN101937286A (en) * 2009-06-29 2011-01-05 比亚迪股份有限公司 Light pen track identification system and method
CN104992192A (en) * 2015-05-12 2015-10-21 浙江工商大学 Visual motion tracking telekinetic handwriting system

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
Handwriting Recognition in Free Space Using WIMU-Based Hand Motion Analysis;Shashidhar Patil,等;《Journal of Sensors》;20161230;全文 *
基于GA-Otsu法的图像阈值分割及定量识别;赵夫群等;《吉林大学学报(工学版)》;20170515(第03期);第1-6页 *
基于PCA和PSO-SVM的手写数字识别应用研究;张校非等;《重庆理工大学学报(自然科学)》;20170715(第07期);第1-5页 *
基于单目视觉的空中书写系统原型设计;余艳等;《软件导刊》;20120730(第07期);第1-3页 *

Also Published As

Publication number Publication date
CN109033954A (en) 2018-12-18

Similar Documents

Publication Publication Date Title
CN109033954B (en) Machine vision-based aerial handwriting recognition system and method
CN110619369B (en) Fine-grained image classification method based on feature pyramid and global average pooling
CN108334848B (en) Tiny face recognition method based on generation countermeasure network
CN106599854B (en) Automatic facial expression recognition method based on multi-feature fusion
CN110929593B (en) Real-time significance pedestrian detection method based on detail discrimination
CN108171196B (en) Face detection method and device
CN110263712B (en) Coarse and fine pedestrian detection method based on region candidates
Wang et al. Small-object detection based on yolo and dense block via image super-resolution
CN112784736B (en) Character interaction behavior recognition method based on multi-modal feature fusion
CN110555481A (en) Portrait style identification method and device and computer readable storage medium
CN110298297A (en) Flame identification method and device
CN108280421B (en) Human behavior recognition method based on multi-feature depth motion map
CN114758288A (en) Power distribution network engineering safety control detection method and device
CN106650617A (en) Pedestrian abnormity identification method based on probabilistic latent semantic analysis
CN111881731A (en) Behavior recognition method, system, device and medium based on human skeleton
CN108734200B (en) Human target visual detection method and device based on BING (building information network) features
CN113435319B (en) Classification method combining multi-target tracking and pedestrian angle recognition
CN112036260A (en) Expression recognition method and system for multi-scale sub-block aggregation in natural environment
CN110263868A (en) Image classification network based on SuperPoint feature
Bappy et al. Real estate image classification
CN113011253A (en) Face expression recognition method, device, equipment and storage medium based on ResNeXt network
CN113297956B (en) Gesture recognition method and system based on vision
CN111582057B (en) Face verification method based on local receptive field
CN111080754A (en) Character animation production method and device for connecting characteristic points of head and limbs
Huang et al. Multifeature selection for 3D human action recognition

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant