CN109033954B

CN109033954B - Machine vision-based aerial handwriting recognition system and method

Info

Publication number: CN109033954B
Application number: CN201810620085.1A
Authority: CN
Inventors: 汪梅; 王博馨; 孙敏; 牛钦; 翟珂; 王刚; 张佳楠
Original assignee: Xian University of Science and Technology
Current assignee: Xian University of Science and Technology
Priority date: 2018-06-15
Filing date: 2018-06-15
Publication date: 2022-02-08
Anticipated expiration: 2038-06-15
Also published as: CN109033954A

Abstract

The invention discloses an aerial handwriting recognition system and method based on machine vision.A handwriting character video input part is used for acquiring an input aerial handwriting character video with a specific color in real time and generating a track picture by using acquired track points; the preprocessing part is used for preprocessing the track picture generated by the track points by filtering, gray level binarization and morphology; the character segmentation part is used for converting the RGB color space of the track picture generated by the preprocessed track points into HSV and segmenting characters in the HSV by using a GA-Otsu segmentation algorithm; the characteristic extraction part is used for extracting the characteristics of the divided characters; and the classification recognition part is used for optimizing the support vector machine algorithm SVM by using the extracted features through a Particle Swarm Optimization (PSO) algorithm to obtain an optimal SVM classification model and outputting recognition results of all characters. The invention only needs to capture the writing track of a specific color, has high efficiency, saves cost, is convenient to carry, and does not need to judge the start, stop and end of writing.

Description

Machine vision-based aerial handwriting recognition system and method

Technical Field

The invention relates to the technical field of machine vision recognition, in particular to an aerial handwriting recognition system and method based on machine vision.

Background

The novel comfortable and natural man-machine interaction means of air handwriting is different from the traditional man-machine interaction mode, and the air handwriting allows a user to write in the air in a natural and unconstrained mode, so that more intuitive, convenient and comfortable interaction experience is provided. The whole process of the air handwriting mainly comprises two aspects of technologies. Namely dynamic target object capture and handwriting recognition. As a novel man-machine interaction mode, the air handwriting opens a new era of man-machine interaction which is from beginning to end and is expected to play an important role in future man-machine interaction application. At present, the air handwriting recognition is mostly realized by two modes, (1) based on an acceleration sensor, the air handwriting motion data is collected and analyzed. And then extracting the characteristic vector of the acquired data, and classifying and identifying by adopting a pattern recognition algorithm. (2) Based on a computer camera, a writing gesture is captured using a specific algorithm. And then analyzing and identifying the specific content formed by the generated motion trail. However, the following common problems still exist in the practical application of the methods:

1. when writing in the air, the initial and end states of writing need to be distinguished. If the pause phenomenon occurs in the writing process, the phenomenon of excessive noise data occurs, so that a real comfortable, cheap and natural man-machine interaction mode cannot be realized.

2. Generally, the requirement on an environmental scene is strict, a user is required to wear a specific data glove or sensing equipment, and a position tracking positioner is required to be equipped, so that the system is expensive, low in practicability and difficult to popularize.

3. The non-contact design requires real-time monitoring of a specific target, and thus has certain limitations.

4. The air writing is carried out in a three-dimensional space without using a plane as a support like a writing pad. Therefore, the overlapping between strokes and the imbalance of the font proportion greatly increase the difficulty of recognition.

5. The accuracy and speed of recognition cannot meet the needs of people.

Disclosure of Invention

The invention aims to solve the defects in the prior art and provides a machine vision-based air handwriting recognition system and method.

In order to achieve the purpose, the invention is implemented according to the following technical scheme:

an aerial handwriting recognition system based on machine vision is composed of a handwriting character video input part, a preprocessing part, a character segmentation part, a feature extraction part and a classification recognition part, wherein the preprocessing part, the character segmentation part, the feature extraction part and the classification recognition part are installed in a computer; the preprocessing part is used for preprocessing the track picture generated by the track points by filtering, gray level binarization and morphology; the character segmentation part is used for converting the RGB color space of the track picture generated by the preprocessed track points into HSV and segmenting characters in the HSV by using a GA-Otsu segmentation algorithm; the characteristic extraction part is used for extracting the characteristics of the divided characters; and the classification recognition part is used for optimizing the support vector machine algorithm SVM by using the extracted features through a Particle Swarm Optimization (PSO) algorithm to obtain an optimal SVM classification model and outputting recognition results of all characters.

In addition, the invention also provides an aerial handwriting recognition method based on machine vision, which comprises the following steps:

s1, handwritten character video input and preprocessing: capturing an aerial handwritten character video with a specific color through a camera, detecting the specific color as a first track point for the first time, finishing acquisition when the number of the formed track points reaches 24, connecting the track points by using a straight line to form characters, simultaneously storing the characters into a picture for judging and identifying, finally normalizing the stored character picture to enable the size of the picture to be 50 x 50, storing the picture in a png format, and sending the picture to a computer for preprocessing operations of filtering, gray level binarization and morphology through an operation processing part;

s2, character segmentation: the threshold value of the preprocessed image I (x, y) with the arbitrarily selected pixels of m x n is preset as t, and the ratio of the number of the pixels in the foreground range is omega₁Taking the mean as mu₁(ii) a The ratio of the number of pixels in the background range is omega₂Taking the mean as mu₂The overall mean of the image I (x, y) is denoted as mu, and the inter-class variance is denoted as g_thThe number of pixels with gray values greater than the threshold is recorded as n₂The number of pixels smaller than the threshold is denoted as n₁Wherein n is₂Is a foreground point, n₁Is a background point:

n₁+n₂＝m×n，ω₁+ω₂＝1，μ＝ω₁μ₁+ω₂μ₂，g_th＝ω₁(μ₁-μ)²+ω₂(μ₂-μ)²to g for_th＝ω₁(μ₁-μ)²+ω₂(μ₂-μ)²Derivation is carried out, the derivative is made to be equal to 0, the obtained threshold is the optimal threshold t obtained by the method, any character image is taken as a sample, and the optimal threshold t obtained by the Otsu method is 230; optimizing Otsu by using a genetic algorithm so as to segment characters in the image;

s3, feature extraction: firstly, extracting texture features of an image by using LBP (local binary pattern) to obtain texture information of the image; secondly, extracting multilayer HOG characteristics of the image to obtain contour information of the image; finally, fusing the two features to serve as final feature information of the image;

s4, classification and identification: the method comprises the steps of utilizing gamma and C parameters of particle swarm optimization algorithm Particle Swarm Optimization (PSO) optimization support vector machine algorithm (SVM) to obtain an optimal SVM classification model, wherein the prediction precision is directly influenced by the change of the gamma and C parameters, so that the identification accuracy is used as a fitness function of the PSO to continuously optimize the gamma and C parameters, the value of the fitness function is maximized, the initial population of the particle swarm is set to be 20, the evolution algebra is 20, and a learning factor C is obtained₁＝c₂The SVM parameter C varies by 10 ═ 10^-1～10²The range of variation of gamma is 10^-2～10³And finally outputting the identification result.

Further, the step of optimizing Otsu by using a genetic algorithm in S2 is as follows:

s21, chromosome coding: in image segmentation, selection, intersection and variation are realized by a binary coding method, and since the image to be extracted is uniformly stored as an 8-bit gray scale map of 50 x 50, an 8-bit binary code between 00000000-11111111 can be used for representing a segmentation threshold, the length of a chromosome string is measured by 10 bits of 2-dimensional numbers, the first 8 bits form a logic mode, and the second 2 bits represent the threshold and an adaptability value;

s22, setting the size M of the group, wherein M is 20-100;

s23, determining a fitness function according to the following criteria: the value of the fitness function is not less than zero; in the optimizing process, the objective function changes, but the change needs to be the same as the other direction, namely the direction of the change of the fitness function in the population evolution, and g is selected_th＝ω₁(μ₁-μ)²+ω₂(μ₂-μ)²As the fitness function, the fitness function is the target function because the problem is the largest and is not negative;

s24, determining genetic control parameters: adopting a single-point crossing mode in chromosome crossing, selecting the crossing probability P to be 0.6, and adopting the following mutation strategy for a mutation operator: the mutation is to divide the evolution process into three stages of early stage, middle stage and later stage, and the probability P of less mutation in the early stage_m10.05, so that the variant forms are protected and the diversity is also protected; probability P of medium term mutation selection being greater_m2When the algorithm enters the convergence stage, the algorithm is 0.1; greater probability of late selection P_m3In the segmentation process, two important parameters including a ditch G and a termination algebra T are provided, wherein the ditch G represents the updating proportion of individuals in each generation of group, and G is 0.4, and T is 50;

s25, executing genetic algorithm operation: m individuals X are arbitrarily generated between (0,0) to (255)₁～X_MFor Otsu, firstly, coding them into 8-bit binary codes according to binary form to form initial group, then after decoding, calculating correspondent fitness value of every individual in the group, finally making selection, cross and mutation operations, and judging that it isWhether a preset finishing criterion is reached or not is judged, namely whether a preset finishing evolution algebra is reached or not is judged; when the loop is over, the individual with the largest fitness value is recorded as the selected threshold for image segmentation.

Further, the specific step of S3 is:

s31, extracting n times of HOG features from the image, obtaining n HOG feature maps of the image, which are marked as HOG (n), (n is 1,2, 3);

s32, hog (n), (n ═ 1,2,3) the feature extraction image is divided in the same manner, and divided into symmetrical sub-block images of the same size and not overlapping;

and S33, calculating HOG histogram features of all the sub-block images. Sequentially cascading histogram features of all the sub-blocks to form feature vectors corresponding to feature images of each layer of HOG (n), (n is 1,2,3), namely acquiring layered HOGi, (i is 1,2,3) features;

s34, extracting texture features from the image, and acquiring texture information;

and S35, performing serial fusion on the texture features and the HOGi hierarchical features respectively.

Further, the specific step of S4 is:

s41, taking the character feature data of the handwritten character as the input of the SVM;

s42, initializing a kernel function parameter gamma and a penalty factor C of the SVM;

s43, initializing the position and the speed of the population, and taking the accuracy rate calculated by an SVM algorithm as a fitness function of the particles;

s44, updating each individual particle by using a PSO algorithm, and calculating the fitness value of the newly generated particle;

s45, judging whether the individual extreme value of the current particle is the global optimal solution of the population or not, and if so, taking the extreme value of the particle as the global optimal solution; if not, returning to the step S44;

and S46, training the training samples by the SVM by adopting the optimized parameters to generate a classification model, and testing by using the test set.

Compared with the prior art, the invention has the beneficial effects that:

the invention realizes the identification of the handwritten characters in the air by combining the common camera and the specific color target without judging the start and the end of writing, thereby solving the two problems of quick positioning of a 'control hand' and the start and the end of writing.

The invention provides an Otsu threshold optimization algorithm, namely a Ga-Otsu algorithm, based on a traditional Otsu segmentation algorithm. And continuously optimizing the Otsu segmentation threshold value through a genetic algorithm to obtain the threshold value with the best segmentation effect. The experimental comparison result verifies the feasibility of the improved algorithm in image segmentation.

The invention provides a serial fusion algorithm based on LBP characteristics and HOG hierarchical characteristics, which not only effectively shows the edge information of an image, but also can enhance the texture detail information of the image. The experimental result shows that the average correct rate of the LBP-HOG3 feature fusion algorithm is the highest and is 92%; the average time was the shortest, 12.375 s.

The invention adopts a particle swarm optimization algorithm to carry out parameter optimization on the traditional SVM classification algorithm. Experimental results show that the method can accelerate the network learning speed and improve the character classification and recognition accuracy. The classifier accuracy designed based on the method reaches 93.8%, is improved by 5% compared with the classifier designed based on the traditional SVM algorithm, and is improved by 9.8% compared with the classifier based on the BP neural network.

The invention has strong practicability, can be integrated into various cross-platform household appliances to control the appliances, and can even be developed into an intelligent household system.

The system has strong expandability and can write any characters such as numbers, characters and the like in the air in an expandable way.

Drawings

FIG. 1 is a block diagram of the system architecture of the present invention.

Fig. 2 is a schematic diagram of character image acquisition according to an embodiment of the present invention.

FIG. 3 is a sample of a partial character image collected according to an embodiment of the present invention.

FIG. 4 is a flowchart of genetic algorithm optimization Otsu according to an embodiment of the present invention.

FIG. 5 is a comparison graph of the segmentation effect optimized by the genetic algorithm and the effect before optimization according to the embodiment of the present invention.

FIG. 6 is a flowchart of the fused algorithm design according to the embodiment of the present invention.

FIG. 7 is a flow chart of an algorithm for optimizing SVM parameters by PSO according to an embodiment of the present invention.

Fig. 8 is a block diagram of an aerial handwriting recognition system according to an embodiment of the present invention.

Fig. 9 is an interactive interface of an aerial handwriting recognition system according to an embodiment of the present invention.

FIG. 10 is a diagram of the recognition results of upper and lower case letters according to the embodiment of the present invention: (a) the result of capital letter identification; (b) recognition results of lower case letters.

FIG. 11 is a diagram of recognition results of similar letters according to an embodiment of the present invention: (a) a recognition result of one of the similar letters; (b) and the recognition result of another similar letter.

Fig. 12 is a process diagram of image extraction by using 3-layer LBP and 3-layer HOG features according to the embodiment of the present invention.

Fig. 13 is a schematic flow chart illustrating fusion of 3-layer LBP and 3-layer HOG features according to an embodiment of the present invention.

Detailed Description

The present invention will be further described with reference to specific examples, which are illustrative of the invention and are not to be construed as limiting the invention.

As shown in fig. 1, the machine vision-based aerial handwriting recognition system of the present embodiment is composed of a handwriting character video input part, a preprocessing part 2, a character segmentation part 3, a feature extraction part 4 and a classification recognition part 5 which are installed in a computer, wherein the handwriting character video input part includes a camera 1, and the camera 1 is used for acquiring an input aerial handwriting character video with a specific color in real time and generating a track picture from the acquired track points; the preprocessing part 2 is used for preprocessing the track picture generated by the track points by filtering, gray level binarization and morphology; the character segmentation part 3 is used for converting the RGB color space of the track picture generated by the preprocessed track points into HSV and segmenting characters in the HSV by using a GA-Otsu segmentation algorithm; the feature extraction part 4 performs feature extraction on the divided characters; and the classification and recognition part 5 is used for optimizing the support vector machine algorithm SVM by using the particle swarm optimization PSO to the extracted features to obtain an optimal SVM classification model and outputting recognition results of all characters.

When the machine vision-based aerial handwriting recognition system of the embodiment is used for machine vision-based aerial handwriting recognition, the method specifically comprises the following steps:

video input and preprocessing of handwritten characters: capturing an aerial handwritten character video with a specific color through a camera, detecting the specific color as a first track point for the first time, finishing acquisition when the number of the formed track points reaches 24, connecting the track points by using a straight line to form characters, simultaneously storing the characters into a picture for judging and identifying, finally normalizing the stored character picture to enable the size of the picture to be 50 x 50, storing the picture in a png format, and sending the picture to a computer for preprocessing operations of filtering, gray level binarization and morphology through an operation processing part. The character image acquisition schematic diagram is shown in FIG. 2; a sample of a captured partial character image is shown in fig. 3.

As the threshold value of the maximum between-class variance method (Otsu) cannot reach an ideal binarization result, the invention optimizes the threshold value by using the uniqueness of a Genetic Algorithm (GA), can effectively find out the global optimal solution value of a variable space, and further solves the problem that the variable value is difficult to obtain in image segmentation. An image I (x, y) with m x n pixels is arbitrarily chosen. The threshold value is preset as t, and the ratio of the pixel points in the foreground range is omega₁Taking the mean as mu₁(ii) a The ratio of the number of pixels in the background range is omega₂Taking the mean as mu₂. The overall mean of the image I (x, y) is denoted as mu, and the inter-class variance is denoted as g_th. The number of pixels with gray values greater than the threshold is recorded as n₂The number of pixels smaller than the threshold is denoted as n₁. Wherein n is₂Is a foreground point, n₁Is a background point.

n₁+n₂＝m×n (3)

ω₁+ω₂＝1 (4)

μ＝ω₁μ₁+ω₂μ₂ (5)

g_th＝ω₁(μ₁-μ)²+ω₂(μ₂-μ)² (6)

And (4) the derivative of the formula (6) is obtained, and the derivative is equal to 0, and the obtained threshold is the optimal threshold t taken by the method. Taking any character image as a sample, the optimal threshold value t obtained by the Otsu method is 230. And optimizing Otsu by using a genetic algorithm, wherein the specific implementation process is as follows:

step 1: the chromosome is encoded. In image segmentation, selection, intersection and mutation are realized by a binary coding method. Since the images to be extracted are uniformly saved as a 50 x 50 8-bit gray scale image, an 8-bit binary code between 00000000 to 11111111 can be used to represent a segmentation threshold. The length of the chromosome string is measured by 10 bits of 2-dimensional numbers, the first 8 bits form a logic mode, and the last 2 bits represent a threshold value and an adaptability value. The parameter coding mode is as follows:

011……001

Real value of thresholds

Fitness of thresholds

step 2: the population size M is set. When M takes a very small number, the running time of the GA is greatly reduced, but the population complexity is cut down. This drawback occasionally causes early maturation of the GA, which greatly reduces the segmentation quality of the image region. When M takes a very large number, the operation efficiency of the algorithm is not high, and the operation time is increased. Generally, 20-100M is selected, and after multiple tests, the value of pop ═ { a1, a2, a20}, and M ═ 20 is selected in the invention.

And step 3: a fitness function is determined. The function is often determined based on an objective function, and is implemented according to the following criteria: the value of the fitness function is not less than zero; the formula is defined as:

in the formula F_minI.e. the specific input value, or the minimum value of f (x) in all current generations or the latest K generation.

In the process of optimizing, the objective function will change, but the change needs to be the same as the other direction, which is the direction of the change of the fitness function in the population evolution. The final objective function to be solved is the inter-class variance function of Otsu, which is shown in the above equation (6). Since the problem is the largest and not negative, the fitness function is the objective function.

And 4, step 4: determining a genetic control parameter. In the image segmentation algorithm, individuals are selected using the commonly used roulette method. The invention adopts a single-point crossing mode in chromosome crossing. The cross probability P is 0.6.

The mutation strategy is adopted for the mutation operator as follows: the mutation is to divide the evolution process into three stages of early stage, middle stage and later stage, and the probability P of less mutation in the early stage_m10.05, so that the variant forms are protected and the diversity is also protected; probability P of medium term mutation selection being greater_m2When the algorithm enters the convergence stage, the algorithm is 0.1; greater probability of late selection P_m3This stage allows the local lookup capability to be improved by 0.3. In the segmentation process, two important parameters, namely a ditch G and a termination algebra T, are provided, wherein the ditch G represents the updating proportion of individuals in each generation group, and G is 0.4 and T is 50.

And 5: and executing the genetic algorithm operation. After the population size is determined, M individuals X are arbitrarily generated between (0,0) to (255) because the target image is an 8-bit bmp image₁～X_MFor Otsu, they are first coded in binary form into 8-bit binary codes, forming the initial population. And then calculating the fitness value corresponding to each individual in the population after decoding. And finally, selecting, crossing and mutating, and judging whether a preset finishing criterion is reached (namely whether a preset finishing evolution algebra is reached). When the loop is over, the individual with the largest fitness value is recorded as the selected threshold for image segmentation.

The genetic algorithm optimization Otsu flow chart is shown in fig. 4, and the optimized segmentation effect and the pre-optimization effect pair are shown in fig. 5.

Feature extraction is a core technology of image classification and identification. The invention provides a feature extraction algorithm based on fusion of LBP texture and HOG gradient features to improve the accuracy of classification and identification. The fusion algorithm has the main idea that: firstly, extracting texture features of an image by using LBP (local binary pattern) to obtain texture information of the image; secondly, extracting multilayer HOG characteristics of the image to obtain contour information of the image; and finally, fusing the two features to serve as final feature information of the image. The specific algorithm steps are as follows:

step 1: extracting the HOG features of the image for n times to obtain n HOG feature maps of the image, recording the HOG feature maps as HOG (n), (n is 1,2 and 3), extracting three layers of HOG features, namely extracting the first layer HOG feature of the original image, then extracting the HOG feature again on the basis of the feature image of the first layer, and so on, and extracting three times of HOG features;

step 2: hog (n), (n ═ 1,2,3) feature extraction images are segmented in the same way and into uniform sub-block images of the same size and without overlapping;

and step 3: and calculating HOG histogram features of all the sub-block images. Sequentially cascading histogram features of all the sub-blocks to form feature vectors corresponding to feature images of each layer of HOG (n), (n is 1,2,3), namely acquiring layered HOGi, (i is 1,2,3) features;

and 4, step 4: after extracting the LBP features from the image, obtaining texture information, wherein the feature extraction process is shown in fig. 1;

and 5: the texture features and the HOGi hierarchical features are serially fused, respectively, and the effect is shown in fig. 12.

Extraction effect as can be seen from fig. 12, both the multi-layer LBP feature and the multi-layer HOG feature contain image information. In fig. 12, (a) the texture structure of the whole image can be well shown, and the texture information and the contour information are more prominent; the texture information in fig. 12 (b) is not as clear as that expressed in fig. 12 (a), but the effective information can still be better displayed; the information in fig. 12 (c) is the weakest. Fig. 12 (d) clearly shows edge information of an image; edge information such as the outline of the image can be better displayed in fig. 12 (e) and fig. 12 (f). Therefore, each characteristic layer can clearly display different information. Therefore, in order to extract the same effective information in less time, the method of fusing the LBP feature and the layered HOG feature is proposed herein to perform feature extraction, and the effect of feature extraction is shown in fig. 13.

The fused algorithm design flow chart is shown in fig. 6.

First, take the letter B as an example, extract the texture feature of the letter B as a, and the base a₁(ii) a Extracting the HOG features at the same time and recording as b, wherein three forms of HOG are respectively extracted and integrated as b₁,b₂,b₃And (5) characterizing. A is to₁Respectively and b₁,b₂,b₃Features are serially fused, denoted as z₁＝[a₁,b₁]；z₂＝[a₂,b₂]；z₃＝[a₃,b₃]. A new fusion function relation is established as follows:

F(x)＝a₁*a+b_i*b (7)

wherein i is 1,2, 3; a is₁+b_i1. For all letters, operating according to the fusion algorithm principle, and recording all feature fusion combinations. And classifying the three fusion combination features respectively, obtaining descending order arrangement of the accuracy, and selecting the combination with the best accuracy for feature extraction. The accuracy and test time of the fusion features under Support Vector Machine (SVM) classification are shown in table 1.

TABLE 1 accuracy and test time of fusion features under SVM classification

As can be seen from Table 1, the average accuracy of LBP-HOG1 was 84.75%, and the average time was 17.925 s; the average accuracy of LBP-HOG2 was 87%, and the average time was 15.225 s; the average accuracy of LBP-HOG3 was 92% and the average time was 12.375 s. The LBP-HOG3 is the best fusion combination feature in comparison, and the LBP-HOG3 algorithm is selected for feature extraction in the invention. When extracting features using LBP-HOG3, the extracted features are already subjected to [0,1] normalization processing, so the resulting merged features are also already normalized features, and therefore normalization feature processing is not required. The present invention can directly use the combined features for classification recognition of the system herein.

And (3) optimizing gamma and C parameters of a Support Vector Machine (SVM) algorithm by utilizing a Particle Swarm Optimization (PSO) to obtain an optimal SVM classification model. The prediction accuracy is directly influenced by the change of the values of the parameters gamma and C. Therefore, the identification accuracy is used as a fitness function of the PSO to continuously optimize the parameters gamma and C, so that the value of the fitness function reaches the maximum value. The initial population of the particle swarm is set to be 20, the evolution algebra is 20, and a learning factor c₁＝c₂The SVM parameter C varies by 10 ═ 10^-1～10²The range of variation of gamma is 10^-2～10³. The algorithm comprises the following steps:

step 1: taking character feature data of the handwriting character as input of the SVM;

step 2: initializing a penalty factor C of the SVM to be 34.7321, and setting a kernel function parameter gamma to be 4.5175;

and step 3: the position and velocity of the population are initialized. The accuracy rate calculated by the SVM algorithm is used as a fitness function of the particles;

and 4, step 4: updating each individual particle by using a PSO algorithm, and calculating the fitness value of the newly generated particle;

and 5: and judging whether the individual extreme value of the current particle is the global optimal solution of the population. If so, taking the extreme value of the particle as a global optimal solution; if not, returning to the step 4;

step 6: and training the training samples by the SVM by adopting the optimized parameters to generate a classification model, and testing by using the test set.

The PSO optimization SVM is adopted, the initial population is set to be 20, the evolution algebra is 20, C1 is C2 is 10, the penalty factor C is 34.7321, and the kernel function parameter gamma is 4.5175. A flow chart of the parameter optimization process is shown in fig. 7.

The test result shows that the accuracy rate of the character recognition by the method is 94.375%. Taking character b as an example, 5 persons are selected, each person performs 20 experiments on each classifier, and the average value is taken to be compared with the unoptimized SVM algorithm. The comparative results are shown in Table 2.

Table 2 experimental algorithm comparison effect table

In the off-line training and testing process, 500 samples (250 upper case characters and 250 lower case characters) are selected as training samples, and the remaining 160 samples (80 upper case characters and 80 lower case characters) are selected as testing samples. From the test results, the system can recognize the English characters in both large and small writing modes accurately. From the statistical results, the recognition rate of the uppercase characters is basically higher than that of the lowercase characters, and the recognition rate of the non-similar characters is higher than that of the similar characters. Wherein, the identification accuracy rate of the non-similar characters AXYZ and the like reaches 96 percent, and the identification rate of the similar characters opqae and the like only reaches 90 percent. The specific recognition pairs of similar characters and non-similar characters are shown in table 3, the recognition pairs of large characters and small characters under different algorithms are shown in table 4, the system is tested by different testers, and the recognition comparison results are shown in table 5.

TABLE 3 comparison table for identifying similar and dissimilar characters under different algorithms

TABLE 4 comparison table of upper and lower case character recognition rates under different algorithms

TABLE 5 comparison table of character recognition rates under different testers

As can be seen from Table 5, the average recognition accuracy of English characters based on the PSO-SVM algorithm is higher than that of the traditional BP neural network and SVM algorithm under the condition of not considering the influence of other factors. The average identification accuracy of the BP algorithm is the lowest and is 84.0 percent; the average recognition accuracy of the SVM method is intermediate and is 88.8 percent; the average recognition accuracy of the PSO-SVM method is highest and reaches 93.8%.

In order to further verify the invention, the system adopts a Windows10 system as a platform for software running, and Python2.7 and IDLE Python (GUI) + Numpy function libraries as development environments. Therefore, the gesture recognition condition can be observed more intuitively, and the use is more convenient. Wherein the Gui class completes the design of the main interface, and the INit function defines the layout of the main interface and comprises a main window, a track window, two buttons and a result window. The STRain function calls a training function, and a training sample under a letter folder is read for training; the reset function is used for resetting and storing the array of 24 track points and emptying the storage of the track points; the cb function is used for storing the track filled with 24 points into a test. Obtaining a final recognition result; the play function is a timer response function, the camera displays a frame of data on a main interface in real time when reading the frame of data, the main interface is written by py.qt, the frame of image is detected at the same time, whether blue points move or not is checked, the number of track points in 24 fixed-length arrays is returned, and when 24 points are read, the image is stored and then identified.

The Video class finishes reading the camera and performs processing of a response frame. Initializing, namely defining how many points are used for storing a track, wherein the points are used for 24 points, and self.pts is the length of 24 data; the writeFrame function returns three results, the first is a current frame with a square frame and a track, the second is a square image background, the color is white, the size of the square is as large as that of a green frame in the first image, and the third function is how many points are stored in the current track; the setImage function is used to convert the current frame into an image that can be displayed on the py. Fig. 8 is a block diagram of an aerial virtual writing system, fig. 9 is an interactive interface of an aerial handwriting recognition system of the present invention, fig. 10 is a result of upper and lower case letter recognition, and fig. 11 is a result of similar letter recognition.

In summary, the present invention is distinguished from the prior art by:

1. need not to wear sensor gloves, also need not cut apart the finger and regard as writing pen, only need catch the writing orbit of specific colour, it is efficient, save the cost, conveniently carry, also need not judge the start and end and the end of writing.

2. And optimizing a maximum inter-class variance method (Otsu) threshold by using uniqueness of a genetic algorithm to obtain an optimal threshold, so that the segmented image information is clearer and a more ideal binarization effect can be obtained.

3. In the aspect of feature extraction, the LBP features can only extract effective texture information but cannot well describe image edge information and direction information; the HOG characteristic describes the edge information of the image according to the local shape change of the image and cannot well describe the texture information, and the LBP-HOG fusion characteristic algorithm provided by the invention is complementary with the edge information of the image, so that the edge information of the image is effectively represented, and the texture detail information of the image can be enhanced.

The technical solution of the present invention is not limited to the limitations of the above specific embodiments, and all technical modifications made according to the technical solution of the present invention fall within the protection scope of the present invention.

Claims

1. A recognition method of an aerial handwriting recognition system based on machine vision is characterized by comprising the following steps:

s1, handwritten character video input and preprocessing: capturing an aerial handwritten character video with a specific color through a camera, detecting the specific color as a first track point for the first time, finishing acquisition when the number of the formed track points reaches 24, connecting the track points by using a straight line to form characters, simultaneously storing the characters into a picture for judging and identifying, finally normalizing the stored character picture to enable the size of the picture to be 50 x 50, storing the picture in a png format, and sending the picture to a computer for preprocessing operations of filtering, gray level binarization and morphology through a preprocessing part;

s4, classification and identification: by utilizing Particle Swarm Optimization (PSO), optimizing gamma and C parameters of a Support Vector Machine (SVM) algorithm to obtain an optimal SVM classification model, wherein the prediction precision is directly influenced by the change of the gamma and C parameters, so that the identification accuracy is used as a fitness function of the PSO to continuously optimize the gamma and C parameters to ensure that the value of the fitness function is maximized, the initial population of the particle swarm is set to be 20, the evolution algebra is 20, and a learning factor C is obtained₁＝c₂The SVM parameter C varies by 10 ═ 10^-1～10²The range of variation of gamma is 10^-2～10³And outputting a final recognition result.

2. The recognition method of the machine-vision-based aerial handwriting recognition system of claim 1, characterized in that: the optimization step of Otsu by using a genetic algorithm in the S2 comprises the following steps:

s22, setting the size M of the group, wherein M is 20-100;

s25, executing genetic algorithm operation: m individuals X are arbitrarily generated between (0,0) to (255)₁～X_MFor Otsu, firstly, coding the Otsu into 8-bit binary codes according to a binary form to form an initial population, then, after decoding, calculating the fitness value corresponding to each individual in the population, finally, carrying out selection, crossing and mutation operations, and judging whether a preset finishing criterion is reached or not, namely whether a preset finishing evolution algebra is reached or not; when the loop is over, the individual with the largest fitness value is recorded as the selected threshold for image segmentation.

3. The recognition method of the machine-vision-based aerial handwriting recognition system of claim 1, characterized in that: the specific steps of S3 are as follows:

s33, calculating the histogram features of the HOG of all the subblock images, and sequentially concatenating the histogram features of all the subblocks to form a feature vector corresponding to each layer of the HOG (n), (n ═ 1,2,3) feature images, that is, to obtain the hierarchical HOGi, (i ═ 1,2,3) features;

4. The recognition method of the machine-vision-based aerial handwriting recognition system of claim 1, characterized in that: the specific steps of S4 are as follows:

5. An aerial handwriting recognition system based on machine vision, which executes the method according to claim 1, is characterized by comprising a handwriting character video input part, a preprocessing part, a character segmentation part, a feature extraction part and a classification recognition part which are installed in a computer, wherein the handwriting character video input part comprises a camera, and the camera is used for acquiring the input aerial handwriting character video with specific colors in real time and generating track pictures from the acquired track points; the preprocessing part is used for preprocessing the track picture generated by the track points by filtering, gray level binarization and morphology; the character segmentation part is used for converting the RGB color space of the track picture generated by the preprocessed track points into HSV and segmenting characters in the HSV by using a GA-Otsu segmentation algorithm; the characteristic extraction part is used for extracting the characteristics of the divided characters; and the classification recognition part is used for optimizing the support vector machine algorithm SVM by using the extracted features through a Particle Swarm Optimization (PSO) algorithm to obtain an optimal SVM classification model and outputting recognition results of all characters.