CN111382598B

CN111382598B - Identification method and device and electronic equipment

Info

Publication number: CN111382598B
Application number: CN201811615801.3A
Authority: CN
Inventors: 辛晓哲; 秦波; 李瑞楠; 孙博; 王帅; 黄海兵; 李斌; 陈伟
Original assignee: Beijing Sogou Technology Development Co Ltd
Current assignee: Beijing Sogou Technology Development Co Ltd
Priority date: 2018-12-27
Filing date: 2018-12-27
Publication date: 2024-05-24
Anticipated expiration: 2038-12-27
Also published as: CN111382598A

Abstract

The embodiment of the invention provides an identification method, an identification device and electronic equipment, wherein the identification method comprises the following steps: entering a writing state, and acquiring a first hand video image from a video image acquired by a camera; performing image processing on the first hand video image to acquire fingertip position information of a writing finger and displaying a writing track on a display screen according to the fingertip position information; after the writing state is finished, recognizing the writing track according to the fingertip position information to obtain corresponding candidate information and displaying the candidate information; and furthermore, the recognition of the writing track is realized through the fingertip position information determined by image processing, and the recognition is not needed by adopting depth information, so that the recognition efficiency can be improved. In addition, the embodiment of the invention does not need to adopt depth information for recognition, so that a depth camera is not required to be adopted to acquire video images, and the recognition cost of blank handwriting is reduced.

Description

Identification method and device and electronic equipment

Technical Field

The present invention relates to the field of data processing technologies, and in particular, to a method and apparatus for identifying and an electronic device.

Background

The characters are used as a commonly used information transmission and communication tool, play an important role in a man-machine interaction system, and the currently widely used character input modes comprise: a keyboard, a touch screen, a handwriting board and the like, and a user needs to be contacted with the handwriting equipment to realize writing in the process of handwriting by using the handwriting equipment; however, in some situations, such as when speech recognition fails due to noise during driving, touch handwriting is particularly inconvenient, and thus space-free handwriting has resulted.

At present, the technology is based on the depth camera for space-apart handwriting recognition, and has the defects that firstly, the depth camera is required to be adopted for image acquisition, the cost is high, and secondly, depth information is required in the actual recognition process, so that the recognition efficiency is low.

Disclosure of Invention

The embodiment of the invention provides an identification method, which is used for reducing the identification cost of spaced handwriting and improving the identification efficiency.

Correspondingly, the embodiment of the invention also provides an identification device and electronic equipment, which are used for guaranteeing the implementation and application of the method.

In order to solve the above problems, an embodiment of the present invention discloses an identification method, which specifically includes: entering a writing state, and acquiring a first hand video image from a video image acquired by a camera; performing image processing on the first hand video image to acquire fingertip position information of a writing finger and displaying a writing track on a display screen according to the fingertip position information; and after the writing state is finished, identifying the writing track according to the fingertip position information, and obtaining and displaying the corresponding candidate information.

Optionally, the acquiring a first hand video image from the video image acquired by the camera includes: and acquiring a video image acquired by a camera, and extracting an image of a handwriting area in the video image as a first hand video image.

Optionally, performing image processing on the first hand video image includes: preprocessing a first hand video image to obtain a first preprocessed image; judging whether the hand gesture in the first hand video image is a writing gesture or not according to the first preprocessing image; and if the hand gesture in the first hand video image is a writing gesture, executing the step of acquiring the fingertip position information of the writing finger.

Optionally, the preprocessing the first hand video image to obtain a first preprocessed image includes: performing background subtraction on the first hand video image by adopting a Gaussian mixture model, and extracting a foreground from the first hand video image to obtain a first image; filtering the first image by adopting the skin color model to obtain a second image; morphological processing is carried out on the second image, and a third image is obtained; and performing binarization processing on the third image to obtain a first preprocessed image.

Optionally, the determining, according to the first preprocessed image, whether the hand gesture in the first hand video image is a writing gesture includes: performing contour detection on the first preprocessed image, and determining the hand contour in the first preprocessed image; performing convex hull detection on the hand outline, and determining the number of the convex fingers and fingertip position information of each convex finger; and if the number of the protruding fingers is matched with the set number of the protruding fingers corresponding to the writing gesture, determining that the gesture of the hand in the first hand video image is the writing gesture.

Optionally, the determining, according to the first preprocessed image, whether the hand gesture in the first hand video image is a writing gesture includes: inputting the first preprocessed image into a static gesture recognition model to obtain probability scores of all gestures, wherein the gestures at least comprise: start writing gesture, write gesture, select gesture, delete gesture, and other gestures; and if the probability score of the writing gesture is highest, determining that the gesture of the hand in the first hand video image is the writing gesture.

Optionally, the acquiring fingertip position information of the writing finger includes: the fingertip position information of the protruding finger is determined as fingertip position information of the writing finger.

Optionally, the method further comprises: acquiring a second hand video image from the video image acquired by the camera; judging whether the hand gesture in the second hand video image is a writing starting gesture or not; and if the hand gesture in the second hand video image is a gesture for starting writing, entering a writing state.

Optionally, the method further comprises: and if the hand gesture in the first hand video image is a selection gesture, determining that the writing state is ended.

Optionally, the candidate information includes a plurality of candidates, and after the step of presenting, the method further includes: periodically polling each candidate item of the current page, and acquiring a third hand video image from the video images acquired by the camera; when the hand gesture in the third hand video image is detected to be the gesture for starting writing, the upper screen corresponds to the polled candidate and enters the writing state again.

Optionally, the method further comprises: and if the gesture in the first hand video image is a delete gesture and the fingertip position information of the writing finger is not acquired after the writing state is entered, deleting the last screen-on candidate.

Optionally, the method further comprises: and if the gesture in the first hand video image is a delete gesture and the fingertip position information of the writing finger is acquired after the first hand video image is in a writing state, the writing track displayed on the display screen is cleared.

Optionally, the identifying the writing track according to the fingertip position information to obtain corresponding candidate information includes: inputting the fingertip position information into a handwriting engine so that the handwriting engine can recognize a writing track according to the fingertip position information to obtain candidate information and return the candidate information; and receiving candidate information returned by the handwriting engine.

The embodiment of the invention also discloses an identification device, which specifically comprises: the acquisition module is used for entering a writing state and acquiring a first hand video image from the video image acquired by the camera; the image processing module is used for performing image processing on the first hand video image, acquiring fingertip position information of a writing finger and displaying a writing track on a display screen according to the fingertip position information; and the identification module is used for identifying the writing track according to the fingertip position information after the writing state is ended, so as to obtain and display the corresponding candidate information.

Optionally, the acquiring module is configured to acquire a video image acquired by the camera, and extract an image of a handwriting area in the video image as the first hand video image.

Optionally, the image processing module includes: the preprocessing sub-module is used for preprocessing the first hand video image to obtain a first preprocessed image; the gesture judging sub-module is used for judging whether the gesture of the hand in the first hand video image is a writing gesture or not according to the first preprocessing image; and the position acquisition sub-module is used for acquiring the fingertip position information of the writing finger if the hand gesture in the first hand video image is the writing gesture.

Optionally, the preprocessing submodule is used for performing background subtraction on the first hand video image by adopting a gaussian mixture model, and extracting a foreground from the first hand video image to obtain a first image; filtering the first image by adopting the skin color model to obtain a second image; morphological processing is carried out on the second image, and a third image is obtained; and performing binarization processing on the third image to obtain a first preprocessed image.

Optionally, the gesture determination submodule includes: the first judging unit is used for carrying out contour detection on the first preprocessed image and determining the hand contour in the first preprocessed image; performing convex hull detection on the hand outline, and determining the number of the convex fingers and fingertip position information of each convex finger; and if the number of the protruding fingers is matched with the set number of the protruding fingers corresponding to the writing gesture, determining that the gesture of the hand in the first hand video image is the writing gesture.

Optionally, the gesture determination submodule includes: the second judging unit is configured to input the first preprocessed image into a static gesture recognition model, and obtain probability scores of each gesture, where the gesture at least includes: start writing gesture, write gesture, select gesture, delete gesture, and other gestures; and if the probability score of the writing gesture is highest, determining that the gesture of the hand in the first hand video image is the writing gesture.

Optionally, the position acquisition sub-module is configured to determine fingertip position information of the protruding finger as fingertip position information of the writing finger.

Optionally, the apparatus further comprises: the writing state start determining module is used for acquiring a second hand video image from the video image acquired by the camera; judging whether the gesture corresponding to the hand in the second hand video image is a writing starting gesture or not; if the gesture corresponding to the hand in the second hand video image is a gesture for starting writing, entering a writing state

Optionally, the apparatus further comprises: and the writing state ending determining module is used for determining that the writing state is ended if the hand gesture in the first hand video image is a selection gesture.

Optionally, the candidate information includes a plurality of candidates, and the apparatus further includes: the polling module is used for periodically polling each candidate item of the current page and acquiring a third hand video image from the video images acquired by the camera; and the screen-on module is used for correspondingly polling the candidate items when detecting that the hand gesture in the third hand video image is the selection gesture.

Optionally, the apparatus further comprises: the first deleting module is used for deleting the last screen candidate if the gesture in the first hand video image is a deleting gesture and the fingertip position information of the writing finger is not acquired after the writing state is entered.

Optionally, the apparatus further comprises: and the second deleting module is used for clearing the writing track displayed on the display screen if the gesture in the first hand video image is a deleting gesture and the fingertip position information of the writing finger is acquired after the writing state is entered.

Optionally, the recognition module is configured to input the fingertip position information into a handwriting engine, so that the handwriting engine recognizes a writing track according to the fingertip position information, obtains candidate information, and returns the candidate information; and receiving candidate information returned by the handwriting engine.

The embodiment of the invention also discloses a readable storage medium, which enables the electronic equipment to execute the identification method according to any one of the embodiments of the invention when the instructions in the storage medium are executed by the processor of the electronic equipment.

The embodiment of the invention also discloses an electronic device, which comprises a memory and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by one or more processors, and the one or more programs comprise instructions for: entering a writing state, and acquiring a first hand video image from a video image acquired by a camera; performing image processing on the first hand video image to acquire fingertip position information of a writing finger and displaying a writing track on a display screen according to the fingertip position information; and after the writing state is finished, identifying the writing track according to the fingertip position information, and obtaining and displaying the corresponding candidate information.

Optionally, the performing image processing on the first hand video image includes: preprocessing a first hand video image to obtain a first preprocessed image; judging whether the hand gesture in the first hand video image is a writing gesture or not according to the first preprocessing image; and if the hand gesture in the first hand video image is a writing gesture, executing the step of acquiring the fingertip position information of the writing finger.

Optionally, further comprising instructions for: acquiring a second hand video image from the video image acquired by the camera; judging whether the hand gesture in the second hand video image is a writing starting gesture or not; and if the hand gesture in the second hand video image is a gesture for starting writing, entering a writing state.

Optionally, further comprising instructions for: and if the hand gesture in the first hand video image is a selection gesture, determining that the writing state is ended.

Optionally, the candidate information includes a plurality of candidates, and after the step of presenting, further includes instructions for: periodically polling each candidate item of the current page, and acquiring a third hand video image from the video images acquired by the camera; when the hand gesture in the third hand video image is detected to be the gesture for starting writing, the upper screen corresponds to the polled candidate and enters the writing state again.

Optionally, further comprising instructions for: and if the gesture in the first hand video image is a delete gesture and the fingertip position information of the writing finger is not acquired after the writing state is entered, deleting the last screen-on candidate.

Optionally, further comprising instructions for: and if the gesture in the first hand video image is a delete gesture and the fingertip position information of the writing finger is acquired after the first hand video image is in a writing state, the writing track displayed on the display screen is cleared.

The embodiment of the invention has the following advantages:

In the embodiment of the invention, when determining to enter a writing state, a first hand video image acquired from a video image can be acquired from a camera, then the first hand video image is subjected to image processing, fingertip position information of a writing finger is acquired, a writing track is displayed on a display screen according to the fingertip position information, after the writing state is finished, the writing track is identified according to the fingertip position information, and corresponding candidate information is acquired and displayed; and furthermore, the recognition of the writing track is realized through the fingertip position information determined by image processing, and the recognition is not needed by adopting depth information, so that the recognition efficiency can be improved. In addition, the embodiment of the invention does not need to adopt depth information for recognition, so that a depth camera is not required to be adopted to acquire video images, and the recognition cost of blank handwriting is reduced.

Drawings

FIG. 1 is a flow chart of steps of an embodiment of an identification method of the present invention;

FIG. 2a is a schematic diagram of a writing trace display interface according to an embodiment of the present invention;

FIG. 2b is a schematic diagram of a candidate information presentation interface according to an embodiment of the present invention;

FIG. 3a is a schematic diagram of a start writing gesture according to an embodiment of the present invention;

FIG. 3b is a schematic diagram of a writing gesture according to an embodiment of the present invention;

FIG. 3c is a schematic diagram of a selection gesture according to an embodiment of the present invention;

FIG. 3d is a schematic diagram of a delete gesture according to an embodiment of the present invention;

FIG. 4 is a flowchart of the steps of an alternate embodiment of an identification method of the present invention;

FIG. 5a is a schematic illustration of a handwriting area in a video image according to an embodiment of the invention;

FIG. 5b is a schematic illustration of a fourth image according to an embodiment of the invention;

FIG. 5c is a schematic illustration of a sixth image according to an embodiment of the invention;

FIG. 5d is a schematic illustration of a second preprocessed image according to an embodiment of the present invention;

FIG. 5e is a schematic illustration of a hand with contour and edge detection in accordance with an embodiment of the present invention;

FIG. 5f is a flowchart of method steps for determining whether a writing gesture is initiated according to an embodiment of the present invention;

FIG. 5g is a schematic diagram of a polling candidate according to an embodiment of the invention;

FIG. 5h is a schematic diagram of a screen-up candidate according to an embodiment of the present invention;

FIG. 6 is a block diagram of an embodiment of an identification appliance of the present invention;

FIG. 7 is a block diagram of an alternative embodiment of an identification device of the present invention;

FIG. 8 illustrates a block diagram of an electronic device for identification, according to an exemplary embodiment;

Fig. 9 is a schematic structural view of an electronic device for recognition according to another exemplary embodiment of the present invention.

Detailed Description

In order that the above-recited objects, features and advantages of the present invention will become more readily apparent, a more particular description of the invention will be rendered by reference to the appended drawings and appended detailed description.

One of the core concepts of the embodiment of the invention is that in the process of spaced handwriting, a camera can be adopted to collect video images, then finger tip position information of a writing finger is obtained by carrying out image processing on the hand video images, and after the writing state is finished, the writing track is identified according to the obtained finger tip position information, so that the identification by adopting depth information is not needed, and the identification efficiency is improved. Correspondingly, the depth camera is not required to be adopted for collecting video images, so that the recognition cost of blank handwriting is reduced.

Referring to fig. 1, a flowchart illustrating steps of an embodiment of an identification method of the present invention may specifically include the following steps:

step 102, entering a writing state, and acquiring a first hand video image from video images acquired by a camera.

And 104, performing image processing on the first hand video image, acquiring fingertip position information of a writing finger, and displaying a writing track on a display screen according to the fingertip position information.

And 106, after the writing state is finished, identifying the writing track according to the fingertip position information, and obtaining and displaying the corresponding candidate information.

In the embodiment of the invention, the blank handwriting equipment can acquire the video image of the blank handwriting of the user by adopting the camera, and the writing track of the blank handwriting is identified by performing image processing on the video image; wherein, can adopt ordinary camera (such as RGB camera) of other than the depth camera to gather video image, and then can reduce the recognition cost of spaced handwriting. In addition, the camera may be a built-in camera of the spaced handwriting device or an external camera, which is not limited in the embodiment of the present invention. According to the embodiment of the invention, the spaced handwriting equipment can process the collected video images in real time, namely, each time a frame of video image is collected, the hand video image can be extracted from the frame of video image, and then the image processing is carried out on the hand video image.

In the process of performing blank handwriting, a user can inform the blank handwriting equipment of starting writing by executing a start writing gesture, then perform blank handwriting by adopting the writing gesture, and inform the blank handwriting equipment of finishing writing by executing a finish writing gesture. After the user performs the writing gesture, the blank handwriting device can enter a writing state, and then in the process that the user performs blank writing by using the writing gesture, hand video images can be acquired from video images acquired by the camera each time, wherein in order to distinguish hand video images acquired by the user during the execution of different gestures, the hand video images acquired from the writing state to the end of the writing state can be called as a first hand video image. Then, image processing can be carried out on the first hand video image, such as background elimination, foreground extraction, hand outline detection, convex hull detection and the like, so as to obtain fingertip position information of the writing finger; after the fingertip position information is acquired, the fingertip position information can be saved on one hand, so that the writing track can be conveniently identified according to the saved fingertip position information; on the other hand, the mapping relation can be searched based on the fingertip position information, the corresponding display position in the display screen is determined, then the display position in the display screen is marked, the current display position is connected with the last display position, and the corresponding writing track is displayed. When the writing of the user is finished, the writing gesture can be executed, at the moment, the blank handwriting equipment can determine that the writing state is finished, then the recorded fingertip position information can be adopted to identify the writing track from the writing state to the writing state finishing process, for example, a handwriting engine is called to identify the writing track, and corresponding candidate information is determined and displayed; wherein the candidate information may include a plurality of candidates. The subsequent user can select the needed candidate item to be displayed through executing the corresponding gesture, so that the input of blank handwriting is realized.

In one example of the present invention, the process of a user writing a "search" word with a space-free hand is as follows: executing a writing starting gesture, and enabling the blank handwriting equipment to enter a writing state; then, a user writes 'search' by adopting writing gestures, and corresponding spaced handwriting equipment acquires a first hand video image from video images acquired by a camera; then, performing image processing on the first hand video image to acquire fingertip position information of a writing finger and displaying a writing track on a display screen according to the fingertip position information; as shown in fig. 2 a. After the user writes the 'search', executing the writing gesture, and the blank handwriting equipment can determine that the handwriting state is ended, then can identify the writing track according to the fingertip position information, and obtain and display the corresponding candidate information; as shown in fig. 2 b.

In summary, in the embodiment of the invention, when determining to enter a writing state, a first hand video image can be acquired from a video image acquired by a camera, then image processing is performed on the first hand video image, fingertip position information of a writing finger is acquired, a writing track is displayed on a display screen according to the fingertip position information, after the writing state is finished, the writing track is identified according to the fingertip position information, and corresponding candidate information is acquired and displayed; and furthermore, the recognition of the writing track is realized through the fingertip position information determined by image processing, and the recognition is not needed by adopting depth information, so that the recognition efficiency can be improved. In addition, the embodiment of the invention does not need to adopt depth information for recognition, so that the embodiment of the invention also does not need to adopt a depth camera to collect video images, thereby reducing the recognition cost of blank handwriting.

In one example of the invention, after the blank handwriting device displays the candidate information corresponding to the writing track, the user can execute the writing starting operation to select the candidate item meeting the requirement, and correspondingly, the blank handwriting device can screen the candidate item selected by the writing starting operation and enter the writing state again, so that the user can write continuously. Of course, it should be understood that only one gesture embodiment of selecting a candidate is given herein, and that a specific gesture may be defined in practical application for performing the operation of selecting the candidate.

In another example of the present invention, the user may misoperate the screen candidate, so after the screen candidate is spaced from the handwriting device and enters the writing state again, the user may perform a deletion operation, and then the spaced handwriting device may delete the corresponding screen candidate; the subsequent user may re-perform the start writing gesture and re-perform handwriting input.

In yet another example of the present invention, a user may make an error in writing a certain character, and at this time, the user may perform a delete gesture, and further the blank handwriting device may delete a previously input stroke; the subsequent user may re-perform the start writing gesture and re-perform handwriting input.

The gesture for starting writing can be a gesture for informing a user of starting writing of the blank handwriting equipment, and can be a gesture for displaying candidates; as shown in fig. 3a, a start writing gesture is shown in fig. 3a, i.e. the palm is facing the camera and the five fingers are open. The writing gesture may refer to a gesture used by a user for writing; as shown in fig. 3b, a writing gesture is shown in fig. 3b, i.e. the palm is facing the camera and the index finger is open and the other four directions are curved towards the palm. The deleting gesture may be a gesture for deleting information such as a writing track, a displayed candidate item and the like by a user; as shown in fig. 3c, one type of ending writing gesture is shown in fig. 3c, i.e., the palm is facing the camera, and the thumb and index finger are open and the other three directions are curved toward the palm. The selection gesture may be a gesture for informing the user that the writing of the spaced handwriting device is finished; as shown in fig. 3d, one selection gesture is shown in fig. 3d, i.e., palm towards the camera and five fingers towards the palm center. The gestures described above may be set according to the needs of the user, which is not limited in the embodiments of the present invention; in addition, the above-described gestures may be referred to as a set gesture, and gestures other than the set gesture may be referred to as other gestures, which may correspond to other functions, which the embodiments of the present invention do not limit.

In order to facilitate explanation of the sequence of executing gestures by the user during the blank handwriting process, each gesture may be numbered:

11. Starting a writing gesture; 12. writing gestures; 13. selecting a gesture; 14. the delete gesture.

For example, one order in which the user performs gestures may be as follows: 11- & gt 12- & gt 13- & gt 11;

for another example, one order in which the user performs gestures may be as follows: 11- > 12- > 13- > 11- > 14- > 11;

Also for example, one order in which the user performs gestures may be as follows: 11- > 12- > 14- > 11- > 12- > 13- > 11;

for another example, one order in which the user performs gestures may be as follows: 11-12-14-11-12-13-11-14-11.

Of course, in the process of the user blank-out handwriting of a character, other combinations of the above gestures may be also included, which are not illustrated herein, and are not limited thereto.

The blank handwriting equipment judges whether to enter a writing state, acquire fingertip position information of a writing finger, end the writing state, delete a writing track, screen, delete a screen candidate and the like based on a gesture corresponding to a hand in a hand video image. Therefore, in the embodiment of the invention, after each frame of hand video image is acquired, the corresponding gesture of the hand in the hand video image can be determined by performing image processing on the hand video image, and then the corresponding operation is executed.

Referring to fig. 4, a flowchart illustrating steps of an embodiment of an identification method of the present invention may specifically include the following steps:

step 402, acquiring a second hand video image from the video image acquired by the camera.

In the embodiment of the invention, after the blank handwriting equipment is started, the hand video image can be acquired from the video image acquired by the camera to judge whether the hand video image enters a writing state before the hand gesture in the video image is determined to be the writing starting gesture or after the hand gesture in the hand video image is determined to be the selecting gesture and before the hand gesture in the video image is determined to be the writing starting gesture. The hand video image acquired in this process may be referred to as a second hand video image, among others. In one example of the present invention, a video image captured by a camera may be acquired, and the image extracted from the handwriting area as a second hand video image. In the embodiment of the invention, after the video image is acquired by the camera of the spaced handwriting equipment, the video image can be displayed on the display screen, and a handwriting area such as the upper right corner area of the video image is identified in the displayed video image as shown in fig. 5a, and the area in the box indicated by A in fig. 5a is the handwriting area, so that a user can perform spaced handwriting in the handwriting area, the hand video image of the user can be conveniently acquired, and the identification accuracy is improved. Therefore, after the video image acquired by the camera is acquired, the image corresponding to the handwriting area can be extracted from the video image according to the area information of the handwriting area and used as a second hand video image.

Step 404, determining whether the gesture corresponding to the hand in the second hand video image is a gesture for starting writing.

After the second hand video image is obtained, gesture recognition can be performed on the second hand video image, and whether the gesture corresponding to the hand in the second hand video image is a writing starting gesture is judged; if the gesture corresponding to the hand in the second hand video image is the gesture for starting writing, it may be determined that the user needs to write, and the user may enter a writing state, and then step 406 is executed; if the gesture corresponding to the hand in the second hand video image is not the start writing gesture, it may be determined that the user does not need to write, and the second hand video image may be acquired from the video image acquired by the next frame of camera, and step 402 is performed.

In one example of the present invention, the determining whether the gesture corresponding to the hand in the second hand video image is the gesture for starting writing may be performed by preprocessing the second hand video image to obtain a second preprocessed image, and determining whether the gesture in the second hand video image is the gesture for starting writing according to the second preprocessed image; if the gesture in the second hand video image is a gesture for starting writing, the writing state is entered, and if the gesture in the second hand video image is not the gesture for starting writing, the second hand video image is acquired from the video image acquired by the next frame of camera, and step 402 is executed.

The step of preprocessing the second hand video image to obtain a second preprocessed image may include the following sub-steps:

and A2, performing background subtraction on the second hand video image by adopting a Gaussian mixture model, and extracting a foreground from the second hand video image to obtain a fourth image.

And A4, filtering the fourth image by adopting the skin color model to obtain a fifth image.

And a substep A6, performing morphological processing on the fifth image to obtain a sixth image.

And a sub-step A8 of performing binarization processing on the sixth image to obtain a second preprocessed image.

In the embodiment of the invention, a gaussian mixture model can be used to perform background subtraction on the second hand video image, and a fourth image is obtained by extracting a foreground from the second hand video image, as shown in fig. 5 b. And filtering the fourth image by adopting a pre-trained skin color model to obtain a fifth image, wherein the skin color model can comprise a Bayes skin color model. Then, the morphology processing is carried out on the fifth image, so that the hand morphology is recovered, the accuracy of gesture recognition is further improved, and a sixth image is obtained, which can be shown in fig. 5 c; and then, binarizing the sixth image to separate the image of the corresponding region of the hand from the image corresponding to the background part, so that the subsequent recognition of the hand gesture is facilitated, and a second preprocessed image is obtained, which can be shown in fig. 5 d.

In an embodiment of the present invention, a manner of determining whether the gesture in the second hand video image is a writing start gesture according to the second preprocessed image may include a plurality of manners, where in an example of the present invention, a manner of determining whether the gesture in the second hand video image is a writing start gesture may include the following sub-steps:

a10, performing contour detection on the second preprocessed image, and determining a hand contour in the second preprocessed image;

Step A12, performing convex hull detection on the hand outline, and determining the number of the convex fingers and the position information of the top ends of the convex fingers;

And a substep A14, judging whether the number of the protruding fingers is matched with the set number of the protruding fingers corresponding to the writing gesture.

A substep a16 determines that the hand gesture in the second hand video image is a start writing gesture.

Wherein the hand contour may be determined by ignoring the effects of background, texture inside the hand, and noise disturbance in the second pre-processed image; then, convex hull detection is performed based on the determined hand outline, points of the outermost layer of the hand outline (which may include determining position information of the tips of the respective fingers) are determined, and the points of the outermost layer are connected to form a convex polygon as shown in fig. 5 e. Further, according to the position information of the top ends of the fingers, determining which fingers are protruding fingers, and then counting the number of the protruding fingers; the position information of the top end of the protruding finger is the fingertip position information of the protruding finger. For example, as shown in fig. 3a, the set number of corresponding protruding fingers may be 5, and if the number of protruding fingers is 5, it may be determined that the hand gesture in the second preprocessed image is the start writing gesture; if the number of the protruding fingers is not 5, it can be determined that the hand gesture in the second preprocessing is not the start writing gesture, and the process of judging whether the hand gesture in the second video image is the start writing gesture can be ended.

In another example of the present invention, another way of determining whether the hand gesture in the second hand video image is a start writing gesture may include the following sub-steps:

Step A18, inputting the second preprocessed image into a static gesture recognition model to obtain probability scores of all gestures, wherein the gestures can comprise set gestures and other gestures;

and a substep A20, judging whether the probability score of the starting writing gesture is highest.

A substep a22 determines that the hand gesture in the second hand video image is a start writing gesture.

The static gesture recognition model can be trained by adopting a set gesture and other gestures, for example, a start writing gesture can be input into the static gesture recognition model to obtain probability scores corresponding to all the gestures, then a loss function is calculated by adopting the probability scores corresponding to the start writing gesture, and the weight of the static gesture recognition model is adjusted by adopting the loss function; other set gestures and other gestures may of course be employed, as well, with the static gesture recognition model being trained in the manner described above. And further, the trained static gesture recognition model may be adopted to recognize the hand gesture in the second pre-processing image, then determine whether the probability score of the hand gesture to start writing is the highest, if the probability score of the hand gesture to start writing is the highest, determine that the hand gesture in the second pre-processing image is the writing gesture, execute the sub-step a22, and if the probability score of the hand gesture to start writing is not the highest, end the current process of determining whether the hand gesture in the second hand video image is the starting writing gesture.

In the embodiment of the present invention, after the execution of the sub-step A8, the sub-steps a10 to a16 may be executed, or the sub-steps a18 to a22 may be executed; of course, the two modes can be combined, whether the gesture of the hand in the second pre-processing image is the gesture of starting writing is judged, and when the gesture of the hand in the second pre-processing image is determined to be the gesture of starting writing according to the first mode and the gesture of the hand in the second pre-processing image is determined to be the gesture of starting writing according to the second mode, the gesture of the hand in the second pre-processing image can be determined to be the gesture of starting writing. For example, after the substeps a10-a14 are performed, if the number of protruding fingers matches the set number of protruding fingers corresponding to the gesture of starting writing, the substeps a18-a22 are performed, and referring to fig. 5f, a flowchart of a method for determining whether the gesture of starting writing is performed according to an embodiment of the present invention is shown.

Step 406, acquiring a first hand video image from the video image acquired by the camera.

If the gesture corresponding to the hand in the first hand video image is determined to be the gesture for starting writing, the method can enter a writing state, and the first hand video image acquired from the video image is acquired from the camera, wherein the video image acquired by the camera can be acquired, and the image of the handwriting area in the video image is extracted as the first hand video image.

Step 408, determining whether the gesture corresponding to the hand in the first hand video image is a writing gesture.

Preprocessing a first hand video image to obtain a first preprocessed image; judging whether the gesture in the first hand video image is a writing gesture or not according to the first preprocessing image; if the gesture in the first hand video image is a writing gesture, executing step 410; if the hand gesture in the first hand video image is a selection gesture, it is determined that the writing state is ended, and step 414 may be executed. If the gesture in the first hand video image is a delete gesture and the fingertip position information of the writing finger is not acquired after entering the writing state, step 422 is executed. If the gesture in the first hand video image is a delete gesture and the fingertip position information of the writing finger is acquired after entering the writing state, step 416 is executed.

The step of preprocessing the first hand video image to obtain a first preprocessed image may include the following sub-steps:

And B2, performing background subtraction on the first hand-held video image by adopting a Gaussian mixture model, and extracting a foreground from the first hand-held video image to obtain a first image.

And B4, filtering the first image by adopting the skin color model to obtain a second image.

And B6, performing morphological processing on the second image to obtain a third image.

And B8, performing binarization processing on the third image to obtain a first preprocessed image.

Step B10, performing contour detection on the first preprocessed image, and determining a hand contour in the first preprocessed image;

Step B12, performing convex hull detection on the hand outline, and determining the number of the convex fingers and the position information of the top ends of the convex fingers;

And B14, judging whether the number of the protruding fingers is matched with the set number of the protruding fingers corresponding to the writing gesture.

And a substep B16, determining that the hand gesture in the first hand video image is a writing gesture.

Step B18, inputting the first preprocessed image into a static gesture recognition model to obtain probability scores of all gestures, wherein the gestures can comprise set gestures and other gestures;

And B20, judging whether the probability score of the writing gesture is highest.

And a substep B22, determining that the hand gesture in the first hand video image is a writing gesture.

The sub-steps B2-B22 are similar to the sub-steps A2-A22 described above, and are not repeated here.

Step 410, if it is determined that the hand gesture in the first hand video image is a writing gesture, determining the fingertip position information of the protruding finger as the fingertip position information of the writing finger.

And step 412, displaying a writing track on a display screen according to the fingertip position information.

In the embodiment of the invention, if the writing gesture of the hand gesture in the first hand video image is determined, it can be determined that the user is performing blank handwriting, and at this time, the fingertip position information of the writing finger can be obtained; the writing gesture may be a gesture in which one finger is spread and the other finger is bent as shown in fig. 3b, so that the protruding finger may be determined as a writing finger and the fingertip position information of the protruding finger may be determined as fingertip position information of the writing finger. And then searching a mapping relation based on the fingertip position information, determining a corresponding display position in the display screen, marking the display position in the display screen, connecting the current display position with the last display position, and displaying a corresponding writing track. When the distance between two adjacent display positions exceeds the distance threshold, interpolation can be performed between the two adjacent display positions, and then the two adjacent display positions are connected through interpolation points, so that the displayed writing track is smoother. The first hand video image may then be acquired from the video image acquired by the next frame of cameras, again performing step 406.

Step 414, if the hand gesture in the first hand video image is a selection gesture, identifying a writing track according to the fingertip position information, and obtaining and displaying corresponding candidate information.

In the embodiment of the invention, if the hand gesture in the first hand video image is a selection gesture, determining that the writing of the user is finished, and at the moment, inputting the fingertip position information into a handwriting engine so that the handwriting engine recognizes the writing track according to the fingertip position information to obtain candidate information and returns the candidate information; and receiving candidate information returned by the handwriting engine. The candidate information can be displayed, wherein the candidate information can comprise a plurality of candidate items, and the handwriting engine can return candidate scores corresponding to the candidate items in the candidate information while returning the candidate information; the candidates may be presented in order of the candidate scores from high to low, and then step 418 may be performed.

Step 416, if the hand gesture in the first hand video image is a delete gesture and the fingertip position information of the writing finger is obtained after entering the writing state, clearing the writing track displayed on the display screen.

In the embodiment of the invention, the user can delete the input writing track in the writing process, so that if the hand gesture in the first hand video image is determined to be a deleting gesture and the fingertip position information of the writing finger is acquired after the writing state is entered, the track which is written before the user needs to be deleted can be determined, and the writing track displayed on the display screen can be cleared and the fingertip position information stored after the writing state is entered can be deleted. The subsequent user may again perform the start writing gesture, resume blank handwriting, and the corresponding blank handwriting device may perform step 402.

And 418, periodically polling each candidate item of the current page, and acquiring a third hand video image from the video images acquired by the camera.

And 420, when the hand gesture in the third hand video image is detected to be the gesture for starting writing, the screen is correspondingly polled candidate items and enters a writing state again.

In the embodiment of the present invention, the number of candidate items displayed on each page in the candidate bar may be preset, and when the number of candidate items included in the candidate information is greater than the number of candidate items displayed on one page, the candidate information may be displayed on two or more pages in the candidate bar. After displaying the candidate information, if the user determines that the required candidate exists in the current page, the corresponding candidate can be selected from the current page. Each candidate item of the current page can be periodically polled by the blank handwriting equipment, and when the user determines that the polled candidate item is a candidate item needing to be selected, a writing starting gesture can be executed so as to select a corresponding polled candidate item; the user may continue to hold the selection gesture until the candidate to be selected is polled. If the user determines that the candidate items which are not needed in the current page are not needed, the user can execute a page turning gesture, and the corresponding blank handwriting equipment can turn to the next page; if the user determines that the needed candidate exists in the next page, the user can select the corresponding candidate from the next page. Each candidate item of the next page can be periodically polled by the blank handwriting equipment, and when the user determines that the polled candidate item is a candidate item needing to be selected, a writing starting gesture can be executed so as to select a corresponding polled candidate item; the user may continue to hold the selection gesture until the candidate to be selected is polled. Therefore, the blank handwriting device periodically polls each candidate item in the current page, and simultaneously can acquire hand video images from video images acquired by the camera, wherein the acquired hand video images from the time of displaying candidate information to the time of determining that hand gestures in the hand video images are writing gestures, are called third hand video images, and the polling period can be set according to requirements. Then judging whether the hand gesture in the third hand video image is a writing starting gesture, and executing step 420 when the hand gesture in the third hand video image is detected to be the writing starting gesture, wherein the screen is corresponding to the polled candidate; meanwhile, the user can enter a writing state again, then step 406 can be executed, and the user can continue writing after selecting the candidate item to be input, so that the operation is smooth and simple. When the hand gesture in the third hand video image is detected to be a page turning gesture, page turning can be carried out, and candidate items in the next page are displayed; and then proceeds to step 418. When it is detected that the hand in the third hand video image is still a selection gesture, step 418 may be performed continuously, that is, polling is performed continuously, and the third hand video image is acquired from the next frame of video image. As shown in fig. 2b, the first candidate is polled, and when the polling period arrives, if it is detected that the hand gesture in the third hand video image is not the start writing gesture, polling may be continued. As shown in fig. 5g, when the 6 th candidate is polled and before the next polling period arrives, if it is detected that the hand gesture in the third hand video image is the start writing gesture, the sixth candidate is polled on the screen, as shown in fig. 5 h.

Step 422, if the gesture in the first hand video image is a delete gesture and the fingertip position information of the writing finger is not obtained after entering the writing state, deleting the last screen-on candidate.

In the embodiment of the invention, after the screen candidate is displayed and the writing state is entered again, the user can execute the deleting gesture before executing the writing gesture to delete the last screen candidate, so after entering the writing state and acquiring the first hand video image, if the gesture in the first hand video image is the deleting gesture and the fingertip position information of the writing finger is not acquired after entering the writing state, the user can determine that the last screen candidate needs to be deleted, and at the moment, the last screen candidate can be executed. And the user can delete the content on the screen by executing the deleting operation.

In summary, in the embodiment of the invention, when determining to enter a writing state, a first hand video image can be acquired from a video image acquired by a camera, then image processing is performed on the first hand video image, fingertip position information of a writing finger is acquired, a writing track is displayed on a display screen according to the fingertip position information, after the writing state is finished, the writing track is identified according to the fingertip position information, and corresponding candidate information is acquired and displayed; and furthermore, the recognition of the writing track is realized through the fingertip position information determined by image processing, and the recognition is not needed by adopting depth information, so that the recognition efficiency can be improved. In addition, the embodiment of the invention does not need to adopt depth information for recognition, so that a depth camera is not required to be adopted to acquire video images, and the recognition cost of blank handwriting is reduced.

Secondly, according to the embodiment of the invention, a Gaussian mixture model is adopted to carry out background subtraction on the first hand-held video image, and a foreground is extracted from the first hand-held video image to obtain a first image; filtering the first image by adopting the skin color model to obtain a second image; morphological processing is carried out on the second image, and a third image is obtained; performing binarization processing on the third image to obtain a first preprocessed image; and judging whether the hand gesture in the first hand video image is a writing gesture according to the first preprocessing image, so that the accuracy of the subsequent recognition of the hand corresponding gesture in the first hand video image can be improved.

Further, in the embodiment of the invention, in the process of judging whether the hand gesture in the first hand video image is a writing gesture according to the first preprocessed image, gesture recognition can be performed by performing contour detection and convex hull detection on the first preprocessed image; the static gesture recognition model can be adopted to recognize the gesture of the hand in the first preprocessed image, so that the accuracy of gesture recognition is improved.

Further, in the embodiment of the present invention, after the step of displaying, each candidate item of the current page may be periodically polled, and a third hand video image may be obtained from the video image collected by the camera; when the hand gesture in the third hand video image is detected to be the gesture for starting writing, the upper screen corresponds to the polled candidate and enters a writing state again; and the screen display method and the screen display device can not only realize the screen display of the candidate selected by the user, but also enter the writing state again, so that the user can continue to execute the writing gesture to carry out handwriting input, the operation from the screen display candidate to the writing of the next character is reduced, the operation is simple, convenient and quick, and the user experience can be improved.

Further, in the embodiment of the present invention, if the gesture in the first hand video image is a delete gesture and the fingertip position information of the writing finger is not acquired after entering the writing state, deleting the last screen candidate; the user can delete the content on the screen conveniently; and according to the condition that the gesture in the first hand video image is a deleting gesture and the fingertip position information of the writing finger is acquired after the writing state is entered, the writing track displayed on the display screen is cleared, the user can delete the inputted track conveniently, and the user experience is improved.

In the embodiment of the invention, the fingertip position information can be input into the handwriting engine, so that the handwriting engine can recognize the writing track according to the fingertip position information to obtain candidate information and return the candidate information, and the candidate information returned by the handwriting engine is received; the accuracy of writing track recognition is improved.

It should be noted that, for simplicity of description, the method embodiments are shown as a series of acts, but it should be understood by those skilled in the art that the embodiments are not limited by the order of acts, as some steps may occur in other orders or concurrently in accordance with the embodiments. Further, those skilled in the art will appreciate that the embodiments described in the specification are presently preferred embodiments, and that the acts are not necessarily required by the embodiments of the invention.

Referring to fig. 6, a block diagram of an embodiment of an identification device of the present invention is shown, and may specifically include the following modules:

The acquisition module 602 is configured to enter a writing state, and acquire a first hand video image from video images acquired by the camera;

the image processing module 604 is configured to perform image processing on the first hand video image, obtain fingertip position information of a writing finger, and display a writing track on a display screen according to the fingertip position information;

And the recognition module 606 is used for recognizing the writing track according to the fingertip position information after the writing state is finished, so as to obtain and display the corresponding candidate information.

Referring to fig. 7, a block diagram of an alternative embodiment of an identification device of the present invention is shown.

In an alternative embodiment of the present invention, the obtaining module 602 is configured to obtain a video image collected by a camera, and extract an image of a handwriting area in the video image as a first hand-held video image.

In an alternative embodiment of the present invention, the image processing module 604 includes:

A preprocessing sub-module 6042, configured to preprocess the first hand video image to obtain a first preprocessed image;

the gesture judging submodule 6044 is configured to judge whether a gesture of a hand in the first hand video image is a writing gesture according to the first preprocessed image;

The position obtaining submodule 6046 is configured to obtain fingertip position information of a writing finger if a hand gesture in the first hand video image is a writing gesture.

In an optional embodiment of the present invention, the preprocessing submodule 6042 is configured to perform background subtraction on the first hand-held video image by using a gaussian mixture model, and extract a foreground from the first hand-held video image to obtain a first image; filtering the first image by adopting the skin color model to obtain a second image; morphological processing is carried out on the second image, and a third image is obtained; and performing binarization processing on the third image to obtain a first preprocessed image.

In an alternative embodiment of the present invention, the gesture determination submodule 6044 includes:

A first judging unit 60442, configured to perform contour detection on the first preprocessed image, and determine a hand contour in the first preprocessed image; performing convex hull detection on the hand outline, and determining the number of the convex fingers and fingertip position information of each convex finger; and if the number of the protruding fingers is matched with the set number of the protruding fingers corresponding to the writing gesture, determining that the gesture of the hand in the first hand video image is the writing gesture.

the second determining unit 60444 is configured to input the first preprocessed image into a static gesture recognition model, and obtain a probability score of each gesture, where the gesture at least includes: start writing gesture, write gesture, select gesture, delete gesture, and other gestures; and if the probability score of the writing gesture is highest, determining that the gesture of the hand in the first hand video image is the writing gesture.

In an alternative embodiment of the present invention, the position obtaining submodule 6046 is configured to determine the fingertip position information of the protruding finger as the fingertip position information of the writing finger.

In an alternative embodiment of the present invention, the apparatus further comprises:

a writing state start determining module 608, configured to acquire a second hand video image from the video images acquired by the camera; judging whether the gesture corresponding to the hand in the second hand video image is a writing starting gesture or not; and if the gesture corresponding to the hand in the second hand video image is the gesture for starting writing, entering a writing state.

The writing state ending determining module 610 is configured to determine that the writing state ends if the gesture of the hand in the first hand video image is a selection gesture.

In an alternative embodiment of the present invention, the candidate information includes a plurality of candidates, and the apparatus further includes:

The polling module 612 is configured to periodically poll each candidate item of the current page, and obtain a third hand video image from the video images collected by the camera;

And the screen-up module 614 is configured to, when detecting that the hand gesture in the third hand video image is a gesture for starting writing, screen-up corresponds to the polled candidate and enters the writing state again.

the first deleting module 616 is configured to delete the last candidate for the screen if the gesture in the first hand video image is a delete gesture and the fingertip position information of the writing finger is not acquired after entering the writing state.

the second deleting module 618 is configured to clear a writing track displayed on the display screen if the gesture in the first hand video image is a delete gesture and the fingertip position information of the writing finger is acquired after the writing state is entered.

In an alternative embodiment of the present invention, the recognition module 606 is configured to input the fingertip position information into a handwriting engine, so that the handwriting engine recognizes a writing track according to the fingertip position information, obtains candidate information, and returns the candidate information; and receiving candidate information returned by the handwriting engine.

In the embodiment of the invention, when the writing state is determined to be entered, a first hand video image can be acquired from a video image acquired by a camera, then the first hand video image is subjected to image processing, fingertip position information of a writing finger is acquired, a writing track is displayed on a display screen according to the fingertip position information, after the writing state is ended, the writing track is identified according to the fingertip position information, and corresponding candidate information is obtained and displayed; and furthermore, the recognition of the writing track is realized through the fingertip position information determined by image processing, and the recognition is not needed by adopting depth information, so that the recognition efficiency can be improved. In addition, the embodiment of the invention does not need to adopt depth information for recognition, so that a depth camera is not required to be adopted to acquire video images, and the recognition cost of blank handwriting is reduced.

For the device embodiments, since they are substantially similar to the method embodiments, the description is relatively simple, and reference is made to the description of the method embodiments for relevant points.

Fig. 8 is a block diagram illustrating a configuration of an electronic device 800 for identification, according to an example embodiment. For example, electronic device 800 may be a mobile phone, computer, digital broadcast terminal, messaging device, game console, tablet device, medical device, exercise device, personal digital assistant, or the like.

Referring to fig. 8, an electronic device 800 may include one or more of the following components: a processing component 802, a memory 804, a power component 806, a multimedia component 808, an audio component 810, an input/output (I/O) interface 812, a sensor component 814, and a communication component 816.

The processing component 802 generally controls overall operation of the electronic device 800, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations. Processing element 802 may include one or more processors 820 to execute instructions to perform all or part of the steps of the methods described above. Further, the processing component 802 can include one or more modules that facilitate interactions between the processing component 802 and other components. For example, the processing component 802 may include a multimedia module to facilitate interaction between the multimedia component 808 and the processing component 802.

The memory 804 is configured to store various types of data to support operations at the device 800. Examples of such data include instructions for any application or method operating on the electronic device 800, contact data, phonebook data, messages, pictures, videos, and so forth. The memory 804 may be implemented by any type or combination of volatile or nonvolatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disk.

The power component 806 provides power to the various components of the electronic device 800. Power components 806 may include a power management system, one or more power sources, and other components associated with generating, managing, and distributing power for electronic device 800.

The multimedia component 808 includes a screen between the electronic device 800 and the user that provides an output interface. In some embodiments, the screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive input signals from a user. The touch panel includes one or more touch sensors to sense touches, swipes, and gestures on the touch panel. The touch sensor may sense not only the boundary of a touch or slide action, but also the duration and pressure associated with the touch or slide operation. In some embodiments, the multimedia component 808 includes a front camera and/or a rear camera. When the electronic device 800 is in an operational mode, such as a shooting mode or a video mode, the front camera and/or the rear camera may receive external multimedia data. Each front camera and rear camera may be a fixed optical lens system or have focal length and optical zoom capabilities.

The audio component 810 is configured to output and/or input audio signals. For example, the audio component 810 includes a Microphone (MIC) configured to receive external audio signals when the electronic device 800 is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode. The received audio signals may be further stored in the memory 804 or transmitted via the communication component 816. In some embodiments, audio component 810 further includes a speaker for outputting audio signals.

The I/O interface 812 provides an interface between the processing component 802 and peripheral interface modules, which may be a keyboard, click wheel, buttons, etc. These buttons may include, but are not limited to: homepage button, volume button, start button, and lock button.

The sensor assembly 814 includes one or more sensors for providing status assessment of various aspects of the electronic device 800. For example, the sensor assembly 814 may detect an on/off state of the device 800, a relative positioning of the components, such as a display and keypad of the electronic device 800, the sensor assembly 814 may also detect a change in position of the electronic device 800 or a component of the electronic device 800, the presence or absence of a user's contact with the electronic device 800, an orientation or acceleration/deceleration of the electronic device 800, and a change in temperature of the electronic device 800. The sensor assembly 814 may include a proximity sensor configured to detect the presence of nearby objects without any physical contact. The sensor assembly 814 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor assembly 814 may also include an acceleration sensor, a gyroscopic sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.

The communication component 816 is configured to facilitate communication between the electronic device 800 and other devices, either wired or wireless. The electronic device 800 may access a wireless network based on a communication standard, such as WiFi,2G, or 3G, or a combination thereof. In one exemplary embodiment, the communication part 814 receives a broadcast signal or broadcast-related information from an external broadcast management system via a broadcast channel. In one exemplary embodiment, the communication component 814 further includes a Near Field Communication (NFC) module to facilitate short range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, ultra Wideband (UWB) technology, bluetooth (BT) technology, and other technologies.

In an exemplary embodiment, the electronic device 800 may be implemented by one or more Application Specific Integrated Circuits (ASICs), digital Signal Processors (DSPs), digital Signal Processing Devices (DSPDs), programmable Logic Devices (PLDs), field Programmable Gate Arrays (FPGAs), controllers, microcontrollers, microprocessors, or other electronic elements for executing the methods described above.

In an exemplary embodiment, a non-transitory computer readable storage medium is also provided, such as memory 804 including instructions executable by processor 820 of electronic device 800 to perform the above-described method. For example, the non-transitory computer readable storage medium may be ROM, random Access Memory (RAM), CD-ROM, magnetic tape, floppy disk, optical data storage device, etc.

A non-transitory computer readable storage medium, which when executed by a processor of an electronic device, causes the electronic device to perform a method of identification, the method comprising: entering a writing state, and acquiring a first hand video image from a video image acquired by a camera; performing image processing on the first hand video image to acquire fingertip position information of a writing finger and displaying a writing track on a display screen according to the fingertip position information; and after the writing state is finished, identifying the writing track according to the fingertip position information, and obtaining and displaying the corresponding candidate information.

Fig. 9 is a schematic structural view of an electronic device 900 for identification according to another exemplary embodiment of the present invention. The electronic device 900 may be a server that may vary widely in configuration or performance and may include one or more central processing units (central processing units, CPU) 922 (e.g., one or more processors) and memory 932, one or more storage media 930 (e.g., one or more mass storage devices) that store applications 942 or data 944. Wherein the memory 932 and the storage medium 930 may be transitory or persistent. The program stored in the storage medium 930 may include one or more modules (not shown), each of which may include a series of instruction operations on a server. Still further, the central processor 922 may be arranged to communicate with a storage medium 930, and execute a series of instruction operations in the storage medium 930 on a server.

The server(s) may also include one or more power supplies 926, one or more wired or wireless network interfaces 950, one or more input/output interfaces 958, one or more keyboards 956, and/or one or more operating systems 941, such as Windows ServerTM, mac OS XTM, unixTM, linuxTM, freeBSDTM, and the like.

An electronic device comprising a memory, and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by one or more processors, the one or more programs comprising instructions for: entering a writing state, and acquiring a first hand video image from a video image acquired by a camera; performing image processing on the first hand video image to acquire fingertip position information of a writing finger and displaying a writing track on a display screen according to the fingertip position information; and after the writing state is finished, identifying the writing track according to the fingertip position information, and obtaining and displaying the corresponding candidate information.

In this specification, each embodiment is described in a progressive manner, and each embodiment is mainly described by differences from other embodiments, and identical and similar parts between the embodiments are all enough to be referred to each other.

Embodiments of the present invention are described with reference to flowchart illustrations and/or block diagrams of methods, terminal devices (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing terminal device to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing terminal device, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following claims be interpreted as including the preferred embodiment and all such alterations and modifications as fall within the scope of the embodiments of the invention.

Finally, it is further noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or terminal that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or terminal. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or terminal device that comprises the element.

The foregoing has outlined a detailed description of a method of identifying, a device of identifying and an electronic device according to the present invention, wherein specific examples are provided herein to illustrate the principles and embodiments of the present invention, the description of the above examples being only for the purpose of aiding in the understanding of the method of the present invention and the core ideas thereof; meanwhile, as those skilled in the art will have variations in the specific embodiments and application scope in accordance with the ideas of the present invention, the present description should not be construed as limiting the present invention in view of the above.

Claims

1. A method of identification, comprising:

entering a writing state, and acquiring a first hand video image from a video image acquired by a camera;

Performing image processing on the first hand video image to acquire fingertip position information of a writing finger and displaying a writing track on a display screen according to the fingertip position information;

after the writing state is finished, recognizing the writing track according to the fingertip position information to obtain corresponding candidate information and displaying the candidate information;

the candidate information includes a plurality of candidates, and after the step of presenting, the method further includes:

periodically polling each candidate item of the current page, and acquiring a third hand video image from the video images acquired by the camera;

When the hand gesture in the third hand video image is detected to be the gesture for starting writing, the upper screen corresponds to the polled candidate and enters a writing state again;

the method further comprises the following steps:

and if the gesture in the first hand video image is a delete gesture and the fingertip position information of the writing finger is not acquired after the writing state is entered, deleting the last screen-on candidate.

2. The method of claim 1, wherein the acquiring a first hand video image from the video image acquired by the camera comprises:

And acquiring a video image acquired by a camera, and extracting an image of a handwriting area in the video image as a first hand video image.

3. The method of claim 1, wherein said image processing said first hand video image comprises:

preprocessing a first hand video image to obtain a first preprocessed image;

Judging whether the hand gesture in the first hand video image is a writing gesture or not according to the first preprocessing image;

And if the hand gesture in the first hand video image is a writing gesture, executing the step of acquiring the fingertip position information of the writing finger.

4. A method according to claim 3, wherein preprocessing the first hand video image to obtain a first preprocessed image comprises:

Performing background subtraction on the first hand video image by adopting a Gaussian mixture model, and extracting a foreground from the first hand video image to obtain a first image;

filtering the first image by adopting a pre-trained skin color model to obtain a second image;

Morphological processing is carried out on the second image, and a third image is obtained;

and performing binarization processing on the third image to obtain a first preprocessed image.

5. The method of claim 3, wherein determining whether the hand gesture in the first hand video image is a writing gesture based on the first pre-processed image comprises:

performing contour detection on the first preprocessed image, and determining the hand contour in the first preprocessed image;

Performing convex hull detection on the hand outline, and determining the number of the convex fingers and fingertip position information of each convex finger;

and if the number of the protruding fingers is matched with the set number of the protruding fingers corresponding to the writing gesture, determining that the gesture of the hand in the first hand video image is the writing gesture.

6. The method of claim 3, wherein determining whether the hand gesture in the first hand video image is a writing gesture based on the first pre-processed image comprises:

Inputting the first preprocessed image into a static gesture recognition model to obtain probability scores of all gestures, wherein the gestures at least comprise: start writing gesture, write gesture, select gesture, delete gesture, and other gestures;

and if the probability score of the writing gesture is highest, determining that the gesture of the hand in the first hand video image is the writing gesture.

7. The method of claim 5, wherein the acquiring fingertip position information of the writing finger comprises:

the fingertip position information of the protruding finger is determined as fingertip position information of the writing finger.

8. The method of claim 1, wherein the method further comprises:

acquiring a second hand video image from the video image acquired by the camera;

judging whether the hand gesture in the second hand video image is a writing starting gesture or not;

and if the hand gesture in the second hand video image is a gesture for starting writing, entering a writing state.

9. A method according to claim 3, wherein the method further comprises:

and if the hand gesture in the first hand video image is a selection gesture, determining that the writing state is ended.

10. A method according to claim 3, wherein the method further comprises:

And if the gesture in the first hand video image is a delete gesture and the fingertip position information of the writing finger is acquired after the first hand video image is in a writing state, the writing track displayed on the display screen is cleared.

11. The method of claim 1, wherein the identifying the writing trace according to the fingertip position information to obtain the corresponding candidate information comprises:

Inputting the fingertip position information into a handwriting engine so that the handwriting engine can recognize a writing track according to the fingertip position information to obtain candidate information and return the candidate information;

and receiving candidate information returned by the handwriting engine.

12. An identification device, comprising:

the acquisition module is used for entering a writing state and acquiring a first hand video image from the video image acquired by the camera;

The image processing module is used for performing image processing on the first hand video image, acquiring fingertip position information of a writing finger and displaying a writing track on a display screen according to the fingertip position information;

the recognition module is used for recognizing the writing track according to the fingertip position information after the writing state is finished, so as to obtain and display corresponding candidate information;

The candidate information includes a plurality of candidates, and the apparatus further includes:

The polling module is used for periodically polling each candidate item of the current page and acquiring a third hand video image from the video images acquired by the camera;

The screen-up module is used for correspondingly polling candidate items when detecting that the hand gesture in the third hand video image is a selection gesture;

the device also comprises:

The first deleting module is used for deleting the last screen candidate if the gesture in the first hand video image is a deleting gesture and the fingertip position information of the writing finger is not acquired after the writing state is entered.

13. The apparatus of claim 12, wherein the device comprises a plurality of sensors,

The acquisition module is used for acquiring video images acquired by the camera and extracting images of handwriting areas in the video images to serve as first hand video images.

14. The apparatus of claim 12, wherein the image processing module comprises:

the preprocessing sub-module is used for preprocessing the first hand video image to obtain a first preprocessed image;

the gesture judging sub-module is used for judging whether the gesture of the hand in the first hand video image is a writing gesture or not according to the first preprocessing image;

and the position acquisition sub-module is used for acquiring the fingertip position information of the writing finger if the hand gesture in the first hand video image is the writing gesture.

15. The apparatus of claim 14, wherein the device comprises a plurality of sensors,

The preprocessing submodule is used for performing background subtraction on the first hand-held video image by adopting a Gaussian mixture model, and extracting a foreground from the first hand-held video image to obtain a first image; filtering the first image by adopting a pre-trained skin color model to obtain a second image; morphological processing is carried out on the second image, and a third image is obtained; and performing binarization processing on the third image to obtain a first preprocessed image.

16. The apparatus of claim 14, wherein the gesture determination submodule comprises:

The first judging unit is used for carrying out contour detection on the first preprocessed image and determining the hand contour in the first preprocessed image; performing convex hull detection on the hand outline, and determining the number of the convex fingers and fingertip position information of each convex finger; and if the number of the protruding fingers is matched with the set number of the protruding fingers corresponding to the writing gesture, determining that the gesture of the hand in the first hand video image is the writing gesture.

17. The apparatus of claim 14, wherein the gesture determination submodule comprises:

The second judging unit is configured to input the first preprocessed image into a static gesture recognition model, and obtain probability scores of each gesture, where the gesture at least includes: start writing gesture, write gesture, select gesture, delete gesture, and other gestures; and if the probability score of the writing gesture is highest, determining that the gesture of the hand in the first hand video image is the writing gesture.

18. The apparatus of claim 16, wherein the device comprises a plurality of sensors,

The position acquisition sub-module is used for determining the fingertip position information of the protruding finger as the fingertip position information of the writing finger.

19. The apparatus of claim 12, wherein said apparatus further comprises:

The writing state start determining module is used for acquiring a second hand video image from the video image acquired by the camera; judging whether the gesture corresponding to the hand in the second hand video image is a writing starting gesture or not; and if the gesture corresponding to the hand in the second hand video image is the gesture for starting writing, entering a writing state.

20. The apparatus of claim 14, wherein said apparatus further comprises:

And the writing state ending determining module is used for determining that the writing state is ended if the hand gesture in the first hand video image is a selection gesture.

21. The apparatus of claim 14, wherein said apparatus further comprises:

And the second deleting module is used for clearing the writing track displayed on the display screen if the gesture in the first hand video image is a deleting gesture and the fingertip position information of the writing finger is acquired after the writing state is entered.

22. The apparatus of claim 14, wherein the device comprises a plurality of sensors,

The recognition module is used for inputting the fingertip position information into a handwriting engine so that the handwriting engine recognizes a writing track according to the fingertip position information to obtain candidate information and returns the candidate information; and receiving candidate information returned by the handwriting engine.

23. A readable storage medium, characterized in that instructions in the storage medium, when executed by a processor of an electronic device, enable the electronic device to perform the identification method according to any one of the method claims 1-11.

24. An electronic device comprising a memory, and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by one or more processors, the one or more programs comprising instructions for:

the candidate information includes a plurality of candidates, and after the step of presenting, further includes instructions for:

Also included are instructions for:

25. The electronic device of claim 24, wherein the capturing the first hand video image from the video image captured by the camera comprises:

26. The electronic device of claim 24, wherein the image processing the first hand-held video image comprises:

preprocessing a first hand video image to obtain a first preprocessed image;

27. The electronic device of claim 26, wherein the preprocessing the first hand-held video image to obtain a first preprocessed image comprises:

28. The electronic device of claim 26, wherein the determining whether the hand gesture in the first hand video image is a writing gesture based on the first pre-processed image comprises:

29. The electronic device of claim 26, wherein the determining whether the hand gesture in the first hand video image is a writing gesture based on the first pre-processed image comprises:

30. The electronic device of claim 28, wherein the obtaining fingertip position information for the writing finger comprises:

31. The electronic device of claim 24, further comprising instructions to:

32. The electronic device of claim 26, further comprising instructions to:

33. The electronic device of claim 26, further comprising instructions to:

34. The electronic device of claim 24, wherein the identifying the writing trace according to the fingertip position information to obtain the corresponding candidate information comprises:

and receiving candidate information returned by the handwriting engine.