CN108846339B - Character recognition method and device, electronic equipment and storage medium - Google Patents

Character recognition method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN108846339B
CN108846339B CN201810563940.XA CN201810563940A CN108846339B CN 108846339 B CN108846339 B CN 108846339B CN 201810563940 A CN201810563940 A CN 201810563940A CN 108846339 B CN108846339 B CN 108846339B
Authority
CN
China
Prior art keywords
character
dimensional coordinate
determining
stroke
output
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810563940.XA
Other languages
Chinese (zh)
Other versions
CN108846339A (en
Inventor
马宝兴
黄茵洁
武学良
钟维涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Gree Electric Appliances Inc of Zhuhai
Original Assignee
Gree Electric Appliances Inc of Zhuhai
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Gree Electric Appliances Inc of Zhuhai filed Critical Gree Electric Appliances Inc of Zhuhai
Priority to CN201810563940.XA priority Critical patent/CN108846339B/en
Publication of CN108846339A publication Critical patent/CN108846339A/en
Application granted granted Critical
Publication of CN108846339B publication Critical patent/CN108846339B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • G06V40/28Recognition of hand or arm movements, e.g. recognition of deaf sign language

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • Psychiatry (AREA)
  • Social Psychology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

The invention discloses a character recognition method, a character recognition device, electronic equipment and a storage medium, which are used for solving the problems of inaccuracy and inconvenience in character input in the prior art. The method comprises the following steps: detecting whether a preset target gesture for writing exists in a current frame image in a video stream acquired by a camera; if so, determining and storing the coordinate information of the target gesture; and determining and outputting the target characters corresponding to the characters to be output according to each coordinate information corresponding to the stored characters to be output. When the user writes in the near-empty space, the electronic equipment determines the characters written by the user according to each three-dimensional coordinate of the target gesture when the user writes, so that the characters can be quickly and accurately input.

Description

Character recognition method and device, electronic equipment and storage medium
Technical Field
The present invention relates to the field of text recognition technologies, and in particular, to a text recognition method, an apparatus, an electronic device, and a storage medium.
Background
With the progress of science and technology and society, the living standard of people is continuously improved, and various electronic devices gradually enter the lives of people to provide various convenience for users.
When the existing electronic equipment is used for inputting information by inputting characters, the information is generally input in a keyboard or touch screen mode. But many people in life have insufficient proficiency on the keyboard, and the speed of inputting characters through the keyboard is slow. The touch screen of the electronic device is easily damaged or is not sensitive, so that a user can have inconvenience when inputting characters through the touch screen.
The user is generally very familiar with writing characters, and how to utilize the advantage to quickly and accurately enter the characters is a problem to be researched.
Disclosure of Invention
The embodiment of the invention discloses a character recognition method, a character recognition device, electronic equipment and a storage medium, which are used for solving the problems of inaccuracy and inconvenience in character input in the prior art.
In order to achieve the above object, an embodiment of the present invention discloses a text recognition method applied to an electronic device, where the method includes:
detecting whether a preset target gesture for writing exists in a current frame image in a video stream acquired by a camera; if so, determining and storing the coordinate information of the target gesture;
and determining and outputting the target characters corresponding to the characters to be output according to each coordinate information corresponding to the stored characters to be output.
Further, the detecting whether a preset target gesture for writing exists in a current frame image in a video stream acquired by a camera includes:
determining an overlapping area in the two current frame images according to the current frame images in the video stream respectively acquired by the two cameras; detecting whether a preset target gesture for writing exists in the overlapping area;
the determining coordinate information of the target gesture comprises:
according to the overlapping area, determining a first three-dimensional coordinate of the target gesture.
Further, after determining the first three-dimensional coordinates of the target gesture, the method further comprises:
and mapping the first three-dimensional coordinate into a first two-dimensional coordinate of a writing display area according to a mapping relation between a pre-stored overlapping area and the writing display area of the display screen, and displaying in the writing display area according to the first two-dimensional coordinate.
Further, after the first three-dimensional coordinate of the target gesture is determined, before the target character corresponding to the character to be output is determined according to each piece of coordinate information corresponding to the stored character to be output, the method further includes:
judging whether the current stroke is completely written or not according to the first three-dimensional coordinate;
if yes, the subsequent steps are carried out.
Further, if there is no preset target gesture for writing in the overlap area, before determining the target text corresponding to the text to be output according to each piece of coordinate information corresponding to the stored text to be output, the method further includes:
judging whether a second three-dimensional coordinate of the target gesture is stored for a previous frame of image of the current frame of image; if yes, the subsequent steps are carried out.
Further, the determining, according to each stored three-dimensional coordinate corresponding to the text to be output, the target text corresponding to the text to be output includes:
determining a first character corresponding to the character to be output according to each stored three-dimensional coordinate corresponding to the character to be output;
and matching the first character with a character library stored in advance, and determining a target character corresponding to the character to be output according to a matching result.
Further, the determining, according to the matching result, the target text corresponding to the text to be output includes:
determining the second character with the highest matching degree as a target character corresponding to the character to be output; or the like, or, alternatively,
displaying a preset number of second characters according to the matching degree from high to low; and determining the second character selected by the user as the target character corresponding to the character to be output.
Further, the determining, according to each stored three-dimensional coordinate corresponding to the text to be output, the first text corresponding to the text to be output includes:
and determining a first character corresponding to the character to be output according to each stroke corresponding to the determined character to be output, wherein each stroke corresponding to the character to be output is determined according to each three-dimensional coordinate corresponding to the stroke after the stroke is determined to be completely written.
Further, according to the first three-dimensional coordinate, the process of determining whether the current stroke is completely written includes:
determining the current writing speed according to the first distance between the first three-dimensional coordinate and a second three-dimensional coordinate saved aiming at the previous frame of image of the current frame of image;
judging whether the current writing speed is within a stroke writing speed range which is stored in advance;
if so, taking the first three-dimensional coordinate as an effective three-dimensional coordinate of the determined stroke;
if not, taking the first three-dimensional coordinate as an invalid three-dimensional coordinate of the determined stroke; and judging whether the second three-dimensional coordinate is an effective three-dimensional coordinate for determining the stroke, if so, taking the second three-dimensional coordinate as an ending three-dimensional coordinate of the current stroke, and determining that the current stroke is completely written.
Further, after the first three-dimensional coordinates are taken as valid three-dimensional coordinates for determining strokes, the method further comprises:
judging whether the second three-dimensional coordinate is an effective three-dimensional coordinate for determining strokes;
and if not, taking the first three-dimensional coordinate as the starting three-dimensional coordinate of the current stroke.
Further, after determining each stroke corresponding to the character to be output, before determining the first character corresponding to the character to be output according to each stroke corresponding to the determined character to be output, the method further includes:
for each stroke, determining whether the time consumption of the stroke is less than a preset first time length threshold value;
if so, the stroke is filtered out.
Further, after determining the target text corresponding to the text to be output according to each piece of stored coordinate information corresponding to the text to be output, the method further includes:
and deleting each coordinate information corresponding to the characters to be output.
Further, if the second three-dimensional coordinates of the target gesture are not saved for the previous frame image of the current frame image, the method further comprises:
and updating the frame number of the image which is saved in advance and has no target gesture in the overlapped area.
Further, the method further comprises:
when the recognition meets the character writing end condition, voice broadcasting is carried out on at least one written character.
Further, recognizing that the writing end condition is satisfied includes:
recognizing the received character writing ending instruction; or
And recognizing that the number of frames of the images of the pre-saved overlapped area without the target gesture is larger than a preset number.
The embodiment of the invention discloses a character recognition device, which comprises:
the target gesture detection module is used for detecting whether a preset target gesture for writing exists in a current frame image in a video stream acquired by the camera;
the coordinate information determining module is used for determining and storing the coordinate information of the target gesture when the detection result of the target gesture detecting module is positive;
the character determining module is used for determining target characters corresponding to the characters to be output according to each piece of coordinate information corresponding to the stored characters to be output;
and the display module is used for displaying the determined target characters.
Further, the target gesture detection module is specifically configured to determine an overlapping area in two current frame images according to the current frame images in the video stream respectively acquired by the two cameras; detecting whether a preset target gesture for writing exists in the overlapping area;
the coordinate information determining module is specifically configured to determine a first three-dimensional coordinate of the target gesture according to the overlap area when the detection result of the target gesture detecting module is yes.
Further, the display module is further configured to map the first three-dimensional coordinate into a first two-dimensional coordinate of a writing display area of a display screen according to a mapping relationship between a pre-stored overlapping area and the writing display area after determining the first three-dimensional coordinate of the target gesture, and display the first two-dimensional coordinate in the writing display area according to the first two-dimensional coordinate.
Further, the apparatus further comprises:
the first judgment module is used for judging whether the current stroke is completely written according to the first three-dimensional coordinate after the coordinate information determination module determines the first three-dimensional coordinate of the target gesture;
and if the judgment result of the first judgment module is yes, triggering the character determination module.
Further, the apparatus further comprises:
the second judgment module is used for judging whether a second three-dimensional coordinate of the target gesture is stored for a previous frame image of a current frame image or not when the detection result of the target gesture detection module is negative; and if the judgment result of the second judgment module is yes, triggering the character determination module.
Further, the character determining module is specifically configured to determine, according to each stored three-dimensional coordinate corresponding to a character to be output, a first character corresponding to the character to be output;
and matching the first character with a character library stored in advance, and determining a target character corresponding to the character to be output according to a matching result.
Further, the character determining module is specifically configured to determine a second character with a highest matching degree as a target character corresponding to the character to be output; or displaying a preset number of second characters according to the matching degree from high to low; and determining the second character selected by the user as the target character corresponding to the character to be output.
Further, the character determining module is specifically configured to determine, according to each stroke corresponding to the determined character to be output, a first character corresponding to the character to be output, where each stroke corresponding to the character to be output is determined according to each three-dimensional coordinate corresponding to the stroke after it is determined that the stroke is completely written.
Further, the text determination module is specifically configured to determine a current writing speed according to a first distance between the first three-dimensional coordinate and a second three-dimensional coordinate stored for a previous frame of image of the current frame of image;
judging whether the current writing speed is within a stroke writing speed range which is stored in advance;
if so, taking the first three-dimensional coordinate as an effective three-dimensional coordinate of the determined stroke;
if not, taking the first three-dimensional coordinate as an invalid three-dimensional coordinate of the determined stroke; and judging whether the second three-dimensional coordinate is an effective three-dimensional coordinate for determining the stroke, if so, taking the second three-dimensional coordinate as an ending three-dimensional coordinate of the current stroke, and determining that the current stroke is completely written.
Further, the character determining module is further configured to, after the first three-dimensional coordinate is used as an effective three-dimensional coordinate for determining strokes, determine whether the second three-dimensional coordinate is an effective three-dimensional coordinate for determining strokes;
and if not, taking the first three-dimensional coordinate as the starting three-dimensional coordinate of the current stroke.
Further, the character determining module is further configured to determine, after determining each stroke corresponding to the character to be output, whether the time consumption of the stroke is less than a preset first time-length threshold value for each stroke before determining, according to each stroke corresponding to the determined character to be output, the first character corresponding to the character to be output; if so, the stroke is filtered out.
Further, the apparatus further comprises:
and the deleting module is used for deleting each coordinate information corresponding to the character to be output after the execution of the character determining module is finished.
Further, the character determining module is further configured to update the number of frames of the image, which is pre-stored and has no target gesture in the overlap area, if the second three-dimensional coordinate of the target gesture is not stored in the previous frame of image of the current frame of image.
Further, the apparatus further comprises:
and the voice broadcasting module is used for carrying out voice broadcasting on at least one written character when the recognition meets the writing ending condition.
Further, the voice broadcasting module is specifically configured to recognize that a character writing ending instruction is received; or the frame number of the images which are stored in advance and have no target gestures in the overlapped area is larger than the preset number.
The embodiment of the invention discloses an electronic device, which comprises: the system comprises a processor, a communication interface, a memory and a communication bus, wherein the processor, the communication interface and the memory complete mutual communication through the communication bus;
the memory has stored therein a computer program which, when executed by the processor, causes the processor to perform the steps of any of the methods described above.
The embodiment of the invention discloses a computer readable storage medium, which stores a computer program executable by an electronic device, and when the program runs on the electronic device, the electronic device is caused to execute the steps of any one of the methods.
The invention discloses a character recognition method, a character recognition device, electronic equipment and a storage medium, wherein the method comprises the following steps: detecting whether a preset target gesture for writing exists in a current frame image in a video stream acquired by a camera; if so, determining and storing the coordinate information of the target gesture; and determining and outputting the target characters corresponding to the characters to be output according to each coordinate information corresponding to the stored characters to be output. When the user writes in the near-empty space, the electronic equipment determines the characters written by the user according to each three-dimensional coordinate of the target gesture when the user writes, so that the characters can be quickly and accurately input.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
Fig. 1A is a schematic diagram of a character recognition process according to an embodiment of the present invention;
fig. 1B is a schematic diagram of a character recognition process according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of a process for determining strokes of a character according to an embodiment of the present invention;
FIG. 3 is a block diagram of a text recognition apparatus according to an embodiment of the present invention;
fig. 4 is an electronic device according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The embodiment of the invention discloses a character recognition method, which is applied to electronic equipment and comprises the following steps:
detecting whether a preset target gesture for writing exists in a current frame image in a video stream acquired by a camera; if so, determining and storing the coordinate information of the target gesture;
and determining and outputting the target characters corresponding to the characters to be output according to each coordinate information corresponding to the stored characters to be output.
The character recognition method provided by the embodiment of the invention is applied to electronic equipment, and the electronic equipment can be a computer, a mobile phone and the like. The camera on the electronic equipment can collect image information in a video form, and when the electronic equipment identifies characters according to video, namely video stream information, the electronic equipment can identify the characters in real time according to each frame of collected image, or store the collected video stream and identify the characters of the stored video stream.
In the embodiment of the invention, when a user writes in the near field, a gesture for writing is firstly input into the electronic device, the gesture can be a palm, a fist or a finger, and the gesture for writing is called a target gesture.
The electronic equipment analyzes each frame of image in the video stream, determines whether the target gesture exists in each frame of image, determines coordinate information corresponding to the target gesture if the target gesture exists, and stores the corresponding coordinate information for each frame.
The electronic equipment can determine and output the target characters corresponding to the characters to be output according to the stored coordinate information corresponding to the characters to be output.
After the electronic device knows what the target gesture is, the process of determining whether the target gesture exists belongs to the prior art, and details are not repeated in the embodiment of the present invention.
In the embodiment of the invention, the user writes the characters in the near space, and the electronic equipment determines the characters written in the near space by determining the coordinate information of the target gesture when the user writes, so that the characters can be determined quickly and accurately.
Example 1:
in order to make the determined text more accurate, the detecting whether a preset target gesture for writing exists in a current frame image in a video stream acquired by a camera includes:
determining an overlapping area in the two current frame images according to the current frame images in the video stream respectively acquired by the two cameras; detecting whether a preset target gesture for writing exists in the overlapping area;
the determining coordinate information of the target gesture comprises:
according to the overlapping area, determining a first three-dimensional coordinate of the target gesture.
The character recognition method provided by the embodiment of the invention is applied to electronic equipment, the electronic equipment comprises at least two cameras, and an overlapping area exists in image frames acquired by the at least two cameras.
The embodiment of the invention identifies the characters of the video streams collected by two cameras with overlapped areas in the collected image frames.
The two cameras for collecting the video stream can be arranged at any position, as long as the collecting areas of the two cameras have an overlapping area, and preferably, the two cameras form a binocular camera. The positions of the two cameras in the whole character recognition process are generally not movable, and on the premise, the overlapping areas of the images collected at any moment are generally the same in the character recognition process of the two cameras.
Each camera continuously collects images, and each camera collects images at the same moment, so that two cameras collect two frames of images, and the electronic equipment can analyze the two frames of images at the same moment and determine an overlapping area in the two frames of images at the same moment.
The image which is currently analyzed and determines the overlapping area is called a current frame image.
The process of determining the overlapping area in the two images belongs to the prior art, and is not described in detail in the embodiment of the present invention.
When the electronic device analyzes the current frame images acquired by the camera, whether the target gesture for writing exists in the overlapping area of the two current frame images or not can be determined, if yes, it can be indicated that the user is currently writing in the near space, and if not, it is indicated that the user is not currently writing in the near space.
The process of detecting whether the target gesture exists in the overlapping area by the electronic device belongs to the prior art, and is not described in detail in the embodiment of the present invention.
The character output by the electronic equipment is called a target character, and the target character can be understood as the finally recognized character.
Generally, when the electronic device detects that the target gesture exists in the overlap area, the three-dimensional coordinates of the target gesture can be determined according to the overlap area, and the three-dimensional coordinates of the target gesture are saved for each frame of image. The three-dimensional coordinates determined from the overlap region corresponding to the current frame are referred to as first three-dimensional coordinates.
At one moment, the three-dimensional coordinates of the target gesture are one, and the target gesture may be a palm, a fist, and a finger. The palm, fist, and fingers typically occupy multiple coordinate points, one of which may be selected, such as a center point, as the first three-dimensional coordinate of the target gesture.
The process of determining the three-dimensional coordinates of the target gesture according to the overlap area belongs to the prior art, and is not described in detail in the embodiment of the present invention.
In the embodiment of the present invention, a specific implementation method for determining, under what conditions and according to each piece of coordinate information corresponding to a stored character to be output, a target character corresponding to the character to be output may be:
if the preset target gesture for writing does not exist in the overlapping area, judging whether a second three-dimensional coordinate of the target gesture is stored for a previous frame of image of the current frame of image; if so, determining that the target character determining condition is met currently, namely determining the target character corresponding to the character to be output according to each piece of coordinate information corresponding to the stored character to be output.
If the image is not stored, the next frame image can be taken as the current frame image for analysis and identification.
If the user does not start writing, or the user has written a certain character at the moment corresponding to the previous image frame and does not write the next character at present, the target gesture will not be detected in the overlapping area corresponding to the current frame image, and the target gesture will not be detected in the overlapping area corresponding to the previous frame image of the current frame image.
After determining that the target gesture does not exist in the overlapping region corresponding to the current frame, that is, the user does not currently perform blank writing, determining whether the user has just written one character at the time corresponding to the previous frame of image according to whether the target gesture exists in the overlapping region corresponding to the previous frame of image of the current frame of image, and if so, determining that the target character determining condition is met and determining the target character is required.
When determining whether the target gesture exists in the overlapping area corresponding to the previous frame of image of the current frame of image, it may be determined whether a second three-dimensional coordinate of the target gesture is stored for the previous frame of image of the current frame of image, and the three-dimensional coordinate of the target gesture corresponding to the previous frame of image of the current frame of image is referred to as the second three-dimensional coordinate. If the second three-dimensional coordinate is stored, it can be determined that the user has just written one word at the time corresponding to the previous frame, and the word can be determined.
The electronic equipment prestores each coordinate information, namely three-dimensional coordinates, corresponding to the characters to be output, and can determine the target characters corresponding to the characters to be output according to each stored three-dimensional coordinate corresponding to the characters to be output when determining that the target characters are currently met. After the target characters are determined, the target characters can be displayed on a screen of the electronic equipment.
When a user writes a character, the target gesture for writing is always in the overlapping area, and when the target gesture does not exist in the overlapping area, the user is considered to have written the character. The absence of the target gesture in the overlap region may be the user removing his hand from the overlap region, or the user removing the overlap region, or the user changing a gesture different from the target gesture. Each three-dimensional coordinate corresponding to the text to be output may be each three-dimensional coordinate determined in a first time period, where a target gesture exists in an overlapping area corresponding to each frame of image in the first time period, and an image frame adjacent to the current frame of image exists in each frame of image in the first time period.
Assuming that a target gesture exists in the overlapping area corresponding to the 1 st frame image to the 4 th frame image, the target gesture does not exist in the 3 rd frame image, and the target gesture exists in the overlapping area corresponding to the 5 th frame image to the 10 th frame image, the three-dimensional coordinates of the target gesture in the overlapping area corresponding to each frame image are determined. If the current frame is the 11 th frame and the target gesture does not exist in the overlapping area corresponding to the current frame image, the text can be determined according to each determined other three-dimensional coordinate in the 5 th frame to the 10 th frame, and the first time period is a time period formed by the 5 th frame to the 10 th frame. The 1 st frame to the 4 th frame are words written by the user, and because the target gesture does not exist in the 3 rd frame, the words in the 1 st frame to the 4 th frame are already determined, so that the determination is not needed again.
In the embodiment of the invention, the acquisition regions of the two cameras have the overlapping region, and the electronic equipment can predetermine the space lattice corresponding to the overlapping region. After the electronic device determines the three-dimensional coordinates each time, it may determine pixel points corresponding to the three-dimensional coordinates in the spatial dot matrix, and when determining the target text corresponding to the text to be output according to each three-dimensional coordinate corresponding to the text to be output, it may determine target pixel points corresponding to each three-dimensional coordinate in the dot matrix of the overlap region, which are stored in advance, according to the three-dimensional coordinates of each pixel point in the dot matrix, and determine the target text corresponding to the text to be output according to each determined target pixel point, that is, determine the text composed of each target pixel point as the target text. And the three-dimensional coordinates in the dot matrix corresponding to the target pixel points are the same as the three-dimensional coordinates determined according to the overlapping area.
Fig. 1A is a schematic diagram of a text recognition process provided in an embodiment of the present invention, where the process includes the following steps:
s101: and determining an overlapping area in the two current frame images according to the current frame images respectively acquired by the two cameras.
S102: and detecting whether a preset target gesture for writing exists in the overlapping area, if so, performing S103, and if not, performing S104.
S103: and determining and saving a first three-dimensional coordinate of the target gesture according to the overlapping area.
S104: judging whether a second three-dimensional coordinate of the target gesture is stored for a previous frame of image of the current frame of image; if yes, S105 is carried out, and if not, the next frame image is taken as the current frame image to be analyzed and identified.
S105: and determining and outputting the target characters corresponding to the characters to be output according to each stored three-dimensional coordinate corresponding to the characters to be output.
In the embodiment of the invention, the user writes the characters in the near space, and the electronic equipment determines the characters written in the near space by determining the three-dimensional coordinates of the target gesture when the user writes, so that the characters can be determined quickly and accurately.
Example 2:
on the basis of the above embodiment, in an embodiment of the present invention, after the first three-dimensional coordinate of the target gesture is determined, before the target text corresponding to the text to be output is determined according to each piece of stored coordinate information corresponding to the text to be output, the method further includes:
judging whether the current stroke is completely written or not according to the first three-dimensional coordinate;
if yes, the subsequent steps are carried out.
In the embodiment of the invention, when a user writes a character, the target gesture for writing is always in the overlapping area, the user generally writes according to the strokes when writing the character, after the user writes one stroke of a certain character, the target gesture cannot leave the overlapping area when writing the next stroke of the character, and the target gesture exists in the overlapping area.
After determining each three-dimensional coordinate of the target gesture, the electronic device may determine whether the stroke is completely written according to the determined three-dimensional coordinate, if the stroke is completely written, it is determined that the target character determination condition is satisfied, and may determine the target character corresponding to the character to be output according to each stored coordinate information corresponding to the character to be output.
Taking the first three-dimensional coordinate as an example, if a target gesture exists in the overlapping area, determining whether the current stroke is completely written according to the first three-dimensional coordinate, and if so, determining that a target character determination condition is currently met; if not, determining that the target character determination condition is not met currently.
Fig. 1B is a schematic diagram of a character recognition process provided in an embodiment of the present invention, where the character recognition process includes the following steps:
s106: and determining an overlapping area in the two current frame images according to the current frame images respectively acquired by the two cameras.
S107: and detecting whether a preset target gesture for writing exists in the overlapping area, and if so, performing S108.
S108: and determining and saving a first three-dimensional coordinate of the target gesture according to the overlapping area.
S109: and determining and outputting the target characters corresponding to the characters to be output according to each stored three-dimensional coordinate corresponding to the characters to be output.
Example 3:
in order to facilitate the user to view his writing situation, on the basis of the foregoing embodiments, in an embodiment of the present invention, after determining the first three-dimensional coordinate of the target gesture, the method further includes:
and mapping the first three-dimensional coordinate into a first two-dimensional coordinate of a writing display area according to a mapping relation between a pre-stored overlapping area and the writing display area of the display screen, and displaying in the writing display area according to the first two-dimensional coordinate.
In the embodiment of the present invention, a writing display area is provided in a display screen of an electronic device, where the writing display area may display a writing situation of a user, a mapping relationship between an overlapping area and the writing display area of the display screen is pre-stored in the electronic device, and a three-dimensional coordinate may be mapped into a two-dimensional coordinate according to the mapping relationship, that is, after a first three-dimensional coordinate is determined, the first three-dimensional coordinate may be mapped into a first two-dimensional coordinate of the writing display area according to the mapping relationship, and then displayed in the writing display area according to the first two-dimensional coordinate, specifically, a pixel point corresponding to the first two-dimensional coordinate may be displayed in the writing display area.
In general, a writing display area displays pixel points corresponding to two-dimensional coordinates after mapping of each three-dimensional coordinate corresponding to characters to be output. The user can check the writing condition of the user according to the pixel points displayed in the writing display area, correct errors in time and improve the writing accuracy.
Example 4:
in order to make the output target text more accurate, on the basis of the foregoing embodiments, in an embodiment of the present invention, determining the target text corresponding to the text to be output according to each stored three-dimensional coordinate corresponding to the text to be output includes:
determining a first character corresponding to the character to be output according to each stored three-dimensional coordinate corresponding to the character to be output;
and matching the first character with a character library stored in advance, and determining a target character corresponding to the character to be output according to a matching result.
In the embodiment of the present invention, a text library is pre-stored in the electronic device, and after determining each coordinate information, i.e., three-dimensional coordinates, of the text to be output, a first text corresponding to the text to be output may be determined according to each three-dimensional coordinate, where a specific process is similar to that in the above embodiment, and is not described herein again.
After the first character is determined, the first character can be matched with characters in a character library, a target character corresponding to the character to be output is determined according to a matching result, the matching result can be a character without matching, the target character cannot be output, and a plurality of matched characters can be provided.
When the target character corresponding to the character to be output is determined according to the matching result, the second character with the highest matching degree can be determined as the target character corresponding to the character to be output; or displaying a preset number of second characters according to the matching degree from high to low; and determining the second character selected by the user as the target character corresponding to the character to be output.
The process of matching the first character with the character library and determining the matching degree of the second character in the character library and the determined first character belongs to the prior art, and is not repeated in the embodiment of the invention.
One specific embodiment: the method comprises the steps that a preset target gesture for writing does not exist in an overlapping area, second three-dimensional coordinates of the target gesture are stored for a previous frame of image of a current frame of image, it is determined that a target character determining condition is met, and then a first character corresponding to a character to be output is determined according to each stored three-dimensional coordinate corresponding to the character to be output; matching the first characters with a character library stored in advance, and determining second characters with the highest matching degree as target characters corresponding to the characters to be output, or displaying a preset number of second characters from high to low according to the matching degree; and determining the second character selected by the user as the target character corresponding to the character to be output.
Another specific embodiment: a preset target gesture for writing exists in the overlapping area, and according to the first three-dimensional coordinates, it is determined that the current stroke is completely written, it is determined that a target character determination condition is met, and then according to each stored three-dimensional coordinate corresponding to the character to be output, a first character corresponding to the character to be output is determined, that is, according to each stroke corresponding to the character to be output, a first character corresponding to the character to be output is determined; matching the first characters with a character library stored in advance, and displaying a preset number of second characters according to the matching degree from high to low; and determining the second character selected by the user as the target character corresponding to the character to be output.
In the embodiment of the invention, the first characters are matched with the second characters in the character library to determine the target characters, so that the characters can be determined more accurately.
Example 5:
on the basis of the above embodiments, in the embodiment of the present invention, determining the first character corresponding to the character to be output according to each stored three-dimensional coordinate corresponding to the character to be output includes:
and determining a first character corresponding to the character to be output according to each stroke corresponding to the determined character to be output, wherein each stroke corresponding to the character to be output is determined according to each three-dimensional coordinate corresponding to the stroke after the stroke is determined to be completely written.
In the embodiment of the present invention, when determining the first character, each stroke corresponding to the character to be output may be determined according to each three-dimensional coordinate corresponding to the character to be output, and subsequently, the first character corresponding to the character to be output may be determined according to each determined stroke.
The electronic device may determine the three-dimensional coordinates of the target gesture corresponding to each frame of image each time, determine whether the stroke is completely written, and perform the process of determining the stroke if it is determined that the stroke is completely written, or determine the stroke after it is determined that the user has already written a character, that is, after it is determined that the second three-dimensional coordinates of the target gesture are stored for the previous frame of image of the current frame of image. It is also possible to determine the strokes at any time before the text is determined.
In the embodiment of the present invention, after each three-dimensional coordinate is determined, it may be determined whether the stroke is completely written, taking determining the first three-dimensional coordinate as an example for description, as shown in fig. 2, a process of determining whether the current stroke is completely written according to the first three-dimensional coordinate includes the following steps:
s201: and determining the current writing speed according to the first distance between the first three-dimensional coordinate and the stored second three-dimensional coordinate.
And the second three-dimensional coordinate is a three-dimensional coordinate saved for a previous frame image of the current frame image.
S202: judging whether the current writing speed is within a stroke writing speed range which is stored in advance; if yes, S203 is performed, and if no, S207 is performed.
S203: and taking the first three-dimensional coordinate as an effective three-dimensional coordinate for determining the stroke.
S207: taking the first three-dimensional coordinate as an invalid three-dimensional coordinate for determining the stroke; and judging whether the second three-dimensional coordinate is a valid three-dimensional coordinate of the determined stroke, if so, performing S208.
S208: and taking the second three-dimensional coordinate as the finishing three-dimensional coordinate of the current stroke, and determining that the current stroke is finished.
Further, the current stroke corresponding to the character to be output can be determined according to each effective three-dimensional coordinate corresponding to the stored current stroke.
In the embodiment of the invention, a general user has a conventional writing speed when writing a character, and generally has a constant speed when writing a certain stroke.
After the electronic device determines the three-dimensional coordinates of the target gesture each time, whether the writing of one stroke is finished or not can be determined according to the comparison between the determined three-dimensional coordinates and the three-dimensional coordinates determined last time.
First, taking a current frame image as an example, the electronic device may determine a current writing speed according to a first three-dimensional coordinate corresponding to the current frame and a second three-dimensional coordinate corresponding to a previous frame of the current frame, specifically, determine a first distance between the first three-dimensional coordinate and the second three-dimensional coordinate, and determine the current writing speed according to the first distance and a time interval between two adjacent image frames.
The electronic device may store a stroke writing speed range in advance, where the speed range may be set by a user in the electronic device, or the speed range may be determined by the electronic device according to a writing speed of a painting and calligraphy within a preset time period.
After the current writing speed is determined, whether the current writing speed is within a stroke writing speed range which is stored in advance can be judged, if yes, the writing of the user is considered to be normal stroke writing, the first three-dimensional coordinate is used as an effective three-dimensional coordinate for determining the stroke, if not, the abnormal stroke writing of the user is determined, and the first three-dimensional coordinate is used as an invalid three-dimensional coordinate for determining the stroke.
If the first three-dimensional coordinate is the effective three-dimensional coordinate for determining the stroke, the role of the first three-dimensional coordinate in the stroke may also be determined according to whether the second three-dimensional coordinate is the effective three-dimensional coordinate, and specifically, after the first three-dimensional coordinate is taken as the effective three-dimensional coordinate for determining the stroke, as shown in fig. 2, the method further includes the following steps:
s204: judging whether the second three-dimensional coordinate is an effective three-dimensional coordinate for determining strokes; if yes, S205 is performed, and if no, S206 is performed.
S206: and taking the first three-dimensional coordinate as a starting three-dimensional coordinate of the current stroke.
S205: and taking the first three-dimensional coordinate as a middle three-dimensional coordinate of the current stroke.
If the first three-dimensional coordinate is an invalid three-dimensional coordinate for determining the stroke, whether the stroke is written completely can be determined according to whether the second three-dimensional coordinate is an valid three-dimensional coordinate, specifically, whether the second three-dimensional coordinate is an valid three-dimensional coordinate for determining the stroke is judged, if yes, the stroke is determined to be written completely, the second three-dimensional coordinate is taken as an ending three-dimensional coordinate of the current stroke, and if the second three-dimensional coordinate is an invalid three-dimensional coordinate for determining the stroke, the current writing is not a normal stroke of the character.
After the second three-dimensional coordinate is taken as the ending three-dimensional coordinate of the current stroke, the current stroke corresponding to the character to be output can be determined according to each saved effective three-dimensional coordinate corresponding to the current stroke.
Each effective three-dimensional coordinate corresponding to the current stroke comprises a starting three-dimensional coordinate, an ending three-dimensional coordinate and a middle three-dimensional coordinate of the current stroke. The starting three-dimensional coordinate is a three-dimensional coordinate corresponding to a starting point of the stroke, the ending three-dimensional coordinate is a unit coordinate corresponding to an ending point of the stroke, a time period between the time of determining the starting three-dimensional coordinate and the time of determining the ending three-dimensional coordinate determines an intermediate three-dimensional coordinate of the stroke, and the starting point, the ending point or a point corresponding to the intermediate three-dimensional coordinate is a pixel point in the lattice.
Each stroke represents the motion track of the target gesture, the starting point of the stroke is the starting point of the motion track, and the ending point of the stroke is the ending point of the motion track.
Example 6:
in order to improve the accuracy of the determined characters, on the basis of the above embodiments, in an embodiment of the present invention, after each stroke corresponding to a character to be output is determined, before a first character corresponding to the character to be output is determined according to each stroke corresponding to the determined character to be output, the method further includes:
determining whether the time consumption of each stroke is less than a preset first time length or not;
if so, the stroke is filtered out.
In the embodiment of the invention, a first time length is prestored in the electronic device, the time consumption of each stroke in a general character is long when the stroke is written, when the stroke which does not belong to the character exists, the writing time length is considered to be short, in order to more accurately determine the character, useless strokes can be filtered, specifically, aiming at each stroke, when whether the time consumption of the stroke is less than the preset first time length is determined, the stroke is filtered, and the first character is determined according to the remaining strokes.
The process of determining the elapsed time of a stroke includes: and determining the frame number of the image frame corresponding to the stroke, and determining the time consumption of the stroke according to the frame number and the time interval of two adjacent frames of images.
Example 7:
in order to make the determined text more accurate and save the storage space for the electronic device, on the basis of the foregoing embodiments, in an embodiment of the present invention, after determining the target text corresponding to the text to be output according to each piece of stored coordinate information corresponding to the text to be output, the method further includes:
and deleting each coordinate information corresponding to the characters to be output.
After the electronic equipment determines one character each time, the coordinate information, such as the three-dimensional coordinate, corresponding to the character can be deleted, so that when the next character is determined, the three-dimensional coordinates corresponding to the next character can be clearly and accurately known, that is, each three-dimensional coordinate corresponding to the character to be output can be all the three-dimensional coordinates currently stored in the electronic equipment, and the accuracy of determining the character is improved. On the other hand, the storage space is saved for the electronic equipment.
Example 8:
on the basis of the foregoing embodiments, in an embodiment of the present invention, if the second three-dimensional coordinate of the target gesture is not stored for a previous frame image of a current frame image, the method further includes:
and updating the frame number of the image which is saved in advance and has no target gesture in the overlapped area.
When determining that the target gesture does not exist in the overlap region corresponding to a certain frame, the electronic device may record the number of frames of the image in which the target gesture does not exist in the overlap region, and if the target gesture does not exist in the overlap region corresponding to the previous frame of the current frame, the electronic device may update the stored number of frames, and generally add 1 to the stored number of frames.
Example 9:
when a user writes in the near space, generally, the user writes not only one character but a plurality of characters, and the electronic device can also recognize whether the user writes, and on the basis of the above embodiments, in the embodiment of the present invention, the method further includes:
when the recognition meets the character writing end condition, voice broadcasting is carried out on at least one written character.
In the embodiment of the invention, the condition of ending writing of the characters is pre-stored in the electronic equipment, when the condition of ending writing of the characters is identified to be met, the written characters can be displayed for a user, and in order to further improve the user experience, the electronic equipment can also broadcast the characters written by the user.
When the electronic equipment identifies whether the character writing end condition is met, whether a character writing end instruction is received or not can be identified; if so, determining that the character writing end condition is met.
The word writing end instruction can be an end operation of the user on a screen of the electronic device, or the electronic device receives an end instruction sent by other devices.
When the electronic device identifies whether the character writing end condition is met, it may also be detected whether a preset target gesture for writing does not exist in the overlapping region for more than a preset second duration, and if so, it is determined that the character writing end condition is met. Specifically, if it is recognized that the number of frames of the image in which the target gesture does not exist in the pre-saved overlap region is greater than a preset number, it is determined that the character writing end condition is satisfied. This number is typically greater than 2, may be 5, and may be 10.
Example 10:
fig. 3 is a structural diagram of a character recognition device according to an embodiment of the present invention, applied to a character recognition device including at least two cameras, where the character recognition device includes:
the target gesture detection module 31 is configured to detect whether a preset target gesture for writing exists in a current frame image in a video stream acquired by a camera;
the coordinate information determining module 32 is configured to determine and store coordinate information of the target gesture if the detection result of the target gesture detecting module is yes;
the character determining module 33 is configured to determine a target character corresponding to the character to be output according to each stored coordinate information corresponding to the character to be output;
and the display module 34 is used for displaying the determined target characters.
Further, the target gesture detection module 31 is specifically configured to determine an overlapping area in two current frame images according to the current frame images in the video stream respectively acquired by the two cameras; detecting whether a preset target gesture for writing exists in the overlapping area;
the coordinate information determining module 32 is specifically configured to determine the first three-dimensional coordinate of the target gesture according to the overlap area when the detection result of the target gesture detecting module is yes.
Further, the display module 34 is further configured to map the first three-dimensional coordinate into a first two-dimensional coordinate of a writing display area of a display screen according to a mapping relationship between a pre-stored overlapping area and the writing display area of the display screen after determining the first three-dimensional coordinate of the target gesture, and display the first two-dimensional coordinate in the writing display area according to the first two-dimensional coordinate.
Further, the apparatus further comprises:
the first judging module 35 is configured to, after the coordinate information determining module 32 determines the first three-dimensional coordinate of the target gesture, judge whether the current stroke is completely written according to the first three-dimensional coordinate;
if the judgment result of the first judgment module 35 is yes, the text determination module 33 is triggered.
Further, the apparatus further comprises:
a second determining module 36, configured to determine, when the detection result of the target gesture detecting module 31 is negative, whether a second three-dimensional coordinate of the target gesture is stored in a previous frame of image of the current frame of image; if the judgment result of the second judgment module 36 is yes, the text determination module 33 is triggered.
Further, the text determining module 33 is specifically configured to determine, according to each stored three-dimensional coordinate corresponding to a text to be output, a first text corresponding to the text to be output;
and matching the candidate characters with a character library stored in advance, and determining target characters corresponding to the characters to be output according to a matching result.
Further, the text determining module 33 is specifically configured to determine a second text with the highest matching degree as a target text corresponding to the text to be output; or displaying a preset number of second characters according to the matching degree from high to low; and determining the second character selected by the user as the target character corresponding to the character to be output.
Further, the character determining module 33 is specifically configured to determine, according to each stroke corresponding to the determined character to be output, a first character corresponding to the character to be output, where each stroke corresponding to the character to be output is determined according to each three-dimensional coordinate corresponding to the stroke after it is determined that the stroke is completely written.
Further, the text determining module 33 is specifically configured to determine a current writing speed according to a first distance between the first three-dimensional coordinate and a second three-dimensional coordinate stored for a previous frame of image of the current frame of image;
judging whether the current writing speed is within a stroke writing speed range which is stored in advance;
if so, taking the first three-dimensional coordinate as an effective three-dimensional coordinate of the determined stroke;
if not, taking the first three-dimensional coordinate as an invalid three-dimensional coordinate of the determined stroke; and judging whether the second three-dimensional coordinate is an effective three-dimensional coordinate for determining the stroke, if so, taking the second three-dimensional coordinate as an ending three-dimensional coordinate of the current stroke, and determining that the current stroke is completely written.
Further, the text determining module 33 is further configured to, after the first three-dimensional coordinate is used as an effective three-dimensional coordinate of a determined stroke, determine whether the second three-dimensional coordinate is an effective three-dimensional coordinate of the determined stroke;
and if not, taking the first three-dimensional coordinate as the starting three-dimensional coordinate of the current stroke.
Further, the character determining module 33 is further configured to, after determining each stroke corresponding to the character to be output, determine whether the time consumption of the stroke is less than a preset first duration threshold value for each stroke before determining the first character corresponding to the character to be output according to each stroke corresponding to the determined character to be output; if so, the stroke is filtered out.
Further, the apparatus further comprises:
and the deleting module 37 is configured to delete each piece of coordinate information corresponding to the character to be output after the execution of the character determining module 33 is completed.
Further, the text determining module 33 is further configured to update the number of frames of the image, which is pre-saved and has no target gesture in the overlap area, if the second three-dimensional coordinate of the target gesture is not saved for the previous image of the current frame of image.
Further, the apparatus further comprises:
voice broadcast module 38 for when the recognition satisfies the writing end condition, voice broadcast is carried out to at least one writing that will write.
Further, the voice broadcasting module 38 is specifically configured to recognize that a character writing ending instruction is received; or the frame number of the images which are stored in advance and have no target gestures in the overlapped area is larger than the preset number.
Example 11:
fig. 4 is an electronic device provided in an embodiment of the present invention, including: the system comprises a processor 41, a communication interface 42, a memory 43 and a communication bus 44, wherein the processor 41, the communication interface 42 and the memory 43 complete mutual communication through the communication bus 44;
the memory having stored therein a computer program that, when executed by the processor, causes the processor to perform the steps of:
detecting whether a preset target gesture for writing exists in a current frame image in a video stream acquired by a camera; if so, determining and storing the coordinate information of the target gesture;
and determining and outputting the target characters corresponding to the characters to be output according to each coordinate information corresponding to the stored characters to be output.
Further, the detecting whether a preset target gesture for writing exists in a current frame image in a video stream acquired by a camera includes:
determining an overlapping area in the two current frame images according to the current frame images in the video stream respectively acquired by the two cameras; detecting whether a preset target gesture for writing exists in the overlapping area;
the determining coordinate information of the target gesture comprises:
according to the overlapping area, determining a first three-dimensional coordinate of the target gesture.
Further, after the first three-dimensional coordinate of the target gesture is determined, the first three-dimensional coordinate is mapped into the first two-dimensional coordinate of the writing display area according to a mapping relation between a pre-stored overlapping area and the writing display area of the display screen, and the first two-dimensional coordinate is displayed in the writing display area.
Further, after the first three-dimensional coordinate of the target gesture is determined, before the target character corresponding to the character to be output is determined according to each piece of coordinate information corresponding to the stored character to be output, the method further includes:
judging whether the current stroke is completely written or not according to the first three-dimensional coordinate;
if yes, the subsequent steps are carried out.
Further, if there is no preset target gesture for writing in the overlap area, before determining the target text corresponding to the text to be output according to each piece of coordinate information corresponding to the stored text to be output, the method further includes:
judging whether a second three-dimensional coordinate of the target gesture is stored for a previous frame of image of the current frame of image; if yes, the subsequent steps are carried out.
Further, the determining, according to each stored three-dimensional coordinate corresponding to the text to be output, the target text corresponding to the text to be output includes:
determining a first character corresponding to the character to be output according to each stored three-dimensional coordinate corresponding to the character to be output;
and matching the first character with a character library stored in advance, and determining a target character corresponding to the character to be output according to a matching result.
Further, the determining, according to the matching result, the target text corresponding to the text to be output includes:
determining the second character with the highest matching degree as a target character corresponding to the character to be output; or the like, or, alternatively,
displaying a preset number of second characters according to the matching degree from high to low; and determining the second character selected by the user as the target character corresponding to the character to be output.
Further, the determining, according to each stored three-dimensional coordinate corresponding to the text to be output, the first text corresponding to the text to be output includes:
and determining a first character corresponding to the character to be output according to each stroke corresponding to the determined character to be output, wherein each stroke corresponding to the character to be output is determined according to each three-dimensional coordinate corresponding to the stroke after the stroke is determined to be completely written.
Further, according to the first three-dimensional coordinate, the process of determining whether the current stroke is completely written includes:
determining the current writing speed according to the first distance between the first three-dimensional coordinate and a second three-dimensional coordinate saved aiming at the previous frame of image of the current frame of image;
judging whether the current writing speed is within a stroke writing speed range which is stored in advance;
if so, taking the first three-dimensional coordinate as an effective three-dimensional coordinate of the determined stroke;
if not, taking the first three-dimensional coordinate as an invalid three-dimensional coordinate of the determined stroke; and judging whether the second three-dimensional coordinate is an effective three-dimensional coordinate for determining the stroke, if so, taking the second three-dimensional coordinate as an ending three-dimensional coordinate of the current stroke, and determining that the current stroke is completely written.
Further, after the first three-dimensional coordinate is used as an effective three-dimensional coordinate for determining strokes, judging whether the second three-dimensional coordinate is the effective three-dimensional coordinate for determining the strokes;
and if not, taking the first three-dimensional coordinate as the starting three-dimensional coordinate of the current stroke.
Further, after determining each stroke corresponding to the character to be output, before determining the first character corresponding to the character to be output according to each stroke corresponding to the determined character to be output, determining whether the time consumption of the stroke is less than a preset first time-length threshold value or not for each stroke;
if so, the stroke is filtered out.
Further, after the target character corresponding to the character to be output is determined according to each coordinate information corresponding to the stored character to be output, each coordinate information corresponding to the character to be output is deleted.
Further, if the second three-dimensional coordinate of the target gesture is not stored for the previous frame of image of the current frame of image, updating the number of frames of images which are stored in advance and have no target gesture in the overlapping area.
Further, when the recognition meets the character writing end condition, voice broadcasting is carried out on at least one written character.
Further, recognizing that the writing end condition is satisfied includes:
recognizing the received character writing ending instruction; or
And recognizing that the number of frames of the images of the pre-saved overlapped area without the target gesture is larger than a preset number.
The communication bus mentioned in the electronic device in each of the above embodiments may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The communication bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown, but this does not mean that there is only one bus or one type of bus.
And the communication interface is used for communication between the electronic equipment and other equipment.
The Memory may include a Random Access Memory (RAM) or a Non-Volatile Memory (NVM), such as at least one disk Memory. Alternatively, the memory may be at least one memory device located remotely from the processor.
The Processor may be a general-purpose Processor, including a central processing unit, a Network Processor (NP), and the like; but may also be a Digital Signal Processor (DSP), an application specific integrated circuit, a field programmable gate array or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or the like.
Example 10:
an embodiment of the present invention provides a computer-readable storage medium storing a computer program executable by an electronic device, and when the program runs on the electronic device, the program causes the electronic device to execute the following steps:
detecting whether a preset target gesture for writing exists in a current frame image in a video stream acquired by a camera; if so, determining and storing the coordinate information of the target gesture;
and determining and outputting the target characters corresponding to the characters to be output according to each coordinate information corresponding to the stored characters to be output.
Further, the detecting whether a preset target gesture for writing exists in a current frame image in a video stream acquired by a camera includes:
determining an overlapping area in the two current frame images according to the current frame images in the video stream respectively acquired by the two cameras; detecting whether a preset target gesture for writing exists in the overlapping area;
the determining coordinate information of the target gesture comprises:
according to the overlapping area, determining a first three-dimensional coordinate of the target gesture.
Further, after the first three-dimensional coordinate of the target gesture is determined, the first three-dimensional coordinate is mapped into the first two-dimensional coordinate of the writing display area according to a mapping relation between a pre-stored overlapping area and the writing display area of the display screen, and the first two-dimensional coordinate is displayed in the writing display area.
Further, after the first three-dimensional coordinate of the target gesture is determined, before the target character corresponding to the character to be output is determined according to each piece of coordinate information corresponding to the stored character to be output, the method further includes:
judging whether the current stroke is completely written or not according to the first three-dimensional coordinate;
if yes, the subsequent steps are carried out.
Further, if there is no preset target gesture for writing in the overlap area, before determining the target text corresponding to the text to be output according to each piece of coordinate information corresponding to the stored text to be output, the method further includes:
judging whether a second three-dimensional coordinate of the target gesture is stored for a previous frame of image of the current frame of image; if yes, the subsequent steps are carried out.
Further, the determining, according to each stored three-dimensional coordinate corresponding to the text to be output, the target text corresponding to the text to be output includes:
determining a first character corresponding to the character to be output according to each stored three-dimensional coordinate corresponding to the character to be output;
and matching the first character with a character library stored in advance, and determining a target character corresponding to the character to be output according to a matching result.
Further, the determining, according to the matching result, the target text corresponding to the text to be output includes:
determining the second character with the highest matching degree as a target character corresponding to the character to be output; or the like, or, alternatively,
displaying a preset number of second characters according to the matching degree from high to low; and determining the second character selected by the user as the target character corresponding to the character to be output.
Further, the determining, according to each stored three-dimensional coordinate corresponding to the text to be output, the first text corresponding to the text to be output includes:
and determining a first character corresponding to the character to be output according to each stroke corresponding to the determined character to be output, wherein each stroke corresponding to the character to be output is determined according to each three-dimensional coordinate corresponding to the stroke after the stroke is determined to be completely written.
Further, according to the first three-dimensional coordinate, the process of determining whether the current stroke is completely written includes:
determining the current writing speed according to the first distance between the first three-dimensional coordinate and a second three-dimensional coordinate saved aiming at the previous frame of image of the current frame of image;
judging whether the current writing speed is within a stroke writing speed range which is stored in advance;
if so, taking the first three-dimensional coordinate as an effective three-dimensional coordinate of the determined stroke;
if not, taking the first three-dimensional coordinate as an invalid three-dimensional coordinate of the determined stroke; and judging whether the second three-dimensional coordinate is an effective three-dimensional coordinate for determining the stroke, if so, taking the second three-dimensional coordinate as an ending three-dimensional coordinate of the current stroke, and determining that the current stroke is completely written.
Further, after the first three-dimensional coordinate is used as an effective three-dimensional coordinate for determining strokes, judging whether the second three-dimensional coordinate is the effective three-dimensional coordinate for determining the strokes;
and if not, taking the first three-dimensional coordinate as the starting three-dimensional coordinate of the current stroke.
Further, after determining each stroke corresponding to the character to be output, before determining the first character corresponding to the character to be output according to each stroke corresponding to the determined character to be output, determining whether the time consumption of the stroke is less than a preset first time-length threshold value or not for each stroke;
if so, the stroke is filtered out.
Further, after the target character corresponding to the character to be output is determined according to each coordinate information corresponding to the stored character to be output, each coordinate information corresponding to the character to be output is deleted.
Further, if the second three-dimensional coordinate of the target gesture is not stored for the previous frame of image of the current frame of image, updating the number of frames of images which are stored in advance and have no target gesture in the overlapping area.
Further, when the recognition meets the character writing end condition, voice broadcasting is carried out on at least one written character.
Further, recognizing that the writing end condition is satisfied includes:
recognizing the received character writing ending instruction; or
And recognizing that the number of frames of the images of the pre-saved overlapped area without the target gesture is larger than a preset number.
The computer readable storage medium in the above embodiments may be any available medium or data storage device that can be accessed by a processor in an electronic device, including but not limited to magnetic memory such as floppy disks, hard disks, magnetic tape, magneto-optical disks (MOs), etc., optical memory such as CDs, DVDs, BDs, HVDs, etc., and semiconductor memory such as ROMs, EPROMs, EEPROMs, non-volatile memory (NAND FLASH), Solid State Disks (SSDs), etc.
The invention discloses a character recognition method, a character recognition device, electronic equipment and a storage medium, wherein the method comprises the following steps: detecting whether a preset target gesture for writing exists in a current frame image in a video stream acquired by a camera; if so, determining and storing the coordinate information of the target gesture; and determining and outputting the target characters corresponding to the characters to be output according to each coordinate information corresponding to the stored characters to be output. When the user writes in the near-empty space, the electronic equipment determines the characters written by the user according to each three-dimensional coordinate of the target gesture when the user writes, so that the characters can be quickly and accurately input.
For the system/apparatus embodiments, since they are substantially similar to the method embodiments, the description is relatively simple, and reference may be made to some descriptions of the method embodiments for relevant points.
It is to be noted that, in this document, relational terms such as first and second, and the like are used solely to distinguish one entity or operation from another entity or operation without necessarily requiring or implying any actual such relationship or order between such entities or operations.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely application embodiment, or an embodiment combining application and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While the preferred embodiments of the present application have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all alterations and modifications as fall within the scope of the application.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present application without departing from the spirit and scope of the application. Thus, if such modifications and variations of the present application fall within the scope of the claims of the present application and their equivalents, the present application is intended to include such modifications and variations as well.

Claims (24)

1. A character recognition method is applied to electronic equipment, and the method comprises the following steps:
detecting whether a preset target gesture for writing exists in a current frame image in a video stream acquired by a camera; if so, determining and storing the coordinate information of the target gesture;
determining and outputting target characters corresponding to the characters to be output according to each coordinate information corresponding to the stored characters to be output;
the method for detecting whether the preset target gesture for writing exists in the current frame image in the video stream acquired by the camera comprises the following steps:
determining an overlapping area in the two current frame images according to the current frame images in the video stream respectively acquired by the two cameras; detecting whether a preset target gesture for writing exists in the overlapping area;
the determining coordinate information of the target gesture comprises:
determining a first three-dimensional coordinate of the target gesture according to the overlapping area;
if the preset target gesture for writing does not exist in the overlapping area, before determining the target character corresponding to the character to be output according to each piece of coordinate information corresponding to the stored character to be output, the method further comprises the following steps:
judging whether a second three-dimensional coordinate of the target gesture is stored for a previous frame of image of the current frame of image; if yes, carrying out the subsequent steps;
after the first three-dimensional coordinate of the target gesture is determined, before the target character corresponding to the character to be output is determined according to each piece of coordinate information corresponding to the stored character to be output, the method further comprises the following steps:
judging whether the current stroke is completely written or not according to the first three-dimensional coordinate;
if yes, carrying out the subsequent steps;
wherein, according to the first three-dimensional coordinate, the process of determining whether the current stroke is completely written comprises the following steps:
determining the current writing speed according to the first distance between the first three-dimensional coordinate and a second three-dimensional coordinate saved aiming at the previous frame of image of the current frame of image;
judging whether the current writing speed is within a stroke writing speed range which is stored in advance;
if so, taking the first three-dimensional coordinate as an effective three-dimensional coordinate of the determined stroke;
if not, taking the first three-dimensional coordinate as an invalid three-dimensional coordinate of the determined stroke; and judging whether the second three-dimensional coordinate is an effective three-dimensional coordinate for determining the stroke, if so, taking the second three-dimensional coordinate as an ending three-dimensional coordinate of the current stroke, and determining that the current stroke is completely written.
2. The method of claim 1, wherein after determining the first three-dimensional coordinates of the target gesture, the method further comprises:
and mapping the first three-dimensional coordinate into a first two-dimensional coordinate of a writing display area according to a mapping relation between a pre-stored overlapping area and the writing display area of the display screen, and displaying in the writing display area according to the first two-dimensional coordinate.
3. The method of claim 1, wherein determining the target text corresponding to the text to be output according to each stored three-dimensional coordinate corresponding to the text to be output comprises:
determining a first character corresponding to the character to be output according to each stored three-dimensional coordinate corresponding to the character to be output;
and matching the first character with a character library stored in advance, and determining a target character corresponding to the character to be output according to a matching result.
4. The method of claim 3, wherein the determining the target text corresponding to the text to be output according to the matching result comprises:
determining the second character with the highest matching degree as a target character corresponding to the character to be output; or the like, or, alternatively,
displaying a preset number of second characters according to the matching degree from high to low; and determining the second character selected by the user as the target character corresponding to the character to be output.
5. The method of claim 3, wherein the determining the first word corresponding to the word to be output according to each three-dimensional coordinate corresponding to the saved word to be output comprises:
and determining a first character corresponding to the character to be output according to each stroke corresponding to the determined character to be output, wherein each stroke corresponding to the character to be output is determined according to each three-dimensional coordinate corresponding to the stroke after the stroke is determined to be completely written.
6. The method of claim 1, wherein after taking the first three-dimensional coordinates as valid three-dimensional coordinates for determining strokes, the method further comprises:
judging whether the second three-dimensional coordinate is an effective three-dimensional coordinate for determining strokes;
and if not, taking the first three-dimensional coordinate as the starting three-dimensional coordinate of the current stroke.
7. The method of claim 5, wherein after determining each stroke of the text to be output, before determining the first text of the text to be output based on each stroke of the text to be output, the method further comprises:
for each stroke, determining whether the time consumption of the stroke is less than a preset first time length threshold value;
if so, the stroke is filtered out.
8. The method of claim 1, wherein after determining the target text corresponding to the text to be output according to each coordinate information corresponding to the saved text to be output, the method further comprises:
and deleting each coordinate information corresponding to the characters to be output.
9. The method of claim 1, wherein if the second three-dimensional coordinates of the target gesture are not saved for a previous frame image of a current frame image, the method further comprises:
and updating the frame number of the image which is saved in advance and has no target gesture in the overlapped area.
10. The method of claim 9, wherein the method further comprises:
when the recognition meets the character writing end condition, voice broadcasting is carried out on at least one written character.
11. The method of claim 10, wherein identifying that a word writing end condition is satisfied comprises:
recognizing the received character writing ending instruction; or
And recognizing that the number of frames of the images of the pre-saved overlapped area without the target gesture is larger than a preset number.
12. A character recognition apparatus, comprising:
the target gesture detection module is used for detecting whether a preset target gesture for writing exists in a current frame image in a video stream acquired by the camera;
the coordinate information determining module is used for determining and storing the coordinate information of the target gesture when the detection result of the target gesture detecting module is positive;
the character determining module is used for determining target characters corresponding to the characters to be output according to each piece of coordinate information corresponding to the stored characters to be output;
the display module is used for displaying the determined target characters;
the target gesture detection module is specifically used for determining an overlapping area in two current frame images according to the current frame images in the video stream respectively acquired by the two cameras; detecting whether a preset target gesture for writing exists in the overlapping area;
the coordinate information determining module is specifically configured to determine a first three-dimensional coordinate of the target gesture according to the overlap area when the detection result of the target gesture detecting module is yes;
the second judging module is used for judging whether a second three-dimensional coordinate of the target gesture is stored for a previous frame image of a current frame image or not when the detection result of the target gesture detecting module is negative; if the judgment result of the second judgment module is yes, triggering the character determination module;
wherein the apparatus further comprises:
the first judgment module is used for judging whether the current stroke is completely written according to the first three-dimensional coordinate after the coordinate information determination module determines the first three-dimensional coordinate of the target gesture;
if the judgment result of the first judgment module is yes, triggering the character determination module;
the character determining module is specifically configured to determine a current writing speed according to a first distance between the first three-dimensional coordinate and a second three-dimensional coordinate stored for a previous frame of image of the current frame of image;
judging whether the current writing speed is within a stroke writing speed range which is stored in advance;
if so, taking the first three-dimensional coordinate as an effective three-dimensional coordinate of the determined stroke;
if not, taking the first three-dimensional coordinate as an invalid three-dimensional coordinate of the determined stroke; and judging whether the second three-dimensional coordinate is an effective three-dimensional coordinate for determining the stroke, if so, taking the second three-dimensional coordinate as an ending three-dimensional coordinate of the current stroke, and determining that the current stroke is completely written.
13. The apparatus of claim 12, wherein the display module is further configured to map the first three-dimensional coordinate into a first two-dimensional coordinate of a writing display area of a display screen according to a pre-stored mapping relationship between an overlapping area and the writing display area after determining the first three-dimensional coordinate of the target gesture, and to display the first two-dimensional coordinate in the writing display area.
14. The apparatus according to claim 12, wherein the text determining module is specifically configured to determine, according to each stored three-dimensional coordinate corresponding to the text to be output, a first text corresponding to the text to be output;
and matching the first character with a character library stored in advance, and determining a target character corresponding to the character to be output according to a matching result.
15. The apparatus according to claim 14, wherein the text determining module is specifically configured to determine a second text with a highest matching degree as the target text corresponding to the text to be output; or displaying a preset number of second characters according to the matching degree from high to low; and determining the second character selected by the user as the target character corresponding to the character to be output.
16. The apparatus according to claim 14, wherein the text determining module is specifically configured to determine, according to each stroke corresponding to the determined text to be output, a first text corresponding to the text to be output, where each stroke corresponding to the text to be output is determined according to each three-dimensional coordinate corresponding to the stroke after it is determined that the stroke is completely written.
17. The apparatus of claim 12, wherein the text determination module is further configured to determine whether the second three-dimensional coordinate is a valid three-dimensional coordinate for a determined stroke after the first three-dimensional coordinate is a valid three-dimensional coordinate for a determined stroke;
and if not, taking the first three-dimensional coordinate as the starting three-dimensional coordinate of the current stroke.
18. The apparatus of claim 16, wherein the text determining module is further configured to, after determining each stroke corresponding to the text to be output, determine whether a time duration of the stroke is less than a preset first time duration threshold for each stroke before determining the first text corresponding to the text to be output according to each stroke corresponding to the determined text to be output; if so, the stroke is filtered out.
19. The apparatus of claim 12, wherein the apparatus further comprises:
and the deleting module is used for deleting each coordinate information corresponding to the character to be output after the execution of the character determining module is finished.
20. The apparatus of claim 12, wherein the text determination module is further configured to update a number of frames of images that are pre-saved with an overlap region where the target gesture does not exist if the second three-dimensional coordinate of the target gesture is not saved for a previous frame of image of a current frame of image.
21. The apparatus of claim 20, wherein the apparatus further comprises:
and the voice broadcasting module is used for carrying out voice broadcasting on at least one written character when the recognition meets the writing ending condition.
22. The device of claim 21, wherein the voice broadcast module is specifically configured to recognize that a text writing end instruction is received; or the frame number of the images which are stored in advance and have no target gestures in the overlapped area is larger than the preset number.
23. An electronic device, comprising: the system comprises a processor, a communication interface, a memory and a communication bus, wherein the processor, the communication interface and the memory complete mutual communication through the communication bus;
the memory has stored therein a computer program which, when executed by the processor, causes the processor to carry out the steps of the method of any one of claims 1-11.
24. A computer-readable storage medium, characterized in that it stores a computer program executable by an electronic device, which program, when run on the electronic device, causes the electronic device to carry out the steps of the method according to any one of claims 1-11.
CN201810563940.XA 2018-06-04 2018-06-04 Character recognition method and device, electronic equipment and storage medium Active CN108846339B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810563940.XA CN108846339B (en) 2018-06-04 2018-06-04 Character recognition method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810563940.XA CN108846339B (en) 2018-06-04 2018-06-04 Character recognition method and device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN108846339A CN108846339A (en) 2018-11-20
CN108846339B true CN108846339B (en) 2020-11-27

Family

ID=64210683

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810563940.XA Active CN108846339B (en) 2018-06-04 2018-06-04 Character recognition method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN108846339B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110059678A (en) * 2019-04-17 2019-07-26 上海肇观电子科技有限公司 A kind of detection method, device and computer readable storage medium
CN111031232B (en) * 2019-04-24 2022-01-28 广东小天才科技有限公司 Dictation real-time detection method and electronic equipment
CN111081103B (en) * 2019-05-17 2022-03-01 广东小天才科技有限公司 Dictation answer obtaining method, family education equipment and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104793724A (en) * 2014-01-16 2015-07-22 北京三星通信技术研究有限公司 Sky-writing processing method and device
CN106484108A (en) * 2016-09-30 2017-03-08 天津大学 Chinese characters recognition method based on double vision point gesture identification
CN107728916A (en) * 2017-09-20 2018-02-23 科大讯飞股份有限公司 Every the display methods and device of empty handwriting tracks

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7372993B2 (en) * 2004-07-21 2008-05-13 Hewlett-Packard Development Company, L.P. Gesture recognition

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104793724A (en) * 2014-01-16 2015-07-22 北京三星通信技术研究有限公司 Sky-writing processing method and device
CN106484108A (en) * 2016-09-30 2017-03-08 天津大学 Chinese characters recognition method based on double vision point gesture identification
CN107728916A (en) * 2017-09-20 2018-02-23 科大讯飞股份有限公司 Every the display methods and device of empty handwriting tracks

Also Published As

Publication number Publication date
CN108846339A (en) 2018-11-20

Similar Documents

Publication Publication Date Title
CN108960163B (en) Gesture recognition method, device, equipment and storage medium
KR101825154B1 (en) Overlapped handwriting input method
KR101763891B1 (en) Method for region extraction, method for model training, and devices thereof
CN103065134B (en) A kind of fingerprint identification device and method with information
CN108038176B (en) Method and device for establishing passerby library, electronic equipment and medium
CN108846339B (en) Character recognition method and device, electronic equipment and storage medium
CN106650648B (en) Recognition method and system for erasing handwriting
KR20170061631A (en) Method and device for region identification
US8417026B2 (en) Gesture recognition methods and systems
KR20130128412A (en) Multi-character continuous handwriting input method
US20140354540A1 (en) Systems and methods for gesture recognition
CN110930419A (en) Image segmentation method and device, electronic equipment and computer storage medium
CN111291601B (en) Lane line identification method and device and electronic equipment
CN109829368A (en) Recognition methods, device, computer equipment and the storage medium of palm feature
CN113128520B (en) Image feature extraction method, target re-identification method, device and storage medium
CN103106388B (en) Method and system of image recognition
CN113867521B (en) Handwriting input method and device based on gesture visual recognition and electronic equipment
CN107861684A (en) Writing recognition method and device, storage medium and computer equipment
CN110858291A (en) Character segmentation method and device
CN110334576B (en) Hand tracking method and device
CN109977737A (en) A kind of character recognition Robust Method based on Recognition with Recurrent Neural Network
CN112163400A (en) Information processing method and device
CN110991438A (en) Font scoring method, system and storage medium thereof
CN116661618A (en) Touch mode switching method and device for touch screen
CN107169517B (en) Method for judging repeated strokes, terminal equipment and computer readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant