CN112686253A - Screen character extraction system and method for electronic whiteboard - Google Patents

Screen character extraction system and method for electronic whiteboard Download PDF

Info

Publication number
CN112686253A
CN112686253A CN202011598383.9A CN202011598383A CN112686253A CN 112686253 A CN112686253 A CN 112686253A CN 202011598383 A CN202011598383 A CN 202011598383A CN 112686253 A CN112686253 A CN 112686253A
Authority
CN
China
Prior art keywords
text
character
area
electronic whiteboard
screen
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202011598383.9A
Other languages
Chinese (zh)
Inventor
朱玉荣
汤鹏飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Anhui Wenxiang Information Technology Co Ltd
Original Assignee
Anhui Wenxiang Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Anhui Wenxiang Information Technology Co Ltd filed Critical Anhui Wenxiang Information Technology Co Ltd
Priority to CN202011598383.9A priority Critical patent/CN112686253A/en
Publication of CN112686253A publication Critical patent/CN112686253A/en
Withdrawn legal-status Critical Current

Links

Images

Landscapes

  • Character Input (AREA)

Abstract

The invention discloses a screen character extraction system and method for an electronic whiteboard, and relates to the technical field of electronic whiteboard manufacturing. The invention comprises a character box selection module, a text detection module, a text feature extraction module and an identification module; the text framing module is used for framing an area needing character extraction in the electronic whiteboard screen and intercepting the area into a high-definition picture; the text detection module is used for detecting the specific position of the character extraction area in the high-definition picture; the text feature extraction module is used for extracting features of the characters in the character extraction area; and the recognition module is used for recognizing the target characters in the character picture after the characteristics are extracted. The invention carries out frame selection on the screen of the electronic whiteboard, positions the frame-selected image, separates the characters from the background, carries out edge detection and subdivision on the classified characters, and finally obtains the characters in the frame-selected area.

Description

Screen character extraction system and method for electronic whiteboard
Technical Field
The invention belongs to the technical field of electronic whiteboard manufacturing, and particularly relates to a screen character extraction system and a screen character extraction method for an electronic whiteboard.
Background
The electronic whiteboard is a digital teaching demonstration device for replacing the traditional blackboard and chalks. The electronic whiteboard can be completely separated from a mouse and a keyboard, and the operations of editing, annotating, saving and the like of computer files are realized on the electronic whiteboard by using fingers or specific pens, so that great convenience is brought to users.
When a teacher uses an electronic whiteboard to teach in a classroom, multi-body-beautifying information with images, audios and videos as main needs to be played. The text in the image or video reflects part of the important content of the image or video to some extent, and usually forms a concise description or illustration of the image or video content. Therefore, in the course of lessons, a teacher is required to extract characters in pictures or videos to teach students, but in practical application, a plurality of character region extraction algorithms are provided, and an edge-based method is provided; a connected domain based approach; texture-based methods, etc. But the sizes of characters in the complex background images are different; the characters have various colors; the traditional method has the limitations of low efficiency, complex calculation, low precision and the like.
Disclosure of Invention
The invention aims to provide a screen character extraction system and a screen character extraction method for an electronic whiteboard.
In order to solve the technical problems, the invention is realized by the following technical scheme:
the invention relates to a screen character extraction system for an electronic whiteboard, which comprises a character box selection module, a text detection module, a text feature extraction module and an identification module; the text framing module is used for framing an area needing character extraction in the electronic whiteboard screen and intercepting the area into a high-definition picture; the text detection module is used for detecting the specific position of the character extraction area in the high-definition picture; the text feature extraction module is used for extracting features of the characters in the character extraction area; and the identification module is used for identifying the target characters in the character picture after the characteristics are extracted.
Preferably, the text detection module is specifically configured to: extracting the attribute of the text area from the text area at the selection position; the attributes of the text region comprise confidence, scale and vertex coordinates; the confidence is used to indicate the probability that the text line is included in the text extraction region.
Preferably, the text detection module is to set a text feature threshold in advance; the text characteristic threshold comprises font size, height, width, gray value of pixel points, gradient value of pixel points and character spacing.
The invention relates to a screen character extraction method for an electronic whiteboard, which comprises the following steps:
step S1: drawing a candidate area needing to be identified on an electronic whiteboard screen by hands;
step S2: screenshot is carried out on the image of the candidate area to obtain a high-definition image;
step S3: positioning a text in the high-definition image;
step S4: separating the text from the background in the image by using a binarization method;
step S5: carrying out edge detection and fine segmentation on the separated character image;
step S6: and (5) inspecting and screening the character area to identify the final character.
Preferably, in step S1, a space coordinate system is established with the lower left corner of the electronic whiteboard screen as the origin of the coordinate system; selecting any point on a screen of the electronic whiteboard as a starting point for image capture, sliding a finger on the electronic whiteboard, and acquiring a terminal coordinate when the finger leaves the electronic whiteboard; and combining the coordinates of the starting point and the end point to obtain a candidate area.
Preferably, in step S3, the text positioning is performed by connected component analysis and sliding window classification.
Preferably, in step S5, the edge detection uses Prewitt operator to first determine R, G, B three components of each pixel in the image, and then substitutes the three components into Prewitt operator to determine the euclidean distance in color; converting the color image I (I, j) into an edge gray image S (I, j), solving a threshold value T by an iterative method, and carrying out binarization processing on the S (I, j); when S (i, j) is smaller than T, the pixel point (i, j) is considered as a background point, and the pixel value is set to be 0; and when the S (i, j) is larger than the T, the pixel point (i, j) is considered as an edge point, and the pixel value is set to be 1.
Preferably, in the step S6, when the text image is subdivided, the edge map is line-scanned, and the edge points E of each line are accumulatediWhen E isiGreater than 0, the line is a character line, when EiIf 0, then act as the background row; if character lines appear continuously, the area is considered as a character area, and if background lines appear continuously, the area is considered as a background area.
Preferably, the pixel point of the text region is not less than six pixels, and if the determined text region is less than six pixels, the text region is determined as noise.
The invention has the following beneficial effects:
(1) the invention carries out frame selection on the screen of the electronic whiteboard, positions the frame-selected image, separates the characters from the background, carries out edge detection and subdivision on the classified characters, and finally obtains the characters in the frame-selected area.
(2) According to the text feature threshold setting method, the text feature threshold setting is carried out on the extracted characters, wherein the text feature threshold setting comprises the font size, the height, the width, the gray value of the pixel points, the gradient value of the pixel points and the character spacing, the content of the text is judged during screening, the identified characters can be rapidly screened out, the characters which do not meet the specification can be eliminated, and the identification accuracy is improved.
Of course, it is not necessary for any product in which the invention is practiced to achieve all of the above-described advantages at the same time.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a schematic structural diagram of a screen character extraction system for an electronic whiteboard according to the present invention;
fig. 2 is a step diagram of a method for extracting screen text for an electronic whiteboard according to the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1, the present invention is a screen character extraction system for an electronic whiteboard, including a character box selection module, a text detection module, a text feature extraction module and an identification module; the text framing module is used for framing an area needing character extraction in the electronic whiteboard screen and intercepting the area into a high-definition picture; the text detection module is used for detecting the specific position of the character extraction area in the high-definition picture; the text feature extraction module is used for extracting features of the characters in the character extraction area; and the recognition module is used for recognizing the target characters in the character picture after the characteristics are extracted.
The text detection module is specifically configured to: extracting the attribute of the text area from the text area at the selection position; the attribute of the character area comprises confidence coefficient, scale and vertex coordinate; the confidence level is used to indicate the probability that the text line is contained in the text extraction region.
A coordinate system is constructed by taking the lower left corner of the electronic whiteboard as the origin of coordinates, and the initial vertex is set to be (x1, y)
1) The position coordinate of the finger lifted after sliding the finger is set as (x2, y 2); a rectangular area formed between the two points serves as an area to be recognized of the text box.
The text detection module is used for setting a text characteristic threshold value in advance; the text feature threshold comprises font size, height, width, gray value of pixel points, gradient value of pixel points and character spacing.
Referring to fig. 2, the present invention is a method for extracting screen characters from an electronic whiteboard, including the following steps:
step S1: drawing a candidate area needing to be identified on an electronic whiteboard screen by hands;
step S2: screenshot is carried out on the image of the candidate area to obtain a high-definition image;
step S3: positioning a text in the high-definition image;
step S4: separating the text from the background in the image by using a binarization method;
step S5: carrying out edge detection and fine segmentation on the separated character image;
step S6: and (5) inspecting and screening the character area to identify the final character.
In step S1, a spatial coordinate system is established with the lower left corner of the electronic whiteboard screen as the origin of the coordinate system; selecting any point on a screen of the electronic whiteboard as a starting point for image capture, sliding a finger on the electronic whiteboard, and acquiring a terminal coordinate when the finger leaves the electronic whiteboard; and combining the coordinates of the starting point and the end point to obtain a candidate area.
In step S3, the text in the image is located by connected component analysis and sliding window classification.
In step S5, the edge detection uses Prewitt operator to first determine R, G, B three components of each pixel in the image, and then substitutes the three components into Prewitt operator to determine the euclidean distance; converting the color image I (I, j) into an edge gray image S (I, j), solving a threshold value T by an iterative method, and carrying out binarization processing on the S (I, j); when S (i, j) is smaller than T, the pixel point (i, j) is considered as a background point, and the pixel value is set to be 0; when S (i, j) is larger than T, the pixel point (i, j) is considered as an edge point, and the pixel value is set to be 1; and then, thinning and denoising the edge to obtain a clearer image.
The Prewitt operator is as follows:
Figure BDA0002870549580000061
in step S6, when the character image is subdivided, the edge map is line-scanned, and edge points E of each line are accumulatediWhen E isiGreater than 0, the line is a character line, when EiIf 0, then act as the background row; if character lines appear continuously, the area is considered as a character area, and if background lines appear continuously, the area is considered as a background area.
The pixel point of the character area is not less than six pixels, if the determined character area is less than six pixels, the character area is determined as noise, multiple experiments show that the characters are clear and visible, the height of the characters is not less than 6 pixels generally, therefore, if the line number of the candidate character line area is not less than 6 pixels, the area serves as noise, and the transverse projection of the corresponding part of the character line presents very obvious peak value characteristics.
The divided character line region not only includes character edges but also includes certain noise edges, but the edge information arrangement of the character region is compact, and the background edge information arrangement is dispersed, so that the character region can be further divided by column division.
It should be noted that, in the above system embodiment, each included unit is only divided according to functional logic, but is not limited to the above division as long as the corresponding function can be implemented; in addition, specific names of the functional units are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present invention.
In addition, it is understood by those skilled in the art that all or part of the steps in the method for implementing the embodiments described above may be implemented by a program instructing associated hardware, and the corresponding program may be stored in a computer-readable storage medium.
The preferred embodiments of the invention disclosed above are intended to be illustrative only. The preferred embodiments are not intended to be exhaustive or to limit the invention to the precise embodiments disclosed. Obviously, many modifications and variations are possible in light of the above teaching. The embodiments were chosen and described in order to best explain the principles of the invention and the practical application, to thereby enable others skilled in the art to best utilize the invention. The invention is limited only by the claims and their full scope and equivalents.

Claims (9)

1. A screen character extraction system for an electronic whiteboard is characterized by comprising a character box selection module, a text detection module, a text feature extraction module and an identification module;
the text framing module is used for framing an area needing character extraction in the electronic whiteboard screen and intercepting the area into a high-definition picture;
the text detection module is used for detecting the specific position of the character extraction area in the high-definition picture;
the text feature extraction module is used for extracting features of the characters in the character extraction area;
and the identification module is used for identifying the target characters in the character picture after the characteristics are extracted.
2. The system of claim 1, wherein the text detection module is specifically configured to: extracting the attribute of the text area from the text area at the selection position; the attributes of the text region comprise confidence, scale and vertex coordinates; the confidence is used to indicate the probability that the text line is included in the text extraction region.
3. The screen word extraction system for an electronic whiteboard of claim 1, wherein the text detection module is to set a text feature threshold in advance; the text characteristic threshold comprises font size, height, width, gray value of pixel points, gradient value of pixel points and character spacing.
4. A method for extracting screen characters of an electronic whiteboard is characterized by comprising the following steps:
step S1: drawing a candidate area needing to be identified on an electronic whiteboard screen by hands;
step S2: screenshot is carried out on the image of the candidate area to obtain a high-definition image;
step S3: positioning a text in the high-definition image;
step S4: separating the text from the background in the image by using a binarization method;
step S5: carrying out edge detection and fine segmentation on the separated character image;
step S6: and (5) inspecting and screening the character area to identify the final character.
5. The method for extracting characters from a screen of an electronic whiteboard as claimed in claim 4, wherein in step S1, a space coordinate system is established with the lower left corner of the screen of the electronic whiteboard as the origin of the coordinate system; selecting any point on a screen of the electronic whiteboard as a starting point for image capture, sliding a finger on the electronic whiteboard, and acquiring a terminal coordinate when the finger leaves the electronic whiteboard; and combining the coordinates of the starting point and the end point to obtain a candidate area.
6. The method for extracting screen characters from an electronic whiteboard of claim 4, wherein in the step S3, the text positioning is performed by connected component analysis and sliding window classification.
7. The method as claimed in claim 4, wherein in step S5, the edge detection uses Prewitt operator to calculate R, G, B three components of each pixel in the image, and then substitutes the three components into Prewitt operator to calculate the euclidean distance; converting the color image I (I, j) into an edge gray image S (I, j), solving a threshold value T by an iterative method, and carrying out binarization processing on the S (I, j); when S (i, j) is smaller than T, the pixel point (i, j) is considered as a background point, and the pixel value is set to be 0; and when the S (i, j) is larger than the T, the pixel point (i, j) is considered as an edge point, and the pixel value is set to be 1.
8. The method for extracting screen characters from an electronic whiteboard of claim 4, wherein in step S6, when the character image is subdivided, the edge map is scanned in lines, and the edge points E of each line are accumulatediWhen E isiIf the value is more than 0, the character line is acted;
when E isiIf 0, then act as the background row;
if character lines appear continuously, the region is considered as a character region, and background lines appear continuously;
the area is considered to be a background area.
9. The system and method for extracting screen text from an electronic whiteboard of claim 8, wherein the pixel points of the text area are not less than six pixels, and if the identified text area is less than six pixels, the text area is identified as noise.
CN202011598383.9A 2020-12-29 2020-12-29 Screen character extraction system and method for electronic whiteboard Withdrawn CN112686253A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011598383.9A CN112686253A (en) 2020-12-29 2020-12-29 Screen character extraction system and method for electronic whiteboard

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011598383.9A CN112686253A (en) 2020-12-29 2020-12-29 Screen character extraction system and method for electronic whiteboard

Publications (1)

Publication Number Publication Date
CN112686253A true CN112686253A (en) 2021-04-20

Family

ID=75454191

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011598383.9A Withdrawn CN112686253A (en) 2020-12-29 2020-12-29 Screen character extraction system and method for electronic whiteboard

Country Status (1)

Country Link
CN (1) CN112686253A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114898409A (en) * 2022-07-14 2022-08-12 深圳市海清视讯科技有限公司 Data processing method and device

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114898409A (en) * 2022-07-14 2022-08-12 深圳市海清视讯科技有限公司 Data processing method and device
CN114898409B (en) * 2022-07-14 2022-09-30 深圳市海清视讯科技有限公司 Data processing method and device

Similar Documents

Publication Publication Date Title
CN110210413B (en) Multidisciplinary test paper content detection and identification system and method based on deep learning
CN108764352B (en) Method and device for detecting repeated page content
CN104182750A (en) Extremum connected domain based Chinese character detection method in natural scene image
CN106980857B (en) Chinese calligraphy segmentation and recognition method based on copybook
WO2022089170A1 (en) Caption area identification method and apparatus, and device and storage medium
CN105427696A (en) Method for distinguishing answer to target question
US9542756B2 (en) Note recognition and management using multi-color channel non-marker detection
CN105205488A (en) Harris angular point and stroke width based text region detection method
US20180082456A1 (en) Image viewpoint transformation apparatus and method
CN110443235B (en) Intelligent paper test paper total score identification method and system
CN113436222A (en) Image processing method, image processing apparatus, electronic device, and storage medium
CN111915635A (en) Test question analysis information generation method and system supporting self-examination paper marking
CN115273115A (en) Document element labeling method and device, electronic equipment and storage medium
Ayesh et al. A robust line segmentation algorithm for Arabic printed text with diacritics
CN115761773A (en) Deep learning-based in-image table identification method and system
CN111626145A (en) Simple and effective incomplete form identification and page-crossing splicing method
RU2633182C1 (en) Determination of text line orientation
CN107958261B (en) Braille point detection method and system
CN112686253A (en) Screen character extraction system and method for electronic whiteboard
CN113221778A (en) Method and device for detecting and identifying handwritten form
CN113569677A (en) Paper test report generation method based on scanning piece
CN116012860B (en) Teacher blackboard writing design level diagnosis method and device based on image recognition
CN110298236B (en) Automatic Braille image identification method and system based on deep learning
CN108062548B (en) Braille square self-adaptive positioning method and system
CN115019310A (en) Image-text identification method and equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication
WW01 Invention patent application withdrawn after publication

Application publication date: 20210420