CN110598217B - Click-to-read content identification method and device, home teaching machine and storage medium - Google Patents

Click-to-read content identification method and device, home teaching machine and storage medium Download PDF

Info

Publication number
CN110598217B
CN110598217B CN201910887010.4A CN201910887010A CN110598217B CN 110598217 B CN110598217 B CN 110598217B CN 201910887010 A CN201910887010 A CN 201910887010A CN 110598217 B CN110598217 B CN 110598217B
Authority
CN
China
Prior art keywords
click
page image
read
content
read page
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910887010.4A
Other languages
Chinese (zh)
Other versions
CN110598217A (en
Inventor
崔颖
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Genius Technology Co Ltd
Original Assignee
Guangdong Genius Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Genius Technology Co Ltd filed Critical Guangdong Genius Technology Co Ltd
Priority to CN201910887010.4A priority Critical patent/CN110598217B/en
Publication of CN110598217A publication Critical patent/CN110598217A/en
Application granted granted Critical
Publication of CN110598217B publication Critical patent/CN110598217B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/22Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/142Image acquisition using hand-held instruments; Constructional details of the instruments

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Character Input (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

The invention belongs to the field of home teaching machines, and discloses a method and a device for identifying click-to-read content, a home teaching machine and a storage medium, wherein the method comprises the following steps: acquiring a click-to-read page image; identifying a shielding area in the click-to-read page image; when the ratio of the shielding area to the click page image is larger than a preset ratio, acquiring the content of the peripheral area of the shielding area; according to the content of the peripheral area, completing the shielding area in the click-to-read page image; and acquiring the click content pointed by the indicator in the completed click page image. When the shielding area exists in the click-to-read page image, the shielding area is complemented by the natural language recognition technology, so that the accuracy of recognition of the click-to-read content can be improved, and the problem that the recognition accuracy is not high or can not be recognized due to shielding in the prior art is solved.

Description

Click-to-read content identification method and device, home teaching machine and storage medium
Technical Field
The invention belongs to the technical field of home teaching machines, and particularly relates to a method and a device for identifying click-to-read content, a home teaching machine and a storage medium.
Background
In the process of learning and growing, children need to read a large number of books, and in order to protect the eyesight of the children, general parents can enable the children to read the paper books. Children often experience difficulties in reading paper books, such as the occurrence of unrecognized words, unintelligible words, and the like. When a child encounters a problem, the child needs to help by parents, but the parents work more busy and often cannot timely help the child to solve the problem, so that the reading interest of the child is reduced, and the child is not benefited to study. The advent of home teaching machines has solved this problem well.
The home teaching machine is provided with a point reading function, when the point reading function of the home teaching machine is used for helping children read books, page images pointed by users are required to be acquired first, then the page images are identified, and finally the pointed content of the indicator is identified in the page images. In the actual use process, the situation that fingers cover book characters due to the fact that gestures of a user are not standard can occur, so that the content to be pointed by the user cannot be known, and the recognition accuracy is low.
Disclosure of Invention
The invention aims to provide a method and a device for identifying click-to-read content, a home education machine and a storage medium, and solves the problem of low click-to-read identification accuracy caused by finger shielding.
The technical scheme provided by the invention is as follows:
in one aspect, a method for identifying click-to-read content is provided, including:
acquiring a click-to-read page image;
identifying a shielding area in the click-to-read page image;
when the ratio of the shielding area to the click page image is larger than a preset ratio, acquiring the content of the peripheral area of the shielding area;
according to the content of the peripheral area, completing the shielding area in the click-to-read page image;
and acquiring the click content pointed by the indicator in the completed click page image.
Further preferably, the identifying the occlusion region in the click-to-read page image specifically includes:
according to the color difference between the shielding object in the click-to-read page image and the click-to-read page, performing binarization processing on the click-to-read page image to obtain a binarized click-to-read page image;
and identifying a shielding area in the click-to-read page image according to the binarized click-to-read page image.
Further preferably, when the ratio of the shielding area to the click-to-read page image is greater than a preset ratio, the acquiring the content of the peripheral area of the shielding area specifically includes:
when the ratio of the shielding area to the click-to-read page image is larger than a preset ratio, deleting pixel points in the shielding area in the click-to-read page image, and filling blank areas in each row of characters in the shielding area by adopting preset characters;
Acquiring sentences containing the preset characters;
according to the content of the peripheral area, the complementing of the shielding area in the click-to-read page image specifically comprises the following steps:
carrying out semantic analysis on the sentences through a natural language processing model to obtain semantic analysis results of the sentences;
and according to the semantic analysis result of the sentence, completing the shielding area in the click-to-read page image.
Further preferably, the acquiring click-to-read content pointed by the indicator in the completed click-to-read page image specifically includes:
searching a target storage page matched with the completed click-to-read page image in a database;
identifying and positioning an indicator in the click-to-read page image;
and acquiring click-to-read content corresponding to the indicator from the target storage page according to the indicator.
On the other hand, there is also provided a recognition device of the click-to-read content, including:
the image acquisition module is used for acquiring a click-to-read page image;
the identification module is used for identifying the shielding area in the click-to-read page image;
the content acquisition module is used for acquiring the content of the peripheral area of the shielding area when the ratio of the shielding area to the click-to-read page image is larger than a preset ratio;
The complement module is used for complementing the shielding area in the click-to-read page image according to the content of the peripheral area;
the content acquisition module is also used for acquiring the click-to-read content pointed by the indicator in the completed click-to-read page image.
Further preferably, the identification module includes:
the image processing unit is used for carrying out binarization processing on the click-to-read page image according to the color difference between the shielding object in the click-to-read page image and the click-to-read page to obtain a binarized click-to-read page image;
and the identification unit is used for identifying the shielding area in the click-to-read page image according to the binarized click-to-read page image.
Further preferably, the content acquisition module includes:
the filling unit is used for deleting the pixel points in the shielding area in the click-to-read page image when the ratio of the shielding area to the click-to-read page image is larger than a preset ratio, and filling the blank area in each row of characters in the shielding area by adopting preset characters;
a sentence acquisition unit configured to acquire a sentence containing the preset character;
the complement module includes:
the semantic analysis unit is used for carrying out semantic analysis on the sentences through a natural language processing model to obtain semantic analysis results of the sentences;
And the complementing unit is used for complementing the shielding area in the click-to-read page image according to the semantic analysis result of the sentence.
Further preferably, the content acquisition module includes:
the searching unit is used for searching a target storage page matched with the completed click-to-read page image in the database;
the identifying and positioning unit is used for identifying and positioning the indicator in the click-to-read page image;
and the content acquisition unit is used for acquiring the click-to-read content corresponding to the indicator in the target storage page according to the indicator.
In yet another aspect, there is also provided a home teaching machine including a memory, a processor and a computer program stored in the memory and executable on the processor, the processor implementing the steps of the method for recognizing click-to-read content as described in any one of the above when the computer program is executed.
In yet another aspect, a computer readable storage medium stores a computer program which, when executed by a processor, implements the steps of the method for recognizing click-to-read content of any one of the above.
Compared with the prior art, the identification method and device for the click-to-read content, the home teaching machine and the storage medium have the following beneficial effects:
When the shielding area exists in the click-to-read page image, the shielding area is complemented by the natural language recognition technology, so that the accuracy of recognition of the click-to-read content can be improved, and the problem that the recognition accuracy is not high or can not be recognized due to shielding in the prior art is solved.
Drawings
The foregoing features, technical features, advantages and implementation manners of a method, apparatus, home education machine and storage medium for recognizing click-to-read contents will be further described with reference to the accompanying drawings in a clear and understandable manner.
FIG. 1 is a flow chart of one embodiment of a method for recognizing click-to-read content in accordance with the present invention;
FIG. 2 is a flow chart of another embodiment of a method for recognizing click-to-read content according to the present invention;
FIG. 3 is a flow chart of a method for recognizing click-through content according to another embodiment of the present invention;
FIG. 4 is a flow chart of yet another embodiment of a method for recognizing click-to-read content in accordance with the present invention;
FIG. 5 is a block diagram illustrating a schematic structure of an embodiment of a content recognition apparatus of the present invention;
fig. 6 is a block diagram illustrating a construction of an embodiment of a home teaching machine according to the present invention.
Description of the reference numerals
110. An image acquisition module; 120. an identification module; 121. an image processing unit; 122. an identification unit; 130. a content acquisition module; 131. a shim cell; 132. a sentence acquisition unit; 133. a search unit; 134. an identification and positioning unit; 135. a content acquisition unit; 140. a complement module; 141. a semantic analysis unit; 142. a complement unit; 200. a family education machine; 210. a memory; 220. a processor.
Detailed Description
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the following description will explain the specific embodiments of the present invention with reference to the accompanying drawings. It is evident that the drawings in the following description are only examples of the invention, from which other drawings and other embodiments can be obtained by a person skilled in the art without inventive effort.
It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
For the sake of simplicity of the drawing, the parts relevant to the present invention are shown only schematically in the figures, which do not represent the actual structure thereof as a product. Additionally, in order to simplify the drawing for ease of understanding, components having the same structure or function in some of the drawings are shown schematically with only one of them, or only one of them is labeled. Herein, "a" means not only "only this one" but also "more than one" case.
The invention provides an embodiment of a method for identifying click-to-read content, as shown in fig. 1, the method for identifying the click-to-read content comprises the following steps:
s100, acquiring a click-to-read page image;
specifically, when a user learns, the front camera of the home teaching machine can be started and a point reading mode is entered, when the user performs point reading on a book, after a finger to be used for point reading is stable, an image of a page pointed by the finger on the book can be obtained through photographing by the camera, and the image of the page is a point reading page image.
S200, identifying an occlusion region in the click-to-read page image;
specifically, when a user uses a finger to click on a book, due to the presence of the finger, a blocking area exists in a click page image obtained by photographing, and therefore, it is necessary to identify the blocking area in the click page image first.
When the shielding area is identified in the click-to-read page image, the trained image identification model can be used for identification, namely, a training sample is firstly obtained, and then the constructed image identification model is trained by adopting the training sample, so that the trained image identification model is obtained. The training sample at least comprises page images for shielding pages by adopting articles such as fingers, pens and the like.
S300, when the ratio of the shielding area to the click-to-read page image is larger than a preset ratio, acquiring the content of the peripheral area of the shielding area;
specifically, after the shielding area is identified in the click-to-read page image, calculating the ratio of the shielding area to the click-to-read page image, namely calculating the percentage of the shielding area in the click-to-read page image, when the ratio is smaller than a preset ratio, indicating that the shielding area is only caused by pointing indicators such as a finger, etc., the shielding area is small and does not influence the identification of click-to-read content, at the moment, directly searching a matched target storage page in a database according to the click-to-read page image so as to know which page in the book corresponds to the click-to-read page image, identifying the position of the click-to-read area pointed by the indicator in the click-to-read page image, finally acquiring corresponding click-to-read content in the target storage page according to the position of the click-to-read area, and playing or displaying the acquired click-to-read content.
When searching the matched target storage page in the database, the searching can be performed according to the characters in the point-read page image. For example, a storage page with a text repetition rate greater than a preset threshold in the page image can be directly searched in the database. The preset threshold may be set according to a preset ratio, and should be appropriately reduced when the preset ratio is increased, but the preset threshold cannot be set too low in order to ensure the accuracy of the search.
If the ratio of the shielding area in the click-to-read page image to the click-to-read page image is larger than the preset ratio, the fact that other shielding objects exist in the click-to-read page image except for the finger used for clicking is indicated. For example, when a user clicks, a plurality of fingers or fists held by hands are placed on a page, so that a large area of shielding exists in a click page image obtained by photographing. When the shielding area is larger, if matching is directly performed in the database according to the click-to-read page image, matching to a plurality of storage pages can occur, so that the matching is inaccurate, and therefore, the shielding area needs to be complemented first. The covering area is complemented, and the content of the peripheral area of the covering area needs to be acquired first. It should be noted that if the occlusion area is very large, the occlusion area cannot be complemented, and in this case, the user needs to be prompted and the click-through page image needs to be acquired again.
S400, complementing the shielding area in the click-to-read page image according to the content of the peripheral area;
specifically, after the content of the peripheral area is obtained, the content of the peripheral area is processed through a natural language processing model according to the relevance between the front content and the rear content of the peripheral area so as to complement the shielding area.
The natural language processing model is obtained through corpus sample training in a corpus. The corpus is obtained by the following steps: all the paper texts are electronized to be corpus; or by crawling data on the web. After the corpus is obtained, preprocessing is performed on the corpus, for example, data cleaning, word segmentation, part-of-speech tagging, feature word removal and the like are performed on the corpus. The data cleansing is to cleanse and delete unnecessary noise data, for example, remove advertisement, tag, comment, etc. for crawled web page content. Common data cleaning modes are: manual deduplication, alignment, deletion, labeling, etc., or regular expression matching, extraction according to parts of speech and named entities, script writing, or code batch processing, etc.
Word segmentation is the segmentation of sentences or paragraphs into individual words or terms. Part of speech tagging is the tagging of each word or word with a part of speech, such as adjectives, verbs, nouns, etc. Marking shielding words in the preprocessed corpus to form training samples, training a natural language processing model through the training samples, wherein the trained natural language processing model can be used for complementing the shielding region according to the content of the surrounding region.
S500, acquiring click-to-read content in the completed click-to-read page image.
Specifically, after the shielding area in the click-to-read page image is complemented, the content clicked by the user can be accurately obtained according to the complemented click-to-read page image.
After acquiring the content clicked by the user, acquiring the corresponding content in the database according to the clicked content and combining the voice information of the user, and playing or displaying the voice of the content. For example, the content read by the user is a question, the voice information is "how to solve", the solution process of the question is obtained in the database by combining the read content and the voice information, and the solution process is displayed to the user.
In the embodiment, when the shielding area exists in the click-to-read page image, the shielding area is complemented by the natural language recognition technology, so that the accuracy of recognition of the click-to-read content can be improved, and the problem that the recognition accuracy is not high or can not be recognized due to shielding in the prior art is solved.
The present invention provides another embodiment of a method for identifying click-to-read content, as shown in fig. 2, the method for identifying click-to-read content includes:
s100, acquiring a click-to-read page image;
s210, according to the color difference between the shielding object in the click-to-read page image and the click-to-read page, performing binarization processing on the click-to-read page image to obtain a binarized click-to-read page image;
Specifically, when the shielding area is identified in the click-to-read page image, the identification can be performed according to the color difference between the shielding object and the click-to-read page. In the click-to-read page image, a page can be used as a background, a shielding object is used as a target, the click-to-read page image is set to two different levels by utilizing the difference between the target and the background in the click-to-read page image, a proper threshold value is selected to determine whether a certain pixel in the image is the target or the background, and binarization processing is carried out on the click-to-read page image to obtain a binarized click-to-read page image.
S220, identifying a shielding area in the click-to-read page image according to the binarized click-to-read page image;
specifically, in the binarized image, the entire image exhibits an obvious black-and-white effect, a white area generally represents a background, a black area represents a target, and the background and the target can be distinguished easily according to the obvious black-and-white effect. Therefore, after the binarization click-to-read page image is obtained, the contour information of the shielding area can be obtained from the binarization click-to-read page image, and the area formed by the contour of the shielding area is the shielding area.
S300, when the ratio of the shielding area to the click-to-read page image is larger than a preset ratio, acquiring the content of the peripheral area of the shielding area;
Specifically, after the shielding area is identified in the click-to-read page image, calculating the ratio of the shielding area to the click-to-read page image, namely calculating the percentage of the shielding area in the click-to-read page image, and when the ratio is smaller than a preset ratio, indicating that the shielding area is only caused by pointing indicators such as a finger, etc., the shielding area is small, and the identification of click-to-read content is not affected, and at the moment, the content clicked by a user can be identified directly according to the click-to-read page image.
If the ratio of the shielding area in the click-to-read page image to the click-to-read page image is larger than the preset ratio, the fact that other shielding objects exist in the click-to-read page image except for the finger used for clicking is indicated. For example, when a user clicks, a plurality of fingers or fists held by hands are placed on a page, so that a large area of shielding exists in a click page image obtained by photographing. When the shielding area is larger, if matching is directly performed in the database according to the click-to-read page image, the problem of inaccurate matching can occur, so that the shielding area needs to be complemented first. The covering area is complemented, and the content of the peripheral area of the covering area needs to be acquired first. It should be noted that if the occlusion area is very large, the occlusion area cannot be complemented, and in this case, the user needs to be prompted and the click-through page image needs to be acquired again.
S400, complementing the shielding area in the click-to-read page image according to the content of the peripheral area;
specifically, after the content of the peripheral area is obtained, the content of the peripheral area is processed through a natural language processing model according to the relevance between the front content and the rear content of the peripheral area so as to complement the shielding area.
S500, acquiring the click-to-read content pointed by the indicator in the completed click-to-read page image.
The present invention provides a further embodiment of a method for identifying click-to-read content, as shown in fig. 3, the method for identifying click-to-read content includes:
s100, acquiring a click-to-read page image;
s200, identifying an occlusion region in the click-to-read page image;
s310, deleting pixel points in the shielding area in the click-to-read page image when the ratio of the shielding area to the click-to-read page image is larger than a preset ratio, and filling blank areas in each row of characters in the shielding area by adopting preset characters;
specifically, after the shielding area is identified in the click-to-read page image, when the shielding area is judged to be required to be fully complemented, deleting all pixel points corresponding to the shielding area in the click-to-read page image, and then filling a blank area of each row of characters in the shielding area by adopting preset characters, wherein the blank area between the rows of characters does not need to be filled, so that each row is conveniently distinguished, and sentences containing the preset characters are conveniently obtained subsequently. The preset characters may be underlined or wavy lines or various symbols, etc.
For example, the page comprises 15 lines of characters, wherein the shielding area shields part of the characters in the fourth line to the eighth line, and after deleting the shielding area in the click-to-read page image, the characters shielded by the shielding object in the fourth line to the eighth line are filled by underline.
S320, acquiring sentences containing the preset characters;
specifically, after filling a blank area corresponding to a shielding area in each row in the click-to-read page image with a preset character, extracting each sentence including the preset character from the click-to-read page image, wherein the sentences can be divided according to punctuation marks, generally, one sentence is taken as a starting point, the next adjacent sentence is taken as an end point, and characters between the starting point and the end point are one sentence. And extracting all single sentences comprising the preset characters from the click-to-read page image according to the filled preset characters. Each sentence extracted comprises at least one preset character.
S410, carrying out semantic analysis on the sentence through a natural language processing model to obtain a semantic analysis result of the sentence;
specifically, after a sentence comprising preset characters is extracted from the click-to-read page image, each sentence is respectively input into a trained natural language processing model, the natural language processing model carries out syntactic analysis on each sentence, sentence substructures and phrases in the sentences are analyzed, the interrelationship of words, phrases and the like in the sentences and the relation of the words and the phrases in the sentences are found, and then the semantics of each sentence are deduced.
S420, complementing the shielding area in the click-to-read page image according to the semantic analysis result of the sentence;
specifically, after the semantics of each sentence are deduced by the natural language processing model, the content of the shielding area can be deduced according to the semantics of the sentences, and then the content of the shielding area is complemented.
S500, acquiring the click-to-read content pointed by the indicator in the completed click-to-read page image.
The present invention provides a still further embodiment of a method for recognizing click-to-read contents, as shown in fig. 4, the method for recognizing click-to-read contents includes:
s100, acquiring a click-to-read page image;
s200, identifying an occlusion region in the click-to-read page image;
s300, when the ratio of the shielding area to the click-to-read page image is larger than a preset ratio, acquiring the content of the peripheral area of the shielding area;
s400, complementing the shielding area in the click-to-read page image according to the content of the peripheral area;
s510, searching a target storage page matched with the completed click-to-read page image in a database;
s520, identifying and positioning an indicator in the click-to-read page image;
s530, according to the indicator, acquiring click-to-read content corresponding to the indicator from the target storage page.
Specifically, after the content of the shielding area is complemented, searching a matched target storage page in the database according to the complemented click-to-read page image, and searching and matching can be performed according to characters in the complemented click-to-read page image during matching. For example, the database can be directly searched for a storage page with a text repetition rate greater than a preset threshold in the completed click-to-read page image.
And identifying and positioning an indicator in the click-to-read page image, wherein the indicator is a finger, a pen and other tools used for clicking by a user, and acquiring click-to-read content corresponding to the indicator in a target storage page according to the position of the indicator in the click-to-read page image.
After acquiring the content clicked by the user, acquiring the corresponding content in the database according to the clicked content and combining the voice information of the user, and playing or displaying the voice of the content. For example, the content read by the user is a question, the voice information is "how to solve", the solution process of the question is obtained in the database by combining the read content and the voice information, and the solution process is displayed to the user.
It should be understood that, in the foregoing embodiments, the size of the sequence numbers of steps does not mean that the execution sequence of the steps should be determined by functions and internal logic, and should not limit the implementation process of the embodiments of the present invention in any way.
The present invention also provides an embodiment of a device for recognizing click-to-read content, as shown in fig. 5, the device for recognizing click-to-read content includes:
an image acquisition module 110, configured to acquire a click-to-read page image;
specifically, when a user learns, the front camera of the home teaching machine can be started and a point reading mode is entered, when the user performs point reading on a book, after a finger to be used for point reading is stable, an image of a page pointed by the finger on the book can be obtained through photographing by the camera, and the image of the page is a point reading page image.
The identifying module 120 is configured to identify an occlusion region in the click-to-read page image;
specifically, when a user uses a finger to click on a book, due to the presence of the finger, a blocking area exists in a click page image obtained by photographing, and therefore, it is necessary to identify the blocking area in the click page image first.
When the shielding area is identified in the click-to-read page image, the trained image identification model can be used for identification, namely, a training sample is firstly obtained, and then the constructed image identification model is trained by adopting the training sample, so that the trained image identification model is obtained. The training sample at least comprises page images for shielding pages by adopting articles such as fingers, pens and the like.
The content acquisition module 130 is configured to acquire content of a peripheral area of the shielding area when a ratio of the shielding area to the click page image is greater than a preset ratio;
specifically, after the shielding area is identified in the click-to-read page image, calculating the ratio of the shielding area to the click-to-read page image, namely calculating the percentage of the shielding area in the click-to-read page image, when the ratio is smaller than a preset ratio, indicating that the shielding area is only caused by pointing indicators such as a finger, etc., the shielding area is small and does not influence the identification of click-to-read content, at the moment, directly searching a matched target storage page in a database according to the click-to-read page image so as to know which page in the book corresponds to the click-to-read page image, identifying the position of the click-to-read area pointed by the indicator in the click-to-read page image, finally acquiring corresponding click-to-read content in the target storage page according to the position of the click-to-read area, and playing or displaying the acquired click-to-read content.
When searching the matched target storage page in the database, the searching can be performed according to the characters in the point-read page image. For example, a storage page with a text repetition rate greater than a preset threshold in the page image can be directly searched in the database. The preset threshold may be set according to a preset ratio, and should be appropriately reduced when the preset ratio is increased, but the preset threshold cannot be set too low in order to ensure the accuracy of the search.
If the ratio of the shielding area in the click-to-read page image to the click-to-read page image is larger than the preset ratio, the fact that other shielding objects exist in the click-to-read page image except for the finger used for clicking is indicated. For example, when a user clicks, a plurality of fingers or fists held by hands are placed on a page, so that a large area of shielding exists in a click page image obtained by photographing. When the shielding area is larger, if matching is directly performed in the database according to the click-to-read page image, matching to a plurality of storage pages can occur, so that the matching is inaccurate, and therefore, the shielding area needs to be complemented first. The covering area is complemented, and the content of the peripheral area of the covering area needs to be acquired first. It should be noted that if the occlusion area is very large, the occlusion area cannot be complemented, and in this case, the user needs to be prompted and the click-through page image needs to be acquired again.
The complement module 140 is configured to complement the occlusion area in the click-to-read page image according to the content of the peripheral area;
specifically, after the content of the peripheral area is obtained, the content of the peripheral area is processed through a natural language processing model according to the relevance between the front content and the rear content of the peripheral area so as to complement the shielding area.
The natural language processing model is obtained through corpus sample training in a corpus. The corpus is obtained by the following steps: all the paper texts are electronized to be corpus; or by crawling data on the web. After the corpus is obtained, preprocessing is performed on the corpus, for example, data cleaning, word segmentation, part-of-speech tagging, feature word removal and the like are performed on the corpus. The data cleansing is to cleanse and delete unnecessary noise data, for example, remove advertisement, tag, comment, etc. for crawled web page content. Common data cleaning modes are: manual deduplication, alignment, deletion, labeling, etc., or regular expression matching, extraction according to parts of speech and named entities, script writing, or code batch processing, etc.
Word segmentation is the segmentation of sentences or paragraphs into individual words or terms. Part of speech tagging is the tagging of each word or word with a part of speech, such as adjectives, verbs, nouns, etc. Marking shielding words in the preprocessed corpus to form training samples, training a natural language processing model through the training samples, wherein the trained natural language processing model can be used for complementing the shielding region according to the content of the surrounding region.
The content obtaining module 130 is further configured to obtain click-to-read content pointed by the pointer in the completed click-to-read page image.
Specifically, after the shielding area in the click-to-read page image is complemented, the content clicked by the user can be accurately obtained according to the complemented click-to-read page image.
After acquiring the content clicked by the user, acquiring the corresponding content in the database according to the clicked content and combining the voice information of the user, and playing or displaying the voice of the content. For example, the content read by the user is a question, the voice information is "how to solve", the solution process of the question is obtained in the database by combining the read content and the voice information, and the solution process is displayed to the user.
In the embodiment, when the shielding area exists in the click-to-read page image, the shielding area is complemented by the natural language recognition technology, so that the accuracy of recognition of the click-to-read content can be improved, and the problem that the recognition accuracy is not high or can not be recognized due to shielding in the prior art is solved.
Preferably, the identification module 120 includes:
an image processing unit 121, configured to perform binarization processing on the click-to-read page image according to a color difference between a shielding object in the click-to-read page image and the click-to-read page to obtain a binarized click-to-read page image;
specifically, when the shielding area is identified in the click-to-read page image, the identification can be performed according to the color difference between the shielding object and the click-to-read page. In the click-to-read page image, a page can be used as a background, a shielding object is used as a target, the click-to-read page image is set to two different levels by utilizing the difference between the target and the background in the click-to-read page image, a proper threshold value is selected to determine whether a certain pixel in the image is the target or the background, and binarization processing is carried out on the click-to-read page image to obtain a binarized click-to-read page image.
The identifying unit 122 is configured to identify an occlusion region in the click-to-read page image according to the binarized click-to-read page image.
Specifically, in the binarized image, the entire image exhibits an obvious black-and-white effect, a white area generally represents a background, a black area represents a target, and the background and the target can be distinguished easily according to the obvious black-and-white effect. Therefore, after the binarization click-to-read page image is obtained, the contour information of the shielding area can be obtained from the binarization click-to-read page image, and the area formed by the contour of the shielding area is the shielding area.
Preferably, the content acquisition module 130 includes:
the filling unit 131 is configured to delete a pixel point in the shielding area in the click-to-read page image when the ratio of the shielding area to the click-to-read page image is greater than a preset ratio, and fill a blank area in each line of characters in the shielding area with preset characters;
specifically, after the shielding area is identified in the click-to-read page image, when the shielding area is judged to be required to be fully complemented, deleting all pixel points corresponding to the shielding area in the click-to-read page image, and then filling a blank area of each row of characters in the shielding area by adopting preset characters, wherein the blank area between the rows of characters does not need to be filled, so that each row is conveniently distinguished, and sentences containing the preset characters are conveniently obtained subsequently. The preset characters may be underlined or wavy lines or various symbols, etc.
For example, the page comprises 15 lines of characters, wherein the shielding area shields part of the characters in the fourth line to the eighth line, and after deleting the shielding area in the click-to-read page image, the characters shielded by the shielding object in the fourth line to the eighth line are filled by underline.
A sentence acquisition unit 132 for acquiring a sentence containing the preset character;
specifically, after filling a blank area corresponding to a shielding area in each row in the click-to-read page image with a preset character, extracting each sentence including the preset character from the click-to-read page image, wherein the sentences can be divided according to punctuation marks, generally, one sentence is taken as a starting point, the next adjacent sentence is taken as an end point, and characters between the starting point and the end point are one sentence. And extracting all single sentences comprising the preset characters from the click-to-read page image according to the filled preset characters. Each sentence extracted comprises at least one preset character.
The complement module 140 includes:
the semantic analysis unit 141 is configured to perform semantic analysis on the sentence through a natural language processing model, so as to obtain a semantic analysis result of the sentence;
specifically, after a sentence comprising preset characters is extracted from the click-to-read page image, each sentence is respectively input into a trained natural language processing model, the natural language processing model carries out syntactic analysis on each sentence, sentence substructures and phrases in the sentences are analyzed, the interrelationship of words, phrases and the like in the sentences and the relation of the words and the phrases in the sentences are found, and then the semantics of each sentence are deduced.
And a complementing unit 142, configured to complement the occlusion region in the click-to-read page image according to the semantic parsing result of the sentence.
Specifically, after the semantics of each sentence are deduced by the natural language processing model, the content of the shielding area can be deduced according to the semantics of the sentences, and then the content of the shielding area is complemented.
Preferably, the content acquisition module 130 further includes:
a searching unit 133, configured to search a database for a target storage page that matches the completed click-to-read page image;
the identifying and positioning unit 134 is configured to identify and position the pointer in the click-to-read page image;
and the content obtaining unit 135 is configured to obtain, according to the indicator, click-to-read content corresponding to the indicator in the target storage page.
Specifically, after the content of the shielding area is complemented, searching a matched target storage page in the database according to the complemented click-to-read page image, and searching and matching can be performed according to characters in the complemented click-to-read page image during matching. For example, the database can be directly searched for a storage page with a text repetition rate greater than a preset threshold in the completed click-to-read page image.
And identifying and positioning an indicator in the click-to-read page image, wherein the indicator is a finger, a pen and other tools used for clicking by a user, and acquiring click-to-read content corresponding to the indicator in a target storage page according to the position of the indicator in the click-to-read page image.
After acquiring the content clicked by the user, acquiring the corresponding content in the database according to the clicked content and combining the voice information of the user, and playing or displaying the voice of the content. For example, the content read by the user is a question, the voice information is "how to solve", the solution process of the question is obtained in the database by combining the read content and the voice information, and the solution process is displayed to the user.
Fig. 6 is a schematic structural view of a home teaching machine provided in an embodiment of the present invention, and as shown in fig. 6, the home teaching machine 200 includes: memory 210, processor 220, and a computer program stored in memory 210 and executable on processor 220, such as: and (5) identifying the page number of the book. The steps in the above embodiments of the method for identifying the page numbers of each book are implemented when the processor 220 executes a computer program, or the functions of each module in the above embodiments of the device for identifying the page numbers of each book are implemented when the processor 220 executes a computer program.
The home teaching machine 200 includes, but is not limited to, a processor 220, a memory 210. It will be appreciated by those skilled in the art that fig. 6 is merely an example of the home teaching machine 200 and is not limiting of the home teaching machine 200, and may include more or fewer components than shown, or may combine certain components, or different components, such as: the home teaching machine 200 may further include an input-output device, a display device, a network access device, a bus, and the like.
The processor 220 may be a central processing unit (Central Processing Unit, CPU), but may also be other general purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application SpecificIntegrated Circuit, ASIC), field-Programmable gate arrays (FPGA) or other Programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. The general purpose processor 220 may be a microprocessor or the processor may be any conventional processor or the like.
The memory 210 may be an internal storage unit of the home teaching machine 200, for example: hard disk or memory of the home teaching machine 200. The memory 210 may also be an external storage device of the home teaching machine 200, for example: a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash Card (Flash Card) and the like provided in the home teaching machine 200. Further, the memory 210 may also include both an internal storage unit and an external storage device of the home teaching machine 200. The memory 210 is used to store computer programs and other programs and data required by the home teaching machine 200. The memory 210 may also be used to temporarily store data that has been output or is to be output.
In the foregoing embodiments, the descriptions of the embodiments are focused on, and the parts of a certain embodiment that are not described or depicted in detail may be referred to in the related descriptions of other embodiments.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus/home teaching machine and method may be implemented in other manners. For example, the apparatus/home teaching machine embodiments described above are merely illustrative, e.g., the division of modules or units is merely a logical function division, and there may be additional divisions when actually implemented, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection via interfaces, devices or units, which may be in electrical, mechanical or other forms.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed over a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in the embodiments of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.
The integrated modules/units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the present application further provides a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the book page number identification method of the above-described embodiment.
The present invention may be implemented by implementing all or part of the above-described methods, or by sending instructions to the relevant hardware by a computer program, which may be stored in a computer readable storage medium, and which, when executed by the processor 220, implements the steps of the above-described method embodiments. Wherein the computer program comprises: computer program code, which may be in the form of source code, object code, executable files, or in some intermediate form, etc. The computer readable storage medium may include: any entity or device capable of carrying computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), an electrical carrier signal, a telecommunications signal, a software distribution medium, and so forth. It should be noted that the content of the computer readable storage medium may be appropriately increased or decreased according to the requirements of legislation and patent practice in the jurisdiction, for example: in some jurisdictions, computer-readable media do not include electrical carrier signals and telecommunications signals in accordance with legislation and patent practice.
It should be noted that the above embodiments can be freely combined as needed. The foregoing is merely a preferred embodiment of the present invention and it should be noted that modifications and adaptations to those skilled in the art may be made without departing from the principles of the present invention, which are intended to be comprehended within the scope of the present invention.

Claims (8)

1. A method for identifying click-to-read content, comprising:
acquiring a click-to-read page image;
identifying a shielding area in the click-to-read page image;
when the ratio of the shielding area to the click page image is larger than a preset ratio, acquiring the content of the peripheral area of the shielding area;
according to the content of the peripheral area, completing the shielding area in the click-to-read page image;
acquiring click contents pointed by an indicator in the completed click page image:
when the ratio of the shielding area to the click page image is greater than a preset ratio, the acquiring the content of the peripheral area of the shielding area specifically comprises:
when the ratio of the shielding area to the click-to-read page image is larger than a preset ratio, deleting pixel points in the shielding area in the click-to-read page image, and filling blank areas in each row of characters in the shielding area by adopting preset characters;
Acquiring sentences containing the preset characters;
according to the content of the peripheral area, the complementing of the shielding area in the click-to-read page image specifically comprises the following steps:
carrying out semantic analysis on the sentences through a natural language processing model to obtain semantic analysis results of the sentences;
and according to the semantic analysis result of the sentence, completing the shielding area in the click-to-read page image.
2. The method for identifying click-through content according to claim 1, wherein the identifying the occlusion region in the click-through page image specifically comprises:
according to the color difference between the shielding object in the click-to-read page image and the click-to-read page, performing binarization processing on the click-to-read page image to obtain a binarized click-to-read page image;
and identifying a shielding area in the click-to-read page image according to the binarized click-to-read page image.
3. The method for identifying click-to-read content according to claim 1 or 2, wherein the obtaining the click-to-read content pointed by the indicator in the completed click-to-read page image specifically includes:
searching a target storage page matched with the completed click-to-read page image in a database;
Identifying and positioning an indicator in the click-to-read page image;
and acquiring click-to-read content corresponding to the indicator from the target storage page according to the indicator.
4. A recognition apparatus for a click-to-read content, comprising:
the image acquisition module is used for acquiring a click-to-read page image;
the identification module is used for identifying the shielding area in the click-to-read page image;
the content acquisition module is used for acquiring the content of the peripheral area of the shielding area when the ratio of the shielding area to the click-to-read page image is larger than a preset ratio;
the complement module is used for complementing the shielding area in the click-to-read page image according to the content of the peripheral area;
the content acquisition module is also used for acquiring the click-to-read content pointed by the indicator in the completed click-to-read page image;
wherein, the content acquisition module includes:
the filling unit is used for deleting the pixel points in the shielding area in the click-to-read page image when the ratio of the shielding area to the click-to-read page image is larger than a preset ratio, and filling the blank area in each row of characters in the shielding area by adopting preset characters;
A sentence acquisition unit configured to acquire a sentence containing the preset character;
the complement module includes:
the semantic analysis unit is used for carrying out semantic analysis on the sentences through a natural language processing model to obtain semantic analysis results of the sentences;
and the complementing unit is used for complementing the shielding area in the click-to-read page image according to the semantic analysis result of the sentence.
5. The apparatus for recognizing a content-on-demand according to claim 4, wherein the recognition module comprises:
the image processing unit is used for carrying out binarization processing on the click-to-read page image according to the color difference between the shielding object in the click-to-read page image and the click-to-read page to obtain a binarized click-to-read page image;
and the identification unit is used for identifying the shielding area in the click-to-read page image according to the binarized click-to-read page image.
6. The apparatus for recognizing a read-on content according to claim 4 or 5, wherein the content acquisition module further comprises:
the searching unit is used for searching a target storage page matched with the completed click-to-read page image in the database;
the identifying and positioning unit is used for identifying and positioning the indicator in the click-to-read page image;
And the content acquisition unit is used for acquiring the click-to-read content corresponding to the indicator in the target storage page according to the indicator.
7. A family education machine comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the steps of the method for recognizing click-through content according to any one of claims 1-3 when the computer program is executed by the processor.
8. A computer-readable storage medium storing a computer program, characterized in that the computer program when executed by a processor implements the steps of the method of recognizing a click-through content according to any one of claims 1-3.
CN201910887010.4A 2019-09-19 2019-09-19 Click-to-read content identification method and device, home teaching machine and storage medium Active CN110598217B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910887010.4A CN110598217B (en) 2019-09-19 2019-09-19 Click-to-read content identification method and device, home teaching machine and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910887010.4A CN110598217B (en) 2019-09-19 2019-09-19 Click-to-read content identification method and device, home teaching machine and storage medium

Publications (2)

Publication Number Publication Date
CN110598217A CN110598217A (en) 2019-12-20
CN110598217B true CN110598217B (en) 2023-10-20

Family

ID=68861103

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910887010.4A Active CN110598217B (en) 2019-09-19 2019-09-19 Click-to-read content identification method and device, home teaching machine and storage medium

Country Status (1)

Country Link
CN (1) CN110598217B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111708902A (en) * 2020-06-04 2020-09-25 南京晓庄学院 Multimedia data acquisition method
CN112163513A (en) * 2020-09-26 2021-01-01 深圳市快易典教育科技有限公司 Information selection method, system, device, electronic equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108494996A (en) * 2018-05-14 2018-09-04 Oppo广东移动通信有限公司 Image processing method, device, storage medium and mobile terminal
CN108551552A (en) * 2018-05-14 2018-09-18 Oppo广东移动通信有限公司 Image processing method, device, storage medium and mobile terminal
CN109656465A (en) * 2019-02-26 2019-04-19 广东小天才科技有限公司 Content acquisition method applied to family education equipment and family education equipment
CN109766412A (en) * 2019-01-16 2019-05-17 广东小天才科技有限公司 Learning content acquisition method based on image recognition and electronic equipment
CN109947273A (en) * 2019-03-25 2019-06-28 广东小天才科技有限公司 Point reading positioning method and device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108494996A (en) * 2018-05-14 2018-09-04 Oppo广东移动通信有限公司 Image processing method, device, storage medium and mobile terminal
CN108551552A (en) * 2018-05-14 2018-09-18 Oppo广东移动通信有限公司 Image processing method, device, storage medium and mobile terminal
CN109766412A (en) * 2019-01-16 2019-05-17 广东小天才科技有限公司 Learning content acquisition method based on image recognition and electronic equipment
CN109656465A (en) * 2019-02-26 2019-04-19 广东小天才科技有限公司 Content acquisition method applied to family education equipment and family education equipment
CN109947273A (en) * 2019-03-25 2019-06-28 广东小天才科技有限公司 Point reading positioning method and device

Also Published As

Publication number Publication date
CN110598217A (en) 2019-12-20

Similar Documents

Publication Publication Date Title
CN110263248B (en) Information pushing method, device, storage medium and server
CN110909122B (en) Information processing method and related equipment
Thompson et al. Customised OCR correction for historical medical text
CN111507330B (en) Problem recognition method and device, electronic equipment and storage medium
Mahmoud et al. Online-khatt: an open-vocabulary database for Arabic online-text processing
CN110598217B (en) Click-to-read content identification method and device, home teaching machine and storage medium
CN111144210A (en) Image structuring processing method and device, storage medium and electronic equipment
Yen et al. WriteAhead: Mining grammar patterns in corpora for assisted writing
CN112257462A (en) Hypertext markup language translation method based on neural machine translation technology
CN110413996B (en) Method and device for constructing zero-index digestion corpus
CN112818200A (en) Data crawling and event analyzing method and system based on static website
CN111984845A (en) Website wrongly-written character recognition method and system
CN113515587A (en) Object information extraction method and device, computer equipment and storage medium
EP2916238A1 (en) Corpus generating device, corpus generating method, and corpus generating program
CN114842982B (en) Knowledge expression method, device and system for medical information system
CN111008519A (en) Reading page display method, electronic equipment and computer storage medium
Budig Extracting spatial information from historical maps: algorithms and interaction
CN111435405A (en) Method and device for automatically labeling key sentences of article
CN114067343A (en) Data set construction method, model training method and corresponding device
CN112001821A (en) Patent document auditing method, processing device and storage medium
CN110717029A (en) Information processing method and system
CN112860958B (en) Information display method and device
CN117077664B (en) Method and device for constructing text error correction data and storage medium
CN116758565B (en) OCR text restoration method, equipment and storage medium based on decision tree
CN111400577B (en) Search recall method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant