CN110969161A - Image processing method, circuit, visual impairment assisting apparatus, electronic apparatus, and medium - Google Patents

Image processing method, circuit, visual impairment assisting apparatus, electronic apparatus, and medium Download PDF

Info

Publication number
CN110969161A
CN110969161A CN201911214755.0A CN201911214755A CN110969161A CN 110969161 A CN110969161 A CN 110969161A CN 201911214755 A CN201911214755 A CN 201911214755A CN 110969161 A CN110969161 A CN 110969161A
Authority
CN
China
Prior art keywords
text
line
stored
data
characters
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911214755.0A
Other languages
Chinese (zh)
Other versions
CN110969161B (en
Inventor
封宣阳
蔡海蛟
冯歆鹏
周骥
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
NextVPU Shanghai Co Ltd
Original Assignee
NextVPU Shanghai Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by NextVPU Shanghai Co Ltd filed Critical NextVPU Shanghai Co Ltd
Priority to CN201911214755.0A priority Critical patent/CN110969161B/en
Publication of CN110969161A publication Critical patent/CN110969161A/en
Application granted granted Critical
Publication of CN110969161B publication Critical patent/CN110969161B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/62Text, e.g. of license plates, overlay texts or captions on TV images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/148Segmentation of character regions
    • G06V30/153Segmentation of character regions using recognition of characters or words
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Abstract

Provided are an image processing method, a circuit, a visual impairment assisting apparatus, an electronic apparatus, and a medium. The image processing method comprises the following steps: acquiring an image, wherein the image comprises a text area; in the text area, performing character recognition on a text line to be recognized to obtain text data of the text line; and storing the text data of the text line to the text to be read.

Description

Image processing method, circuit, visual impairment assisting apparatus, electronic apparatus, and medium
Technical Field
The present disclosure relates to the field of image processing technologies, and in particular, to an image processing method, an image processing circuit, a vision-impairment assisting apparatus, an electronic apparatus, and a medium.
Background
In recent years, image processing techniques have been widely used in various fields, and among them, techniques related to recognition, storage, and reading of text data have been one of the focuses of interest in the industry.
The approaches described in this section are not necessarily approaches that have been previously conceived or pursued. Unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section. Similarly, unless otherwise indicated, the problems mentioned in this section should not be considered as having been acknowledged in any prior art.
Disclosure of Invention
According to an aspect of the present disclosure, there is provided an image processing method including: acquiring an image, wherein the image comprises a text area; in the text area, performing character recognition on a text line to be recognized to obtain text data of the text line; and storing the text data of the text line to the text to be read.
According to another aspect of the present disclosure, there is provided an electronic circuit comprising: circuitry configured to perform the steps of the above-described method.
According to another aspect of the present disclosure, there is also provided a vision-impairment assisting apparatus including: a camera configured to acquire an image; the electronic circuit described above; circuitry configured to perform text detection and recognition on text contained in the image to obtain text data; circuitry configured to read the text data.
According to another aspect of the present disclosure, there is also provided an electronic device including: a processor; and a memory storing a program comprising instructions which, when executed by the processor, cause the processor to perform the method described above.
According to another aspect of the present disclosure, there is also provided a non-transitory computer readable storage medium storing a program, the program comprising instructions which, when executed by a processor of an electronic device, cause the electronic device to perform the above-described method.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate exemplary embodiments of the embodiments and, together with the description, serve to explain the exemplary implementations of the embodiments. The illustrated embodiments are for purposes of illustration only and do not limit the scope of the claims. Throughout the drawings, identical reference numbers designate similar, but not necessarily identical, elements.
Fig. 1 is a flowchart illustrating an image processing method according to an exemplary embodiment of the present disclosure;
FIG. 2 illustrates an exemplary image including a text region having a plurality of lines of text therein;
fig. 3 is a flowchart illustrating an image processing method according to another exemplary embodiment of the present disclosure;
fig. 4 is a flowchart illustrating an image processing method according to another exemplary embodiment of the present disclosure;
FIG. 5 illustrates an identification and storage process for a line of text to be identified according to an exemplary embodiment of the present disclosure;
FIG. 6 illustrates an identification and storage process for a line of text to be identified according to another exemplary embodiment of the present disclosure;
fig. 7 is a flowchart illustrating an image processing method according to another exemplary embodiment of the present disclosure;
fig. 8 is a flowchart illustrating an image processing method according to still another exemplary embodiment of the present disclosure;
fig. 9(a), 9(b), 9(c), 9(d) are diagrams illustrating data storage processes according to exemplary embodiments of the present disclosure;
FIG. 10 is a schematic process showing reading of recognized text data according to an example embodiment of the present disclosure; and
fig. 11 is a block diagram illustrating an example of an electronic device according to an exemplary embodiment of the present disclosure.
Detailed Description
In the present disclosure, unless otherwise specified, the use of the terms "first", "second", etc. to describe various elements is not intended to limit the positional relationship, the timing relationship, or the importance relationship of the elements, and such terms are used only to distinguish one element from another. In some examples, a first element and a second element may refer to the same instance of the element, and in some cases, based on the context, they may also refer to different instances.
The terminology used in the description of the various described examples in this disclosure is for the purpose of describing particular examples only and is not intended to be limiting. Unless the context clearly indicates otherwise, if the number of elements is not specifically limited, the elements may be one or more. Furthermore, the term "and/or" as used in this disclosure is intended to encompass any and all possible combinations of the listed items.
Although image processing techniques related to character recognition have been widely used in various fields, currently, some challenges still remain in text data reading.
A conventional read method may be as follows. The first method is to store the recognized text data in units of words in a data structure such as an array or an array. The disadvantage of this method is that it is cumbersome to read the desired text from the data structure storing the text, and the read text also lacks semantic concatenation and contextual context. The second method is to recognize all the characters from the image, store them as a whole, and then read the stored characters. The disadvantage of this approach is the long read latency required.
The present disclosure provides an image processing method. Fig. 1 is a flowchart illustrating an image processing method according to an exemplary embodiment of the present disclosure.
In this disclosure, a line of text refers to a sequence of words having an adjacent word spacing less than a threshold spacing, i.e., a continuous line of words. The adjacent character spacing refers to a distance between coordinates of corresponding positions of adjacent characters, such as a distance between upper left corner coordinates, lower right corner coordinates, or centroid coordinates of adjacent characters. If the adjacent text spacing is not greater than the threshold spacing, the adjacent text can be considered contiguous, dividing it into the same text line. If the adjacent text spacing is greater than the threshold spacing, the adjacent text may be considered to be discontinuous (e.g., may belong to different paragraphs or left and right columns, respectively), and thus divided into different text lines. The threshold spacing may be set according to the text size, for example: the threshold distance set by the adjacent characters with the font size larger than four (such as three and two) is larger than the threshold distance set by the adjacent characters with the font size smaller than four (such as four and five).
As shown in fig. 2, the image example includes a text region having 5 text lines (5 lines of text). Note that there is not limited to only one text region in one image, but there may also be a plurality of text regions, and each text region in the image may be processed using the image processing method of the present disclosure.
As shown in fig. 1, an image processing method according to an exemplary embodiment of the present disclosure includes: step S101, acquiring an image, wherein the image comprises a text area; step S102, in the text area, performing character recognition on a text line to be recognized to obtain text data of the text line; and step S103, storing the text data of the text line to the text to be read.
In step S101, an image is acquired, the image including a text region.
The acquired image may include a text region (i.e., a region containing text) with 5 lines of text as shown in fig. 2. As described above, the image acquired in step S101 may also include a plurality of text regions, and each text region may include a plurality of text lines.
According to some embodiments, the acquired image may contain text regions, and each text region may contain at least two lines of text (at least 2 text lines), and the contained text may be, for example, various forms of text (including various characters, numbers, and the like). In addition, the image may include a map or the like in addition to the text region.
According to some embodiments, the acquired image may be directly an image captured by the camera, or may be an image that has undergone some or some pre-processing based on the image captured by the camera, such as denoising, contrast enhancement, resolution processing, grayscale processing, blur removal, and so forth.
According to some embodiments, the images captured by the camera may be acquired in real time, or may be acquired some time after the images are captured by the camera.
According to some embodiments, the acquired images may be pre-screened images, such as multiple shots, from which a clearer image is selected.
According to some embodiments, the camera used to capture the images is capable of still or motion image capture, which may be a stand-alone device (e.g., a camera, a video camera, a webcam, etc.), or may be included in a variety of electronic devices (e.g., a mobile phone, a computer, a personal digital assistant, a vision-impaired auxiliary device, a tablet computer, a reading auxiliary device, a wearable device, etc.).
According to some embodiments, the camera may be provided on a device such as a wearable device or glasses of the user, for example.
In step S102, in the text area, character recognition is performed on one text line to be recognized to obtain text data of the text line.
According to some embodiments, the text data for a line of text may be derived by character recognition of the line of text, for example by optical character recognition, OCR, methods.
According to some embodiments, text line detection is performed after the image is acquired and before character recognition.
According to some embodiments, each text line to be recognized in one text region may be sequentially detected and recognized, resulting in text data of the text line.
Taking the image shown in fig. 2 as an example, character recognition may be performed on the line 1 text first, so as to obtain text data of the line 1 text ([ "culprit", i.e., "turn on vision"). And then, character recognition is carried out on the characters of the subsequent lines in sequence to obtain corresponding text data.
Note, however, that detection and recognition do not necessarily have to start from the first line of the text region, but may start directly from other lines.
In step S103, the text data of the text line is stored in the text to be read.
According to some embodiments, every time a text line is identified, the identified text data of the text line can be stored to the text to be read for the reading device to read the text to be read.
According to some embodiments, the text to be read may be actively acquired by the reading device, or may be provided to the reading device by the recognition device for performing text line recognition.
According to some embodiments, after being read, the text to be read becomes read text, and the read text may be additionally stored as needed, for example, the read text may be additionally stored in order for use. According to some embodiments, the read text may not be stored, as the case may be.
Therefore, the invention realizes asynchronous processing of identification and reading and reduces the waiting time of reading by identifying the text lines in the text area line by line and storing the obtained text data of the text line to the text to be read after identifying one text line for reading by reading equipment.
Moreover, through the steps, the text data of the detected and recognized text lines are spliced and stored to form the text to be read, so that the read content from the spliced and stored text data can have normal semantic connection and context, the problems of word-to-word blockage and lack of semantic connection and context caused by reading word-by-word are overcome, and the problem of sentence break of each line (hard and mechanical intervals or blockage between reading lines) caused by reading line-by-line is also greatly overcome. For example, according to an exemplary embodiment of the present disclosure, after the last reading, text data (at least one line) that has been stored but has not been read is acquired, and the text data is read continuously line by line, and in the reading of the text data, semantics are linked and coherent. Unlike the prior art line-by-line reading, only one line can be read at a time, and during the period from reading the line to reading the next line, there is significant stuttering, losing semantic consistency and consistency.
The image processing method according to an exemplary embodiment of the present disclosure may further include step S104, step S105, and step S106, as shown in fig. 3.
In step S104, the text data of the text line is stored to the stored text as one line of data in the stored text.
According to some embodiments, in addition to storing the recognized text data of the text line into the text to be read, it may also be stored into a stored text and, at the time of storage, may be stored as a line of data in the stored text. That is, the recognized text data may be stored also by line, as in the presentation form in the text region of the image. Specifically, for example, when the text data of 1 text line in the recognized text area is stored in the stored text, the text data is also stored separately as one line of data, so as to facilitate subsequent processing.
According to some embodiments, the text to be read and the stored text may be stored in different storage spaces. In addition, according to some embodiments, the read text may also be stored in a different storage space than the text to be read and the stored text.
In step S105, the sum of the number of characters of the stored text is calculated as the stored total number of characters.
The number of characters of a line of text may be used to represent the number of characters of the line of text. In the case where 1 kanji is equal to 2 characters, and 1 english alphabet, 1 number, or 1 punctuation mark is equal to 1 character, the number of characters in a line of characters can be usually calculated by "number of kanji + 2+ number of english alphabets + number of numbers + number of punctuation marks". For example, the text data of line 1 in fig. 2 [ "hit and watch", i.e., "turn on vision". The number of characters in (c) is 20, specifically: 7 Chinese characters 2+6 punctuations 20 characters. In addition, the number of characters of the text line may be calculated in other manners, and is not limited to the exemplary manner shown here.
The total number of characters stored, which is the sum of the number of characters of the stored text, varies with the continuous recognition of the lines of text and the continuous storage of the text data. Thus, the stored total number of characters may be updated with the storage. In this step, the updating of the stored total number of characters is described with emphasis.
The stored total number of characters may be calculated based on the number of characters of the stored text for use in calculating the updated cutoff duty at the subsequent step S106.
In the case where only one text line is recognized and stored, and there is only one line of data in the stored text (text data of the text line), the total number of stored characters may be the number of characters of the text data of the text line, for example, if only the 1 st line of characters in fig. 2 as the first text line is recognized, the number of the total number of stored characters is 20 in the case where 1 chinese character is 2 characters and one punctuation mark is 1 character.
In the case where a plurality of text lines are identified and stored, where there are a plurality of lines of data in the stored text, each line of data of the plurality of lines of data corresponding to the text data of each line of the plurality of lines of text, respectively, then the total number of characters stored may be the sum of the number of characters of the text data of the plurality of lines of text. In the case where the line 1 to line 5 letters shown in fig. 2 are recognized and stored in the stored text, the number of stored total characters is 142, i.e., 20+26+30+25+41, which is the total number of characters from the line 1 to the line 5 letters.
According to some embodiments, this step may be combined with step S106 into one step.
In step S106, a cutoff ratio of each line of data in the stored text is calculated and stored, wherein the cutoff ratio is determined by a ratio of a sum of the number of characters preceding the line of data in the stored text to the number of characters of the line of data to the total number of characters stored.
If the position of a recognized text line in the text area is to be determined, this can be done, for example, by calculating the cut-off ratio of the text data corresponding to the text line (i.e., the data of the corresponding line in the stored text).
The cutoff ratio of a text line preceding the text line is changed every time the text data of the text line is newly recognized and stored, so that it can be recalculated to update the cutoff ratios of the text lines, and in addition, the cutoff ratio of the text line needs to be calculated. How to calculate the cutoff duty of a text line will be described in more detail below by combining specific examples.
For example, assuming that text data of 5 text lines, such as the line 1 text to the line 5 text in the text area shown in fig. 2, is already stored in the stored text, the total number of stored characters of the stored text is 142, i.e., 20+26+30+25+41, which is the total number of characters from the line 1 to the line 5.
For example, for the 3 rd row of texts, the cutoff percentage is the ratio of the number of characters from the 1 st row of texts to the 3 rd row of texts to the total number of stored characters, that is, (20+26+30)/142 equals 54%.
The method of obtaining the cutoff ratio using the number of characters is exemplified above for explaining the meaning of the "cutoff ratio", and the cutoff ratio of each line of data may also be determined based on other parameters, for example, the spatial position information of the text line corresponding to the line of data in the text region may be determined according to, for example, the proportion of the area from the first text line of the text region to the text line to the entire text region, the proportion of the number of lines from the first text line of the text region to the text line to the total number of lines of the text region, and the like.
According to some embodiments, the cutoff percentage of a newly identified and stored line of text may be a particular value, such as 100%, since by that line of text, all lines of stored text, the cutoff percentage is unambiguous. However, this special case is only for the newly identified and stored line of text.
The above describes the identification and storage of lines of text in a manner that facilitates multiple line reading, such as may be facilitated by concatenating stored text data. Here, the term "mosaic storage" may refer to storing data obtained by scattering (for example, obtained line by line) together in order.
Through the steps S104-S106, the related information (such as cut-off ratio and the like) of the text line is stored in real time along with the storage of the text data, so that the management of the text data is facilitated; and the cutoff occupation ratio of the stored text line is dynamically updated in the storage process to obtain an accurate cutoff occupation ratio, great convenience is provided for positioning and re-reading the text data, the reading speed and accuracy can be greatly improved, and therefore the required service is provided for the user.
According to some embodiments, as shown in fig. 4, in step S110, a location identifier of the text line for indicating a location of the text line in the text region may be stored.
Wherein, the position identifier for indicating the position of the text line in the text area may be a line number. For example, the first row may be represented by "001". The location identity may also be represented in other ways and is not limited to the way illustrated here.
Although step S110 shown in fig. 4 immediately follows step S104 of storing text data to a stored text, the storage of the location identity may be performed together with the storage of the text data, i.e. step S110 may be merged to step S104. Alternatively, step S110 of storing the position identifier of the text line may be executed after step S106. In summary, the present disclosure is not limited to the steps and their order of execution shown in FIG. 4.
FIG. 5 illustrates a character recognition and storage process according to an exemplary embodiment of the present disclosure.
As shown in fig. 5, for example, in the case where one text line to be recognized is a line 1 character, as text data, position identification, and cutoff duty of the line 1 character, respectively, the following are stored: "onset", i.e., "turn on vision". [001] and [ 100% ]. Here, the values such as 001, 100% shown in fig. 5 are exemplary, and the present disclosure is not limited to the exemplary numerical forms.
For convenience of description, information related to a text line (text data excluding the text line) is sometimes referred to herein as "text line related information", which may include, for example, a position identification, a cutoff duty, and the like, as shown in fig. 5.
According to some embodiments, the text data and the location identification and the deadline ratio may be stored in a storage space, such as a cache, or in other forms of storage space.
In the present disclosure, the storage of each text data, position identification, and cutoff duty is not necessarily in the order and position illustrated in fig. 5. Moreover, they are not necessarily stored adjacently, but the storage sequence and the storage position may be specifically arranged according to the actual requirement or the actual situation of the storage space, and will not be described in detail when the description of the storage is referred to later.
According to some embodiments, each line of data in the stored text is associated with the stored location identity and the cutoff duty cycle.
As described above, the related information may include the position identification, the cutoff duty ratio, and the like of the text line.
In the present disclosure, the meaning of "associatively storing" may include, for example, storing text data of text lines (i.e., data of corresponding lines in stored text) and related information in the same storage area in order to facilitate uniform collection and management of all text data and related information; it may also include storing the text data of the text line together, for example, in the same storage area, so as to facilitate reading of the text data, and storing the related information of the text line in a different storage area from the text data, so as to facilitate collection and management of the related information. The "associative storage" may also include other storage manners, which are not mentioned herein. In short, the storage mode of the text data and the related information can be flexibly set according to the requirement.
According to some embodiments, when the text data is stored separately from the related information, a desired association may be established between the text data store and the related information store by, for example, using the same index number. The location identification may be employed, for example, as an index to the text data store and the related information store.
On the other hand, a manner of storing only text data in a storage area of text data may also be employed, and although this manner makes it appear that there is no correspondence between the stored text data and the position of the text line, since related information including a cut-off ratio is additionally stored, by which text data of a text line to be read can be quickly found in the storage area of text data.
In addition, the number of characters of each recognized text line may also be stored as one kind of the related information of the text line, for example, together with the related information of the position identification, the cutoff ratio, and the like. Here, the operation of storing the number of characters of the text line may be performed in step S110, and of course, may also be performed in step S104 or step S106, which is not limited herein.
Fig. 6 illustrates a character recognition and storage process according to another exemplary embodiment of the present disclosure, and fig. 6 stores more information on the number of characters of a text line than fig. 5.
As shown in fig. 6, for example, in the case where one text line to be recognized is a line 1 character, as text data, the number of characters, position identification, and a cut-off ratio of the line 1 character, the following are stored, respectively: "onset", i.e., "turn on vision". [20], [001], and [ 100% ]. Here, fig. 6 shows an example of storing the number of characters of the line 1 letter together with the position identification and the cutoff duty.
As described above, the related information may include the number of characters, etc. in addition to the position identifier and the cutoff ratio of the text line, as described above.
Here, storing the number of characters identified for each line of text facilitates calculation and updating of the cutoff fraction.
According to some embodiments, as shown in fig. 7, the image processing method of the present disclosure may further include: step S120, after the step S101 of obtaining the image, performing character recognition on the first text line to be recognized in the text region, and separately storing the obtained text data in the first storage area in the text to be read.
That is, the text data and/or related information of the first line of text is not stored together with the text data and/or related information of the subsequent line of text, but is stored separately in one storage area (e.g., the first storage area in the text to be read).
Here, the above-mentioned "first text line to be recognized" may be the 1 st line of characters of the entire text region, or may be the 1 st line of characters in a part of lines (a part of all lines) in the entire text region instead of the 1 st line of characters of the entire text region.
The operation performed on the first text line to be recognized is separately proposed here in order to distinguish different operation modes between the first text line to be recognized and the text line to be spliced and stored with the text data. For example, the first line of text may be stored separately, such as in a different storage region (e.g., the first storage region) than the other lines of text. One of the purposes of the separate storage is to read quickly, that is, the reading can be performed after the storage of the text line, and the identification and storage of the subsequent text line do not need to be waited, so that the reading waiting time is greatly reduced, the reading speed is effectively increased, and particularly the reading speed of the interested 1 st text line is very helpful for expressing the performance of the reading device, and the user experience is improved.
In addition, the relevant information of the first text line to be identified can be stored together with the text data thereof, so that the access, the collection and the management are convenient.
After the 1 st line of text of interest is identified and stored separately, the rest of text lines of interest can participate in the splicing storage of text data in the identification and storage process of the subsequent text lines, so as to facilitate the reading of multiple lines. In the case that the recognition and storage speed is faster than the reading speed, a plurality of lines may be recognized and stored during reading 1 line, so that the stored text data can be used by the reading device in a recognition, storage and reading mode, and the time for waiting for recognition and storage of all text lines is not needed as in the prior art, so that the reading waiting time can be greatly reduced, and the reading speed and efficiency are improved.
As shown in fig. 8, the image processing method according to the present disclosure may further include: step S107, after calculating and storing the cut-off ratio of the stored text (the first text line is stored separately, so the first text line is not calculated in the stored text), judging whether a next text line to be recognized still exists in the text area, if so, going to step S102, performing the character recognition operation of step S102 on the next text line to be recognized, and continuing to sequentially perform the operations of steps S103 to S107. And the operations of the steps S102 to S107 are circularly carried out in this way until all the text lines to be recognized in the text area are recognized and stored. That is, if there is no text line to be recognized next, the recognition and storage of the text region may be ended.
The process of identifying and storing text lines in a text region will be described below with reference to specific examples, and in particular, how to update the cutoff duty ratio will be described in detail.
In the example given here, in step S120, the 1 st line of text in the text area shown in fig. 2 is recognized and stored separately. The text data of the line 1 text may be stored separately in, for example, the first storage area of the text to be read, and may also be stored in the stored text.
In addition, in step S120, information related to the line 1 text may be stored. As shown in fig. 6, the text data and related information (here including the number of characters, position identification, and cutoff ratio) of the line 1 text stored separately are:
"onset", i.e., "turn on vision". [20], [001], and [ 100% ].
As described above, the text data of the text on line 1 is stored separately from the related information, i.e., not together with the subsequent text line. In the case of separate storage, the cutoff fraction of the line 1 text may not be stored. As shown in fig. 5, the number of characters of the line 1 letter may not be stored.
As described above, the text data of the first text line is stored separately for quick reading.
Then, in steps S102 to S104, the 2 nd line of text in the text area may be identified and stored to the text to be read (S103) and the stored text (S104), respectively.
As shown in fig. 9(a), the text data of the text to be read and the line 2 text in the stored text at this time are:
[ Zymoost electronics are dedicated to computer vision ].
Since the line 1 text can be stored separately in the first memory area, for example, of the text to be read and/or in the first memory area, for example, of the stored text, no other text lines are currently stored together for the line 2 text, which accounts for 100% of the total. Thus, for the line 2 text, steps S105 to S106 may be omitted, since the cutoff fraction thereof may be directly determined at step S104.
Note that the cut-off ratio of the line 2 word, which is the first line data in the stored text, is the same as the cut-off ratio of the first text line of the text region, both being 100%, because the line 2 word is stored in a different storage space from the line 1 word for easy quick reading, as described above. However, after the next line of text to be recognized is stored next, the cutoff occupancy of the line 2 text will be updated to a different value.
Next, in step S107, it is determined whether there is a next text line to be recognized. When it is determined that there is a next text line to be recognized (for example, the 3 rd text line), the process proceeds to step S102, the 3 rd line of characters is recognized, and the text data of the 3 rd line of characters is stored in steps S103 and S104, respectively. Then, the position identifier of the line of characters may be stored in the subsequent step S110, and the number of characters may also be stored in this step.
Since the number of characters of the line of characters is 30 and the position mark is 003, the text data and related information of the 3 rd line of characters stored at this time are:
innovations and combinations of manager and artificial intelligence application products, 30, 003.
In step S105, the total number of stored characters, i.e. the number of characters on the 2 nd line plus the number of characters on the 3 rd line, is calculated, and 26+30 equals 56, so as to update the cutoff ratio of each stored text line before the 3 rd line character, in this case, only the 2 nd line character before the 3 rd line character needs to update the cutoff ratio.
In step S106, the cutoff fraction of the stored text (excluding the separately stored line 1 text) is calculated and stored. At this time, the cutoff ratio of the 2 nd row character is updated from the previous 100% to the number of characters of the 2 nd row character/the number of stored total characters, that is, 26/56 is 46%, and the cutoff ratio of the 3 rd row character is 100%.
At this time, as shown in fig. 9(b), in the case where the line 1 text is stored separately, the text data and the related information of the subsequent text line stored are as follows:
[ the hit-and-watch electronics are dedicated to the computer vision part ], [26], [002], [ 46% ]
Innovative sum of products applied by a manager and an artificial intelligence, [30], [003] and [ 100% ].
Next, in step S107, it is determined whether there is a next text line to be recognized. When it is determined that the next text line to be recognized is, for example, the 4 th text line, the process goes to step S102, the 4 th line of characters is recognized, and the text data of the 4 th line of characters is stored in steps S103 and S104, respectively.
Since the number of characters of the line of characters is 25 and the position mark is 004, the text data and related information of the 4 th line of characters after the storage in step S110 are:
[ research and development, robot, unmanned aerial vehicle, nobody ], [25], [004 ].
In step S105, the number of characters stored in the total number of characters of the 2 nd, 3 rd and 4 th lines, that is, 26+30+25 is calculated to be 81, so as to update the cutoff ratio of the text line before the 4 th line character, in this case, the 4 th line character is preceded by the 2 nd and 3 rd line characters (the 1 st line is stored separately and is not counted), and therefore, the cutoff ratio of the 2 nd and 3 rd line characters needs to be updated.
In step S106, the cutoff occupancy of the line 2 character may be updated to the number of characters of the line 2 character/the number of stored total characters, i.e., 26/81 ═ 32%, the cutoff occupancy of the line 3 character may be updated to (the number of characters of the line 2 character + the number of characters of the line 3 character)/the number of stored total characters, i.e., (26+30)/81 ═ 69%, and the cutoff occupancy of the line 4 character may be determined to be 100%.
At this time, as shown in fig. 9(c), the text data and related information of the text lines (lines 2 to 4) subsequent to the line 1 text are stored as follows:
[ the hit-and-watch electronics are dedicated to the computer vision processor ], [26], [002], [ 32% ]
Innovation and combination of manager and artificial intelligence application products, 30, 003 and 69 percent
[ research and development, robot, unmanned aerial vehicle ], [25], [004], [ 100% ].
Next, in step S107, it is determined whether there is a next text line to be recognized. If it is determined that the next line of text to be recognized is, for example, the 5 th line of text, the process proceeds to step S102, the 5 th line of text is recognized, and the text data of the 5 th line of text is stored in steps S103 and S104, respectively.
Since the number of characters of the line of characters is 41 and the position identifier is 005, the text data and related information of the 5 th line of characters after the storage in step S110 are:
[ professional fields such as vehicle and security monitoring provide end-to-end solutions. [41] and [005 ].
In step S105, the total number of stored characters is calculated to be the number of characters on the 2 nd, 3 rd, 4 th and 5 th lines, that is, 26+30+25+41 is 122, so as to update the cutoff duty ratio of each text line before the 5 th line character, in this case, the 5 th line character is preceded by the 2 nd, 3 rd and 4 th line characters, and therefore, the cutoff duty ratios of the 2 nd, 3 rd and 4 th line characters need to be updated.
In step S106, the cutoff occupancy ratio of the 2 nd row character may be updated to the number of characters of the 2 nd row character/the number of stored total characters, that is, 26/122 is 21%, the cutoff occupancy ratio of the 3 rd row character may be updated to (the number of characters of the 2 nd row character + the number of characters of the 3 rd row character)/the number of stored total characters, that is, (26+30)/122 is 46%, the cutoff occupancy ratio of the 4 th row character may be updated to (the number of characters of the 2 nd row character + the number of characters of the 3 rd row character + the number of characters of the 4 th row character)/the number of stored total characters, that is, (26+30+25)/122 is 66%, and the cutoff occupancy ratio of the 5 th row character is determined to be 100% at this time.
At this time, as shown in fig. 9(d), the stored text data and related information other than the line 1 text are as follows:
[ the hit-and-watch electronics are dedicated to the computer vision processor ], [26], [002], [ 21% ]
Innovation and combination of manager and artificial intelligence application products, 30, 003 and 46 percent
[ research and development, robot, unmanned aerial vehicle ], [25], [004], [ 66% ]
[ professional fields such as vehicle and security monitoring provide end-to-end solutions. [41], [005], [ 100% ].
According to some embodiments, as described above, the number of characters in the stored related information shown in fig. 9(a) to (d) is not necessary, that is, the number of characters may not be stored. According to some embodiments, the number of characters in the stored related information shown in fig. 9(a) - (d) may be replaced by another parameter, for example, the area of the text region occupied by the corresponding text line, and the cut-off occupation ratio in the related information may also be obtained according to the area parameter.
Regarding the calculation and updating of the cut-off ratio, the cut-off ratio of each text line may be calculated after the text lines to be recognized in the whole text area are recognized, so that the cut-off ratio is calculated only once, and the calculation and updating of the cut-off ratio are not required to be performed every time the text lines are recognized.
As described above, the storage method of the present disclosure is not particularly limited, and for example, the text data and the related information in one text region may be stored together, or the text data and the related information may be stored separately.
In the above description, for the case where one text region is included in one image, the above-described identifying and storing operations may be performed for each text region, respectively, until all text lines in the text region or those text lines of interest are identified and stored.
According to some embodiments, when a plurality of text regions are included in one image, text data of the plurality of text regions may be stored together or separately, which do not affect the essence of the present disclosure.
According to the present disclosure, stored text data and related information are available for reading. The text data and the related information of each text line obtained through the above steps are very advantageous for various reading operations, for example, it can make the identification and storage of the text data and the reading of the text data proceed in parallel, i.e., it is not necessary to wait until all text lines are identified and stored as in the prior art to start the reading, but rather, the reading can be performed while identifying and storing, and the reading does not affect the identification and storage of the text data, thereby greatly increasing the reading speed. Reading will be described in detail below.
According to some embodiments, the text data in the currently stored text to be read is read while the identifying and the storing are sequentially performed for the text region.
For example, the text data in the currently stored text to be read may be read while recognizing and storing the text data in order. For example, if the characters in one text area are arranged in lines, the text data may be detected and recognized in lines, and each line of recognized text data may be stored in sequence, and each text data in the currently stored text to be read may be read.
According to some embodiments, it may be that each time the reading device acquires and reads a text to be read, the text to be read becomes a read text. Therefore, the text data in the currently stored text to be read may be newly stored in newly identified and unread text data after the reading device acquires the last text to be read.
For example, assuming that a currently stored text to be read is text data of lines 2 to 3 of a text area, at this time, the reading device acquires and reads the text to be read, and the text to be read becomes a read text. And if the newly identified text data of the 4 th to 6 th lines is currently stored as the text to be read, which is acquired and read next time by the reading equipment, is the text data of the 4 th to 6 th lines.
By splicing and storing the text data of the text lines in the manner described above, the reading consistency can be realized, unnecessary jam between lines in the prior art is reduced, and reading is not started until all text lines are identified and stored as in the prior art, thereby improving the reading efficiency.
An example of sequential reading will be described in detail below with reference to fig. 10. For example, for the text region shown in fig. 2, reading can be started immediately after the first text line is identified and stored (e.g., at step S120) (i.e., "first reading" shown in fig. 10) to shorten the reading waiting time and increase the reading speed; after the first text line is identified and stored, that is, while the first text line is read, the subsequent text line is still identified and stored, so that the beneficial technical effects of identifying and storing while reading are achieved.
After the reading of the text data of the first text line is completed, the currently stored text to be read is continuously read (i.e., "read for the second time" shown in fig. 10). Assuming that in the process of reading the text data of the first text line, the recognition and storage of the 2 nd line of characters and the recognition and storage of the 3 rd line of characters have already been performed, the currently stored text to be read is the characters of the 2 nd line and the 3 rd line: hit-and-watch electronics is dedicated to computer vision, [ innovations in processors and artificial intelligence application products ], and ] (the stored text data of line 1 may have become text that has been read). Therefore, the 2 nd and 3 rd lines of text data in the text to be read are continuously read. That is, the content of the second reading is [ hit-and-watch electronics is devoted to the innovation and summation of computer vision processors and artificial intelligence application products ], thereby making the speech between the second reading of line 2 and line 3 text coherent, with semantic concatenation and contextual context, overcoming the harsh and mechanical spacing or stuttering that occurs when reading word by word or line by line in the prior art.
In addition, in the process of the second reading, the subsequent text line is still subjected to the identification and storage operation, so that the beneficial technical effects of identification and storage and reading are realized. After the second reading, reading may be continued, for example, a third reading, etc., so as to read contents (such as line 4, line 5, etc.) that have not been read among the currently stored text data. And circulating the steps until the text lines in the whole text area are completely read, thereby completing the sequential reading process of the text area.
According to the method, the device and the system, identification and storage are realized while reading is realized by adopting the splicing storage mode, so that more coherent and smooth reading is realized. The methods and related devices of the present disclosure may help, for example, visually impaired users, elderly or elderly users, reading impaired users, and the like more easily understand and understand, for example, information automatically read from a text region by a visually impaired device.
According to some embodiments, the stored text data of a text line may be modified and the number of characters and cutoff duty for the text line updated accordingly. For example, modification operations such as replacement (e.g., replacing a character that is incorrectly recognized with a correct character), deletion (e.g., deleting excess characters), addition (e.g., adding missing characters that are not recognized), etc. may be performed on characters in the stored text data to make the stored text data more accurate.
After the above-described modification, the number of characters of the text data may vary, so the number of characters of the modified text data and the corresponding cutoff ratios may be updated accordingly to make the stored corresponding information accurate.
According to some embodiments, for a particular type of text line, a particular type of location identification is stored that represents the type of the text line, and based on the particular type identification, a prompt is issued to the user upon reading.
For a line of text of such a specific type as described above, a specific type identification for indicating the type of the line of text may be stored. When reading, if a certain text line to be read is determined to correspond to one specific type identifier, a corresponding prompt can be sent to a user. For example, if it is determined that a text line to be read is a title line, the user may be prompted for information such as "this is a title line", and the like. If a line of text to be read is determined to be a blurred line, the user may be prompted for information such as "unable to recognize the line of text, understanding," and the like.
According to some embodiments, the prompt may include one of an audible prompt, a vibratory prompt, a text prompt, an image prompt, a video prompt, or a combination thereof.
According to some embodiments, the particular type of text line comprises: a first type text line, wherein the first type text line is determined by a text size; and a second type text line, wherein the second type text line is determined by text line sharpness. For example, a first type of text line may be a title line, a header, a footer, etc., which lines often differ in text size from other text lines. In addition, the second type of text line refers to a text line that cannot be clearly recognized, i.e., a text line whose text definition is not high (e.g., below a preset text definition threshold).
According to some embodiments, the text rows may be arranged in a transverse direction, a vertical direction, or an oblique direction.
According to another aspect of the present disclosure, there is also provided an electronic circuit, which may include: circuitry configured to perform the steps of the above-described method.
According to another aspect of the present disclosure, there is also provided a vision-impairment assisting apparatus including: a camera configured to acquire an image; the electronic circuit described above; circuitry configured to perform text detection and recognition on text contained in the image to obtain text data; circuitry configured to read the text data.
According to another aspect of the present disclosure, there is also provided an electronic device including: a processor; and a memory storing a program comprising instructions which, when executed by the processor, cause the processor to perform the method described above.
According to another aspect of the present disclosure, there is also provided a non-transitory computer readable storage medium storing a program, the program comprising instructions which, when executed by a processor of an electronic device, cause the electronic device to perform the above-described method.
Fig. 11 is a block diagram illustrating an example of an electronic device according to an exemplary embodiment of the present disclosure. It is noted that the structure shown in fig. 10 is only one example, and the electronic device of the present disclosure may include only one or more of the constituent parts shown in fig. 10 according to a specific implementation.
The electronic device 2000 may be, for example, a general purpose computer (e.g., various computers such as a laptop computer, a tablet computer, etc.), a mobile phone, a personal digital assistant. According to some embodiments, the electronic device 2000 may be a vision-impaired aid or a reading aid.
The electronic device 2000 may be configured to capture an image, process the captured image, and provide a prompt in response to the processing. For example, the electronic device 2000 may be configured to capture an image, perform text detection and recognition on the image to obtain text data, convert the text data into sound data, and may output the sound data for listening by a user and/or output the text data for viewing by the user.
According to some embodiments, the electronic device 2000 may be configured to comprise a spectacle frame or be configured to be detachably mountable to a spectacle frame (e.g. a frame of a spectacle frame, a connector connecting two frames, a temple or any other part) so as to be able to take an image approximately comprising a field of view of a user.
According to some embodiments, the electronic device 2000 may also be mounted to or integrated with other wearable devices. The wearable device may be, for example: a head-mounted device (e.g., a helmet or hat, etc.), an ear-wearable device, etc. According to some embodiments, the electronic device may be implemented as an accessory attachable to a wearable device, for example as an accessory attachable to a helmet or cap, or the like.
According to some embodiments, the electronic device 2000 may also have other forms. For example, the electronic device 2000 may be a mobile phone, a general purpose computing device (e.g., a laptop computer, a tablet computer, etc.), a personal digital assistant, and so forth. The electronic device 2000 may also have a base so as to be able to be placed on a table top.
According to some embodiments, the electronic device 2000 may be used to assist reading as a vision-impaired aid, in which case the electronic device 2000 is sometimes also referred to as an "electronic reader" or "reading aid". With the electronic device 2000, a user who cannot read autonomously (e.g., a person with impaired vision, a person with impaired reading, etc.) can use a posture similar to a reading posture to "read" a conventional reading material (e.g., a book, a magazine, etc.). In the "reading" process, the electronic device 2000 may acquire an image, perform character recognition on a text line in the image to obtain text data, store the obtained text data in a text to be read for reading, simultaneously store the text data in a stored text, and further store related information such as a position identifier (for example, a line number), a number of characters, a cut-off ratio, and the like of the line of characters, so as to facilitate quick reading of the text data, enable the read text data to have semantic links and context contexts, and avoid hard stuttering caused by line-by-line or word-by-word reading.
The electronic device 2000 may include a camera 2004 for capturing and acquiring images. The camera 2004 may capture still images or dynamic images, which may include, but are not limited to, a camera, a video camera, etc., configured to acquire an initial image including an object to be identified. The electronic device 2000 may further comprise electronic circuitry 2100, said electronic circuitry 2100 comprising circuitry configured to perform the steps of the method as described previously. The electronic device 2100 may also include a text recognition circuit 2005 configured to perform text detection and recognition (e.g., OCR processing) on text in the image to obtain text data. The character recognition circuit 2005 can be realized by a dedicated chip, for example. The electronic device 2000 may further include a voice conversion circuit 2006, the voice conversion circuit 2006 configured to convert the text data into voice data. The sound conversion circuit 2006 may be realized by a dedicated chip, for example. The electronic device 2000 may further include a sound output circuit 2007, the sound output circuit 2007 being configured to output the sound data. The sound output circuit 2007 may include, but is not limited to, an earphone, a speaker, a vibrator, or the like, and its corresponding driving circuit. Wherein the aforementioned identification device may comprise, for example, the text recognition circuit 2005, and the aforementioned reading device may comprise, for example, the sound conversion circuit 2006 and the sound output circuit 2007.
According to some embodiments, the electronic device 2000 may further include image processing circuitry 2008, and the image processing circuitry 2008 may include circuitry configured to perform various image processing on the image. The image processing circuitry 2008 may include, for example, but not limited to, one or more of the following: the image processing system comprises circuitry configured to reduce noise in the image, circuitry configured to deblur the image, circuitry configured to geometrically correct the image, circuitry configured to feature extract the image, circuitry configured to target detect and identify a target object in the image, circuitry configured to text detect text contained in the image, circuitry configured to extract lines of text from the image, circuitry configured to extract text coordinates from the image, and so forth.
According to some embodiments, the electronic circuitry 2100 may further include word processing circuitry 2009 that may be configured to perform various processing based on the extracted word-related information (e.g., word data, text box, paragraph coordinates, text line coordinates, word coordinates, etc.) to obtain processing results such as paragraph ordering, word semantic analysis, layout analysis, and so on.
For example, one or more of the various circuits described above may be implemented by programming hardware (e.g., programmable logic circuits including Field Programmable Gate Arrays (FPGAs) and/or Programmable Logic Arrays (PLAs)) in an assembly language or a hardware programming language such as ILVEROG, VHDL, C + +, using logic and algorithms in accordance with the present disclosure.
According to some embodiments, electronic device 2000 may also include communications circuitry 2010, which communications circuitry 2010 may be any type of device or system that enables communication with an external device and/or with a network and may include, but is not limited to, a modem, a network card, an infrared communications device, a wireless communications device, and/or a chipset, such as a bluetooth device, 1302.11 device, a WiFi device, a WiMax device, a cellular communications device, and/or the like.
According to some embodiments, the electronic device 2000 may also include an input device 2011, which may be any type of device 2011 capable of inputting information to the electronic device 2000, and may include, but is not limited to, various sensors, mice, keyboards, touch screens, buttons, levers, microphones, and/or remote controls, among others.
According to some embodiments, the electronic device 2000 may also include an output device 2012, which output device 2012 may be any type of device capable of presenting information and may include, but is not limited to, a display, a visual output terminal, a vibrator, and/or a printer, among others. Although the electronic device 2000 is used for a vision-impaired auxiliary device according to some embodiments, the vision-based output device may facilitate a user's family or service personnel or the like to obtain output information from the electronic device 2000.
According to some embodiments, the electronic device 2000 may further comprise a processor 2001. The processor 2001 may be any type of processor and may include, but is not limited to, one or more general purpose processors and/or one or more special purpose processors (e.g., special purpose processing chips). The processor 2001 may be, for example, but not limited to, a central processing unit CPU or a microprocessor MPU or the like. The electronic device 2000 may also include a working memory 2002, which working memory 2002 may store programs (including instructions) and/or data (e.g., images, text, sound, and other intermediate data, etc.) useful for the operation of the processor 2001, and may include, but is not limited to, a random access memory and/or a read only memory device. The electronic device 2000 may also include a storage device 2003, which may include any non-transitory storage device, which may be non-transitory and may implement any storage device for data storage, and may include, but is not limited to, a disk drive, an optical storage device, a solid state memory, a floppy disk, a flexible disk, a hard disk, a magnetic tape, or any other magnetic medium, an optical disk or any other optical medium, a ROM (read only memory), a RAM (random access memory), a cache memory, and/or any other memory chip or cartridge, and/or any other medium from which a computer may read data, instructions, and/or code. The working memory 2002 and the storage device 2003 may be collectively referred to as "memory" and may be used concurrently with each other in some cases.
According to some embodiments, the processor 2001 may control and schedule at least one of the camera 2004, the text recognition circuit 2005, the voice conversion circuit 2006, the voice output circuit 2007, the image processing circuit 2008, the text processing circuit 2009, the communication circuit 2010, the electronic circuit 2100, and other various devices and circuits included in the electronic device 2000. According to some embodiments, at least some of the various components described in fig. 11 may be interconnected and/or in communication by wires 2013.
Software elements (programs) may reside in the working memory 2002 including, but not limited to, an operating system 2002a, one or more application programs 2002b, drivers, and/or other data and code.
According to some embodiments, instructions for performing the aforementioned control and scheduling may be included in the operating system 2002a or one or more application programs 2002 b.
According to some embodiments, instructions to perform the method steps described in the present disclosure may be included in one or more application programs 2002b, and the various modules of the electronic device 2000 described above may be implemented by the processor 2001 reading and executing the instructions of the one or more application programs 2002 b. In other words, the electronic device 2000 may comprise a processor 2001 as well as a memory (e.g. working memory 2002 and/or storage device 2003) storing a program comprising instructions which, when executed by the processor 2001, cause the processor 2001 to perform a method according to various embodiments of the present disclosure.
According to some embodiments, some or all of the operations performed by at least one of the text recognition circuit 2005, the sound conversion circuit 2006, the image processing circuit 2008, the word processing circuit 2009, the electronic circuit 2100 may be implemented by instructions of one or more application programs 2002 being read and executed by the processor 2001.
Executable code or source code of instructions of the software elements (programs) may be stored in a non-transitory computer readable storage medium, such as the storage device 2003, and may be stored in the working memory 2001 (possibly compiled and/or installed) upon execution. Accordingly, the present disclosure provides a computer readable storage medium storing a program comprising instructions that, when executed by a processor of an electronic device (e.g., a vision-impaired auxiliary device), cause the electronic device to perform a method as described in various embodiments of the present disclosure. According to another embodiment, the executable code or source code of the instructions of the software elements (programs) may also be downloaded from a remote location.
It will also be appreciated that various modifications may be made in accordance with specific requirements. For example, customized hardware might also be used and/or individual circuits, units, modules, or elements might be implemented in hardware, software, firmware, middleware, microcode, hardware description languages, or any combination thereof. For example, some or all of the circuits, units, modules, or elements encompassed by the disclosed methods and apparatus may be implemented by programming hardware (e.g., programmable logic circuitry including Field Programmable Gate Arrays (FPGAs) and/or Programmable Logic Arrays (PLAs)) in an assembly language or hardware programming language such as VERILOG, VHDL, C + +, using logic and algorithms in accordance with the present disclosure.
The processor 2001 in the electronic device 2000 may be distributed over a network according to some embodiments. For example, some processes may be performed using one processor while other processes may be performed by another processor that is remote from the one processor. Other modules of the electronic device 2001 may also be similarly distributed. As such, the electronic device 2001 may be interpreted as a distributed computing system that performs processing at multiple locations.
Some exemplary aspects of the disclosure will be described below.
An aspect 1. a method of image processing, the method comprising:
acquiring an image, wherein the image comprises a text area;
in the text area, performing character recognition on a text line to be recognized to obtain text data of the text line; and
and storing the text data of the text line to the text to be read.
Aspect 2 the image processing method according to aspect 1, further comprising:
storing the text data of the text line into a stored text as a line of data in the stored text;
calculating the sum of the number of characters of the stored text to serve as the number of the stored total characters; and
and calculating and storing a cutoff ratio of each line of data in the stored text, wherein the cutoff ratio is determined by the ratio of the sum of the number of characters in the stored text before the line of data and the number of characters in the line of data to the total number of characters in the stored text.
Aspect 3 the image processing method according to aspect 2, further comprising:
and storing the position identification of the text line for representing the position of the text line in the text area, wherein the position identification comprises a line number.
Aspect 4 the image processing method of aspect 3, wherein each line of data in the stored text is associated with the stored location identity and the cutoff duty.
Aspect 5 the image processing method according to aspect 1, further comprising:
after the image is acquired, character recognition is carried out on the first text line to be recognized in the text area, and the obtained text data is separately stored in the first storage area of the text to be read.
Aspect 6 the image processing method according to aspect 2, further comprising:
after calculating and storing the cutoff occupation ratio of each line of data in the stored text, judging whether the text area has the next text line to be identified; and
if there is a next line of text to be recognized, character recognition is performed for the next line of text to be recognized.
Aspect 7 the image processing method according to aspect 1, further comprising:
and reading the currently stored text data in the text to be read while sequentially performing the identification and the storage for the text region.
Aspect 8 the image processing method of aspect 2, wherein the stored text is modified and based on the modification, the corresponding cutoff fraction is updated.
Aspect 9 the image processing method of aspect 8, wherein the modification includes addition, deletion, or replacement.
Aspect 10 the image processing method according to aspect 1, wherein, for a specific type of text line, a specific type position identification for indicating the type of the text line is stored, and a prompt is issued to the user at the time of reading based on the specific type identification.
The image processing method of aspect 10, wherein the particular type of text line comprises:
a first type text line, wherein the first type text line is determined by a text size; and
a second type line of text, wherein the second type line of text is determined by a line of text sharpness.
Aspect 12 the image processing method according to any one of aspects 1 to 11, wherein the text lines are arranged in a horizontal direction, a vertical direction, or an oblique direction.
Aspect 13 is an electronic circuit comprising:
circuitry configured to perform the steps of the method of any of aspects 1-12.
Aspect 14. a vision-impairment assisting apparatus, comprising:
a camera configured to acquire an image;
the electronic circuit of aspect 13;
circuitry configured to perform text detection and recognition on text contained in the image to obtain text data; and
circuitry configured to read the text data.
An electronic device of aspect 15, comprising:
a processor; and
a memory storing a program comprising instructions that, when executed by the processor, cause the processor to perform the method of any of aspects 1-12.
A non-transitory computer readable storage medium storing a program, the program comprising instructions that, when executed by a processor of an electronic device, cause the electronic device to perform the method of any of aspects 1-12.
Although embodiments or examples of the present disclosure have been described with reference to the accompanying drawings, it is to be understood that the methods, systems, and apparatus described above are merely exemplary embodiments or examples and that the scope of the present disclosure is not limited by these embodiments or examples, but only by the claims as issued and their equivalents. Various elements in the embodiments or examples may be omitted or may be replaced with equivalents thereof. Further, the steps may be performed in an order different from that described in the present disclosure. Further, various elements of the embodiments or examples may be combined in various ways. It is important that as technology evolves, many of the elements described herein may be replaced with equivalent elements that appear after the present disclosure.

Claims (10)

1. A method of image processing, the method comprising:
acquiring an image, wherein the image comprises a text area;
in the text area, performing character recognition on a text line to be recognized to obtain text data of the text line; and
and storing the text data of the text line to the text to be read.
2. The image processing method according to claim 1, further comprising:
storing the text data of the text line into a stored text as a line of data in the stored text;
calculating the sum of the number of characters of the stored text to serve as the number of the stored total characters; and
and calculating and storing a cutoff ratio of each line of data in the stored text, wherein the cutoff ratio is determined by the ratio of the sum of the number of characters in the stored text before the line of data and the number of characters in the line of data to the total number of characters in the stored text.
3. The image processing method according to claim 2, further comprising:
and storing the position identification of the text line for representing the position of the text line in the text area, wherein the position identification comprises a line number.
4. The image processing method according to claim 1, further comprising:
after the image is acquired, character recognition is carried out on the first text line to be recognized in the text area, and the obtained text data is separately stored in the first storage area of the text to be read.
5. The image processing method according to claim 2, further comprising:
after calculating and storing the cutoff occupation ratio of each line of data in the stored text, judging whether the text area has the next text line to be identified; and
if there is a next line of text to be recognized, character recognition is performed for the next line of text to be recognized.
6. The image processing method according to any one of claims 1 to 5, further comprising:
and reading the currently stored text data in the text to be read while sequentially performing the identification and the storage for the text region.
7. An electronic circuit, comprising:
circuitry configured to perform the steps of the method of any of claims 1-6.
8. A visual impairment assistance device comprising:
a camera configured to acquire an image;
the electronic circuit of claim 7;
circuitry configured to perform text detection and recognition on text contained in the image to obtain text data;
circuitry configured to read the text data.
9. An electronic device, comprising:
a processor; and
a memory storing a program comprising instructions that, when executed by the processor, cause the processor to perform the method of any of claims 1-6.
10. A non-transitory computer readable storage medium storing a program, the program comprising instructions that when executed by a processor of an electronic device cause the electronic device to perform the method of any of claims 1-6.
CN201911214755.0A 2019-12-02 2019-12-02 Image processing method, circuit, vision-impaired assisting device, electronic device, and medium Active CN110969161B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911214755.0A CN110969161B (en) 2019-12-02 2019-12-02 Image processing method, circuit, vision-impaired assisting device, electronic device, and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911214755.0A CN110969161B (en) 2019-12-02 2019-12-02 Image processing method, circuit, vision-impaired assisting device, electronic device, and medium

Publications (2)

Publication Number Publication Date
CN110969161A true CN110969161A (en) 2020-04-07
CN110969161B CN110969161B (en) 2023-11-07

Family

ID=70032598

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911214755.0A Active CN110969161B (en) 2019-12-02 2019-12-02 Image processing method, circuit, vision-impaired assisting device, electronic device, and medium

Country Status (1)

Country Link
CN (1) CN110969161B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180225306A1 (en) * 2017-02-08 2018-08-09 International Business Machines Corporation Method and system to recommend images in a social application
CN108596168A (en) * 2018-04-20 2018-09-28 北京京东金融科技控股有限公司 For identification in image character method, apparatus and medium
CN109389115A (en) * 2017-08-11 2019-02-26 腾讯科技(上海)有限公司 Text recognition method, device, storage medium and computer equipment

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180225306A1 (en) * 2017-02-08 2018-08-09 International Business Machines Corporation Method and system to recommend images in a social application
CN109389115A (en) * 2017-08-11 2019-02-26 腾讯科技(上海)有限公司 Text recognition method, device, storage medium and computer equipment
CN108596168A (en) * 2018-04-20 2018-09-28 北京京东金融科技控股有限公司 For identification in image character method, apparatus and medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
张再银;童立靖;湛健;沈冲;: "基于文本域分割和文本行检测的扭曲文档图像校正" *

Also Published As

Publication number Publication date
CN110969161B (en) 2023-11-07

Similar Documents

Publication Publication Date Title
US9274646B2 (en) Method and apparatus for selecting text information
US9589198B2 (en) Camera based method for text input and keyword detection
US7949157B2 (en) Interpreting sign language gestures
CN110991455B (en) Image text broadcasting method and equipment, electronic circuit and storage medium thereof
EP3940589A1 (en) Layout analysis method, electronic device and computer program product
US10592726B2 (en) Manufacturing part identification using computer vision and machine learning
EP2797032A2 (en) Method and system using two parallel optical character recognition processes
CN108256523B (en) Identification method and device based on mobile terminal and computer readable storage medium
CN111160333A (en) AR glasses, text translation method and device thereof, and computer-readable storage medium
JP2006107048A (en) Controller and control method associated with line-of-sight
KR101429882B1 (en) Image Processor, Image Processing Method, Control Program, and Recording Medium
US10965801B2 (en) Method for inputting and processing phone number, mobile terminal and storage medium
JP2009123020A (en) Information processor, information processing method, program and recording medium
EP3467820A1 (en) Information processing device and information processing method
US11776286B2 (en) Image text broadcasting
US10915778B2 (en) User interface framework for multi-selection and operation of non-consecutive segmented information
CN110969161B (en) Image processing method, circuit, vision-impaired assisting device, electronic device, and medium
US11367296B2 (en) Layout analysis
KR20140134844A (en) Method and device for photographing based on objects
CN112613510A (en) Picture preprocessing method, character recognition model training method and character recognition method
JP4371306B2 (en) Color image processing apparatus and color image processing program
CN115346205A (en) Page information identification method and device and electronic equipment
CN117876540A (en) Animation display method, device, equipment and computer readable storage medium
CN115115958A (en) Augmented reality display method and device and electronic equipment
JP2006331216A (en) Image processor, processing object range designation method in image processor, image processing range designation program and recording medium for recording image processing range designation program

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant