CN111429767A

CN111429767A - Image-text photographing and identifying device

Info

Publication number: CN111429767A
Application number: CN201910811258.2A
Authority: CN
Inventors: 陈旭
Original assignee: Individual
Current assignee: Individual
Priority date: 2013-01-25
Filing date: 2014-01-21
Publication date: 2020-07-17
Also published as: CN107967824A; CN111050017A; CN107742446A; CN109300343A

Abstract

The invention discloses image-text photographing and identifying equipment. An image-text photographing recognition apparatus includes: the image information identification unit determines the name of the book by identifying the image information of the cover of the book and/or determines the number of pages currently read by identifying the image information of the pages in the book. The embodiment of the invention can provide auxiliary equipment convenient for reading common books for people who are inconvenient to directly read the common books.

Description

Image-text photographing and identifying device

Technical Field

The invention relates to the technical field of electronics, in particular to image-text recognition.

Background

At present, some people, such as children, blind people, old people, and the like, are inconvenient to directly read books.

Disclosure of Invention

The invention aims to provide book reading equipment, so that common books can be read for people who are inconvenient to directly read books.

The purpose of the invention is realized by the following technical scheme:

an image-text photographing recognition apparatus includes:

an image information recognition unit including:

the book name is determined by identifying characters in the image information of the book cover or the book name is determined by identifying labels in the image information of the book cover, and/or the page number currently read is determined by identifying the image information of the inner page of the book or the page number currently read is determined by identifying the characters or the number of the number.

An image-text photographing recognition apparatus includes:

the camera module takes a picture of all or a portion of the right one of the two sides of the unfolded book.

An image-text photographing recognition apparatus includes:

and the image information identification unit identifies the audio information corresponding to the current reading position of the current reading page or printed matter or the reading operation indication position or the text content information containing the book information or the page number information according to the current reading position of the current reading page or printed matter or the reading operation indication position or the image information containing the book information or the page number information and informs the audio unit, and the audio unit outputs the corresponding audio information in an audio form.

An image-text photographing recognition apparatus includes:

the flat plate-shaped shell or the flat cube shell comprises a camera module which takes a picture downwards on the upper part of the large surface of the vertical shell.

An image-text photographing recognition apparatus includes:

when in use, the image-text photographing equipment is erected in front of a book, the camera module is arranged at the upper part of the large surface of the erected shell facing the book and downwards faces the book to photograph,

and/or the presence of a gas in the gas,

the direction of taking a picture downwards by the camera module is outwards inclined relative to the vertical large surface of the flat-plate-shaped shell or the flat cubic shell,

and/or the presence of a gas in the gas,

the camera module is arranged in the position of the area needing to collect image information and comprises: above the edges of the area and/or obliquely above,

and/or the presence of a gas in the gas,

the camera module comprises a camera which is symmetrical left and right in the middle of the upper part of the large face of the vertical shell or the vertical center line of the upper part of the large face of the vertical shell.

An image-text photographing recognition apparatus includes:

the camera module includes: at least one movable camera the angle of taking a picture and/or the position of movable camera is adjustable, and/or, fixed camera includes one or more camera lenses.

An image-text photographing recognition apparatus includes:

the shooting angle and/or position of the movable camera is adjustable, and the movable camera controls the movable camera to move based on a preset control mode or received control information so as to shoot multi-angle and/or multi-position multi-point images of the shot object,

and/or the presence of a gas in the gas,

and if the fixed camera comprises a plurality of lenses, controlling each lens to acquire the image-text content information based on a preset control mode or received control information.

An image-text photographing recognition apparatus includes:

a display unit for displaying the set content information and/or the image and character information obtained in the photographing identification process and/or the externally obtained content information,

and/or the presence of a gas in the gas,

the storage unit comprises an image and/or character information obtained in the photographing identification process and/or content information obtained from the outside;

and/or the presence of a gas in the gas,

a communication unit for communicating with a computer,

and/or the presence of a gas in the gas,

an audio input unit for acquiring audio information,

and/or the presence of a gas in the gas,

the interactive processing module is used for acquiring interactive operation control information of a user and executing preset interactive operation according to the interactive operation control information, wherein the interactive operation control information comprises at least one of limb action, action of operating an object, voice information, screen input or operation keys.

The flat-plate-shaped shell or the flat cubic shell has two large planes which are main planes, the other four surrounding planes can only be calculated on the side faces or the side edges, the camera module is arranged on the large planes, the shell is erected during working, and the camera module shoots downwards on the upper part of the large planes.

According to the technical scheme provided by the invention, the embodiment of the invention can identify the page of the common book when the book is naturally browsed, and can further operate, such as assisting to make a sound. The book is only photographed and unfolded, the photographing height of the camera is reduced, the size of the equipment is reduced, the books are skillfully utilized to be opened leftwards, and the right side of the unfolded book is photographed, so that the books do not need to be moved from the process of recognizing the cover to recognizing the inner page.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on the drawings without creative efforts.

Fig. 1 is a first schematic structural diagram of a specific implementation of book reading equipment according to an embodiment of the present invention;

fig. 2 is a schematic diagram of a specific implementation structure of the book reading device according to the embodiment of the present invention;

fig. 3 is a third schematic diagram of a specific implementation structure of the book reading device according to the embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention are clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present invention without making any creative effort, shall fall within the protection scope of the present invention.

An image-text photographing recognition apparatus includes:

an image information recognition unit including:

An image-text photographing recognition apparatus includes:

and/or the presence of a gas in the gas,

An image-text photographing recognition apparatus includes:

and/or the presence of a gas in the gas,

An image-text photographing recognition apparatus includes:

and/or the presence of a gas in the gas,

a communication unit for communicating with a computer,

and/or the presence of a gas in the gas,

an audio input unit for acquiring audio information,

and/or the presence of a gas in the gas,

The embodiment of the invention can comprise the following steps: the multipoint image information acquisition unit transmits acquired image information to the image information identification unit, the image information identification unit identifies image-text content information according to the image information, and the multipoint image information acquisition unit acquires the image information in a photographing mode, namely the multipoint image information acquisition unit acquires the image information by photographing through a camera contained in the multipoint image information acquisition unit.

Further, in order to realize multi-angle and/or multi-position multi-point image shooting for the shot object, the corresponding multi-point image information acquisition unit can be realized by adopting any one of the following structures:

(1) the multi-point image information collecting unit may include at least one movable camera, and the movable camera controls its movement based on a predetermined control manner or received control information to perform multi-angle and/or multi-position multi-point image photographing on a photographed object. For example, the angle and/or position of the movable camera may be automatically adjusted according to feedback automatic control (e.g., automatic adjustment according to feedback after recognition of the shot image), such as when the finger position is out of range or part of the text is out of range or the page number is out of range, so that the product requires no or little manual intervention during the work process, or the camera may be controlled to rotate or move according to control information input by the user (e.g., specific limb actions performed by the user or predetermined control information input by the user through operating keys, etc.), or the camera may be automatically controlled to rotate or move according to a preset time interval, so as to perform multi-angle and/or multi-position multi-point shooting on the shot object.

Specifically, the movable camera comprises a rotatable camera and/or a movable camera, that is, the movable camera is rotatable or movable and rotatable; or the movable camera comprises one or more movable lenses; if the movable cameras comprise a plurality of cameras, each camera contained in the movable cameras is used for collecting image-text content information of all or part of scenes; the movable camera can set up in needs gather the position in image information's region includes: above the edges of the area and/or obliquely above and/or directly above.

(2) The multi-point image information acquisition unit comprises a plurality of cameras, namely two or three or four or more than four cameras, the cameras are used for completing multi-angle and/or multi-position multi-point image shooting, each camera is a fixed camera or a movable camera, and the movable camera is used for performing multi-angle and/or multi-position multi-point image shooting on a shot object based on a preset control mode or on received control information or manually controlling the movement of the movable camera. For example, the angle and/or position of the movable camera may be automatically adjusted according to feedback automatic control (e.g. automatic adjustment according to feedback after recognition of the shot image), for example, when the finger position is out of range or part of characters are out of range or the page number is out of range, so that the product requires no or little manual intervention during working, or the camera may be controlled to rotate or move according to control information input by the user (e.g. specific actions performed by the user or predetermined control information input by the user through operating keys, etc.), or the camera may be automatically controlled to rotate or move according to a preset time interval, so as to perform multi-angle and/or multi-position multi-point shooting on the shot object.

Specifically, the fixed camera includes one or more lenses, and if the fixed camera includes multiple lenses, the fixed camera controls each lens to acquire the image-text content information based on a predetermined control manner or received control information, and the predetermined control manner includes controlling all or some of the multiple cameras to perform multi-angle and/or multi-position multi-point image capturing;

each camera included in the plurality of cameras is used for collecting image-text content information of all or part of scenes;

in this scheme (2), the positioning of the camera in the area where the image information needs to be acquired may include: above the edges of the area and/or obliquely above and/or directly above.

That is to say, in the image-text collection and recognition device, the corresponding multi-point image information collection unit may include a plurality of cameras, and the corresponding plurality of cameras may be fixedly or movably disposed above and/or obliquely above and/or directly above the edge of the area where the object is located, for example, may be disposed above and/or obliquely above and/or directly above the edge of a book, and the position set by the reader when reading needs not to affect the reader when reading the book. Meanwhile, due to the fact that the multiple cameras are adopted, shooting coverage required by each camera is greatly reduced, overall coverage is increased, and therefore the shooting coverage required by recognition is guaranteed, the multiple cameras can shoot respectively to conduct respective recognition work, and shooting results can be integrated for recognition.

(3) The multi-point image information acquisition unit comprises a fixed camera, the fixed camera comprises a plurality of lenses, each lens contained in the plurality of lenses is controlled to acquire the image-text content information based on a preset control mode or received control information, so that multi-angle and/or multi-position multi-point image shooting can be realized on a shot object through the plurality of lenses, and the preset control mode comprises the step of controlling all lenses or part of lenses in the plurality of lenses to carry out multi-angle and/or multi-position multi-point image shooting. For example, according to feedback automatic control (e.g. automatic adjustment according to feedback after recognition of the shot image), for example, when the finger position exceeds the range or part of the characters exceeds the range or the page number exceeds the range, the angle and/or position of the movable camera is automatically adjusted, so that the product requires no or little manual intervention during the working process, or, according to control information input by the user (e.g. body fixed action performed by the user or predetermined control information input by the user through operating keys, etc.), the various lenses are controlled to shoot multiple angles and/or multiple positions of the shot object, or, the various lenses are also automatically controlled to shoot according to a preset manner that the various lenses collect image information of multiple angles and/or multiple positions of the shot object (e.g. the various lenses can be set to obtain corresponding image information for shooting the shot object in sequence, it is also possible to set the respective lenses to photograph the subject at the same time to obtain corresponding image information, or to set a part of the lenses to photograph the subject to obtain corresponding image information, or the like).

Specifically, in the scheme (3), the position where the fixed camera may be disposed in the area where the image information needs to be acquired includes: above the edges of the area and/or obliquely above and/or directly above.

The fixed camera usually has a fixed view, but a plurality of cameras can form comprehensive coverage, and the movable camera has a fixed view in a certain angular position, but it changes the view through the activity, therefore can also form comprehensive coverage, in the concrete implementation, if the fixed camera is not flexible, can not regard as the movable camera, likewise, even if the camera is movable, but if the work process does not depend on the activity to obtain the required special effect, such as comprehensive coverage, etc., then still belong to the fixed camera scheme in fact. For example, if the movable camera is adjusted to a proper angle position in advance, but the movable camera does not need to be moved in actual use, or the effect of the movable camera on special effects such as overall coverage is small, the movable camera still belongs to a fixed camera solution in practice.

In the image-text acquisition and identification device provided by the embodiment of the present invention, the image-text content information may specifically include, but is not limited to: the method comprises the following steps of (1) picture or text content information of a printed matter, and/or picture information of a spatial still object, and/or limb action information, and/or indication information of reading operation on the printed matter, and/or action information of an operation object; that is, the image-text content information may be at least one of a picture or text content information of a printed matter, picture information of a spatial still, body motion information, instruction information for performing a reading operation on the printed matter, and motion information of an operation object.

That is, the corresponding image information recognition unit may recognize the picture or text information in the printed matter according to the acquired image information, or may also recognize the picture of the spatial still object (for example, content information such as a corresponding picture or text description of the corresponding spatial still object is determined according to the acquired image information of the spatial still object), or may also recognize body motion information such as a gesture motion performed by the user (for example, a predetermined execution instruction meaning corresponding to the body motion is recognized), or may also recognize motion information of the user operating the object, or may also recognize a reading operation instruction when the user reads the printed matter, and so on. Furthermore, the instruction information for reading the printed matter can be realized by body movement information or movement information of an operation object, that is, a specific body movement or movement of the operation object can be used as the instruction information for certain reading operation; that is, the instruction information for reading the printed matter may include: the operation information is indicated by reading on the printed matter through a hand or a handheld object, such as an indication for determining to point to read somewhere or an indication for determining to-be-read content or an indication for determining whether to-be-read or not, and the like, such as pointing, clicking, double-clicking, sliding, turning pages and the like on the printed matter by the hand.

The image-text acquisition and recognition device provided by the embodiment of the invention adopts a unique camera arrangement scheme, so that multi-point image information of a shot object to be acquired can be flexibly acquired in the image information acquisition process, namely image information of different angles and/or different positions corresponding to the shot object is acquired, so that the acquired image information can truly and accurately reflect the actual condition of the shot object, further, corresponding image-text content information can be accurately identified in the subsequent image-text identification processing process, such as accurately identifying character or picture information in a printed matter, or identifying the meaning of limb action of a user, or identifying the meaning of action executed by the user to operate the object, or identifying the meaning of reading operation of the user on the printed matter such as books and the like through limb action or the operation object, or, a text or picture indicated by the user, etc. is recognized.

In the image and text collecting and identifying device provided by the embodiment of the present invention, in order to facilitate a user to realize audio reading of printed matters based on the device, the device may further include an audio unit, the multipoint image information collecting unit transmits the collected current reading position or reading operation indicating position or image information including bibliographic information or page number information of the current reading page or printed matters in the book to the image information identifying unit, the image information identifying unit identifies and notifies the audio unit of audio information corresponding to the text content information identifying the current reading position or reading operation indicating position or bibliographic information or page number information of the current reading page or printed matters according to the current reading position or reading operation indicating position or image information including bibliographic information or page number information, the audio unit outputs the corresponding audio information in an audio form, therefore, the method can realize the sound reading of the text contents in the printed matters, and is convenient for people who are inconvenient and intuitive to read books to obtain the content information in the common books.

In the process of performing audio reading operation on a printed matter through the audio unit, the multipoint image information acquisition unit further comprises a reading position information acquisition module, and is used for acquiring character image information of a user reading operation position (namely the current reading position of the printed matter appointed by a user) through a camera, identifying character contents contained in the character image information of the user reading operation position by the image information identification unit, and notifying the audio unit of the identified and determined audio information corresponding to the character contents or the audio information obtained by converting the character contents. The audio information corresponding to the text content may be the reading audio information of the part of the text content, or may be other audio information corresponding to the text content, such as audio information such as an explanation of the text content.

Corresponding character recognition has already gradually entered the practical stage, and the corresponding recognition processing procedure may include: firstly, preprocessing a photographed image, wherein the preprocessing mainly comprises binarization, noise removal, inclination correction and the like, and then extracting character features, namely thinning a character image, obtaining the number and the positions of stroke end points and cross points of characters or taking stroke segments as features, and matching a comparison method for comparison so as to recognize the characters. Since the text recognition technology is already the prior art, it will not be described in detail here.

In this recognition device is gathered to picture and text, because can accurately carry out the discernment of picture and text content information, therefore can carry out picture and text discernment to ordinary books and handle, and combine corresponding vocal function to realize the supplementary vocal processing of reading to ordinary books, thereby for people provide one kind can carry out the supplementary vocal equipment of reading of supplementary vocal to ordinary books, this just makes children, the blind person, the old person and so on inconvenient can assist with the help of this picture and text collection recognition device to the crowd that books directly read and read, the reading operation of this part of crowd to ordinary books has greatly been made things convenient for. Moreover, the accuracy of the identification process can also ensure that the book reading process can be smoothly carried out, and further ensure that the reading user has better reading experience.

In the image-text acquisition and identification device provided by the embodiment of the invention, in order to store the identified image-text content information, the device can further comprise a storage unit for storing the image-text content information identified by the image information identification unit so as to facilitate the subsequent call of the image-text content information.

In the image-text acquisition and identification device provided by the embodiment of the invention, the multipoint image information acquisition unit can also acquire image information containing book information of books and transmit the image information to the image information identification unit, and the image information identification unit identifies the names of the books according to the image information containing the book information of the books. Further, the book name may be output in an audio or display manner, for example, the book name may be read by the audio unit or displayed on the display screen.

Further, the multipoint image information collecting unit may collect image information of a Book cover through the camera as image information including bibliographic information of the Book, and the image information identifying unit may determine a Book name by identifying characters in the image information of the Book cover (including a front cover, a back cover, and the like), or may determine the Book name by identifying the image information of the Book cover, or may determine the Book name by identifying a tag in the image information of the Book cover, where the corresponding tag includes a specially-made tag or code, or may include a currently existing tag or code such as an ISBN barcode (International Standard Book Number ).

Because the front cover image and the back cover image of each book are different, the book can be identified by comparing the shot image information or by extracting characteristic comparison, and the specific book is identified, so that the corresponding book name is determined. In addition, for the convenience of identification, a corresponding label convenient for identification can be further arranged in the book, so that the specific book name of the current book can be determined according to the label, the corresponding label can be a label printed on the book or a label pasted on the book, and the corresponding label can be content information such as pictures, codes or characters. Since the specific image recognition technology is already the prior art, it will not be described in detail here.

In the image-text acquisition and recognition device provided by the embodiment of the invention, the multipoint image information acquisition unit can also acquire image information containing page number information and transmit the image information to the image information recognition unit, and the image information recognition unit recognizes the page number according to the image information containing the page number information. Further, the book name may be output in an audio or display manner, for example, the page number may be read out through the audio unit or displayed through the display screen.

The page number information acquisition module determines the currently read page number by identifying the image information of the book inner page, or determines the currently read page number by identifying the character or number page number in the image information of the book inner page.

The image-text collection and identification device may further include a display unit, configured to display set content information and/or collect images and text information obtained in the identification process and/or content information obtained externally, for example, information such as a page number or a book name read by a current book may be displayed, or explanation information (such as author introduction) for the book may be displayed, or an operation instruction of a user identified by the image information identification unit may be displayed, or video information for explaining the book may be played, and so on.

Specifically, the image-text acquisition and identification device can further comprise any one or more of the following units:

and the audio input unit is used for acquiring audio information. The corresponding acquired audio information can be stored through the storage unit.

The storage unit is used for storing audio information and/or storing image and/or text information obtained in the acquisition and recognition process and/or storing externally obtained content information, and the stored voice information can be played through the audio unit when needed, for example, whether the pronunciation of the user is accurate or not in the language learning process can be corrected through the cooperation of the audio input unit, the storage unit and the audio unit.

And the communication unit is used for communicating with the computer.

Furthermore, in order to enhance the interactive processing between the user and the image-text acquisition and recognition device and improve the experience of the user in using the image-text acquisition and recognition device, the device can further comprise an interactive processing module which is used for acquiring the interactive operation control information of the user and executing the preset interactive operation according to the interactive operation control information, wherein the interactive operation control information comprises at least one of limb actions, actions of operating objects, voice information, screen input or operation keys; in the interactive processing process, the image-text acquisition and identification device can also play specific prompt sound information to a user through the audio unit, or can display specific content information to the user through the display unit, and the user can transmit corresponding interactive operation control information to the image-text acquisition and identification device according to the corresponding prompt sound information or the displayed specific content information so as to interact with the image-text acquisition and identification device. Specifically, the corresponding interactive operation control information may include reading operation instruction information and the like performed on the printed matter, for example, performing interactive operation through body movement, for a user reading a general book, performing interactive control on a reading mode or reading content through interaction between the movement of a hand or a handheld object and the image-text acquisition and recognition device, for example, re-reading the content of the current position through predetermined gesture control. The interaction processing module can identify the limb action or the operation object action of the reading user, so that the device can interact with the reading user, the reading experience of the reading user is improved, and the book becomes an audio medium and an interactive medium.

In the embodiment of the invention, the needed acquisition object can be covered without distortion by corresponding multipoint image acquisition close distance. Specifically, a plurality of focal lengths are provided for the corresponding movable camera or cameras, so that all parts of the collected object are at the best focal length to ensure that the images of all parts are clear.

Because the structure of adopting many cameras or movable camera (like can the formula camera of turning over), always there is suitable shooting angle and position to crooked surface (for example the big curved surface in books middle part), consequently can shoot the discernment effectively to it, corresponding shooting angle no matter directly over or the side inclined plane can both effectively shoot the shooting object, each part all in good focus and obtain clear image and good resolution ratio. Moreover, the corresponding multi-point image acquisition has lower requirement on the resolution of the camera, or the camera with the same resolution can shoot the shot object to achieve higher resolution (more beneficial to identification).

The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present invention are included in the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims

1. An image-text photographing identification device, characterized by comprising:

an image information recognition unit including:

2. The device for recognizing a picture-taking photograph according to claim 1, characterized by comprising:

3. The device for recognizing a picture-taking photograph according to claim 1, characterized by comprising:

4. The image-text photographing identification apparatus according to any one of claims 1 to 3, characterized by comprising:

5. The image-text photographing apparatus according to claim 4, characterized by comprising:

and/or the presence of a gas in the gas,

6. The image-text photographing apparatus according to claim 4, characterized by comprising:

7. The image-text photographing apparatus according to claim 6, characterized by comprising:

and/or the presence of a gas in the gas,

8. The image-text photographing apparatus according to any one of claims 1 to 3, characterized by comprising:

and/or the presence of a gas in the gas,

a communication unit for communicating with a computer,

and/or the presence of a gas in the gas,

an audio input unit for acquiring audio information,

and/or the presence of a gas in the gas,

9. The image-text photographing apparatus according to claim 4, characterized by comprising:

and/or the presence of a gas in the gas,

a communication unit for communicating with a computer,

and/or the presence of a gas in the gas,

an audio input unit for acquiring audio information,

and/or the presence of a gas in the gas,

10. The image-text photographing apparatus according to claim 5, characterized by comprising:

and/or the presence of a gas in the gas,

a communication unit for communicating with a computer,

and/or the presence of a gas in the gas,

an audio input unit for acquiring audio information,

and/or the presence of a gas in the gas,