CN112836685A

CN112836685A - Reading assisting method, system and storage medium

Info

Publication number: CN112836685A
Application number: CN202110262244.7A
Authority: CN
Inventors: 何苗; 秦林婵
Original assignee: Beijing 7Invensun Technology Co Ltd
Current assignee: Beijing 7Invensun Technology Co Ltd
Priority date: 2021-03-10
Filing date: 2021-03-10
Publication date: 2021-05-25

Abstract

The invention discloses an auxiliary reading method, an auxiliary reading system and a storage medium. The method comprises the following steps: acquiring an image of a user watching a target, and performing character recognition on the image to acquire characters contained in the image; tracking the sight of the user to obtain the information of the point of regard of the user; mapping the fixation point information to the image to obtain a target character watched by the user at present; and converting the target characters into audio for playing. According to the auxiliary reading method disclosed by the embodiment of the disclosure, the gaze point information of the user is mapped to the image of the target watched by the user, the target characters watched by the user at present are obtained, and the target characters are converted into audio to be played, so that the independent reading of any text by children can be realized, the auxiliary reading can be realized without the participation of parents, and the convenience of the auxiliary reading is improved.

Description

Reading assisting method, system and storage medium

Technical Field

The embodiment of the invention relates to the technical field of reading assistance, in particular to a reading assistance method, a reading assistance system and a storage medium.

Background

At present, the autonomous reading products for children mainly comprise a point reading machine, a sound picture book and the like, and the problems that the quantity of literacy of the children at low ages is small and the children cannot autonomously read are solved. When the sound picture book is used, a child turns the picture book to which page, corresponding characters can be automatically read, but only a whole article can be mechanically read, the child hardly concentrates attention, and the child can only wait for the sound picture book to finish reading and then turn to the next page. The point-reading machine can read and learn according to the instruction of children, but specific books or hardware (equipment with a display screen and the like) are required, and not all books can be read in a point-reading mode. Similarly, the sound picture book is also required to be specific to a book. It cannot be adapted to all electronic sketches or physical books.

Therefore, most of common books still need parents to turn over the books beside the books to tell stories to children, the method takes much time for the parents, and the children cannot read the books by themselves without the parents.

Disclosure of Invention

The embodiment of the invention provides an auxiliary reading method, an auxiliary reading system and a storage medium, which can realize the autonomous reading of any text by children, can assist in reading without the participation of parents and improve the convenience of auxiliary reading.

In a first aspect, an embodiment of the present invention provides an assistant reading method, including:

acquiring an image of a user watching a target, and performing character recognition on the image to acquire characters contained in the image;

tracking the sight of the user to obtain the information of the point of regard of the user;

mapping the fixation point information to the image to obtain a target character watched by the user at present;

and converting the target characters into audio for playing.

Further, acquiring an image of the user's gaze target, comprising:

if the image of the user watching the target is displayed through the display device, acquiring the source of the image;

acquiring an image of the user watching target according to the source; alternatively, the first and second electrodes may be,

and shooting the displayed content through a camera to obtain an image of the user watching the target.

Further, acquiring an image of the user's gaze target, comprising:

if the target watched by the user is the real book, shooting the real book watched by the user through the camera to obtain an image of the target watched by the user.

Further, mapping the gaze point information to the image to obtain a target text currently gazed by the user, including:

acquiring coordinate information corresponding to the fixation point information in the image;

and determining the target character which is currently watched by the user based on the coordinate information.

Further, converting the target text into audio for playing, including:

judging whether the duration of watching the target characters by the user exceeds a set value,

if yes, converting the target characters into audio for playing.

In a second aspect, an embodiment of the present invention further provides an assistant reading system, including: the device comprises an image acquisition module, a character recognition module, a sight tracking module, a fixation point information mapping module and an audio playing module;

the image acquisition module is used for acquiring an image of a user gazing at a target; the character recognition module is used for carrying out character recognition on the image to obtain characters contained in the image; the sight tracking module is used for tracking eye movements of the user to obtain the fixation point information of the user; the fixation point information mapping module is used for mapping the fixation point information to the image to obtain a target character which is watched by the user at present; the audio playing module is used for converting the target characters into audio to play.

Further, the system also comprises a camera; the camera is used for shooting the content watched by the user, acquiring an image of a target watched by the user and sending the image to the image acquisition module; wherein, what the user watched is the picture that the object books or user watched the target show through display device.

Further, if the image of the target watched by the user is displayed through a display device, the sight tracking module is integrated on the display device;

the image acquisition module is further configured to: obtaining a source of the image; and acquiring an image of a user watching target according to the source.

Further, the camera and the gaze tracking module are integrated on a wearable device.

In a third aspect, an embodiment of the present invention further provides a computer-readable storage medium, on which a computer program is stored, and when the computer program is executed by a processing apparatus, the computer program implements a reading assistance method according to an embodiment of the present invention.

The embodiment of the invention discloses an auxiliary reading method, an auxiliary reading system and a storage medium. Acquiring an image of a user watching a target, and performing character recognition on the image to acquire characters contained in the image; tracking the sight of the user to obtain the information of the point of regard of the user; mapping the fixation point information into an image to obtain a target character watched by the user at present; and converting the target characters into audio for playing. According to the auxiliary reading method disclosed by the embodiment of the disclosure, the gaze point information of the user is mapped to the image of the target watched by the user, the target characters watched by the user at present are obtained, and the target characters are converted into audio to be played, so that the independent reading of any text by children can be realized, the auxiliary reading can be realized without the participation of parents, and the convenience of the auxiliary reading is improved.

Drawings

FIG. 1 is a flowchart illustrating an assistant reading method according to an embodiment of the present invention;

FIG. 2 is a diagram illustrating the results of an assistant reading system according to a second embodiment of the present invention;

fig. 3 is an exemplary diagram of an assistive reading system in the second embodiment of the invention;

fig. 4 is an exemplary view of another reading aid system in accordance with the second embodiment of the invention;

fig. 5a is a schematic structural diagram of a wearable device in the second embodiment of the present invention; .

Fig. 5b is a schematic structural diagram of a wearable device in the second embodiment of the present invention.

Detailed Description

The present invention will be described in further detail with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be further noted that, for the convenience of description, only some of the structures related to the present invention are shown in the drawings, not all of the structures.

Gaze tracking, which may also be referred to as eye tracking, is a technique for estimating the gaze direction and/or gaze point of an eye by measuring eye movement. The method specifically comprises the steps that an eye image of a user to be detected is captured in real time, and the relative position of eye features is analyzed through the eye image of the user to be detected, so that the fixation point information of the user to be detected is obtained; or detecting eyeball movement through a capacitance value between the eyeballs and the capacitance plate to obtain the information of the fixation points of the user to be detected; or electrodes are arranged at the bridge of the nose, the forehead, the ears or the earlobes, the eyeball movement is detected through the detected myoelectric current signal mode, and the information of the fixation point of the user to be detected is obtained. Of course, other methods for acquiring the gaze point information of the user to be detected in real time may be adopted, which all fall within the scope of the present invention.

Tracking of the eye can be achieved by optical recording. The principle of the optical recording method is that an infrared camera is used for recording the eye movement condition of a tested person, namely, an eye image capable of reflecting the eye movement is obtained, and eye features are extracted from the obtained eye image to be used for establishing an estimation model of the sight. Wherein the eye features may include: pupil location, pupil shape, iris location, iris shape, eyelid location, canthus location, spot location (or purkinje spot), and the like. Optical recording methods include pupil-cornea reflectometry. The principle of the pupil-cornea reflection method is that a near-infrared light source irradiates an eye, an infrared camera shoots the eye, and meanwhile, a reflection point of the light source on the cornea, namely a light spot, is shot, so that an eye image with the light spot is obtained.

Of course, in addition to optical recording, the gaze tracking device may be a MEMS, including, for example, MEMS infrared scanning mirrors, infrared light sources, infrared receivers; or a capacitance sensor which detects the eyeball movement through the capacitance value between the eyeball and the capacitance plate; and more, the device can be a myoelectric current detector which detects eye movement through a detected myoelectric current signal mode by placing an electrode at the bridge of the nose, forehead, ear or earlobe.

At present, there are various methods for the gaze tracking technology to acquire the gaze information of the user, which are not described in detail herein.

Example one

Fig. 1 is a flowchart of an assisted reading method according to an embodiment of the present invention, where the method is applicable to a case of assisting a child to read, and the method may be executed by an assisted reading apparatus, where the apparatus may be composed of hardware and/or software, and may be generally integrated into a device with an assisted reading function, where the device may be an electronic device such as a server or a server cluster. As shown in fig. 1, the method specifically includes the following steps:

and step 110, acquiring an image of a user watching a target, and performing character recognition on the image to acquire characters contained in the image.

The target watched by the user can be a real book or an image of the target watched by the user is displayed through the display device. The display device may be a display screen of any electronic equipment, such as a television, a desktop computer, or a mobile terminal.

In this embodiment, if the image of the user gazing at the target is displayed by the display device, the manner of obtaining the image of the user gazing at the target may be: acquiring a source of an image; acquiring an image of a user watching a target according to a source; or shooting the displayed content through a camera to obtain an image of the user gazing at the target.

If the image of the user's gaze target is displayed by the display device, the image may be stored in a local memory in a data form or transmitted via a network. If the image can be stored in a local memory in a data form, the image can be directly obtained according to the storage path of the image; if the image is transmitted through the network, the image is acquired through Socket. In addition, the content appearing on the display device can be shot through the camera, so that the image of the user watching the target can be obtained. The camera is arranged in front of the display device, the shooting angle is right opposite to the display picture of the display device, and the displayed content can be completely shot. Specifically, the camera may operate according to a received shooting instruction to shoot the displayed content, or to shoot the displayed content in real time.

In this embodiment, if the user gazes at the real-object book, the manner of obtaining the image of the user gazing at the target may be: and controlling the camera to shoot the real object book watched by the user to obtain an image of the target watched by the user.

The camera is arranged in front of or above the real object data, the shooting angle is opposite to the real object book, and the contents in the real object book can be completely shot. Specifically, the camera may operate according to a received shooting instruction to shoot the displayed content, or to shoot the displayed content in real time.

The image may be subjected to text recognition by using the existing text recognition technology, which is not described herein again. In this embodiment, when performing character recognition on an image, not only character information included in the image but also position information of each character in the image needs to be acquired, where the position information may be represented by coordinates of a central point of a rectangular frame surrounding the character. Specifically, a rectangular frame corresponding to the recognized character is acquired, and the coordinate information of the center of the rectangular frame is acquired and determined as the position information of the character.

And step 120, tracking the sight of the user to obtain the information of the fixation point of the user.

In this embodiment, the gaze tracking technology described above may be used for performing gaze tracking on the user. An eye tracking module (e.g., an eye tracker) may be installed to track the user's gaze.

And step 130, mapping the gazing point information into an image to obtain the target characters gazed by the user at present.

The gazing point information can be understood as intersection point information of the user sight line and the gazing image. Specifically, the manner of mapping the gazing point information to the image to obtain the target text watched by the user at present may be: acquiring coordinate information corresponding to the fixation point information in the image; and determining the target character which is currently watched by the user based on the coordinate information.

In this embodiment, the position information of each character in the image is acquired while acquiring the characters included in the image of the user gazing at the target, and after the coordinate information corresponding to the gazing point in the image is acquired, the characters falling into the coordinate information, that is, the target characters, can be acquired.

Step 140, converting the target text into audio for playing.

Specifically, after the target characters are determined, a character-to-audio module can be called to convert the target characters into audio and play the audio.

Optionally, the method of converting the target text into audio for playing may be: and judging whether the duration of watching the target characters by the user exceeds a set value, and if so, converting the target characters into audio for playing.

Wherein the set value may be a value between 1-2 seconds. When the duration of watching the target characters by the user exceeds a set value, the characters are reflected to a certain extent that the user may not know the characters, and then the characters are converted into audio to be played, so that automatic control of auxiliary reading is realized.

According to the technical scheme of the embodiment of the invention, the image of the target watched by the user is obtained, and the character recognition is carried out on the image to obtain the characters contained in the image; tracking the sight of the user to obtain the information of the point of regard of the user; mapping the fixation point information into an image to obtain a target character watched by the user at present; and converting the target characters into audio for playing. According to the auxiliary reading method disclosed by the embodiment of the disclosure, the gaze point information of the user is mapped to the image of the target watched by the user, the target characters watched by the user at present are obtained, and the target characters are converted into audio to be played, so that the independent reading of any text by children can be realized, the auxiliary reading can be realized without the participation of parents, and the convenience of the auxiliary reading is improved.

Example two

Fig. 2 is a diagram illustrating the result of an assistant reading system according to a second embodiment of the present invention. As shown in fig. 2, the system includes: an image acquisition module 210, a character recognition module 220, a gaze tracking module 230, a gaze point information mapping module 240, and an audio playback module 250.

The image obtaining module 210 is configured to obtain an image of a target watched by a user; the character recognition module 220 is configured to perform character recognition on the image to obtain characters included in the image; the gaze tracking module 230 is configured to track eye movements of the user to obtain gaze point information of the user; the gazing point information mapping module 240 is configured to map the gazing point information into an image to obtain a target character gazed by a user at present; the audio playing module 250 is configured to convert the target text into audio for playing.

Wherein the gaze tracking module 230 may be an eye tracker. If the image of the user's gaze target is displayed by a display device, the gaze tracking module 230 is integrated on the display device. The display device may be a display screen of any electronic device, such as a television, a desktop computer, or a mobile terminal. Fig. 3 is an exemplary diagram of an assistant reading system according to a second embodiment of the present invention. As shown in fig. 3, the gaze tracking module is disposed on the display device, and when a user gazes at an image displayed on the display device, the gaze tracking module may acquire gaze point information of the user, map the gaze point information onto the image, determine a character corresponding to the gaze point, convert the character into an audio, and play the audio through the playing module. The playing module may be a speaker or a headphone. In this embodiment, the point of regard may or may not be displayed on the display device.

The image acquisition module 210 is further configured to: acquiring a source of an image; and acquiring an image of the target watched by the user according to the source.

Optionally, the system further includes a camera, and the camera is configured to capture the content watched by the user, obtain an image of the target watched by the user, and send the image to the image obtaining module 210.

Wherein, what the user watched is the picture that the object books or user watched the target show through display device.

For example, fig. 4 is a schematic diagram of another reading assistance system provided in the second embodiment of the present invention. As shown in fig. 4, a user gazes at a real book, the sight tracking module and the camera are both arranged on a desk, the camera is used for shooting an image of the real book and sending the image to the image acquisition module, and the sight tracking module is used for acquiring the information of a gazing point of the user. And the character recognition module performs character recognition on the image to obtain characters contained in the image. And the fixation point information mapping module maps the fixation point information into the image to obtain the target characters watched by the user at present. The audio playing module is used for converting the target characters into audio to play.

In this embodiment, the audio playing module may be integrated with the camera and the gaze tracking module, or may be a separate module.

Optionally, the camera and gaze tracking module may also be integrated on the wearable device. The wearable device can be glasses, a helmet and the like. Illustratively, fig. 5 a-5 b are schematic structural diagrams of a wearable device provided by an example of the present invention. As shown in fig. 5 a-5 b, the glasses are provided with a front camera and a gaze tracking module. After the user wears the glasses, the sight tracking module can acquire fixation point information of the user in real time, and the front camera shoots an image of a fixation target of the user. The image acquisition module, the character recognition module, the fixation point information mapping module and the audio playing module can be integrated on the wearable device, or the wearable device is connected with the device integrated with the image acquisition module, the character recognition module, the fixation point information mapping module and the audio playing module in a wired or wireless mode, and the fixation point information and the image of the target watched by the user are sent to the device, so that the device processes the fixation point information and the image of the target watched by the user to realize auxiliary reading.

The auxiliary reading system provided by the embodiment of the invention comprises: the device comprises an image acquisition module, a character recognition module, a sight tracking module, a fixation point information mapping module and an audio playing module. The image acquisition module is used for acquiring an image of a user gazing at a target; the character recognition module is used for carrying out character recognition on the image to obtain characters contained in the image; the sight tracking module is used for tracking the eye movement of the user to obtain the fixation point information of the user; the fixation point information mapping module is used for mapping the fixation point information into an image to obtain a target character which is watched by a user at present; the audio playing module is used for converting the target characters into audio to play. The method has the advantages that the gaze point information of the user is mapped to the image of the target watched by the user, the target characters watched by the user at present are obtained, the target characters are converted into audio to be played, independent reading of any text by children can be achieved, reading can be assisted without participation of parents, and convenience of assisted reading is improved.

EXAMPLE III

An embodiment of the present invention provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processing device, implements an auxiliary reading method as in the embodiment of the present invention. The computer readable medium of the present invention described above may be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In contrast, in the present disclosure, a computer readable signal medium may comprise a propagated data signal with computer readable program code embodied therein, either in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, optical cables, RF (radio frequency), etc., or any suitable combination of the foregoing.

In some embodiments, the clients, servers may communicate using any currently known or future developed network Protocol, such as HTTP (HyperText Transfer Protocol), and may interconnect with any form or medium of digital data communication (e.g., a communications network). Examples of communication networks include a local area network ("LAN"), a wide area network ("WAN"), the Internet (e.g., the Internet), and peer-to-peer networks (e.g., ad hoc peer-to-peer networks), as well as any currently known or future developed network.

The computer readable medium may be embodied in the electronic device; or may exist separately without being assembled into the electronic device.

The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to: acquiring an image of a user watching a target, and performing character recognition on the image to acquire characters contained in the image; tracking the sight of the user to obtain the information of the point of regard of the user; mapping the fixation point information to the image to obtain a target character watched by the user at present; and converting the target characters into audio for playing.

Computer program code for carrying out operations for the present disclosure may be written in any combination of one or more programming languages, including but not limited to an object oriented programming language such as Java, Smalltalk, C + +, and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The units described in the embodiments of the present disclosure may be implemented by software or hardware. Where the name of an element does not in some cases constitute a limitation on the element itself.

The functions described herein above may be performed, at least in part, by one or more hardware logic components. For example, without limitation, exemplary types of hardware logic components that may be used include: field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), systems on a chip (SOCs), Complex Programmable Logic Devices (CPLDs), and the like.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

It is to be noted that the foregoing is only illustrative of the preferred embodiments of the present invention and the technical principles employed. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the present invention has been described in greater detail by the above embodiments, the present invention is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present invention, and the scope of the present invention is determined by the scope of the appended claims.

Claims

1. An assistive reading method, comprising:

and converting the target characters into audio for playing.

2. The method of claim 1, wherein obtaining an image of a user's gaze target comprises:

3. The method of claim 1, wherein obtaining an image of a user's gaze target comprises:

4. The method of claim 1, wherein mapping the point-of-regard information into the image to obtain a target text currently gazed at by the user comprises:

5. The method of claim 1, wherein converting the target text into audio for playing comprises:

if yes, converting the target characters into audio for playing.

6. An assistive reading system, comprising: the device comprises an image acquisition module, a character recognition module, a sight tracking module, a fixation point information mapping module and an audio playing module;

7. The system of claim 6, further comprising a camera; the camera is used for shooting the content watched by the user, acquiring an image of a target watched by the user and sending the image to the image acquisition module; wherein, what the user watched is the picture that the object books or user watched the target show through display device.

8. The system of claim 6 or 7, wherein the gaze tracking module is integrated on a display device if an image of a user's gaze target is displayed by the display device;

9. The system of claim 7, wherein the camera and the gaze tracking module are integrated on a wearable device.

10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processing device, carries out a reading aid method according to any one of claims 1 to 5.