CN112836685A

CN112836685A - A kind of auxiliary reading method, system and storage medium

Info

Publication number: CN112836685A
Application number: CN202110262244.7A
Authority: CN
Inventors: 何苗; 秦林婵
Original assignee: Beijing 7Invensun Technology Co Ltd
Current assignee: Beijing 7Invensun Technology Co Ltd
Priority date: 2021-03-10
Filing date: 2021-03-10
Publication date: 2021-05-25

Abstract

The invention discloses an auxiliary reading method, system and storage medium. The method includes: acquiring an image of the user's gaze target, and performing text recognition on the image to obtain the text contained in the image; performing gaze tracking on the user to acquire the user's gaze point information; and mapping the gaze point information to In the image, the target text that the user is currently looking at is obtained; the target text is converted into audio for playback. The assisted reading method disclosed in the embodiment of the present disclosure maps the user's gaze point information to the image of the user's gaze target, obtains the target text that the user is currently looking at, and converts the target text into audio for playback, which can realize children's reading of any text. Independent reading, assisted reading without parental participation, improves the convenience of assisted reading.

Description

A kind of auxiliary reading method, system and storage medium

技术领域technical field

本发明实施例涉及辅助阅读技术领域，尤其涉及一种辅助阅读方法、系统及存储介质。Embodiments of the present invention relate to the technical field of auxiliary reading, and in particular, to a reading auxiliary method, system, and storage medium.

背景技术Background technique

目前儿童自主阅读产品主要有点读机、有声绘本等，解决了低年龄幼儿识字量不多，不能自主阅读的情况。有声绘本在使用时儿童将绘本翻到哪页，就会自动读出相应的文字，但只能机械的阅读一整篇文章，孩子很难集中注意力，只能等它读完再翻到下一页。点读机可以按照儿童的指示进行阅读和学习，但是必须针对特定的书本或者硬件(带显示屏的设备等)，不是所有的书籍都可以以点读的方式进行阅读。同样的，有声绘本也是要针对特定的书才可以。不能适应所有的电子绘本或者实体书籍。At present, children's self-reading products mainly include computer readers, audio picture books, etc., which solves the situation that young children do not have much literacy and cannot read independently. When the audio picture book is in use, the child will automatically read the corresponding text to which page the child turns the picture book, but it can only read the entire article mechanically. one page. The point-and-click machine can read and learn according to children's instructions, but it must be aimed at specific books or hardware (devices with display screens, etc.), and not all books can be read in a point-to-point manner. In the same way, audio picture books can only be used for specific books. Not suitable for all electronic picture books or physical books.

因此，对于大多数普通书本，还是需要家长在旁边一边翻书一遍给孩子讲故事，这种方式要占用家长较多时间，在没有家长陪伴的情况下，孩子无法自己阅读书籍。Therefore, for most ordinary books, parents still need to turn over the book next to them and tell stories to their children. This method takes up a lot of parents' time, and children cannot read books by themselves without the company of parents.

发明内容SUMMARY OF THE INVENTION

本发明实施例提供一种辅助阅读方法、系统及存储介质，可以实现儿童对任意文本的自主阅读，无需家长参与就可以辅助阅读，提高了辅助阅读的便捷性。The embodiments of the present invention provide an auxiliary reading method, system and storage medium, which can realize children's independent reading of any text, and can assist in reading without the participation of parents, thereby improving the convenience of auxiliary reading.

第一方面，本发明实施例提供了一种辅助阅读方法，包括：In a first aspect, an embodiment of the present invention provides a method for assisting reading, including:

获取用户注视目标的图像，并对所述图像进行文字识别，获得所述图像中包含的文字；Acquire an image of the user's gaze target, and perform text recognition on the image to obtain the text contained in the image;

对所述用户进行视线追踪，获得用户的注视点信息；Perform gaze tracking on the user to obtain the user's gaze point information;

将所述注视点信息映射到所述图像中，获得用户当前注视的目标文字；Mapping the gaze point information into the image to obtain the target text currently gazed by the user;

将所述目标文字转换为音频进行播放。Convert the target text into audio for playback.

进一步地，获取用户注视目标的图像，包括：Further, acquiring the image of the user's gaze target, including:

若用户注视目标的图像通过显示装置显示，则获取所述图像的来源；If the image of the user's gaze target is displayed through the display device, acquiring the source of the image;

根据所述来源获取所述用户注视目标的图像；或者，Obtain an image of the user's gaze target according to the source; or,

通过摄像头对显示的内容进行拍摄，获得所述用户注视目标的图像。The displayed content is photographed by a camera to obtain an image of the user's gaze target.

若用户注视的目标是实物书籍，则通过摄像头对用户注视的实物书籍进行拍摄，获得所述用户注视目标的图像。If the target of the user's gaze is a real book, the real book that the user is looking at is photographed by a camera to obtain an image of the user's gaze target.

进一步地，将所述注视点信息映射到所述图像中，获得用户当前注视的目标文字，包括：Further, mapping the gaze point information to the image to obtain the target text currently gazed by the user, including:

获取所述注视点信息在所述图像中对应的坐标信息；obtaining coordinate information corresponding to the gaze point information in the image;

基于所述坐标信息确定用户当前注视的目标文字。The target character that the user is currently looking at is determined based on the coordinate information.

进一步地，将所述目标文字转换为音频进行播放，包括：Further, converting the target text into audio for playback, including:

判断用户注视所述目标文字的时长是否超过设定值，Determine whether the user gazes at the target text for longer than a set value,

若超过，则将所述目标文字转换为音频进行播放。If it exceeds, the target text is converted into audio for playback.

第二方面，本发明实施例还提供了一种辅助阅读系统，包括：图像获取模块、文字识别模块、视线追踪模块、注视点信息映射模块及音频播放模块；In a second aspect, an embodiment of the present invention further provides an auxiliary reading system, including: an image acquisition module, a character recognition module, a gaze tracking module, a gaze point information mapping module, and an audio playback module;

所述图像获取模块用于获取用户注视目标的图像；所述文字识别模块用于对所述图像进行文字识别，获得所述图像包含的文字；所述视线追踪模块用于对用户进行眼动追踪，获得用户的注视点信息；所述注视点信息映射模块用于将所述注视点信息映射到所述图像中，获得用户当前注视的目标文字；所述音频播放模块用于将所述目标文字转换为音频进行播放。The image acquisition module is used to acquire the image of the user's gaze target; the text recognition module is used to perform text recognition on the image to obtain the text contained in the image; the gaze tracking module is used to perform eye tracking on the user , to obtain the user's gaze point information; the gaze point information mapping module is used to map the gaze point information to the image to obtain the target text that the user is currently looking at; the audio playback module is used to map the target text to the Convert to audio for playback.

进一步地，所述系统还包括摄像头；所述摄像头用于对用户注视的内容进行拍摄，获得用户注视目标的图像，并将所述图像发送至所述图像获取模块；其中，用户注视的是实物书籍或者用户注视目标的图像通过显示装置显示。Further, the system further includes a camera; the camera is used to shoot the content of the user's gaze, obtain an image of the user's gaze target, and send the image to the image acquisition module; wherein, what the user is looking at is a real object An image of a book or a user's gaze target is displayed through the display device.

进一步地，若用户注视目标的图像通过显示装置显示，则所述视线追踪模块集成于所述显示装置上；Further, if the image of the user's gaze target is displayed on the display device, the gaze tracking module is integrated on the display device;

所述图像获取模块还用于：获取所述图像的来源；根据所述来源获取用户注视目标的图像。The image acquisition module is further configured to: acquire the source of the image; and acquire the image of the user's gaze target according to the source.

进一步地，所述摄像头和所述视线追踪模块集成于可穿戴设备上。Further, the camera and the gaze tracking module are integrated on the wearable device.

第三方面，本发明实施例还提供了一种计算机可读存储介质，其上存储有计算机程序，、该程序被处理装置执行时实现如本发明实施例所述的辅助阅读方法。In a third aspect, an embodiment of the present invention further provides a computer-readable storage medium on which a computer program is stored, and when the program is executed by a processing device, the reading assistance method according to the embodiment of the present invention is implemented.

本发明实施例公开了一种辅助阅读方法、系统及存储介质。获取用户注视目标的图像，并对图像进行文字识别，获得图像中包含的文字；对用户进行视线追踪，获得用户的注视点信息；将注视点信息映射到图像中，获得用户当前注视的目标文字；将目标文字转换为音频进行播放。本公开实施例公开的辅助阅读方法，将用户的注视点信息映射至用户注视目标的图像中，获得用户当前注视的目标文字，并将目标文字转换成音频进行播放，可以实现儿童对任意文本的自主阅读，无需家长参与就可以辅助阅读，提高了辅助阅读的便捷性。Embodiments of the present invention disclose a method, system and storage medium for assisting reading. Obtain the image of the user's gaze target, and perform text recognition on the image to obtain the text contained in the image; track the user's gaze to obtain the user's gaze point information; map the gaze point information to the image to obtain the user's current gaze target text ; Convert the target text to audio for playback. The assisted reading method disclosed in the embodiment of the present disclosure maps the user's gaze point information to the image of the user's gaze target, obtains the target text that the user is currently looking at, and converts the target text into audio for playback, which can realize children's reading of any text. Independent reading, assisted reading without parental participation, improves the convenience of assisted reading.

附图说明Description of drawings

图1是本发明实施例一中的一种辅助阅读方法的流程图；Fig. 1 is the flow chart of a kind of auxiliary reading method in the first embodiment of the present invention;

图2是本发明实施例二中的一种辅助阅读系统的结果示意图；Fig. 2 is the result schematic diagram of a kind of auxiliary reading system in the second embodiment of the present invention;

图3是本发明实施例二中的一种辅助阅读系统的示例图；3 is an exemplary diagram of an auxiliary reading system in Embodiment 2 of the present invention;

图4是本发明实施例二中的另一种辅助阅读系统的示例图；4 is an exemplary diagram of another auxiliary reading system in Embodiment 2 of the present invention;

图5a是本发明实施例二中的一种可穿戴设备的结构示意图；。Fig. 5a is a schematic structural diagram of a wearable device in Embodiment 2 of the present invention;

图5b是本发明实施例二中的一种可穿戴设备的结构示意图。FIG. 5b is a schematic structural diagram of a wearable device in Embodiment 2 of the present invention.

具体实施方式Detailed ways

下面结合附图和实施例对本发明作进一步的详细说明。可以理解的是，此处所描述的具体实施例仅仅用于解释本发明，而非对本发明的限定。另外还需要说明的是，为了便于描述，附图中仅示出了与本发明相关的部分而非全部结构。The present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are only used to explain the present invention, but not to limit the present invention. In addition, it should be noted that, for the convenience of description, the drawings only show some but not all structures related to the present invention.

视线追踪也可以称为眼球追踪，是通过测量眼睛运动情况来估计眼睛的视线方向和/或注视点的技术。具体可以通过实时捕捉待检测用户的眼部图像，并通过待检测用户的眼睛图像分析眼部特征的相对位置，获得待检测用户的注视点信息；或者通过眼球与电容极板之间的电容值来检测眼球运动，获得待检测用户的注视点信息；又或者通过在鼻梁、额头、耳朵或耳垂处放置电极，通过检测的肌电流信号模式来检测眼球运动，获得待检测用户的注视点信息。当然也可以采用其他的实时获取待检测用户的注视点信息的方法，这都应属于本发明的保护范畴。Gaze tracking, also known as eye tracking, is a technique for estimating the eye's gaze direction and/or fixation point by measuring eye movement. Specifically, the eye image of the user to be detected can be captured in real time, and the relative position of the eye features can be analyzed through the eye image of the user to be detected, so as to obtain the gaze point information of the user to be detected; or the capacitance value between the eyeball and the capacitive plate can be obtained. to detect eye movement and obtain the gaze point information of the user to be detected; or place electrodes on the bridge of the nose, forehead, ear or earlobe to detect eye movement through the detected muscle current signal pattern to obtain the gaze point information of the user to be detected. Of course, other methods for acquiring the gaze point information of the user to be detected in real time may also be adopted, which should all belong to the protection scope of the present invention.

对眼球进行追踪可以采用光学记录法实现。光学记录法的原理是，利用红外相机记录被测试者的眼睛运动情况，即获取能够反映眼睛运动的眼部图像，从获取到的眼部图像中提取眼部特征用于建立视线的估计模型。其中，眼部特征可以包括：瞳孔位置、瞳孔形状、虹膜位置、虹膜形状、眼皮位置、眼角位置、光斑位置(或者普尔钦斑)等。光学记录法包括瞳孔-角膜反射法。瞳孔-角膜反射法的原理是，近红外光源照向眼睛，由红外相机对眼部进行拍摄，同时拍摄到光源在角膜上的反射点即光斑，由此获取到带有光斑的眼部图像。Eye tracking can be achieved using optical recording methods. The principle of the optical recording method is to use an infrared camera to record the eye movement of the test subject, that is, to obtain an eye image that can reflect the eye movement, and to extract the eye features from the obtained eye image to establish an estimation model of the line of sight. The eye features may include: pupil position, pupil shape, iris position, iris shape, eyelid position, canthus position, light spot position (or Purchin's spot), and the like. Optical recording methods include pupil-corneal reflex methods. The principle of the pupil-corneal reflection method is that the near-infrared light source illuminates the eye, the eye is photographed by an infrared camera, and the reflection point of the light source on the cornea, that is, the light spot, is captured, thereby obtaining the eye image with the light spot.

当然，除了光学记录法外，视线追踪装置还可以是MEMS微机电系统，例如包括MEMS红外扫描反射镜、红外光源、红外接收器；又或者是电容传感器，其通过眼球与电容极板之间的电容值来检测眼球运动；更可以是肌电流检测器，其通过在鼻梁、额头、耳朵或耳垂处放置电极，通过检测的肌电流信号模式来检测眼球运动。Of course, in addition to the optical recording method, the gaze tracking device can also be a MEMS micro-electromechanical system, for example, including a MEMS infrared scanning mirror, an infrared light source, and an infrared receiver; or a capacitive sensor, which passes through the contact between the eyeball and the capacitive plate. The capacitance value is used to detect eye movement; it can be a muscle current detector, which can detect eye movement through the detected muscle current signal pattern by placing electrodes on the bridge of the nose, forehead, ear or earlobe.

目前视线追踪技术有多种方法可以获取用户的注视信息，在这个不再一一举例。At present, there are various methods for the gaze tracking technology to obtain the user's gaze information, and we will not give examples one by one here.

实施例一Example 1

图1为本发明实施例一提供的一种辅助阅读方法的流程图，本实施例可适用于辅助儿童阅读的情况，该方法可以由辅助阅读装置来执行，该装置可由硬件和/或软件组成，并一般可集成在具有辅助阅读功能的设备中，该设备可以是服务器或服务器集群等电子设备。如图1所示，该方法具体包括如下步骤：FIG. 1 is a flow chart of a method for assisting reading provided by Embodiment 1 of the present invention. This embodiment can be applied to the situation of assisting children in reading, and the method can be executed by an assisting reading device, and the device can be composed of hardware and/or software. , and can generally be integrated in a device with an auxiliary reading function, which can be an electronic device such as a server or a server cluster. As shown in Figure 1, the method specifically includes the following steps:

步骤110，获取用户注视目标的图像，并对图像进行文字识别，获得图像中包含的文字。Step 110: Acquire an image of the user's gaze target, and perform text recognition on the image to obtain the text contained in the image.

其中，用户注视的目标可以是实物书籍或者用户注视目标的图像通过显示装置显示。显示装置可以是任意电子设备的显示屏，如电视机、台式电脑或者移动终端等。The target of the user's gaze may be a physical book or an image of the user's gaze target is displayed on the display device. The display device may be a display screen of any electronic device, such as a television, a desktop computer, or a mobile terminal.

本实施例中，若用户注视目标的图像通过显示装置显示，获取用户注视目标的图像的方式可以是：获取图像的来源；根据来源获取用户注视目标的图像；或者，通过摄像头对显示的内容进行拍摄，获得用户注视目标的图像。In this embodiment, if the image of the user's gaze target is displayed on the display device, the method of acquiring the image of the user's gaze target may be: acquiring the source of the image; acquiring the image of the user's gaze target according to the source; Shoot to get an image of the user's gaze target.

其中，若用户注视目标的图像通过显示装置显示，则该图像可以数据形式存储于本地的存储器，或者通过网络传输过来。若该图像可以数据形式存储于本地的存储器，则可以直接根据该图像的存储路径获取到该图像；若图像通过网络传输过来，则通过Socket获取到图像。另外，还可以通过摄像头对显示装置中显得的内容进行拍摄，从而获得用户注视目标的图像。摄像头设置于显示装置前面，拍摄角度正对显示装置的显示画面，可以将显示的内容完整拍摄下来。具体的，摄像头的工作原理可以是，根据接收的拍摄指令对显示的内容进行拍摄，或者对显示的内容进行实时的拍摄。Wherein, if the image of the user's gaze target is displayed on the display device, the image can be stored in the local memory in the form of data, or transmitted through the network. If the image can be stored in the local memory in the form of data, the image can be obtained directly according to the storage path of the image; if the image is transmitted through the network, the image can be obtained through the Socket. In addition, the content displayed on the display device can also be photographed by a camera, so as to obtain an image of the user's gaze target. The camera is arranged in front of the display device, and the shooting angle is facing the display screen of the display device, so that the displayed content can be completely photographed. Specifically, the working principle of the camera may be to shoot the displayed content according to the received shooting instruction, or to shoot the displayed content in real time.

本实施例中的，若用户注视的是实物书籍，获取用户注视目标的图像的方式可以是：控制摄像头对用户注视的实物书籍进行拍摄，获得用户注视目标的图像。In this embodiment, if the user is staring at a physical book, the method of obtaining the image of the user's staring target may be: controlling the camera to photograph the physical book the user is staring at to obtain the image of the user's staring target.

其中，摄像头设置于实物数据前面或者上面，拍摄角度正对实物书籍，可以将实物书籍中的内容完整拍摄下来。具体的，摄像头的工作原理可以是，根据接收的拍摄指令对显示的内容进行拍摄，或者对显示的内容进行实时的拍摄。Among them, the camera is arranged in front of or above the physical data, and the shooting angle is facing the physical book, so that the content in the physical book can be completely photographed. Specifically, the working principle of the camera may be to shoot the displayed content according to the received shooting instruction, or to shoot the displayed content in real time.

其中，对图像进行文字识别可以采用现有的文字识别技术，此处不再赘述。本实施例中，在对图像进行文字识别时，不仅要获取到图像包含的文字信息，还需要获取到各文字在图像中的位置信息，该位置信息可以由包围文字的矩形框的中心点的坐标表征。具体的，获取识别到文字对应的矩形框，在获取该矩形框中心的坐标信息，确定为该文字的位置信息。Wherein, the existing text recognition technology can be used to perform text recognition on the image, which will not be repeated here. In this embodiment, when character recognition is performed on an image, not only the text information contained in the image, but also the position information of each character in the image needs to be obtained. The position information can be determined by the center point of the rectangular frame surrounding the character. Coordinate representation. Specifically, a rectangular frame corresponding to the recognized character is obtained, and the coordinate information of the center of the rectangular frame is obtained and determined as the position information of the character.

步骤120，对用户进行视线追踪，获得用户的注视点信息。In step 120, the user's gaze is tracked to obtain the user's gaze point information.

本实施例中，对用户进行视线追踪可以采用上述所述的视线追踪技术。可以安装一个视线追踪模块(如眼动仪)对用户进行视线追踪。In this embodiment, the gaze tracking technology described above may be used for the gaze tracking of the user. A gaze tracking module (such as an eye tracker) can be installed to track the user's gaze.

步骤130，将注视点信息映射到图像中，获得用户当前注视的目标文字。Step 130: Map the gaze point information to the image to obtain the target text currently gazed by the user.

其中，注视点信息可以理解为用户视线与注视图像的交点信息。具体的，将注视点信息映射到图像中，获得用户当前注视的目标文字的方式可以是：获取注视点信息在图像中对应的坐标信息；基于坐标信息确定用户当前注视的目标文字。The gaze point information may be understood as the intersection point information of the user's line of sight and the gaze image. Specifically, the method of mapping the gaze point information to the image to obtain the target text currently gazed by the user may be: acquiring coordinate information corresponding to the gaze point information in the image; and determining the target text currently gazed by the user based on the coordinate information.

本实施例中，在获取用户注视目标的图像中包含的文字的同时获取到了各个文字在图像中的位置信息，在获得了注视点在图像中对应的坐标信息后，就可以获取落入该坐标信息中的文字，即为目标文字。In this embodiment, the position information of each character in the image is obtained while the characters contained in the image of the user's gaze target are obtained. After the coordinate information corresponding to the gaze point in the image is obtained, the coordinates falling into the image can be obtained. The text in the message is the target text.

步骤140，将目标文字转换为音频进行播放。Step 140: Convert the target text into audio for playback.

具体的，在确定了目标文字后，可以调用文字转音频模块将目标文字转换为音频，并播放出来。Specifically, after the target text is determined, the text-to-audio module can be called to convert the target text into audio and play it out.

可选的，将目标文字转换为音频进行播放的方式可以是：判断用户注视目标文字的时长是否超过设定值，若超过，则将目标文字转换为音频进行播放。Optionally, the method of converting the target text into audio for playback may be: judging whether the duration of the user's gaze at the target text exceeds a set value, and if so, converting the target text into audio for playback.

其中，设定值可以1-2秒之间的值。当用户注视目标文字的时长超过设定值时，在一定程度上反映出用户可能不认识该文字，此时才转换成音频进行播放，实现了辅助阅读的自动控制。Among them, the set value can be a value between 1-2 seconds. When the user stares at the target text for longer than the set value, it reflects that the user may not know the text to a certain extent, and then it is converted into audio for playback, realizing automatic control of assisted reading.

本发明实施例的技术方案，获取用户注视目标的图像，并对图像进行文字识别，获得图像中包含的文字；对用户进行视线追踪，获得用户的注视点信息；将注视点信息映射到图像中，获得用户当前注视的目标文字；将目标文字转换为音频进行播放。本公开实施例公开的辅助阅读方法，将用户的注视点信息映射至用户注视目标的图像中，获得用户当前注视的目标文字，并将目标文字转换成音频进行播放，可以实现儿童对任意文本的自主阅读，无需家长参与就可以辅助阅读，提高了辅助阅读的便捷性。The technical solution of the embodiment of the present invention is to obtain an image of the user's gaze target, and perform text recognition on the image to obtain the text contained in the image; perform gaze tracking on the user to obtain the user's gaze point information; and map the gaze point information to the image. , to obtain the target text that the user is currently looking at; convert the target text to audio for playback. The assisted reading method disclosed in the embodiment of the present disclosure maps the user's gaze point information to the image of the user's gaze target, obtains the target text that the user is currently looking at, and converts the target text into audio for playback, which can realize children's reading of any text. Independent reading, assisted reading without parental participation, improves the convenience of assisted reading.

实施例二Embodiment 2

图2是本发明实施例二提供的一种辅助阅读系统的结果示意图。如图2所示，该系统包括：图像获取模块210、文字识别模块220、视线追踪模块230、注视点信息映射模块240及音频播放模块250。FIG. 2 is a schematic diagram of the result of an auxiliary reading system provided in the second embodiment of the present invention. As shown in FIG. 2 , the system includes: an image acquisition module 210 , a character recognition module 220 , a gaze tracking module 230 , a gaze point information mapping module 240 and an audio playback module 250 .

图像获取模块210用于获取用户注视目标的图像；文字识别模块220用于对图像进行文字识别，获得图像包含的文字；视线追踪模块230用于对用户进行眼动追踪，获得用户的注视点信息；注视点信息映射模块240用于将注视点信息映射到图像中，获得用户当前注视的目标文字；音频播放模块250用于将目标文字转换为音频进行播放。The image acquisition module 210 is used to acquire the image of the user's gaze target; the text recognition module 220 is used to perform text recognition on the image to obtain the text contained in the image; the gaze tracking module 230 is used to perform eye tracking on the user to obtain the user's gaze point information ; The gaze point information mapping module 240 is used to map the gaze point information to the image to obtain the target text that the user is currently looking at; the audio playing module 250 is used to convert the target text into audio for playback.

其中，视线追踪模块230可以是眼动仪。若用户注视目标的图像通过显示装置显示，则视线追踪模块230集成于所述显示装置上。其中，显示装置可以是任意电子设备的显示屏，如电视机、台式电脑或者移动终端。等示例性的，图3是本发明实施例二提供的一种辅助阅读系统的示例图。如图3所示，视线追踪模块设置于显示装置上，当用户注视显示装置上显示的图像时，视线追踪模块可以获取到用户的注视点信息，并将注视点信息映射至图像上，确定出注视点对应的文字，将该文字转换为音频并通过播放模块播放出来。播放模块可以是扬声器或者耳机。本实施例中，注视点可以在显示装置上显示或者不显示。The gaze tracking module 230 may be an eye tracker. If the image of the user's gaze target is displayed on the display device, the gaze tracking module 230 is integrated on the display device. Wherein, the display device may be a display screen of any electronic device, such as a television set, a desktop computer or a mobile terminal. By way of example, FIG. 3 is an exemplary diagram of an auxiliary reading system provided by Embodiment 2 of the present invention. As shown in FIG. 3 , the gaze tracking module is set on the display device. When the user gazes at the image displayed on the display device, the gaze tracking module can obtain the gaze point information of the user, map the gaze point information to the image, and determine the Gaze at the text corresponding to the point, convert the text into audio and play it out through the playback module. The playback module can be a speaker or an earphone. In this embodiment, the gaze point may or may not be displayed on the display device.

图像获取模块210还用于：获取图像的来源；根据来源获取用户注视目标的图像。The image acquisition module 210 is further configured to: acquire the source of the image; and acquire the image of the user's gaze target according to the source.

可选的，该系统还包括摄像头，该摄像头用于对用户注视的内容进行拍摄，获得用户注视目标的图像，并将图像发送至图像获取模块210。Optionally, the system further includes a camera, and the camera is used to photograph the content of the user's gaze, obtain an image of the user's gaze target, and send the image to the image acquisition module 210 .

其中，用户注视的是实物书籍或者用户注视目标的图像通过显示装置显示。Wherein, what the user is gazing at is a physical book or an image of the user's gazing target is displayed by the display device.

示例性的，图4是本发明实施例二提供的另一种辅助阅读系统的示例图。如图4所示，用户注视的是实物书籍，视线追踪模块和摄像头均设置在书桌上，摄像头用于拍摄实物书籍的图像，并将图像发送至图像获取模块，视线追踪模块用于获取用户的注视点信息。文字识别模块对图像进行文字识别，获得图像包含的文字。注视点信息映射模块将注视点信息映射到图像中，获得用户当前注视的目标文字。音频播放模块用于将目标文字转换为音频进行播放。Exemplarily, FIG. 4 is an exemplary diagram of another auxiliary reading system provided by Embodiment 2 of the present invention. As shown in Figure 4, the user is looking at the physical book, the gaze tracking module and the camera are set on the desk, the camera is used to capture the image of the physical book, and the image is sent to the image acquisition module, and the gaze tracking module is used to obtain the user's Gaze information. The text recognition module performs text recognition on the image to obtain the text contained in the image. The gaze point information mapping module maps the gaze point information to the image, and obtains the target text that the user is currently looking at. The audio playback module is used to convert the target text into audio for playback.

本实施例中，音频播放模块可以与摄像头及视线追踪模块集成在一起，也可以是单独的模块。In this embodiment, the audio playback module may be integrated with the camera and the line-of-sight tracking module, or may be a separate module.

可选的，摄像头和视线追踪模块还可以集成于可穿戴设备上。其中，可穿戴设备可以是眼镜及头盔等。示例性的，图5a-图5b是本发明实例提供一种可穿戴设备的结构示意图。如图5a-图5b所示所示，眼镜上设置有前置摄像头和视线追踪模块。用户佩戴该眼镜后，视线追踪模块会实时获取用户的注视点信息，前置摄像头拍摄用户注视目标的图像。图像获取模块、文字识别模块、注视点信息映射模块及音频播放模块均可集成于可穿戴设备上，或者可穿戴设备通过有线或无线的方式与集成有图像获取模块、文字识别模块、注视点信息映射模块及音频播放模块的设备连接，将注视点信息和用户注视目标的图像发送至该设备，使得该设备对注视点信息和用户注视目标的图像进行处理，以实现辅助阅读。Optionally, the camera and gaze tracking module can also be integrated into the wearable device. Among them, the wearable devices may be glasses and helmets. Exemplarily, FIGS. 5a-5b are schematic structural diagrams of a wearable device provided by an example of the present invention. As shown in Figures 5a-5b, the glasses are provided with a front camera and a gaze tracking module. After the user wears the glasses, the gaze tracking module will obtain the user's gaze point information in real time, and the front camera will capture the image of the user's gaze target. The image acquisition module, the text recognition module, the gaze point information mapping module and the audio playback module can all be integrated on the wearable device, or the wearable device can be integrated with the image acquisition module, the text recognition module, the gaze point information in a wired or wireless manner. The device of the mapping module and the audio playback module is connected, and the gaze point information and the image of the user's gaze target are sent to the device, so that the device can process the gaze point information and the image of the user's gaze target to realize auxiliary reading.

本发明实施例提供的辅助阅读系统，包括：图像获取模块、文字识别模块、视线追踪模块、注视点信息映射模块及音频播放模块。图像获取模块用于获取用户注视目标的图像；文字识别模块用于对图像进行文字识别，获得图像包含的文字；视线追踪模块用于对用户进行眼动追踪，获得用户的注视点信息；注视点信息映射模块用于将注视点信息映射到图像中，获得用户当前注视的目标文字；音频播放模块用于将目标文字转换为音频进行播放。将用户的注视点信息映射至用户注视目标的图像中，获得用户当前注视的目标文字，并将目标文字转换成音频进行播放，可以实现儿童对任意文本的自主阅读，无需家长参与就可以辅助阅读，提高了辅助阅读的便捷性。The auxiliary reading system provided by the embodiment of the present invention includes: an image acquisition module, a character recognition module, a gaze tracking module, a gaze point information mapping module, and an audio playback module. The image acquisition module is used to obtain the image of the user's gaze target; the text recognition module is used to perform text recognition on the image to obtain the text contained in the image; the gaze tracking module is used to perform eye tracking on the user to obtain the user's gaze point information; gaze point The information mapping module is used to map the gaze point information into the image to obtain the target text that the user is currently looking at; the audio playback module is used to convert the target text into audio for playback. Map the user's gaze point information to the image of the user's gaze target, obtain the target text of the user's current gaze, and convert the target text into audio for playback, which can realize children's independent reading of any text, without parental participation. , which improves the convenience of assisted reading.

实施例三Embodiment 3

本发明实施例提供了一种计算机可读存储介质，该计算机可读存储介质上存储有计算机程序，该程序被处理装置执行时实现如本发明实施例中的辅助阅读方法。本发明上述的计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质或者是上述两者的任意组合。计算机可读存储介质例如可以是——但不限于——电、磁、光、电磁、红外线、或半导体的系统、装置或器件，或者任意以上的组合。计算机可读存储介质的更具体的例子可以包括但不限于：具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机访问存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑磁盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。在本公开中，计算机可读存储介质可以是任何包含或存储程序的有形介质，该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。而在本公开中，计算机可读信号介质可以包括在基带中或者作为载波一部分传播的数据信号，其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式，包括但不限于电磁信号、光信号或上述的任意合适的组合。计算机可读信号介质还可以是计算机可读存储介质以外的任何计算机可读介质，该计算机可读信号介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。计算机可读介质上包含的程序代码可以用任何适当的介质传输，包括但不限于：电线、光缆、RF(射频)等等，或者上述的任意合适的组合。An embodiment of the present invention provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the program is executed by a processing apparatus, the reading assistance method in the embodiment of the present invention is implemented. The above-mentioned computer-readable medium of the present invention may be a computer-readable signal medium or a computer-readable storage medium or any combination of the above two. The computer-readable storage medium can be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus or device, or a combination of any of the above. More specific examples of computer readable storage media may include, but are not limited to, electrical connections with one or more wires, portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable Programmable read only memory (EPROM or flash memory), fiber optics, portable compact disk read only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination of the foregoing. In this disclosure, a computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device. In the present disclosure, however, a computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave with computer-readable program code embodied thereon. Such propagated data signals may take a variety of forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing. A computer-readable signal medium can also be any computer-readable medium other than a computer-readable storage medium that can transmit, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device . Program code embodied on a computer readable medium may be transmitted using any suitable medium including, but not limited to, electrical wire, optical fiber cable, RF (radio frequency), etc., or any suitable combination of the foregoing.

在一些实施方式中，客户端、服务器可以利用诸如HTTP(HyperText TransferProtocol，超文本传输协议)之类的任何当前已知或未来研发的网络协议进行通信，并且可以与任意形式或介质的数字数据通信(例如，通信网络)互连。通信网络的示例包括局域网(“LAN”)，广域网(“WAN”)，网际网(例如，互联网)以及端对端网络(例如，ad hoc端对端网络)，以及任何当前已知或未来研发的网络。In some embodiments, the client and server can communicate using any currently known or future developed network protocol such as HTTP (HyperText Transfer Protocol), and can communicate with digital data in any form or medium (eg, a communications network) interconnected. Examples of communication networks include local area networks ("LAN"), wide area networks ("WAN"), the Internet (eg, the Internet), and peer-to-peer networks (eg, ad hoc peer-to-peer networks), as well as any currently known or future development network of.

上述计算机可读介质可以是上述电子设备中所包含的；也可以是单独存在，而未装配入该电子设备中。The above-mentioned computer-readable medium may be included in the above-mentioned electronic device; or may exist alone without being assembled into the electronic device.

上述计算机可读介质承载有一个或者多个程序，当上述一个或者多个程序被该电子设备执行时，使得该电子设备：获取用户注视目标的图像，并对所述图像进行文字识别，获得所述图像中包含的文字；对所述用户进行视线追踪，获得用户的注视点信息；将所述注视点信息映射到所述图像中，获得用户当前注视的目标文字；将所述目标文字转换为音频进行播放。The above-mentioned computer-readable medium carries one or more programs, and when the above-mentioned one or more programs are executed by the electronic device, the electronic device is made to: acquire an image of the user's gaze target, perform text recognition on the image, and obtain the The text contained in the image; the user's gaze is tracked to obtain the user's gaze point information; the gaze point information is mapped to the image to obtain the target text that the user is currently looking at; the target text is converted into audio to play.

可以以一种或多种程序设计语言或其组合来编写用于执行本公开的操作的计算机程序代码，上述程序设计语言包括但不限于面向对象的程序设计语言—诸如Java、Smalltalk、C++，还包括常规的过程式程序设计语言—诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服务器上执行。在涉及远程计算机的情形中，远程计算机可以通过任意种类的网络——包括局域网(LAN)或广域网(WAN)—连接到用户计算机，或者，可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。Computer program code for performing operations of the present disclosure may be written in one or more programming languages, including but not limited to object-oriented programming languages—such as Java, Smalltalk, C++, and This includes conventional procedural programming languages - such as the "C" language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer, or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or may be connected to an external computer (eg, using an Internet service provider through Internet connection).

附图中的流程图和框图，图示了按照本公开各种实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上，流程图或框图中的每个方框可以代表一个模块、程序段、或代码的一部分，该模块、程序段、或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意，在有些作为替换的实现中，方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如，两个接连地表示的方框实际上可以基本并行地执行，它们有时也可以按相反的顺序执行，这依所涉及的功能而定。也要注意的是，框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合，可以用执行规定的功能或操作的专用的基于硬件的系统来实现，或者可以用专用硬件与计算机指令的组合来实现。The flowchart and block diagrams in the Figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code that contains one or more logical functions for implementing the specified functions executable instructions. It should also be noted that, in some alternative implementations, the functions noted in the blocks may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It is also noted that each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations, can be implemented in dedicated hardware-based systems that perform the specified functions or operations , or can be implemented in a combination of dedicated hardware and computer instructions.

描述于本公开实施例中所涉及到的单元可以通过软件的方式实现，也可以通过硬件的方式来实现。其中，单元的名称在某种情况下并不构成对该单元本身的限定。The units involved in the embodiments of the present disclosure may be implemented in a software manner, and may also be implemented in a hardware manner. Among them, the name of the unit does not constitute a limitation of the unit itself under certain circumstances.

本文中以上描述的功能可以至少部分地由一个或多个硬件逻辑部件来执行。例如，非限制性地，可以使用的示范类型的硬件逻辑部件包括：现场可编程门阵列(FPGA)、专用集成电路(ASIC)、专用标准产品(ASSP)、片上系统(SOC)、复杂可编程逻辑设备(CPLD)等等。The functions described herein above may be performed, at least in part, by one or more hardware logic components. For example, without limitation, exemplary types of hardware logic components that may be used include: Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), Systems on Chips (SOCs), Complex Programmable Logical Devices (CPLDs) and more.

在本公开的上下文中，机器可读介质可以是有形的介质，其可以包含或存储以供指令执行系统、装置或设备使用或与指令执行系统、装置或设备结合地使用的程序。机器可读介质可以是机器可读信号介质或机器可读储存介质。机器可读介质可以包括但不限于电子的、磁性的、光学的、电磁的、红外的、或半导体系统、装置或设备，或者上述内容的任何合适组合。机器可读存储介质的更具体示例会包括基于一个或多个线的电气连接、便携式计算机盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦除可编程只读存储器(EPROM或快闪存储器)、光纤、便捷式紧凑盘只读存储器(CD-ROM)、光学储存设备、磁储存设备、或上述内容的任何合适组合。In the context of the present disclosure, a machine-readable medium may be a tangible medium that may contain or store a program for use by or in connection with the instruction execution system, apparatus or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. Machine-readable media may include, but are not limited to, electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems, devices, or devices, or any suitable combination of the foregoing. More specific examples of machine-readable storage media would include one or more wire-based electrical connections, portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM or flash memory), fiber optics, compact disk read only memory (CD-ROM), optical storage, magnetic storage, or any suitable combination of the foregoing.

注意，上述仅为本发明的较佳实施例及所运用技术原理。本领域技术人员会理解，本发明不限于这里所述的特定实施例，对本领域技术人员来说能够进行各种明显的变化、重新调整和替代而不会脱离本发明的保护范围。因此，虽然通过以上实施例对本发明进行了较为详细的说明，但是本发明不仅仅限于以上实施例，在不脱离本发明构思的情况下，还可以包括更多其他等效实施例，而本发明的范围由所附的权利要求范围决定。Note that the above are only preferred embodiments of the present invention and applied technical principles. Those skilled in the art will understand that the present invention is not limited to the specific embodiments described herein, and various obvious changes, readjustments and substitutions can be made by those skilled in the art without departing from the protection scope of the present invention. Therefore, although the present invention has been described in detail through the above embodiments, the present invention is not limited to the above embodiments, and can also include more other equivalent embodiments without departing from the concept of the present invention. The scope is determined by the scope of the appended claims.

Claims

1. a kind of auxiliary reading method, is characterized in that, comprises:

Acquire an image of the user's gaze target, and perform text recognition on the image to obtain the text contained in the image;

Perform gaze tracking on the user to obtain the user's gaze point information;

Mapping the gaze point information into the image to obtain the target text currently gazed by the user;

Convert the target text into audio for playback.

2. The method according to claim 1, wherein acquiring the image of the user's gaze target comprises:

If the image of the user's gaze target is displayed through the display device, acquiring the source of the image;

Obtain an image of the user's gaze target according to the source; or,

The displayed content is photographed by a camera to obtain an image of the user's gaze target.

3. The method according to claim 1, wherein acquiring the image of the user's gaze target comprises:

If the target of the user's gaze is a real book, the real book that the user is looking at is photographed by a camera to obtain an image of the user's gaze target.

4. The method according to claim 1, wherein, mapping the gaze point information to the image to obtain the target text currently gazed by the user, comprising:

obtaining coordinate information corresponding to the gaze point information in the image;

The target character that the user is currently looking at is determined based on the coordinate information.

5. The method according to claim 1, wherein the target text is converted into audio for playing, comprising:

Determine whether the user gazes at the target text for longer than a set value,

If it exceeds, the target text is converted into audio for playback.

6. An auxiliary reading system, comprising: an image acquisition module, a character recognition module, a gaze tracking module, a gaze point information mapping module and an audio playback module;

The image acquisition module is used to acquire the image of the user's gaze target; the text recognition module is used to perform text recognition on the image to obtain the text contained in the image; the gaze tracking module is used to perform eye tracking on the user , to obtain the user's gaze point information; the gaze point information mapping module is used to map the gaze point information to the image to obtain the target text that the user is currently looking at; the audio playback module is used to map the target text to the Convert to audio for playback.

7 . The system according to claim 6 , wherein the system further comprises a camera; the camera is used to shoot the content of the user's gaze, obtain an image of the user's gaze target, and send the image to the user. 8 . The image acquisition module is described above; wherein, the user is staring at a physical book or the image of the user's staring target is displayed on the display device.

8. The system according to claim 6 or 7, wherein if the image of the user's gaze target is displayed on a display device, the gaze tracking module is integrated on the display device;

The image acquisition module is further configured to: acquire the source of the image; and acquire the image of the user's gaze target according to the source.

9. The system of claim 7, wherein the camera and the gaze tracking module are integrated on a wearable device.

10. A computer-readable storage medium on which a computer program is stored, characterized in that, when the program is executed by a processing device, the reading assistance method according to any one of claims 1-5 is implemented.