WO2023272656A1 - Picture book recognition method and apparatus, family education machine, and storage medium - Google Patents

Picture book recognition method and apparatus, family education machine, and storage medium Download PDF

Info

Publication number
WO2023272656A1
WO2023272656A1 PCT/CN2021/103859 CN2021103859W WO2023272656A1 WO 2023272656 A1 WO2023272656 A1 WO 2023272656A1 CN 2021103859 W CN2021103859 W CN 2021103859W WO 2023272656 A1 WO2023272656 A1 WO 2023272656A1
Authority
WO
WIPO (PCT)
Prior art keywords
picture book
standard
target
area
book page
Prior art date
Application number
PCT/CN2021/103859
Other languages
French (fr)
Chinese (zh)
Inventor
张明云
Original Assignee
东莞市小精灵教育软件有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 东莞市小精灵教育软件有限公司 filed Critical 东莞市小精灵教育软件有限公司
Priority to PCT/CN2021/103859 priority Critical patent/WO2023272656A1/en
Publication of WO2023272656A1 publication Critical patent/WO2023272656A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • G06V30/41Analysis of document content
    • G06V30/413Classification of content, e.g. text, photographs or tables
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/107Static hand or arm
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09BEDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
    • G09B5/00Electrically-operated educational appliances
    • G09B5/06Electrically-operated educational appliances with both visual and audible presentation of the material to be studied

Definitions

  • the present application relates to the technical field of image processing, and in particular to a picture book recognition method, device, tutoring machine and storage medium.
  • the tutor machine is an Android tablet that provides high-quality educational resources.
  • the user first places the picture book in front of the tablet, uses the camera device in the tablet to take pictures of the picture book, and then uses the picture book APP in the tablet to identify the cover of the picture book and confirm the picture book to be read.
  • the user turns the page Picture book, finger points to the text in the picture book page, and the picture book app in the tablet instantly displays the text, audio and video content of the corresponding text to realize the function of picture book literacy, assisting users to understand the content in the picture book, and realizing assisted reading through electronic devices such as tablets
  • the effect of picture books reduces the difficulty of children reading picture books, and at the same time liberates parents from repeated guidance and assisting children in reading picture books.
  • the current picture book recognition scheme for tutoring machines has the following defects:
  • artistic characters cannot be recognized;
  • a picture book recognition method comprising:
  • the standard library includes a plurality of standard picture book pages and a plurality of standard blocks corresponding to each of the standard picture book pages, each of the standard blocks is marked with different standard coordinates, wherein, The standard block is obtained by dividing each standard picture book page according to preset division rules;
  • a fingertip positioning method is used to determine the click area
  • Finding the standard block containing the target area from each of the standard coordinates of the plurality of standard blocks corresponding to the target picture book page is determined as the target block;
  • a picture book recognition result is determined based on the target block.
  • a picture book recognition device comprising:
  • the obtaining module is used to obtain a standard library of picture books.
  • the standard library includes a plurality of standard picture book pages and a plurality of standard blocks corresponding to each of the standard picture book pages.
  • Each of the standard blocks is marked with a different Standard coordinates, wherein the standard block is obtained by dividing each standard picture book page according to preset division rules;
  • the collection module is used to collect the captured current picture book page and obtain the shooting pixels of the current picture book page when the click operation of the picture book by the user is detected;
  • a retrieval module configured to search in the standard library, determine the standard picture book page corresponding to the current picture book page as the target picture book page, and obtain the standard pixels of the target picture book page;
  • a positioning module configured to determine the click area by using a fingertip positioning method according to the click position corresponding to the click operation
  • a conversion module configured to convert the clicked area into a target area consistent with the coordinate system corresponding to the standard coordinates based on the shooting pixels and the standard pixels;
  • a search module configured to search for a standard block containing the target area from each of the standard coordinates of the plurality of standard blocks corresponding to the target picture book page and determine it as the target block;
  • a recognition module configured to determine a picture book recognition result based on the target block.
  • a tutoring machine includes a memory and a processor, the memory stores computer-readable instructions, and when the computer-readable instructions are executed by the processor, the processor performs the following steps:
  • the standard library includes a plurality of standard picture book pages and a plurality of standard blocks corresponding to each of the standard picture book pages, each of the standard blocks is marked with different standard coordinates, wherein, The standard block is obtained by dividing each standard picture book page according to preset division rules;
  • a fingertip positioning method is used to determine the click area
  • Finding the standard block containing the target area from each of the standard coordinates of the plurality of standard blocks corresponding to the target picture book page is determined as the target block;
  • a picture book recognition result is determined based on the target block.
  • a computer-readable medium storing computer-readable instructions, which, when executed by a processor, cause the processor to perform the following steps:
  • the standard library includes a plurality of standard picture book pages and a plurality of standard blocks corresponding to each of the standard picture book pages, each of the standard blocks is marked with different standard coordinates, wherein, The standard block is obtained by dividing each standard picture book page according to preset division rules;
  • a fingertip positioning method is used to determine the click area
  • Finding the standard block containing the target area from each of the standard coordinates of the plurality of standard blocks corresponding to the target picture book page is determined as the target block;
  • a picture book recognition result is determined based on the target block.
  • the standard library includes a plurality of standard picture book pages and a plurality of standard blocks corresponding to each of the standard picture book pages, each The standard blocks are marked with different standard coordinates, wherein the standard block is obtained by dividing each standard picture book page according to the preset division rules; when the click operation of the picture book by the user is detected, the photographing the obtained current picture book page, and obtaining the shooting pixels of the current picture book page; searching in the standard library, determining the standard picture book page corresponding to the current picture book page as the target picture book page, and obtaining the standard picture book page of the target pixels; according to the click position corresponding to the click operation, the click area is determined by using the fingertip positioning method; based on the shooting pixels and the standard pixels, the click area is converted into a coordinate system consistent with the standard coordinates The target area; from each of the standard coordinates of the plurality of standard blocks corresponding to the target picture book page,
  • Fig. 1 is the flowchart of picture book recognition method in an embodiment
  • Fig. 2 is the flowchart of picture book identification method in another embodiment
  • Fig. 3 is a flow chart of determining picture book recognition results in an embodiment
  • Fig. 4 is a flowchart of a method for determining a target area in an embodiment
  • FIG. 5 is a flow chart of a method for determining a click area in an embodiment
  • Fig. 6 is a structural block diagram of a picture book recognition device in an embodiment
  • Fig. 7 is a structural block diagram of the tutoring machine in one embodiment.
  • a method for identifying picture books is provided, and the method for identifying picture books can be applied to both a terminal and a server.
  • This embodiment uses the application to a server as an example for illustration.
  • the picture book identification method specifically includes the following steps:
  • Step 102 obtain the picture book standard library
  • the standard library contains a plurality of standard picture book pages and a plurality of standard blocks corresponding to each standard picture book page, each standard block is marked with a different standard coordinates, wherein the standard block It is obtained by dividing each standard picture book page according to preset division rules.
  • the standard block refers to an area in a standard picture book page in the picture book.
  • a standard picture book page can be a picture scanned by a scanning device.
  • the preset dividing rule can be divided according to the content of a standard picture book page, and the corresponding standard block contains at least pictures in a character area, a picture area or an artistic word area, wherein the character area can be an area containing a text, or Can be an area containing multiple text.
  • each standard picture book page can be identified based on feature extraction, divided according to the identified content to obtain standard blocks, and the corresponding standard coordinates for each standard block. It can be understood that in this embodiment, each standard picture book is also divided in advance and the standard coordinates are marked, so as to realize the positioning of each standard block, so that further processing can be performed based on the standard block.
  • Step 104 when the user's click operation on the picture book is detected, the captured current picture book page is captured, and the shooting pixels of the current picture book page are acquired.
  • the current picture book page refers to the picture of the picture book currently clicked by the user that needs to be identified.
  • Shooting pixels refer to the pixel information of the current picture book page.
  • the server detects the user's click operation on the picture book, the current picture book page is captured by the camera device, and the pixel information of the current picture book page is acquired.
  • Step 106 search in the standard library, determine the standard picture book page corresponding to the current picture book page as the target picture book page, and obtain the standard pixels of the target picture book page.
  • the target picture book page refers to a standard picture book page that is consistent with the content of the current picture book page.
  • Standard pixels refer to the pixel information of the target picture book page.
  • image comparison methods can be used, for example, to extract the image features of the current picture book page and each standard picture book page, determine the standard picture book page that matches the image features of the current picture book page as the target picture book page, and obtain the target picture book page standard pixels of the page.
  • Step 108 according to the click position corresponding to the click operation, use the fingertip positioning method to determine the click area.
  • the fingertip positioning method refers to a positioning method that detects the position of the hand in the image and locates the coordinate information of the fingertip.
  • the fingertip positioning method can be an image recognition positioning method based on deep learning, or a positioning based on feature extraction. method.
  • a positioning method based on feature extraction is selected to avoid the cumbersome image recognition of deep learning and the time-consuming increase of fingertip positioning.
  • Step 110 based on the captured pixels and the standard pixels, transform the clicked area into a target area consistent with the coordinate system corresponding to the standard coordinates.
  • the proportional relationship between the shooting pixel and the standard pixel may be calculated, and then according to the proportional relationship between the two, the clicked area is transformed into a target area consistent with the coordinate system corresponding to the standard coordinate. Understandably, since the current picture book page is captured by a camera device, the shape and quality of the current picture book page captured are affected by the limited shape of the camera device. For example, the current picture book page may be large at the top and small at the bottom. Trapezoid picture.
  • the clicked area is converted into a target area that is consistent with the coordinate system corresponding to the standard coordinates, to ensure the accuracy of the clicked area, and then to ensure the accuracy of the corresponding target area, so that the subsequent target picture book page The target area for picture book recognition.
  • Step 112 searching for a standard block containing the target area from the standard coordinates of the multiple standard blocks corresponding to the target picture book page and determining it as the target block.
  • the target block refers to an area of a standard picture book page that requires picture book identification. Specifically, after the target area is determined, according to the coordinates of the target area and the standard coordinates of the multiple standard blocks corresponding to the target picture book page, search the standard coordinates of the multiple standard blocks corresponding to the target picture book page that contains The standard block of the target area is used to obtain the target area for subsequent efficient identification based on the target block.
  • Step 114 determine the picture book recognition result based on the target block.
  • screenshot the picture according to the target area in the target block perform OCR recognition on the intercepted area, and obtain the picture book recognition result. It can be understood that in this embodiment, by using the target area to identify the standard block of the standard picture book page, since the quality of the standard picture book page is higher than the quality of the captured picture, the identification of the remote picture is avoided. , greatly improving the recognition efficiency of picture books.
  • the above-mentioned picture book identification method obtains the standard picture book library, which contains a plurality of standard picture book pages and a plurality of standard blocks corresponding to each standard picture book page, and each standard block is marked with a different standard coordinate, wherein, The standard block is obtained by dividing each standard picture book page according to the preset division rules; when the user's click operation on the picture book is detected, the current picture book page obtained by shooting is collected, and the shooting pixels of the current picture book page are obtained; Search in the standard library, determine the standard picture book page corresponding to the current picture book page as the target picture book page, and obtain the standard pixels of the target picture book page; according to the click position corresponding to the click operation, use the fingertip positioning method to determine the click area; based on the shooting pixels and standard pixels, convert the clicked area into a target area consistent with the coordinate system corresponding to the standard coordinates; find the standard block containing the target area from each standard coordinate of the multiple standard blocks corresponding to the target picture book page and determine it as the target area block; based
  • Step 116 based on the click area, intercept the first picture book page from the current picture book page;
  • Step 118 respectively extracting the first text information of the first picture book page and the second text information of the target block;
  • Step 120 judging whether the first text information matches the second text information
  • Step 122 if not matching, determine that the recognition result is that the clicked area is a blank area.
  • the first picture book page is intercepted according to the clicked area in the current picture book page, OCR is performed on the first picture book page and the target block respectively, and the first text information of the first picture book page and the second text information of the target block are obtained.
  • Text information judging whether the first text information matches the second text information, if they do not match, it indicates that there is no picture book information that matches the clicked area, therefore, it is determined that the clicked area is a blank area as a result of the recognition. Further, after determining that the clicked area is blank After the area, you can continue to obtain new click areas for picture book recognition, or reposition for picture book recognition to improve the efficiency of picture book recognition.
  • determining the picture book recognition result based on the target block includes:
  • Step 114A based on the target area, intercept the second picture book page from the target block;
  • Step 114B identify the second picture book page, and obtain the picture book recognition result.
  • the second picture book page is an area of a standard picture book page, its picture Compared with the current picture book page, the quality is higher, therefore, the recognition accuracy rate of the second picture book page is greatly improved, and the accuracy of the picture book recognition result is guaranteed.
  • the clicked area is transformed into a target area consistent with the coordinate system corresponding to the standard coordinates, including:
  • Step 110A calculating the mapping transformation matrix based on the captured pixels and the standard pixels
  • step 110B the clicked area is subjected to coordinate transformation processing according to the mapping transformation matrix to obtain the target area.
  • the affine transformation matrix between the current picture book page and the standard picture book page can be calculated according to the mapping relationship between the captured pixels and the standard pixels as the mapping transformation matrix; the coordinates of the clicked area are transformed and calculated according to the affine transformation matrix to obtain target area. It can be understood that in this embodiment, the accuracy of the target area is guaranteed by performing affine transformation on the coordinates of the clicked area.
  • the method further includes: respectively identifying and semantically analyzing each standard block in the standard library, and generating a picture book interpretation information mapping table, each standard block corresponding to a piece of picture book interpretation information.
  • each standard block can be identified and semantically analyzed in advance.
  • text content, picture content, or artistic word content can be voice analyzed to generate picture book interpretation information corresponding to each standard block, and the picture book
  • the paraphrase information mapping table is stored in the server.
  • the method further includes: obtaining the target picture book interpretation information corresponding to the target block from the picture book interpretation information mapping table; and displaying the target picture book interpretation information.
  • the target picture book interpretation information corresponding to the target block is obtained from the picture book interpretation information mapping table, and the target picture book interpretation information is displayed. Recognition greatly improves the user experience in the picture book reading process and the user's ability to understand the picture book content during the picture book reading process.
  • the click area is determined by using the fingertip positioning method, including:
  • Step 108A acquiring the clicked image including the click operation performed by the finger
  • Step 108B performing edge detection on the clicked image to obtain finger contour features
  • step 108C the click area is determined based on the contour features of the finger.
  • the click image that includes the finger to perform the click operation edge detection is performed on the click image to obtain the finger contour features
  • the edge detection methods include but are not limited to Sobel operator, Laplacian operator, Canny
  • the operator locates the fingertip of the index finger based on the contour features of the finger, returns the coordinate information of the hand and the fingertip of the index finger, and determines the click area; it can also locate the middle joint of the index finger and the root of the index finger based on the finger contour feature and detection of the fingertip of the index finger , the middle joint of the middle finger, and the coordinate information of the root of the middle finger to determine the click area.
  • the precise positioning of the clicked area is realized through the method of edge detection, and the efficiency of fingertip positioning is improved.
  • a picture book recognition device As shown in Figure 6, in one embodiment, a picture book recognition device is proposed, the device includes:
  • the acquiring module 602 is used to acquire a standard picture book library, the standard library includes a plurality of standard picture book pages and a plurality of standard blocks corresponding to each of the standard picture book pages, and each of the standard blocks is marked with a different standard coordinates, wherein the standard block is obtained by dividing each standard picture book page according to preset division rules;
  • the collection module 604 is used to collect the current picture book page obtained by shooting when the user's click operation on the picture book is detected, and obtain the shooting pixels of the current picture book page;
  • a retrieval module 606 configured to search in the standard library, determine the standard picture book page corresponding to the current picture book page as the target picture book page, and obtain the standard pixels of the target picture book page;
  • a positioning module 608, configured to determine the click area by using a fingertip positioning method according to the click position corresponding to the click operation;
  • a conversion module 610 configured to convert the clicked area into a target area consistent with the coordinate system corresponding to the standard coordinates based on the photographing pixels and the standard pixels;
  • a search module 612 configured to search for a standard block containing the target area from each of the standard coordinates of the plurality of standard blocks corresponding to the target picture book page and determine it as the target block;
  • a recognition module 614 configured to determine a picture book recognition result based on the target block.
  • the device also includes:
  • An intercepting module configured to intercept a first picture book page from the current picture book page based on the click area
  • An extraction module configured to extract the first text information of the first picture book page and the second text information of the target block respectively;
  • a matching module configured to determine whether the first text information matches the second text information
  • a determining module configured to determine that the clicked area is a blank area as a result of the recognition if there is no match.
  • the recognition module includes:
  • An intercepting unit configured to intercept a second picture book page from the target block based on the target area
  • the identification unit is configured to identify the second picture book page to obtain the picture book identification result.
  • the conversion module includes:
  • a calculation unit configured to calculate a mapping transformation matrix based on the captured pixels and the standard pixels
  • the transformation unit is configured to perform coordinate transformation processing on the clicked area according to the mapping transformation matrix to obtain the target area.
  • the method further includes: respectively identifying and semantically analyzing each standard block in the standard library, and generating a picture book interpretation information mapping table, each standard block corresponding to a piece of picture book interpretation information.
  • the device also includes:
  • a search unit configured to obtain the target picture book interpretation information corresponding to the target block from the picture book interpretation information mapping table
  • a display unit configured to display the target picture book interpretation information.
  • the positioning module includes:
  • An acquisition unit configured to acquire a click image including a finger click operation
  • An extraction unit configured to perform edge detection on the clicked image to obtain finger contour features
  • a determining unit configured to determine the click area based on the outline features.
  • Fig. 7 shows the internal structure diagram of the tutoring machine in one embodiment.
  • the tutoring machine may specifically be a server, and the server includes but is not limited to a high-performance computer and a cluster of high-performance computers.
  • the tutoring machine includes a processor, a memory and a network interface connected through a system bus.
  • the memory includes a non-volatile storage medium and an internal memory.
  • the non-volatile storage medium of the tutoring machine stores an operating system and also stores computer-readable instructions.
  • the processor can realize the method for identifying picture books.
  • Computer-readable instructions may also be stored in the internal memory, and when the computer-readable instructions are executed by the processor, the processor may execute the picture book recognition method.
  • Figure 7 is only a block diagram of a part of the structure related to the solution of this application, and does not constitute a limitation to the tutoring machine on which the solution of this application is applied.
  • the specific tutoring machine can be More or fewer components than shown in the figures may be included, or some components may be combined, or have a different arrangement of components.
  • the picture book recognition method provided in this application can be implemented in the form of a computer-readable instruction, and the computer-readable instruction can be run on the tutoring machine as shown in FIG. 7 .
  • Various program templates constituting the picture book recognition device can be stored in the memory of the tutoring machine. For example, an acquisition module 602 , a collection module 604 , a retrieval module 606 , a positioning module 608 , a conversion module 610 , a search module 612 , and an identification module 614 .
  • a tutoring machine comprising a memory, a processor, and computer-readable instructions stored in the memory and operable on the processor.
  • the processor executes the computer-readable instructions, the following steps are implemented: acquiring a picture book The standard library, the standard library includes a plurality of standard picture book pages and a plurality of standard blocks corresponding to each of the standard picture book pages, each of the standard blocks is marked with a different standard coordinates, wherein the The standard block is obtained by dividing each standard picture book page according to the preset division rules; when the user's click operation on the picture book is detected, the current picture book page obtained by shooting is collected, and the shooting pixels of the current picture book page are obtained; retrieve in the standard picture book page, determine the standard picture book page corresponding to the current picture book page as the target picture book page, and obtain the standard pixels of the target picture book page; according to the click position corresponding to the click operation, use fingertips to locate The method determines the click area; based on the shooting pixels and the standard pixels, convert the click area into a target area consistent with
  • before determining the picture book recognition result based on the target block it further includes: intercepting the first picture book page from the current picture book page based on the click area; extracting the first picture book page respectively the first text information of the target block and the second text information of the target block; determine whether the first text information matches the second text information; if not, determine that the recognition result is that the clicked area is a blank area.
  • the determining the picture book recognition result based on the target block includes: intercepting a second picture book page from the target block based on the target area; identifying the second picture book page to obtain The picture book recognition result.
  • converting the click area into a target area consistent with the coordinate system corresponding to the standard coordinates based on the shooting pixels and the standard pixels includes: based on the shooting pixels and the A mapping transformation matrix is calculated for standard pixels; coordinate transformation processing is performed on the clicked area according to the mapping transformation matrix to obtain the target area.
  • the method further includes: respectively identifying and semantically analyzing each standard block in the standard library, and generating a picture book interpretation information mapping table, each standard block corresponding to a piece of picture book interpretation information.
  • the method further includes: obtaining the target picture book interpretation information corresponding to the target block from the picture book interpretation information mapping table; and displaying the target picture book interpretation information.
  • using the fingertip positioning method to determine the click area includes: acquiring a click image that includes a finger performing a click operation; performing edge detection on the click image to obtain Finger contour features; determining the click area based on the contour features.
  • a computer-readable storage medium stores computer-readable instructions, characterized in that, when the computer-readable instructions are executed by a processor, the following steps are implemented: obtaining a standard library of picture books, the standard The gallery contains a plurality of standard picture book pages and a plurality of standard blocks corresponding to each of the standard picture book pages, and each of the standard blocks is marked with a different standard coordinate, wherein the standard block is obtained by placing each A standard picture book page is obtained by dividing according to the preset division rules; when the click operation of the picture book by the user is detected, the current picture book page obtained by shooting is collected, and the shooting pixels of the current picture book page are obtained; retrieve, determine the standard picture book page corresponding to the current picture book page as the target picture book page, and obtain the standard pixels of the target picture book page; according to the click position corresponding to the click operation, use the fingertip positioning method to determine the click area; based on the The shooting pixels and the standard pixels are used to convert the clicked area into a target area consistent with the coordinate system
  • before determining the picture book recognition result based on the target block it further includes: intercepting the first picture book page from the current picture book page based on the click area; extracting the first picture book page respectively the first text information of the target block and the second text information of the target block; determine whether the first text information matches the second text information; if not, determine that the recognition result is that the clicked area is a blank area.
  • the determining the picture book recognition result based on the target block includes: intercepting a second picture book page from the target block based on the target area; identifying the second picture book page to obtain The picture book recognition result.
  • converting the click area into a target area consistent with the coordinate system corresponding to the standard coordinates based on the shooting pixels and the standard pixels includes: based on the shooting pixels and the A mapping transformation matrix is calculated for standard pixels; coordinate transformation processing is performed on the clicked area according to the mapping transformation matrix to obtain the target area.
  • the method further includes: respectively identifying and semantically analyzing each standard block in the standard library, and generating a picture book interpretation information mapping table, each standard block corresponding to a piece of picture book interpretation information.
  • the method further includes: obtaining the target picture book interpretation information corresponding to the target block from the picture book interpretation information mapping table; and displaying the target picture book interpretation information.
  • using the fingertip positioning method to determine the click area includes: acquiring a click image that includes a finger performing a click operation; performing edge detection on the click image to obtain Finger contour features; determining the click area based on the contour features.
  • Nonvolatile memory can include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory.
  • Volatile memory can include random access memory (RAM) or external cache memory.
  • RAM is available in many forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Chain Road (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.
  • SRAM Static RAM
  • DRAM Dynamic RAM
  • SDRAM Synchronous DRAM
  • DDRSDRAM Double Data Rate SDRAM
  • ESDRAM Enhanced SDRAM
  • SLDRAM Synchronous Chain Road
  • SLDRAM Synchronous Chain Road
  • RDRAM direct RAM
  • DRAM direct memory bus dynamic RAM
  • RDRAM memory bus dynamic RAM

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Business, Economics & Management (AREA)
  • Educational Administration (AREA)
  • Educational Technology (AREA)
  • Human Computer Interaction (AREA)
  • Artificial Intelligence (AREA)
  • Image Analysis (AREA)

Abstract

Disclosed in embodiments of the present application is a picture book recognition method, comprising: acquiring a standard picture library of a picture book; upon detecting a click operation of a user on the picture book, collecting a current picture book page obtained by capturing; retrieving in the standard picture library, and determining a standard picture book page corresponding to the current picture book page as a target picture book page; according to a click position corresponding to the click operation, determining a click area by using a fingertip positioning method; converting the click area into a target area consistent with a coordinate system corresponding to a standard coordinate; searching for a standard block comprising the target area, and determining same as a target block; and determining the picture book recognition result on the basis of the target block. By introducing the coordinate positioning and transformation methods, positioning is performed according to the click area, so that accurate positioning of the picture book is realized, and thus the corresponding area of the standard picture book is recognized, the recognition of a far picture is avoided, and the text recognition rate in the picture book is greatly improved. In addition, also provided are a picture book recognition apparatus, a family education machine, and a storage medium.

Description

绘本识别方法、装置、家教机及存储介质Picture book identification method, device, tutoring machine and storage medium 技术领域technical field
本申请涉及图像处理技术领域,尤其涉及一种绘本识别方法、装置、家教机及存储介质。The present application relates to the technical field of image processing, and in particular to a picture book recognition method, device, tutoring machine and storage medium.
背景技术Background technique
家教机是提供优质教育资源的Android平板,用户先将绘本放置在平板前,利用平板中的摄像装置拍摄绘本图片,然后通过平板中的绘本APP识别绘本封面确认需要阅读的绘本,用户通过翻页绘本,手指指着绘本页中文字,平板中的绘本APP即时展示对应文字的文本、音视频内容的方式实现绘本识字的功能,辅助用户理解绘本中的内容,实现了通过平板等电子设备辅助阅读绘本的效果,降低孩子阅读绘本难度的同时,将家长从重复的引导和辅助孩子阅读绘本的事务中解放出来。The tutor machine is an Android tablet that provides high-quality educational resources. The user first places the picture book in front of the tablet, uses the camera device in the tablet to take pictures of the picture book, and then uses the picture book APP in the tablet to identify the cover of the picture book and confirm the picture book to be read. The user turns the page Picture book, finger points to the text in the picture book page, and the picture book app in the tablet instantly displays the text, audio and video content of the corresponding text to realize the function of picture book literacy, assisting users to understand the content in the picture book, and realizing assisted reading through electronic devices such as tablets The effect of picture books reduces the difficulty of children reading picture books, and at the same time liberates parents from repeated guidance and assisting children in reading picture books.
技术问题technical problem
但目前的家教机的绘本识别方案存在如下缺陷:一是绘本识字的主要技术原理是通过OCR识别用户点击位置的文字,但绘本中存在各种样式的艺术字,通用OCR无法识别该类文字,导致艺术字无法识别;二是受限于装置的形态,拍摄出来的绘本图片是呈上大下小的梯形图片,离平板越远图片中的文字就越小越模糊,OCR识别的准确率就越低,导致远端的绘本识字效果较差。因此,亟待提供一种能够提高绘本中文字识别率的绘本识别方法。However, the current picture book recognition scheme for tutoring machines has the following defects: First, the main technical principle of picture book literacy is to recognize the text at the position clicked by the user through OCR, but there are various styles of artistic characters in picture books, and general OCR cannot recognize such text. As a result, artistic characters cannot be recognized; secondly, limited by the shape of the device, the picture book pictures taken are trapezoidal pictures with a large top and a small bottom. The lower it is, the poorer the picture book literacy effect at the far end. Therefore, it is urgent to provide a picture book recognition method that can improve the character recognition rate in picture books.
技术解决方案technical solution
基于此,有必要针对上述问题,提出一种能够提高绘本中文字识别率的绘本识别方法、装置、家教机及存储介质。Based on this, it is necessary to address the above problems and propose a picture book recognition method, device, tutoring machine and storage medium that can improve the character recognition rate in picture books.
一种绘本识别方法,所述方法包括:A picture book recognition method, said method comprising:
获取绘本的标准图库,所述标准图库中包含多个标准绘本页及每个所述标准绘本页对应的多个标准区块,每个所述标准区块均标识有不同的标准坐标,其中,所述标准图块是通过将每个标准绘本页按照预设的划分规则进行划分得到的;Acquiring a standard picture book library, the standard library includes a plurality of standard picture book pages and a plurality of standard blocks corresponding to each of the standard picture book pages, each of the standard blocks is marked with different standard coordinates, wherein, The standard block is obtained by dividing each standard picture book page according to preset division rules;
当检测到用户对绘本的点击操作时,采集拍摄得到的当前绘本页,并获取当前绘本页的拍摄像素;When the user's click operation on the picture book is detected, the current picture book page obtained by shooting is collected, and the shooting pixels of the current picture book page are obtained;
在所述标准图库中进行检索,确定所述当前绘本页对应的标准绘本页作为目标绘本页,并获取所述目标绘本页的标准像素;Searching in the standard picture book page, determining the standard picture book page corresponding to the current picture book page as the target picture book page, and obtaining the standard pixels of the target picture book page;
根据所述点击操作对应的点击位置,利用指尖定位方法确定点击区域;According to the click position corresponding to the click operation, a fingertip positioning method is used to determine the click area;
基于所述拍摄像素和所述标准像素,将所述点击区域转换为与所述标准坐标对应的坐标系一致的目标区域;converting the clicked area into a target area consistent with a coordinate system corresponding to the standard coordinates based on the photographing pixels and the standard pixels;
从所述目标绘本页对应的多个标准区块的各个所述标准坐标中查找包含有目标区域的标准区块确定为目标区块;Finding the standard block containing the target area from each of the standard coordinates of the plurality of standard blocks corresponding to the target picture book page is determined as the target block;
基于所述目标区块确定绘本识别结果。A picture book recognition result is determined based on the target block.
一种绘本识别装置,所述装置包括:A picture book recognition device, said device comprising:
获取模块,用于获取绘本的标准图库,所述标准图库中包含多个标准绘本页及每个所述标准绘本页对应的多个标准区块,每个所述标准区块均标识有不同的标准坐标,其中,所述标准图块是通过将每个标准绘本页按照预设的划分规则进行划分得到的;The obtaining module is used to obtain a standard library of picture books. The standard library includes a plurality of standard picture book pages and a plurality of standard blocks corresponding to each of the standard picture book pages. Each of the standard blocks is marked with a different Standard coordinates, wherein the standard block is obtained by dividing each standard picture book page according to preset division rules;
采集模块,用于当检测到用户对绘本的点击操作时,采集拍摄得到的当前绘本页,并获取当前绘本页的拍摄像素;The collection module is used to collect the captured current picture book page and obtain the shooting pixels of the current picture book page when the click operation of the picture book by the user is detected;
检索模块,用于在所述标准图库中进行检索,确定所述当前绘本页对应的标准绘本页作为目标绘本页,并获取所述目标绘本页的标准像素;A retrieval module, configured to search in the standard library, determine the standard picture book page corresponding to the current picture book page as the target picture book page, and obtain the standard pixels of the target picture book page;
定位模块,用于根据所述点击操作对应的点击位置,利用指尖定位方法确定点击区域;A positioning module, configured to determine the click area by using a fingertip positioning method according to the click position corresponding to the click operation;
转换模块,用于基于所述拍摄像素和所述标准像素,将所述点击区域转换为与所述标准坐标对应的坐标系一致的目标区域;A conversion module, configured to convert the clicked area into a target area consistent with the coordinate system corresponding to the standard coordinates based on the shooting pixels and the standard pixels;
查找模块,用于从所述目标绘本页对应的多个标准区块的各个所述标准坐标中查找包含有目标区域的标准区块确定为目标区块;A search module, configured to search for a standard block containing the target area from each of the standard coordinates of the plurality of standard blocks corresponding to the target picture book page and determine it as the target block;
识别模块,用于基于所述目标区块确定绘本识别结果。A recognition module, configured to determine a picture book recognition result based on the target block.
一种家教机,包括存储器和处理器,所述存储器存储有计算机可读指令,所述计算机可读指令被所述处理器执行时,使得所述处理器执行以下步骤:A tutoring machine includes a memory and a processor, the memory stores computer-readable instructions, and when the computer-readable instructions are executed by the processor, the processor performs the following steps:
获取绘本的标准图库,所述标准图库中包含多个标准绘本页及每个所述标准绘本页对应的多个标准区块,每个所述标准区块均标识有不同的标准坐标,其中,所述标准图块是通过将每个标准绘本页按照预设的划分规则进行划分得到的;Acquiring a standard picture book library, the standard library includes a plurality of standard picture book pages and a plurality of standard blocks corresponding to each of the standard picture book pages, each of the standard blocks is marked with different standard coordinates, wherein, The standard block is obtained by dividing each standard picture book page according to preset division rules;
当检测到用户对绘本的点击操作时,采集拍摄得到的当前绘本页,并获取当前绘本页的拍摄像素;When the user's click operation on the picture book is detected, the current picture book page obtained by shooting is collected, and the shooting pixels of the current picture book page are obtained;
在所述标准图库中进行检索,确定所述当前绘本页对应的标准绘本页作为目标绘本页,并获取所述目标绘本页的标准像素;Searching in the standard picture book page, determining the standard picture book page corresponding to the current picture book page as the target picture book page, and obtaining the standard pixels of the target picture book page;
根据所述点击操作对应的点击位置,利用指尖定位方法确定点击区域;According to the click position corresponding to the click operation, a fingertip positioning method is used to determine the click area;
基于所述拍摄像素和所述标准像素,将所述点击区域转换为与所述标准坐标对应的坐标系一致的目标区域;converting the clicked area into a target area consistent with a coordinate system corresponding to the standard coordinates based on the photographing pixels and the standard pixels;
从所述目标绘本页对应的多个标准区块的各个所述标准坐标中查找包含有目标区域的标准区块确定为目标区块;Finding the standard block containing the target area from each of the standard coordinates of the plurality of standard blocks corresponding to the target picture book page is determined as the target block;
基于所述目标区块确定绘本识别结果。A picture book recognition result is determined based on the target block.
一种计算机可读介质,存储有计算机可读指令,所述计算机可读指令被处理器执行时,使得所述处理器执行以下步骤:A computer-readable medium, storing computer-readable instructions, which, when executed by a processor, cause the processor to perform the following steps:
获取绘本的标准图库,所述标准图库中包含多个标准绘本页及每个所述标准绘本页对应的多个标准区块,每个所述标准区块均标识有不同的标准坐标,其中,所述标准图块是通过将每个标准绘本页按照预设的划分规则进行划分得到的;Acquiring a standard picture book library, the standard library includes a plurality of standard picture book pages and a plurality of standard blocks corresponding to each of the standard picture book pages, each of the standard blocks is marked with different standard coordinates, wherein, The standard block is obtained by dividing each standard picture book page according to preset division rules;
当检测到用户对绘本的点击操作时,采集拍摄得到的当前绘本页,并获取当前绘本页的拍摄像素;When the user's click operation on the picture book is detected, the current picture book page obtained by shooting is collected, and the shooting pixels of the current picture book page are obtained;
在所述标准图库中进行检索,确定所述当前绘本页对应的标准绘本页作为目标绘本页,并获取所述目标绘本页的标准像素;Searching in the standard picture book page, determining the standard picture book page corresponding to the current picture book page as the target picture book page, and obtaining the standard pixels of the target picture book page;
根据所述点击操作对应的点击位置,利用指尖定位方法确定点击区域;According to the click position corresponding to the click operation, a fingertip positioning method is used to determine the click area;
基于所述拍摄像素和所述标准像素,将所述点击区域转换为与所述标准坐标对应的坐标系一致的目标区域;converting the clicked area into a target area consistent with a coordinate system corresponding to the standard coordinates based on the photographing pixels and the standard pixels;
从所述目标绘本页对应的多个标准区块的各个所述标准坐标中查找包含有目标区域的标准区块确定为目标区块;Finding the standard block containing the target area from each of the standard coordinates of the plurality of standard blocks corresponding to the target picture book page is determined as the target block;
基于所述目标区块确定绘本识别结果。A picture book recognition result is determined based on the target block.
有益效果Beneficial effect
上述绘本识别方法、装置、家教机及存储介质,通过获取绘本的标准图库,所述标准图库中包含多个标准绘本页及每个所述标准绘本页对应的多个标准区块,每个所述标准区块均标识有不同的标准坐标,其中,所述标准图块是通过将每个标准绘本页按照预设的划分规则进行划分得到的;当检测到用户对绘本的点击操作时,采集拍摄得到的当前绘本页,并获取当前绘本页的拍摄像素;在所述标准图库中进行检索,确定所述当前绘本页对应的标准绘本页作为目标绘本页,并获取所述目标绘本页的标准像素;根据所述点击操作对应的点击位置,利用指尖定位方法确定点击区域;基于所述拍摄像素和所述标准像素,将所述点击区域转换为与所述标准坐标对应的坐标系一致的目标区域;从所述目标绘本页对应的多个标准区块的各个所述标准坐标中查找包含有目标区域的标准区块确定为目标区块;基于所述目标区块确定绘本识别结果,通过引进坐标定位和变换方法,根据点击区域进行定位,实现了对绘本页的精准定位,进而对标准绘本页的对应区域进行识别,避免了对远端图片的识别,大大提高了绘本中文字识别率。The above-mentioned picture book identification method, device, tutoring machine, and storage medium, by obtaining the standard picture book library, the standard library includes a plurality of standard picture book pages and a plurality of standard blocks corresponding to each of the standard picture book pages, each The standard blocks are marked with different standard coordinates, wherein the standard block is obtained by dividing each standard picture book page according to the preset division rules; when the click operation of the picture book by the user is detected, the photographing the obtained current picture book page, and obtaining the shooting pixels of the current picture book page; searching in the standard library, determining the standard picture book page corresponding to the current picture book page as the target picture book page, and obtaining the standard picture book page of the target pixels; according to the click position corresponding to the click operation, the click area is determined by using the fingertip positioning method; based on the shooting pixels and the standard pixels, the click area is converted into a coordinate system consistent with the standard coordinates The target area; from each of the standard coordinates of the plurality of standard blocks corresponding to the target picture book page, search for the standard block containing the target area and determine it as the target block; determine the picture book recognition result based on the target block, by The introduction of coordinate positioning and transformation methods, positioning according to the clicked area, realizes the precise positioning of the picture book page, and then recognizes the corresponding area of the standard picture book page, avoiding the recognition of remote pictures, and greatly improving the text recognition rate in picture books .
附图说明Description of drawings
为了更清楚地说明本申请实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions in the embodiments of the present application or the prior art, the following will briefly introduce the drawings that need to be used in the description of the embodiments or the prior art. Obviously, the accompanying drawings in the following description are only These are some embodiments of the present application. Those skilled in the art can also obtain other drawings based on these drawings without creative work.
其中:in:
图1为一个实施例中绘本识别方法的流程图;Fig. 1 is the flowchart of picture book recognition method in an embodiment;
图2为另一个实施例中绘本识别方法的流程图;Fig. 2 is the flowchart of picture book identification method in another embodiment;
图3为一个实施例中绘本识别结果确定的流程图;Fig. 3 is a flow chart of determining picture book recognition results in an embodiment;
图4为一个实施例中目标区域确定方法的流程图;Fig. 4 is a flowchart of a method for determining a target area in an embodiment;
图5为一个实施例中点击区域确定方法的流程图;FIG. 5 is a flow chart of a method for determining a click area in an embodiment;
图6为一个实施例中绘本识别装置的结构框图;Fig. 6 is a structural block diagram of a picture book recognition device in an embodiment;
图7为一个实施例中家教机的结构框图。Fig. 7 is a structural block diagram of the tutoring machine in one embodiment.
本发明的实施方式Embodiments of the present invention
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。The following will clearly and completely describe the technical solutions in the embodiments of the application with reference to the drawings in the embodiments of the application. Apparently, the described embodiments are only some of the embodiments of the application, not all of them. Based on the embodiments in this application, all other embodiments obtained by persons of ordinary skill in the art without creative efforts fall within the protection scope of this application.
如图1所示,在一个实施例中,提供了一种绘本识别方法,该绘本识别方法既可以应用于终端,也可以应用于服务器,本实施例以应用于服务器举例说明。该绘本识别方法具体包括以下步骤:As shown in FIG. 1 , in one embodiment, a method for identifying picture books is provided, and the method for identifying picture books can be applied to both a terminal and a server. This embodiment uses the application to a server as an example for illustration. The picture book identification method specifically includes the following steps:
步骤102,获取绘本的标准图库,标准图库中包含多个标准绘本页及每个标准绘本页对应的多个标准区块,每个标准区块均标识有不同的标准坐标,其中,标准图块是通过将每个标准绘本页按照预设的划分规则进行划分得到的。Step 102, obtain the picture book standard library, the standard library contains a plurality of standard picture book pages and a plurality of standard blocks corresponding to each standard picture book page, each standard block is marked with a different standard coordinates, wherein the standard block It is obtained by dividing each standard picture book page according to preset division rules.
其中,标准区块是指绘本中的一个标准绘本页中的区域。标准绘本页可以是通过扫描设备进行扫描后的图片。预设的划分规则可以是按照标准绘本页的内容进行划分,该对应的标准区块至少包含有字符区域、图画区域或者艺术字区域的图片,其中的字符区域可以是包含一个文字的区域,也可以是包含多个文字的区域。具体地,可以采用基于特征提取的方式的对每个标准绘本页进行识别,根据识别的内容进行划分,得到标准区块,并且对每个标准区块进行对应的标准坐标。可以理解地,本实施例中通过预先对每个标准绘本也进行划分并标识标准坐标,实现了对每个标准区块的定位,以便后续基于标准区块进行进一步处理。Wherein, the standard block refers to an area in a standard picture book page in the picture book. A standard picture book page can be a picture scanned by a scanning device. The preset dividing rule can be divided according to the content of a standard picture book page, and the corresponding standard block contains at least pictures in a character area, a picture area or an artistic word area, wherein the character area can be an area containing a text, or Can be an area containing multiple text. Specifically, each standard picture book page can be identified based on feature extraction, divided according to the identified content to obtain standard blocks, and the corresponding standard coordinates for each standard block. It can be understood that in this embodiment, each standard picture book is also divided in advance and the standard coordinates are marked, so as to realize the positioning of each standard block, so that further processing can be performed based on the standard block.
步骤104,当检测到用户对绘本的点击操作时,采集拍摄得到的当前绘本页,并获取当前绘本页的拍摄像素。Step 104, when the user's click operation on the picture book is detected, the captured current picture book page is captured, and the shooting pixels of the current picture book page are acquired.
其中,当前绘本页是指用户当前点击的需要识别的绘本的图片。拍摄像素是指当前绘本页的像素信息。具体地,当服务器检测到用户对绘本的点击操作时,通过摄像装置采集当前绘本页,并获取当前绘本页像素信息。Wherein, the current picture book page refers to the picture of the picture book currently clicked by the user that needs to be identified. Shooting pixels refer to the pixel information of the current picture book page. Specifically, when the server detects the user's click operation on the picture book, the current picture book page is captured by the camera device, and the pixel information of the current picture book page is acquired.
步骤106,在标准图库中进行检索,确定当前绘本页对应的标准绘本页作为目标绘本页,并获取目标绘本页的标准像素。Step 106, search in the standard library, determine the standard picture book page corresponding to the current picture book page as the target picture book page, and obtain the standard pixels of the target picture book page.
其中,目标绘本页是指与当前绘本页的内容一致的标准绘本页。标准像素是指该目标绘本页的像素信息。具体地,可以利用图像比对的方法,例如,分别提取当前绘本页及各个标准绘本页的图像特征,将与当前绘本页的图像特征匹配的标准绘本页确定为目标绘本页,并获取目标绘本页的标准像素。Wherein, the target picture book page refers to a standard picture book page that is consistent with the content of the current picture book page. Standard pixels refer to the pixel information of the target picture book page. Specifically, image comparison methods can be used, for example, to extract the image features of the current picture book page and each standard picture book page, determine the standard picture book page that matches the image features of the current picture book page as the target picture book page, and obtain the target picture book page standard pixels of the page.
步骤108,根据点击操作对应的点击位置,利用指尖定位方法确定点击区域。Step 108, according to the click position corresponding to the click operation, use the fingertip positioning method to determine the click area.
其中,指尖定位方法是指检测图像中的手部位置,定位指尖的坐标信息的定位方法,该指尖定位方法可以是基于深度学习的图像识别定位方法,也可以是基于特征提取的定位方法。作为本实施例的优选,为了提高指尖定位效率,选取基于特征提取的定位方法,以避免深度学习的图像识别的繁琐增加指尖定位的耗时。Among them, the fingertip positioning method refers to a positioning method that detects the position of the hand in the image and locates the coordinate information of the fingertip. The fingertip positioning method can be an image recognition positioning method based on deep learning, or a positioning based on feature extraction. method. As a preference of this embodiment, in order to improve the efficiency of fingertip positioning, a positioning method based on feature extraction is selected to avoid the cumbersome image recognition of deep learning and the time-consuming increase of fingertip positioning.
步骤110,基于拍摄像素和标准像素,将点击区域转换为与标准坐标对应的坐标系一致的目标区域。Step 110, based on the captured pixels and the standard pixels, transform the clicked area into a target area consistent with the coordinate system corresponding to the standard coordinates.
具体地,可以计算拍摄像素和标准像素的比例关系,进而根据二者的比例关系,将点击区域转换为与标准坐标对应的坐标系一致的目标区域。可以理解地,由于当前绘本页是通过摄像装置拍摄采集的到,受限于摄像装置的形态,拍摄出来的当前绘本页的形状及质量受到影响,如当前绘本页可能是呈上大下小的梯形图片。为了提高后续对点击区域识别的准确率,将点击区域转换为与标准坐标对应的坐标系一致的目标区域,保证点击区域精准度,进而保证在对应的目标区域准确度,以便后续在目标绘本页的目标区域进行绘本识别。Specifically, the proportional relationship between the shooting pixel and the standard pixel may be calculated, and then according to the proportional relationship between the two, the clicked area is transformed into a target area consistent with the coordinate system corresponding to the standard coordinate. Understandably, since the current picture book page is captured by a camera device, the shape and quality of the current picture book page captured are affected by the limited shape of the camera device. For example, the current picture book page may be large at the top and small at the bottom. Trapezoid picture. In order to improve the accuracy of the subsequent recognition of the clicked area, the clicked area is converted into a target area that is consistent with the coordinate system corresponding to the standard coordinates, to ensure the accuracy of the clicked area, and then to ensure the accuracy of the corresponding target area, so that the subsequent target picture book page The target area for picture book recognition.
步骤112,从目标绘本页对应的多个标准区块的各个标准坐标中查找包含有目标区域的标准区块确定为目标区块。Step 112, searching for a standard block containing the target area from the standard coordinates of the multiple standard blocks corresponding to the target picture book page and determining it as the target block.
其中,目标区块是指需要进行绘本识别的标准绘本页的区域。具体地,在确定了目标区域后,根据目标区域的坐标及目标绘本页对应的多个标准区块的各个标准坐标,从目标绘本页对应的多个标准区块的各个标准坐标中查找包含有目标区域的标准区块,得到目标区域,以便后续基于目标区块进行高效识别。Wherein, the target block refers to an area of a standard picture book page that requires picture book identification. Specifically, after the target area is determined, according to the coordinates of the target area and the standard coordinates of the multiple standard blocks corresponding to the target picture book page, search the standard coordinates of the multiple standard blocks corresponding to the target picture book page that contains The standard block of the target area is used to obtain the target area for subsequent efficient identification based on the target block.
步骤114,基于目标区块确定绘本识别结果。Step 114, determine the picture book recognition result based on the target block.
具体地,在目标区块中按照目标区域进行图片截图,对截取出的区域进行OCR识别,得到绘本识别结果。可以理解地,本实施例中,通过借助目标区域,对标准绘本页的标准区块进行识别,由于标准绘本页的质量高于拍摄采集的图片的质量,因此,避免了对远端图片的识别,大大提高了绘本的识别效率。Specifically, screenshot the picture according to the target area in the target block, perform OCR recognition on the intercepted area, and obtain the picture book recognition result. It can be understood that in this embodiment, by using the target area to identify the standard block of the standard picture book page, since the quality of the standard picture book page is higher than the quality of the captured picture, the identification of the remote picture is avoided. , greatly improving the recognition efficiency of picture books.
上述绘本识别方法,通过获取绘本的标准图库,标准图库中包含多个标准绘本页及每个标准绘本页对应的多个标准区块,每个标准区块均标识有不同的标准坐标,其中,标准图块是通过将每个标准绘本页按照预设的划分规则进行划分得到的;当检测到用户对绘本的点击操作时,采集拍摄得到的当前绘本页,并获取当前绘本页的拍摄像素;在标准图库中进行检索,确定当前绘本页对应的标准绘本页作为目标绘本页,并获取目标绘本页的标准像素;根据点击操作对应的点击位置,利用指尖定位方法确定点击区域;基于拍摄像素和标准像素,将点击区域转换为与标准坐标对应的坐标系一致的目标区域;从目标绘本页对应的多个标准区块的各个标准坐标中查找包含有目标区域的标准区块确定为目标区块;基于目标区块确定绘本识别结果,通过引进坐标定位和变换方法,根据点击区域进行定位,实现了对绘本页的精准定位,进而对标准绘本页的对应区域进行识别,避免了对远端图片的识别,大大提高了绘本中文字识别率。The above-mentioned picture book identification method obtains the standard picture book library, which contains a plurality of standard picture book pages and a plurality of standard blocks corresponding to each standard picture book page, and each standard block is marked with a different standard coordinate, wherein, The standard block is obtained by dividing each standard picture book page according to the preset division rules; when the user's click operation on the picture book is detected, the current picture book page obtained by shooting is collected, and the shooting pixels of the current picture book page are obtained; Search in the standard library, determine the standard picture book page corresponding to the current picture book page as the target picture book page, and obtain the standard pixels of the target picture book page; according to the click position corresponding to the click operation, use the fingertip positioning method to determine the click area; based on the shooting pixels and standard pixels, convert the clicked area into a target area consistent with the coordinate system corresponding to the standard coordinates; find the standard block containing the target area from each standard coordinate of the multiple standard blocks corresponding to the target picture book page and determine it as the target area block; based on the target block to determine the recognition result of the picture book, by introducing the coordinate positioning and transformation method, and positioning according to the clicked area, the precise positioning of the picture book page is realized, and then the corresponding area of the standard picture book page is recognized, avoiding the remote The recognition of pictures has greatly improved the recognition rate of Chinese characters in picture books.
如图2所示,在一个实施例中,在基于目标区块确定绘本识别结果之前,还包括:As shown in Figure 2, in one embodiment, before determining the picture book recognition result based on the target block, it also includes:
步骤116,基于点击区域,从当前绘本页截取第一绘本页;Step 116, based on the click area, intercept the first picture book page from the current picture book page;
步骤118,分别提取第一绘本页的第一文字信息和目标区块的第二文字信息;Step 118, respectively extracting the first text information of the first picture book page and the second text information of the target block;
步骤120,判断第一文字信息与第二文字信息是否匹配;Step 120, judging whether the first text information matches the second text information;
步骤122,若不匹配,判定识别结果为点击区域为空白区域。Step 122, if not matching, determine that the recognition result is that the clicked area is a blank area.
在这个实施例中,在当前绘本页中按照点击区域截取第一绘本页,分别对第一绘本页和目标区块进行OCR识别,获取第一绘本页的第一文字信息和目标区块的第二文字信息,判断第一文字信息与第二文字信息是否匹配,若不匹配,表明不存在与点击区域匹配的绘本信息,因此,判定识别结果为点击区域为空白区进一步地,在确定点击区域为空白区域之后,可以继续获取新的点击区域进行绘本识别,也可以重新进行定位进行绘本识别,提高绘本识别效率。In this embodiment, the first picture book page is intercepted according to the clicked area in the current picture book page, OCR is performed on the first picture book page and the target block respectively, and the first text information of the first picture book page and the second text information of the target block are obtained. Text information, judging whether the first text information matches the second text information, if they do not match, it indicates that there is no picture book information that matches the clicked area, therefore, it is determined that the clicked area is a blank area as a result of the recognition. Further, after determining that the clicked area is blank After the area, you can continue to obtain new click areas for picture book recognition, or reposition for picture book recognition to improve the efficiency of picture book recognition.
如图3所示,在一个实施例中,基于目标区块确定绘本识别结果,包括:As shown in Figure 3, in one embodiment, determining the picture book recognition result based on the target block includes:
步骤114A,基于目标区域,从目标区块截取第二绘本页;Step 114A, based on the target area, intercept the second picture book page from the target block;
步骤114B,对第二绘本页进行识别,得到绘本识别结果。Step 114B, identify the second picture book page, and obtain the picture book recognition result.
具体地,在目标区块中按照目标区域截取第二绘本页,对第二绘本页进行OCR识别,得到绘本识别结果,可以理解地,由于第二绘本页是标准绘本页地一个区域,其图片质量相较于当前绘本页更高,因此,大大提高了对第二绘本页的识别准确率,保证了绘本识别结果的准确性。Specifically, intercept the second picture book page according to the target area in the target block, perform OCR recognition on the second picture book page, and obtain the picture book recognition result. Understandably, since the second picture book page is an area of a standard picture book page, its picture Compared with the current picture book page, the quality is higher, therefore, the recognition accuracy rate of the second picture book page is greatly improved, and the accuracy of the picture book recognition result is guaranteed.
如图4所示,在一个实施例中,基于拍摄像素和标准像素,将点击区域转换为与标准坐标对应的坐标系一致的目标区域,包括:As shown in Figure 4, in one embodiment, based on the captured pixels and the standard pixels, the clicked area is transformed into a target area consistent with the coordinate system corresponding to the standard coordinates, including:
步骤110A,基于拍摄像素和标准像素计算映射变换矩阵;Step 110A, calculating the mapping transformation matrix based on the captured pixels and the standard pixels;
步骤110B,将点击区域按照映射变换矩阵进行坐标变换处理,得到目标区域。In step 110B, the clicked area is subjected to coordinate transformation processing according to the mapping transformation matrix to obtain the target area.
在这个实施例中,可以根据拍摄像素和标准像素的映射关系计算当前绘本页与标准绘本页的仿射变换矩阵作为映射变换矩阵;将点击区域的坐标按照该仿射变换矩阵进行变换计算,得到目标区域。可以理解地,本实施例中,通过对点击区域的坐标进行仿射变换,保证了目标区域的准确性。In this embodiment, the affine transformation matrix between the current picture book page and the standard picture book page can be calculated according to the mapping relationship between the captured pixels and the standard pixels as the mapping transformation matrix; the coordinates of the clicked area are transformed and calculated according to the affine transformation matrix to obtain target area. It can be understood that in this embodiment, the accuracy of the target area is guaranteed by performing affine transformation on the coordinates of the clicked area.
在一个实施例中,该方法还包括:分别对标准图库中的各个标准区块进行识别和语义分析,生成绘本释义信息映射表,每一标准区块对应一条绘本释义信息。In one embodiment, the method further includes: respectively identifying and semantically analyzing each standard block in the standard library, and generating a picture book interpretation information mapping table, each standard block corresponding to a piece of picture book interpretation information.
在这个实施例中,可以预先对各个标准区块进行识别和语义分析,例如,可以对文字内容、图画内容或者艺术字内容进行语音分析,生成每一标准区块对应的绘本释义信息,将绘本释义信息映射表存储在服务器中。In this embodiment, each standard block can be identified and semantically analyzed in advance. For example, text content, picture content, or artistic word content can be voice analyzed to generate picture book interpretation information corresponding to each standard block, and the picture book The paraphrase information mapping table is stored in the server.
在一个实施例中,该方法还包括:从绘本释义信息映射表中获取目标区块对应的目标绘本释义信息;将目标绘本释义信息进行展示。In one embodiment, the method further includes: obtaining the target picture book interpretation information corresponding to the target block from the picture book interpretation information mapping table; and displaying the target picture book interpretation information.
具体地,从绘本释义信息映射表中获取目标区块对应的目标绘本释义信息,将目标绘本释义信息进行展示,如将目标绘本释义信息进行播放,实现了对绘本中艺术字、图画内容的准确识别,大大提升了绘本阅读过程中的用户体验和用户绘本阅读过程对绘本内容的理解能力。Specifically, the target picture book interpretation information corresponding to the target block is obtained from the picture book interpretation information mapping table, and the target picture book interpretation information is displayed. Recognition greatly improves the user experience in the picture book reading process and the user's ability to understand the picture book content during the picture book reading process.
如图5所示,在一个实施例中,根据点击操作对应的点击位置,利用指尖定位方法确定点击区域,包括:As shown in FIG. 5, in one embodiment, according to the click position corresponding to the click operation, the click area is determined by using the fingertip positioning method, including:
步骤108A,获取包含有手指进行点击操作的点击图像;Step 108A, acquiring the clicked image including the click operation performed by the finger;
步骤108B,对点击图像进行边缘检测,得到手指轮廓特征;Step 108B, performing edge detection on the clicked image to obtain finger contour features;
步骤108C,基于手指轮廓特征确定点击区域。In step 108C, the click area is determined based on the contour features of the finger.
在这个实施例中,首先,获取包含有手指进行点击操作的点击图像;对点击图像进行边缘检测,得到手指轮廓特征,其中的边缘检测方法包括但不限于是Sobel算子,Laplacian算子,Canny算子,基于手指轮廓特征,定位食指指尖,返回手部、食指指尖的坐标信息,确定点击区域;也可以手指轮廓特征,检测食指指尖的基础上,定位食指中间关节、食指指根、中指中间关节、中指指根的坐标信息,确定点击区域。本实施例中,通过边缘检测的方法实现了对点击区域的精准定位,提高了指尖定位效率。In this embodiment, at first, obtain the click image that includes the finger to perform the click operation; edge detection is performed on the click image to obtain the finger contour features, and the edge detection methods include but are not limited to Sobel operator, Laplacian operator, Canny The operator locates the fingertip of the index finger based on the contour features of the finger, returns the coordinate information of the hand and the fingertip of the index finger, and determines the click area; it can also locate the middle joint of the index finger and the root of the index finger based on the finger contour feature and detection of the fingertip of the index finger , the middle joint of the middle finger, and the coordinate information of the root of the middle finger to determine the click area. In this embodiment, the precise positioning of the clicked area is realized through the method of edge detection, and the efficiency of fingertip positioning is improved.
如图6所示,在一个实施例中,提出了一种绘本识别装置,所述装置包括:As shown in Figure 6, in one embodiment, a picture book recognition device is proposed, the device includes:
获取模块602,用于获取绘本的标准图库,所述标准图库中包含多个标准绘本页及每个所述标准绘本页对应的多个标准区块,每个所述标准区块均标识有不同的标准坐标,其中,所述标准图块是通过将每个标准绘本页按照预设的划分规则进行划分得到的;The acquiring module 602 is used to acquire a standard picture book library, the standard library includes a plurality of standard picture book pages and a plurality of standard blocks corresponding to each of the standard picture book pages, and each of the standard blocks is marked with a different standard coordinates, wherein the standard block is obtained by dividing each standard picture book page according to preset division rules;
采集模块604,用于当检测到用户对绘本的点击操作时,采集拍摄得到的当前绘本页,并获取当前绘本页的拍摄像素;The collection module 604 is used to collect the current picture book page obtained by shooting when the user's click operation on the picture book is detected, and obtain the shooting pixels of the current picture book page;
检索模块606,用于在所述标准图库中进行检索,确定所述当前绘本页对应的标准绘本页作为目标绘本页,并获取所述目标绘本页的标准像素;A retrieval module 606, configured to search in the standard library, determine the standard picture book page corresponding to the current picture book page as the target picture book page, and obtain the standard pixels of the target picture book page;
定位模块608,用于根据所述点击操作对应的点击位置,利用指尖定位方法确定点击区域;A positioning module 608, configured to determine the click area by using a fingertip positioning method according to the click position corresponding to the click operation;
转换模块610,用于基于所述拍摄像素和所述标准像素,将所述点击区域转换为与所述标准坐标对应的坐标系一致的目标区域;A conversion module 610, configured to convert the clicked area into a target area consistent with the coordinate system corresponding to the standard coordinates based on the photographing pixels and the standard pixels;
查找模块612,用于从所述目标绘本页对应的多个标准区块的各个所述标准坐标中查找包含有目标区域的标准区块确定为目标区块;A search module 612, configured to search for a standard block containing the target area from each of the standard coordinates of the plurality of standard blocks corresponding to the target picture book page and determine it as the target block;
识别模块614,用于基于所述目标区块确定绘本识别结果。A recognition module 614, configured to determine a picture book recognition result based on the target block.
在一个实施例中,所述装置还包括:In one embodiment, the device also includes:
截取模块,用于基于所述点击区域,从所述当前绘本页截取第一绘本页;An intercepting module, configured to intercept a first picture book page from the current picture book page based on the click area;
提取模块,用于分别提取所述第一绘本页的第一文字信息和所述目标区块的第二文字信息;An extraction module, configured to extract the first text information of the first picture book page and the second text information of the target block respectively;
匹配模块,用于判断所述第一文字信息与所述第二文字信息是否匹配;A matching module, configured to determine whether the first text information matches the second text information;
确定模块,用于若不匹配,判定所述识别结果为所述点击区域为空白区域。A determining module, configured to determine that the clicked area is a blank area as a result of the recognition if there is no match.
在一个实施例中,识别模块包括:In one embodiment, the recognition module includes:
截取单元,用于基于所述目标区域,从所述目标区块截取第二绘本页;An intercepting unit, configured to intercept a second picture book page from the target block based on the target area;
识别单元,用于对所述第二绘本页进行识别,得到所述绘本识别结果。The identification unit is configured to identify the second picture book page to obtain the picture book identification result.
在一个实施例中,转换模块包括:In one embodiment, the conversion module includes:
计算单元,用于基于所述拍摄像素和所述标准像素计算映射变换矩阵;a calculation unit, configured to calculate a mapping transformation matrix based on the captured pixels and the standard pixels;
变换单元,用于将所述点击区域按照所述映射变换矩阵进行坐标变换处理,得到所述目标区域。The transformation unit is configured to perform coordinate transformation processing on the clicked area according to the mapping transformation matrix to obtain the target area.
在一个实施例中,所述方法还包括:分别对所述标准图库中的各个标准区块进行识别和语义分析,生成绘本释义信息映射表,每一标准区块对应一条绘本释义信息。In one embodiment, the method further includes: respectively identifying and semantically analyzing each standard block in the standard library, and generating a picture book interpretation information mapping table, each standard block corresponding to a piece of picture book interpretation information.
在一个实施例中,所述装置还包括:In one embodiment, the device also includes:
查找单元,用于从所述绘本释义信息映射表中获取所述目标区块对应的目标绘本释义信息;A search unit, configured to obtain the target picture book interpretation information corresponding to the target block from the picture book interpretation information mapping table;
展示单元,用于将所述目标绘本释义信息进行展示。A display unit, configured to display the target picture book interpretation information.
在一个实施例中,定位模块包括:In one embodiment, the positioning module includes:
获取单元,用于获取包含有手指进行点击操作的点击图像;An acquisition unit, configured to acquire a click image including a finger click operation;
提取单元,用于对所述点击图像进行边缘检测,得到手指轮廓特征;An extraction unit, configured to perform edge detection on the clicked image to obtain finger contour features;
确定单元,用于基于所述轮廓特征确定所述点击区域。A determining unit, configured to determine the click area based on the outline features.
图7示出了一个实施例中家教机的内部结构图。该家教机具体可以是服务器,所述服务器包括但不限于高性能计算机和高性能计算机集群。如图7所示,该家教机包括通过系统总线连接的处理器、存储器和网络接口。其中,存储器包括非易失性存储介质和内存储器。该家教机的非易失性存储介质存储有操作系统,还可存储有计算机可读指令,该计算机可读指令被处理器执行时,可使得处理器实现绘本识别方法。该内存储器中也可储存有计算机可读指令,该计算机可读指令被处理器执行时,可使得处理器执行绘本识别方法。本领域技术人员可以理解,图7中示出的结构,仅仅是与本申请方案相关的部分结构的框图,并不构成对本申请方案所应用于其上的家教机的限定,具体的家教机可以包括比图中所示更多或更少的部件,或者组合某些部件,或者具有不同的部件布置。Fig. 7 shows the internal structure diagram of the tutoring machine in one embodiment. The tutoring machine may specifically be a server, and the server includes but is not limited to a high-performance computer and a cluster of high-performance computers. As shown in FIG. 7, the tutoring machine includes a processor, a memory and a network interface connected through a system bus. Wherein, the memory includes a non-volatile storage medium and an internal memory. The non-volatile storage medium of the tutoring machine stores an operating system and also stores computer-readable instructions. When the computer-readable instructions are executed by the processor, the processor can realize the method for identifying picture books. Computer-readable instructions may also be stored in the internal memory, and when the computer-readable instructions are executed by the processor, the processor may execute the picture book recognition method. Those skilled in the art can understand that the structure shown in Figure 7 is only a block diagram of a part of the structure related to the solution of this application, and does not constitute a limitation to the tutoring machine on which the solution of this application is applied. The specific tutoring machine can be More or fewer components than shown in the figures may be included, or some components may be combined, or have a different arrangement of components.
在一个实施例中,本申请提供的绘本识别方法可以实现为一种计算机可读指令的形式,计算机可读指令可在如图7所示的家教机上运行。家教机的存储器中可存储组成绘本识别装置的各个程序模板。比如,获取模块602,采集模块604,检索模块606,定位模块608,转换模块610,查找模块612,识别模块614。In one embodiment, the picture book recognition method provided in this application can be implemented in the form of a computer-readable instruction, and the computer-readable instruction can be run on the tutoring machine as shown in FIG. 7 . Various program templates constituting the picture book recognition device can be stored in the memory of the tutoring machine. For example, an acquisition module 602 , a collection module 604 , a retrieval module 606 , a positioning module 608 , a conversion module 610 , a search module 612 , and an identification module 614 .
一种家教机,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机可读指令,所述处理器执行所述计算机可读指令时实现如下步骤:获取绘本的标准图库,所述标准图库中包含多个标准绘本页及每个所述标准绘本页对应的多个标准区块,每个所述标准区块均标识有不同的标准坐标,其中,所述标准图块是通过将每个标准绘本页按照预设的划分规则进行划分得到的;当检测到用户对绘本的点击操作时,采集拍摄得到的当前绘本页,并获取当前绘本页的拍摄像素;在所述标准图库中进行检索,确定所述当前绘本页对应的标准绘本页作为目标绘本页,并获取所述目标绘本页的标准像素;根据所述点击操作对应的点击位置,利用指尖定位方法确定点击区域;基于所述拍摄像素和所述标准像素,将所述点击区域转换为与所述标准坐标对应的坐标系一致的目标区域;从所述目标绘本页对应的多个标准区块的各个所述标准坐标中查找包含有目标区域的标准区块确定为目标区块;基于所述目标区块确定绘本识别结果。A tutoring machine, comprising a memory, a processor, and computer-readable instructions stored in the memory and operable on the processor. When the processor executes the computer-readable instructions, the following steps are implemented: acquiring a picture book The standard library, the standard library includes a plurality of standard picture book pages and a plurality of standard blocks corresponding to each of the standard picture book pages, each of the standard blocks is marked with a different standard coordinates, wherein the The standard block is obtained by dividing each standard picture book page according to the preset division rules; when the user's click operation on the picture book is detected, the current picture book page obtained by shooting is collected, and the shooting pixels of the current picture book page are obtained; Retrieve in the standard picture book page, determine the standard picture book page corresponding to the current picture book page as the target picture book page, and obtain the standard pixels of the target picture book page; according to the click position corresponding to the click operation, use fingertips to locate The method determines the click area; based on the shooting pixels and the standard pixels, convert the click area into a target area consistent with the coordinate system corresponding to the standard coordinates; from the multiple standard blocks corresponding to the target picture book page Find the standard block containing the target area in each of the standard coordinates and determine it as the target block; determine the picture book recognition result based on the target block.
在一个实施例中,在所述基于所述目标区块确定绘本识别结果之前,还包括:基于所述点击区域,从所述当前绘本页截取第一绘本页;分别提取所述第一绘本页的第一文字信息和所述目标区块的第二文字信息;判断所述第一文字信息与所述第二文字信息是否匹配;若不匹配,判定所述识别结果为所述点击区域为空白区域。In one embodiment, before determining the picture book recognition result based on the target block, it further includes: intercepting the first picture book page from the current picture book page based on the click area; extracting the first picture book page respectively the first text information of the target block and the second text information of the target block; determine whether the first text information matches the second text information; if not, determine that the recognition result is that the clicked area is a blank area.
在一个实施例中,所述基于所述目标区块确定绘本识别结果,包括:基于所述目标区域,从所述目标区块截取第二绘本页;对所述第二绘本页进行识别,得到所述绘本识别结果。In one embodiment, the determining the picture book recognition result based on the target block includes: intercepting a second picture book page from the target block based on the target area; identifying the second picture book page to obtain The picture book recognition result.
在一个实施例中,所述基于所述拍摄像素和所述标准像素,将所述点击区域转换为与所述标准坐标对应的坐标系一致的目标区域,包括:基于所述拍摄像素和所述标准像素计算映射变换矩阵;将所述点击区域按照所述映射变换矩阵进行坐标变换处理,得到所述目标区域。In one embodiment, converting the click area into a target area consistent with the coordinate system corresponding to the standard coordinates based on the shooting pixels and the standard pixels includes: based on the shooting pixels and the A mapping transformation matrix is calculated for standard pixels; coordinate transformation processing is performed on the clicked area according to the mapping transformation matrix to obtain the target area.
在一个实施例中,所述方法还包括:分别对所述标准图库中的各个标准区块进行识别和语义分析,生成绘本释义信息映射表,每一标准区块对应一条绘本释义信息。In one embodiment, the method further includes: respectively identifying and semantically analyzing each standard block in the standard library, and generating a picture book interpretation information mapping table, each standard block corresponding to a piece of picture book interpretation information.
在一个实施例中,所述方法还包括:从所述绘本释义信息映射表中获取所述目标区块对应的目标绘本释义信息;将所述目标绘本释义信息进行展示。In one embodiment, the method further includes: obtaining the target picture book interpretation information corresponding to the target block from the picture book interpretation information mapping table; and displaying the target picture book interpretation information.
在一个实施例中,所述根据所述点击操作对应的点击位置,利用指尖定位方法确定点击区域,包括:获取包含有手指进行点击操作的点击图像;对所述点击图像进行边缘检测,得到手指轮廓特征;基于所述轮廓特征确定所述点击区域。In one embodiment, according to the click position corresponding to the click operation, using the fingertip positioning method to determine the click area includes: acquiring a click image that includes a finger performing a click operation; performing edge detection on the click image to obtain Finger contour features; determining the click area based on the contour features.
一种计算机可读存储介质,所述计算机可读存储介质存储有计算机可读指令,其特征在于,所述计算机可读指令被处理器执行时实现如下步骤:获取绘本的标准图库,所述标准图库中包含多个标准绘本页及每个所述标准绘本页对应的多个标准区块,每个所述标准区块均标识有不同的标准坐标,其中,所述标准图块是通过将每个标准绘本页按照预设的划分规则进行划分得到的;当检测到用户对绘本的点击操作时,采集拍摄得到的当前绘本页,并获取当前绘本页的拍摄像素;在所述标准图库中进行检索,确定所述当前绘本页对应的标准绘本页作为目标绘本页,并获取所述目标绘本页的标准像素;根据所述点击操作对应的点击位置,利用指尖定位方法确定点击区域;基于所述拍摄像素和所述标准像素,将所述点击区域转换为与所述标准坐标对应的坐标系一致的目标区域;从所述目标绘本页对应的多个标准区块的各个所述标准坐标中查找包含有目标区域的标准区块确定为目标区块;基于所述目标区块确定绘本识别结果。A computer-readable storage medium, the computer-readable storage medium stores computer-readable instructions, characterized in that, when the computer-readable instructions are executed by a processor, the following steps are implemented: obtaining a standard library of picture books, the standard The gallery contains a plurality of standard picture book pages and a plurality of standard blocks corresponding to each of the standard picture book pages, and each of the standard blocks is marked with a different standard coordinate, wherein the standard block is obtained by placing each A standard picture book page is obtained by dividing according to the preset division rules; when the click operation of the picture book by the user is detected, the current picture book page obtained by shooting is collected, and the shooting pixels of the current picture book page are obtained; Retrieve, determine the standard picture book page corresponding to the current picture book page as the target picture book page, and obtain the standard pixels of the target picture book page; according to the click position corresponding to the click operation, use the fingertip positioning method to determine the click area; based on the The shooting pixels and the standard pixels are used to convert the clicked area into a target area consistent with the coordinate system corresponding to the standard coordinates; from each of the standard coordinates of the plurality of standard blocks corresponding to the target picture book page Searching for a standard block containing the target area and determining it as the target block; determining the picture book recognition result based on the target block.
在一个实施例中,在所述基于所述目标区块确定绘本识别结果之前,还包括:基于所述点击区域,从所述当前绘本页截取第一绘本页;分别提取所述第一绘本页的第一文字信息和所述目标区块的第二文字信息;判断所述第一文字信息与所述第二文字信息是否匹配;若不匹配,判定所述识别结果为所述点击区域为空白区域。In one embodiment, before determining the picture book recognition result based on the target block, it further includes: intercepting the first picture book page from the current picture book page based on the click area; extracting the first picture book page respectively the first text information of the target block and the second text information of the target block; determine whether the first text information matches the second text information; if not, determine that the recognition result is that the clicked area is a blank area.
在一个实施例中,所述基于所述目标区块确定绘本识别结果,包括:基于所述目标区域,从所述目标区块截取第二绘本页;对所述第二绘本页进行识别,得到所述绘本识别结果。In one embodiment, the determining the picture book recognition result based on the target block includes: intercepting a second picture book page from the target block based on the target area; identifying the second picture book page to obtain The picture book recognition result.
在一个实施例中,所述基于所述拍摄像素和所述标准像素,将所述点击区域转换为与所述标准坐标对应的坐标系一致的目标区域,包括:基于所述拍摄像素和所述标准像素计算映射变换矩阵;将所述点击区域按照所述映射变换矩阵进行坐标变换处理,得到所述目标区域。In one embodiment, converting the click area into a target area consistent with the coordinate system corresponding to the standard coordinates based on the shooting pixels and the standard pixels includes: based on the shooting pixels and the A mapping transformation matrix is calculated for standard pixels; coordinate transformation processing is performed on the clicked area according to the mapping transformation matrix to obtain the target area.
在一个实施例中,所述方法还包括:分别对所述标准图库中的各个标准区块进行识别和语义分析,生成绘本释义信息映射表,每一标准区块对应一条绘本释义信息。In one embodiment, the method further includes: respectively identifying and semantically analyzing each standard block in the standard library, and generating a picture book interpretation information mapping table, each standard block corresponding to a piece of picture book interpretation information.
在一个实施例中,所述方法还包括:从所述绘本释义信息映射表中获取所述目标区块对应的目标绘本释义信息;将所述目标绘本释义信息进行展示。In one embodiment, the method further includes: obtaining the target picture book interpretation information corresponding to the target block from the picture book interpretation information mapping table; and displaying the target picture book interpretation information.
在一个实施例中,所述根据所述点击操作对应的点击位置,利用指尖定位方法确定点击区域,包括:获取包含有手指进行点击操作的点击图像;对所述点击图像进行边缘检测,得到手指轮廓特征;基于所述轮廓特征确定所述点击区域。In one embodiment, according to the click position corresponding to the click operation, using the fingertip positioning method to determine the click area includes: acquiring a click image that includes a finger performing a click operation; performing edge detection on the click image to obtain Finger contour features; determining the click area based on the contour features.
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机可读指令来指令相关的硬件来完成,所述的程序可存储于一非易失性计算机可读取存储介质中,该程序在执行时,可包括如上述各方法的实施例的流程。其中,本申请所提供的各实施例中所使用的对存储器、存储、数据库或其它介质的任何引用,均可包括非易失性和/或易失性存储器。非易失性存储器可包括只读存储器(ROM)、可编程ROM(PROM)、电可编程ROM(EPROM)、电可擦除可编程ROM(EEPROM)或闪存。易失性存储器可包括随机存取存储器(RAM)或者外部高速缓冲存储器。作为说明而非局限,RAM以多种形式可得,诸如静态RAM(SRAM)、动态RAM(DRAM)、同步DRAM(SDRAM)、双数据率SDRAM(DDRSDRAM)、增强型SDRAM(ESDRAM)、同步链路(Synchlink) DRAM(SLDRAM)、存储器总线(Rambus)直接RAM(RDRAM)、直接存储器总线动态RAM(DRDRAM)、以及存储器总线动态RAM(RDRAM)等。Those of ordinary skill in the art can understand that all or part of the processes in the methods of the above embodiments can be implemented through computer-readable instructions to instruct related hardware, and the program can be stored in a non-volatile computer-readable In the storage medium, when the program is executed, it may include the processes of the embodiments of the above-mentioned methods. Wherein, any references to memory, storage, database or other media used in the various embodiments provided in the present application may include non-volatile and/or volatile memory. Nonvolatile memory can include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory. Volatile memory can include random access memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in many forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Chain Road (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.
以上实施例的各技术特征可以进行任意的组合,为使描述简洁,未对上述实施例中的各个技术特征所有可能的组合都进行描述,然而,只要这些技术特征的组合不存在矛盾,都应当认为是本说明书记载的范围。The technical features of the above embodiments can be combined arbitrarily. To make the description concise, all possible combinations of the technical features in the above embodiments are not described. However, as long as there is no contradiction in the combination of these technical features, they should be It is considered to be within the range described in this specification.
以上所述实施例仅表达了本申请的几种实施方式,其描述较为具体和详细,但并不能因此而理解为对本申请范围的限制。应当指出的是,对于本领域的普通技术人员来说,在不脱离本申请构思的前提下,还可以做出若干变形和改进,这些都属于本申请的保护范围。因此,本申请的保护范围应以所附权利要求为准。The above-mentioned embodiments only express several implementation modes of the present application, and the description thereof is relatively specific and detailed, but should not be construed as limiting the scope of the present application. It should be noted that those skilled in the art can make several modifications and improvements without departing from the concept of the present application, and these all belong to the protection scope of the present application. Therefore, the protection scope of the present application should be determined by the appended claims.

Claims (10)

  1. 一种绘本识别方法,其特征在于,所述方法包括:A picture book identification method, characterized in that the method comprises:
    获取绘本的标准图库,所述标准图库中包含多个标准绘本页及每个所述标准绘本页对应的多个标准区块,每个所述标准区块均标识有不同的标准坐标,其中,所述标准图块是通过将每个标准绘本页按照预设的划分规则进行划分得到的;Acquiring a standard picture book library, the standard library includes a plurality of standard picture book pages and a plurality of standard blocks corresponding to each of the standard picture book pages, each of the standard blocks is marked with different standard coordinates, wherein, The standard block is obtained by dividing each standard picture book page according to preset division rules;
    当检测到用户对绘本的点击操作时,采集拍摄得到的当前绘本页,并获取当前绘本页的拍摄像素;When the user's click operation on the picture book is detected, the current picture book page obtained by shooting is collected, and the shooting pixels of the current picture book page are obtained;
    在所述标准图库中进行检索,确定所述当前绘本页对应的标准绘本页作为目标绘本页,并获取所述目标绘本页的标准像素;Searching in the standard picture book page, determining the standard picture book page corresponding to the current picture book page as the target picture book page, and obtaining the standard pixels of the target picture book page;
    根据所述点击操作对应的点击位置,利用指尖定位方法确定点击区域;According to the click position corresponding to the click operation, a fingertip positioning method is used to determine the click area;
    基于所述拍摄像素和所述标准像素,将所述点击区域转换为与所述标准坐标对应的坐标系一致的目标区域;converting the clicked area into a target area consistent with a coordinate system corresponding to the standard coordinates based on the photographing pixels and the standard pixels;
    从所述目标绘本页对应的多个标准区块的各个所述标准坐标中查找包含有目标区域的标准区块确定为目标区块;Finding the standard block containing the target area from each of the standard coordinates of the plurality of standard blocks corresponding to the target picture book page is determined as the target block;
    基于所述目标区块确定绘本识别结果。A picture book recognition result is determined based on the target block.
  2. 根据权利要求1所述的绘本识别方法,其特征在于,在所述基于所述目标区块确定绘本识别结果之前,还包括:The picture book recognition method according to claim 1, wherein, before determining the picture book recognition result based on the target block, further comprising:
    基于所述点击区域,从所述当前绘本页截取第一绘本页;Based on the click area, intercepting a first picture book page from the current picture book page;
    分别提取所述第一绘本页的第一文字信息和所述目标区块的第二文字信息;respectively extracting the first text information of the first picture book page and the second text information of the target block;
    判断所述第一文字信息与所述第二文字信息是否匹配;judging whether the first text information matches the second text information;
    若不匹配,判定所述识别结果为所述点击区域为空白区域。If not, it is determined that the recognition result is that the clicked area is a blank area.
  3. 根据权利要求1所述的绘本识别方法,其特征在于,所述基于所述目标区块确定绘本识别结果,包括:The picture book recognition method according to claim 1, wherein said determining the picture book recognition result based on the target block comprises:
    基于所述目标区域,从所述目标区块截取第二绘本页;Based on the target area, intercepting a second picture book page from the target block;
    对所述第二绘本页进行识别,得到所述绘本识别结果。Recognize the second picture book page to obtain the picture book recognition result.
  4. 根据权利要求1所述的绘本识别方法,其特征在于,所述基于所述拍摄像素和所述标准像素,将所述点击区域转换为与所述标准坐标对应的坐标系一致的目标区域,包括:The picture book recognition method according to claim 1, characterized in that, based on the shooting pixels and the standard pixels, converting the clicked area into a target area consistent with the coordinate system corresponding to the standard coordinates includes :
    基于所述拍摄像素和所述标准像素计算映射变换矩阵;calculating a mapping transformation matrix based on the captured pixels and the standard pixels;
    将所述点击区域按照所述映射变换矩阵进行坐标变换处理,得到所述目标区域。and performing coordinate transformation processing on the clicked area according to the mapping transformation matrix to obtain the target area.
  5. 根据权利要求1所述的绘本识别方法,其特征在于,所述方法还包括:The picture book identification method according to claim 1, wherein the method further comprises:
    分别对所述标准图库中的各个标准区块进行识别和语义分析,生成绘本释义信息映射表,每一标准区块对应一条绘本释义信息。Each standard block in the standard library is identified and semantically analyzed to generate a picture book interpretation information mapping table, and each standard block corresponds to a piece of picture book interpretation information.
  6. 根据权利要求5所述的绘本识别方法,其特征在于,所述方法还包括:The picture book identification method according to claim 5, wherein the method further comprises:
    从所述绘本释义信息映射表中获取所述目标区块对应的目标绘本释义信息;Obtain the target picture book interpretation information corresponding to the target block from the picture book interpretation information mapping table;
    将所述目标绘本释义信息进行展示。The paraphrase information of the target picture book is displayed.
  7. 根据权利要求1所述的绘本识别方法,其特征在于,所述根据所述点击操作对应的点击位置,利用指尖定位方法确定点击区域,包括:The picture book recognition method according to claim 1, characterized in that, according to the click position corresponding to the click operation, using a fingertip positioning method to determine the click area includes:
    获取包含有手指进行点击操作的点击图像;Obtain the click image that contains the click operation of the finger;
    对所述点击图像进行边缘检测,得到手指轮廓特征;Carry out edge detection to described click image, obtain finger outline feature;
    基于所述轮廓特征确定所述点击区域。The click area is determined based on the contour feature.
  8. 一种绘本识别装置,其特征在于,所述绘本识别装置包括:A picture book recognition device, characterized in that the picture book recognition device comprises:
    获取模块,用于获取绘本的标准图库,所述标准图库中包含多个标准绘本页及每个所述标准绘本页对应的多个标准区块,每个所述标准区块均标识有不同的标准坐标,其中,所述标准图块是通过将每个标准绘本页按照预设的划分规则进行划分得到的;The obtaining module is used to obtain a standard library of picture books. The standard library includes a plurality of standard picture book pages and a plurality of standard blocks corresponding to each of the standard picture book pages. Each of the standard blocks is marked with a different Standard coordinates, wherein the standard block is obtained by dividing each standard picture book page according to preset division rules;
    采集模块,用于当检测到用户对绘本的点击操作时,采集拍摄得到的当前绘本页,并获取当前绘本页的拍摄像素;The collection module is used to collect the captured current picture book page and obtain the shooting pixels of the current picture book page when the click operation of the picture book by the user is detected;
    检索模块,用于在所述标准图库中进行检索,确定所述当前绘本页对应的标准绘本页作为目标绘本页,并获取所述目标绘本页的标准像素;A retrieval module, configured to search in the standard library, determine the standard picture book page corresponding to the current picture book page as the target picture book page, and obtain the standard pixels of the target picture book page;
    定位模块,用于根据所述点击操作对应的点击位置,利用指尖定位方法确定点击区域;A positioning module, configured to determine the click area by using a fingertip positioning method according to the click position corresponding to the click operation;
    转换模块,用于基于所述拍摄像素和所述标准像素,将所述点击区域转换为与所述标准坐标对应的坐标系一致的目标区域;A conversion module, configured to convert the clicked area into a target area consistent with the coordinate system corresponding to the standard coordinates based on the shooting pixels and the standard pixels;
    查找模块,用于从所述目标绘本页对应的多个标准区块的各个所述标准坐标中查找包含有目标区域的标准区块确定为目标区块;A search module, configured to search for a standard block containing the target area from each of the standard coordinates of the plurality of standard blocks corresponding to the target picture book page and determine it as the target block;
    识别模块,用于基于所述目标区块确定绘本识别结果。A recognition module, configured to determine a picture book recognition result based on the target block.
  9. 一种家教机,其特征在于,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机可读指令,其特征在于,所述处理器执行所述计算机可读指令时实现如权利要求1至7任一项所述绘本识别方法的步骤。A tutoring machine, characterized in that it includes a memory, a processor, and computer-readable instructions stored in the memory and operable on the processor, characterized in that the processor executes the computer-readable The step of realizing the picture book recognition method according to any one of claims 1 to 7 when the instruction is given.
  10. 一种计算机可读存储介质,所述计算机可读存储介质存储有计算机可读指令,其特征在于,所述计算机可读指令被处理器执行时实现如权利要求1至7任一项所述绘本识别方法的步骤。A computer-readable storage medium, the computer-readable storage medium stores computer-readable instructions, wherein when the computer-readable instructions are executed by a processor, the picture book according to any one of claims 1 to 7 is realized. Identify the steps of the method.
PCT/CN2021/103859 2021-06-30 2021-06-30 Picture book recognition method and apparatus, family education machine, and storage medium WO2023272656A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2021/103859 WO2023272656A1 (en) 2021-06-30 2021-06-30 Picture book recognition method and apparatus, family education machine, and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2021/103859 WO2023272656A1 (en) 2021-06-30 2021-06-30 Picture book recognition method and apparatus, family education machine, and storage medium

Publications (1)

Publication Number Publication Date
WO2023272656A1 true WO2023272656A1 (en) 2023-01-05

Family

ID=84689872

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/103859 WO2023272656A1 (en) 2021-06-30 2021-06-30 Picture book recognition method and apparatus, family education machine, and storage medium

Country Status (1)

Country Link
WO (1) WO2023272656A1 (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105447499A (en) * 2015-10-23 2016-03-30 北京爱乐宝机器人科技有限公司 Book interaction method, apparatus, and equipment
US9317486B1 (en) * 2013-06-07 2016-04-19 Audible, Inc. Synchronizing playback of digital content with captured physical content
CN109947273A (en) * 2019-03-25 2019-06-28 广东小天才科技有限公司 A kind of put reads localization method and device
JP2020086667A (en) * 2018-11-19 2020-06-04 東京瓦斯株式会社 Picture book story-telling system, schedule adjustment system and program
CN112016346A (en) * 2019-05-28 2020-12-01 阿里巴巴集团控股有限公司 Gesture recognition method, device and system and information processing method
CN112487929A (en) * 2020-11-25 2021-03-12 深圳市云希谷科技有限公司 Image recognition method, device and equipment of children picture book and storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9317486B1 (en) * 2013-06-07 2016-04-19 Audible, Inc. Synchronizing playback of digital content with captured physical content
CN105447499A (en) * 2015-10-23 2016-03-30 北京爱乐宝机器人科技有限公司 Book interaction method, apparatus, and equipment
JP2020086667A (en) * 2018-11-19 2020-06-04 東京瓦斯株式会社 Picture book story-telling system, schedule adjustment system and program
CN109947273A (en) * 2019-03-25 2019-06-28 广东小天才科技有限公司 A kind of put reads localization method and device
CN112016346A (en) * 2019-05-28 2020-12-01 阿里巴巴集团控股有限公司 Gesture recognition method, device and system and information processing method
CN112487929A (en) * 2020-11-25 2021-03-12 深圳市云希谷科技有限公司 Image recognition method, device and equipment of children picture book and storage medium

Similar Documents

Publication Publication Date Title
CN107656922B (en) Translation method, translation device, translation terminal and storage medium
CN111476227B (en) Target field identification method and device based on OCR and storage medium
US10303968B2 (en) Method and apparatus for image recognition
TWI685795B (en) Information recognition method and device
US10664519B2 (en) Visual recognition using user tap locations
WO2018233055A1 (en) Method and apparatus for entering policy information, computer device and storage medium
EP3940589B1 (en) Layout analysis method, electronic device and computer program product
WO2019033656A1 (en) Board-writing processing method, device and apparatus, and computer-readable storage medium
CN112087656A (en) Online note generation method and device and electronic equipment
CN111353501A (en) Book point-reading method and system based on deep learning
WO2020186779A1 (en) Image information identification method and apparatus, and computer device and storage medium
WO2019033658A1 (en) Method and apparatus for determining associated annotation information, intelligent teaching device, and storage medium
CN112926469B (en) Certificate identification method based on deep learning OCR and layout structure
CN111813998B (en) Video data processing method, device, equipment and storage medium
WO2021051527A1 (en) Image segmentation-based text positioning method, apparatus and device, and storage medium
US20190042186A1 (en) Systems and methods for using optical character recognition with voice recognition commands
CN108121987B (en) Information processing method and electronic equipment
CN112926421A (en) Image processing method and apparatus, electronic device, and storage medium
CN111881904A (en) Blackboard writing recording method and system
CN111090817A (en) Method for displaying book extension information, electronic equipment and computer storage medium
CN111680177A (en) Data searching method, electronic device and computer-readable storage medium
CN111079777B (en) Page positioning-based click-to-read method and electronic equipment
CN115131693A (en) Text content identification method and device, computer equipment and storage medium
CN112906532A (en) Image processing method and apparatus, electronic device, and storage medium
CN111522992A (en) Method, device and equipment for putting questions into storage and storage medium

Legal Events

Date Code Title Description
NENP Non-entry into the national phase

Ref country code: DE