CN111753830A - A job image correction method and computing device - Google Patents
A job image correction method and computing device Download PDFInfo
- Publication number
- CN111753830A CN111753830A CN202010573519.4A CN202010573519A CN111753830A CN 111753830 A CN111753830 A CN 111753830A CN 202010573519 A CN202010573519 A CN 202010573519A CN 111753830 A CN111753830 A CN 111753830A
- Authority
- CN
- China
- Prior art keywords
- image
- column
- text
- area
- job
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 86
- 238000003702 image correction Methods 0.000 title claims abstract description 44
- 238000004891 communication Methods 0.000 claims abstract description 59
- 230000009466 transformation Effects 0.000 claims description 13
- 238000012545 processing Methods 0.000 claims description 11
- 230000007704 transition Effects 0.000 claims description 6
- 238000004364 calculation method Methods 0.000 claims description 5
- 230000000877 morphologic effect Effects 0.000 claims description 4
- 238000012937 correction Methods 0.000 description 13
- 238000010586 diagram Methods 0.000 description 9
- 230000008569 process Effects 0.000 description 6
- 230000006870 function Effects 0.000 description 5
- 230000002093 peripheral effect Effects 0.000 description 4
- 230000008901 benefit Effects 0.000 description 3
- 238000013145 classification model Methods 0.000 description 3
- 230000011218 segmentation Effects 0.000 description 3
- 238000007726 management method Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000007723 transport mechanism Effects 0.000 description 2
- 230000004075 alteration Effects 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 238000000429 assembly Methods 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 230000007423 decrease Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000007373 indentation Methods 0.000 description 1
- 238000012549 training Methods 0.000 description 1
- 238000011426 transformation method Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/24—Aligning, centring, orientation detection or correction of the image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
- G06V10/267—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Theoretical Computer Science (AREA)
- Character Input (AREA)
Abstract
Description
技术领域technical field
本发明涉及图像处理领域,尤其涉及一种作业图像校正方法和计算设备。The invention relates to the field of image processing, in particular to a job image correction method and a computing device.
背景技术Background technique
作业图片有一定的特殊性,一般分为单栏图片、双栏图片、甚至三栏、四栏图片,每个栏目区域都有题目。另外作业图片中还会存在单栏为主局部双栏等特殊情况。而且,作业图片一般在作业本上,用户通过该页作业纸得到作业图片,而作业本可能具有一定厚度,导致拍摄的图像具有一定的图像扭曲,例如页面上部分内容向下扭曲,下部分内容向上扭曲等。作业图片中另有一部分文字区域较少的情况,如题目中出现几何图形、卡通人物及答题区域等内容。因此,需要提供一种对作业图片进行分析校正和板式识别的方法,以便对每道题目区域进行确认。Homework pictures have certain particularities, generally divided into single-column pictures, double-column pictures, or even three-column or four-column pictures, each column area has a topic. In addition, there will be special cases such as single column as the main part and double column in the job picture. Moreover, the job picture is generally on the workbook, and the user obtains the job picture through this page of work paper, and the workbook may have a certain thickness, resulting in a certain image distortion in the captured image, for example, part of the content on the page is distorted downward, and the lower part of the content is distorted. Upward twist etc. There is another part of the homework picture with less text area, such as geometric figures, cartoon characters and answer areas in the question. Therefore, it is necessary to provide a method for analyzing and correcting the work pictures and for identifying the plate type, so as to confirm each problem area.
发明内容SUMMARY OF THE INVENTION
鉴于上述问题,本发明提出了一种作业图像校正方法和计算设备,以力图解决或者至少解决上面存在的问题。In view of the above problems, the present invention proposes a job image correction method and computing device, in an attempt to solve or at least solve the above problems.
根据本发明的一个方面,提供了一种作业图像校正方法,适于在计算设备中执行,该方法包括步骤:获取待处理的作业图像,识别该图像中的各文字连通域;将最左侧横坐标相近的一个或多个文字联通域划分为同一分组;根据所划分的一个或多个分组确定作业图像为单栏布局或多栏布局、以及对应的栏目区域;以及对于每个栏目区域,从中确定一个或多个宽度达到预定条件的标准联通域,并基于该标准联通域的文本线对该栏目区域进行图像校正。According to one aspect of the present invention, there is provided a job image correction method suitable for execution in a computing device, the method comprising the steps of: acquiring a job image to be processed, identifying each character connected domain in the image; One or more text communication domains with similar horizontal coordinates are divided into the same group; according to the divided one or more groups, it is determined that the job image is a single-column layout or a multi-column layout, and the corresponding column area; and for each column area, One or more standard communication fields whose widths meet a predetermined condition are determined therefrom, and image correction is performed on the column area based on the text lines of the standard communication fields.
可选地,在根据本发明的作业图像校正方法中,还包括步骤:对于多栏布局,根据相邻两个栏目区域的图像校正结果,对该相邻栏目区域之间的过渡区域进行图像校正。Optionally, in the job image correction method according to the present invention, it also includes the step: for a multi-column layout, according to the image correction results of two adjacent column areas, image correction is performed on the transition area between the adjacent column areas. .
可选地,在根据本发明的作业图像校正方法中,标准联通域为宽度达到预定条件的文字联通域中最靠上和最靠下的两个文字联通域。Optionally, in the job image correction method according to the present invention, the standard communication domains are the uppermost and the lowermost two character communication domains among the character communication domains whose widths meet a predetermined condition.
可选地,在根据本发明的作业图像校正方法中,基于该标准联通域的文本线对对应的栏目区域进行图像校正的步骤包括:根据该两个标准联通域的文本线和图像水平线分别计算对应的变换公式,并根据计算得到的该两个变换公式对所述栏目区域进行校正。Optionally, in the job image correction method according to the present invention, the step of performing image correction on the corresponding column area based on the text line of the standard connectivity domain includes: calculating respectively according to the text line and the image horizontal line of the two standard connectivity domains. Corresponding transformation formulas are obtained, and the column area is corrected according to the two transformation formulas obtained by calculation.
可选地,在根据本发明的作业图像校正方法中,根据所划分的一个或多个分组确定所述作业图像为单栏布局或多栏布局、以及对应的栏目区域的步骤包括:若分组个数为1,则判定作业图像为单栏布局,对应的栏目区域为该作业图像中所有文字联通域的最小横坐标和最大横坐标之间区域;若分组个数不为1,则生成每个分组的矩形区域,并对该多个矩形区域进行去重处理,去重后的矩形区域即为栏目区域。Optionally, in the job image correction method according to the present invention, the step of determining that the job image is a single-column layout or a multi-column layout and a corresponding column area according to one or more divided groups includes: If the number is 1, it is determined that the job image is a single-column layout, and the corresponding column area is the area between the minimum abscissa and the maximum abscissa of all the text connection fields in the job image; if the number of groups is not 1, each grouped rectangular areas, and perform deduplication processing on the plurality of rectangular areas, and the deduplicated rectangular area is the column area.
可选地,在根据本发明的作业图像校正方法中,矩形区域的左右边界为对应分组内所有文字联通域的最左边界和最右边界、上下边界为所述作业图像的上下边界。Optionally, in the job image correction method according to the present invention, the left and right boundaries of the rectangular area are the leftmost and rightmost boundaries of all text communication domains in the corresponding group, and the upper and lower boundaries are the upper and lower boundaries of the job image.
可选地,在根据本发明的作业图像校正方法中,对该多个矩形区域进行去重处理的步骤包括:删除与不止一个矩形区域相重合的矩形区域;对于剩下的矩形区域,删除包含于其他矩形区域的矩形区域,以及从相交的两个矩形区域中删除宽度较小的矩形区域。Optionally, in the job image correction method according to the present invention, the step of performing deduplication processing on the plurality of rectangular areas includes: deleting rectangular areas that overlap with more than one rectangular area; Rectangular regions that are added to other rectangular regions, and the rectangular region with the smaller width is removed from the two intersecting rectangular regions.
可选地,在根据本发明的作业图像校正方法中,在识别该图像中的各文字连通域之前,还包括步骤:将作业图像转换为二值图像,对二值图像进行形态学膨胀处理,使得同一行的相邻文字可连通。Optionally, in the job image correction method according to the present invention, before recognizing each text connected domain in the image, the method further includes the steps of: converting the job image into a binary image, and performing morphological expansion processing on the binary image, Make adjacent text on the same line connectable.
可选地,在根据本发明的作业图像校正方法中,文字联通域为印刷字符的文字联通域。Optionally, in the job image correction method according to the present invention, the text communication field is the text communication field of printed characters.
可选地,在根据本发明的作业图像校正方法中,最左侧横坐标相近是指文字联通域的最左侧横坐标之差小于页面宽度的第一百分比。Optionally, in the job image correction method according to the present invention, the leftmost abscissas being close means that the difference between the leftmost abscissas of the text communication domain is less than the first percentage of the page width.
可选地,在根据本发明的作业图像校正方法中,还包括步骤:检测作业图像中的纵向直线,并根据该纵向直线与图像竖直线的夹角对所述作业图像进行纵向校正;和/或检测作业图像中的横向直线,并根据该横向直线与图像水平线的夹角对作业图像进行横向校正。Optionally, in the job image correction method according to the present invention, the method further includes the steps of: detecting a vertical straight line in the job image, and performing vertical correction on the job image according to the included angle between the vertical straight line and the image vertical line; and /or detect a horizontal straight line in the job image, and perform horizontal correction on the job image according to the included angle between the horizontal straight line and the horizontal line of the image.
可选地,在根据本发明的作业图像校正方法中,纵向直线为与图像竖直线的夹角小于预定角度且直线高度大于图像高度的第二百分比的直线,或者为所述作业图像的纵向边框线;横向直线为与图像水平线的夹角小于预定角度且直线高度大于图像高度的第二百分比的直线,或者为作业图像的横向边框线;Optionally, in the job image correction method according to the present invention, the vertical straight line is a straight line whose included angle with the vertical line of the image is less than a predetermined angle and whose height is greater than the second percentage of the image height, or the job image. The vertical border line of the image; the horizontal straight line is the straight line whose angle with the horizontal line of the image is less than the predetermined angle and the straight line height is greater than the second percentage of the image height, or the horizontal border line of the job image;
可选地,在根据本发明的作业图像校正方法中,宽度达到预定条件是指宽度大于等于栏目宽度的75%,第一百分比为8%,第二百分比为50%,预定角度为15度。Optionally, in the job image correction method according to the present invention, the width reaching a predetermined condition means that the width is greater than or equal to 75% of the column width, the first percentage is 8%, the second percentage is 50%, and the predetermined angle is is 15 degrees.
可选地,在根据本发明的作业图像校正方法中,还包括划分每个栏目区域中的各题目区域的步骤:对于每个栏目区域,确定该区域内每行文本的第一个文字联通域;从上到下依次基于所确定的各文字联通域的前若干字符格式、内容和横坐标位置来确定题号行;以及根据相邻两道题号行的纵坐标位置确定每道题目的区域,并分割出每个题目区域进行存储。Optionally, in the job image correction method according to the present invention, it also includes the step of dividing each topic area in each column area: for each column area, determine the first text connection domain of each line of text in the area. ; From top to bottom, determine the title line based on the first several character formats, contents and abscissa positions of the determined respective text connection fields in turn; and determine the area of each title according to the ordinate positions of the adjacent two title lines. , and divide each topic area for storage.
可选地,在根据本发明的作业图像校正方法中,题号行的判定规则包括:题号为中文数字或阿拉伯数字;同级题号的字符格式相同、数字连续、横坐标相同或相近、题号之后的标点符号相同。Optionally, in the homework image correction method according to the present invention, the determination rule of the question number line includes: the question number is Chinese numerals or Arabic numerals; The punctuation after the title number is the same.
可选地,在根据本发明的作业图像校正方法中,还包括步骤:将图画所在行标记为该图画上方最接近的题号行的题目区域。Optionally, in the work image correction method according to the present invention, the method further includes the step of: marking the row where the picture is located as the topic area of the closest question number row above the picture.
根据本发明的又一方面,提供了一种计算设备,包括:一个或多个处理器;存储器;以及一个或多个程序,其中一个或多个程序存储在存储器中并被配置为由一个或多个处理器执行,该一个或多个程序被处理器执行时实现如上所述的作业图像校正方法的步骤。According to yet another aspect of the present invention, there is provided a computing device comprising: one or more processors; a memory; and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by one or more Executed by a plurality of processors, the one or more programs, when executed by the processors, implement the steps of the job image correction method as described above.
根据本发明的又一方面,提供了一种存储一个或多个程序的计算机可读存储介质,该一个或多个程序包括指令,所述指令当由计算设备执行时实现如上所述的作业图像校正方法的步骤。According to yet another aspect of the present invention, there is provided a computer-readable storage medium storing one or more programs including instructions that, when executed by a computing device, implement a job image as described above The steps of the calibration method.
根据本发明的技术方案,根据作业图像中每个文字连通域的最左侧坐标确定一个或多个分组,通过对这些分组情况进行分析,确定该作业图像是单栏图像还是多栏图像。单栏图像的栏目区域即为该整体作业区域本身,多栏图像的栏目区域即为每个分栏的区域。之后,从每个栏目区域中选出宽度达标的标准连通域,并基于该标准连通域的文本线和图像水平线来对图像进行校正。本发明不仅能够准确识别复杂作业图像的版式布局,还能对扭曲的作业图像进行校正,方便后续的题目内容识别和题目分割等过程。According to the technical solution of the present invention, one or more groups are determined according to the leftmost coordinate of each text connected domain in the job image, and by analyzing these groupings, it is determined whether the job image is a single-column image or a multi-column image. The column area of a single-column image is the overall operation area itself, and the column area of a multi-column image is the area of each sub-column. After that, a standard connected domain whose width reaches the standard is selected from each column area, and the image is corrected based on the text line and image horizontal line of the standard connected domain. The invention can not only accurately identify the layout layout of complex homework images, but also correct the distorted homework images, which facilitates subsequent process of subject content identification and subject division.
而且,在进行图像校正时,本发明从每个栏目区域中宽度达标(例如连通域宽度大于等于栏目宽度的75%)的文字联通中选取纵坐标最大和最小的两个文字连通域为标准连通域。根据这两个连通域的文本线和水平线计算变换公式,上方图片根据上方的变换公式逐层变换向下校正,下方图片根据下方的变换公式逐层变换向上校正,默认中间行的扭曲力度最小。本发明采取从两端(顶部和底部)向中间的逐层校正方式,提高图像校正准确率,使的校正的图片更接近平面印刷格式。Moreover, when performing image correction, the present invention selects the two text connected domains with the largest and smallest ordinates as standard connectivity from the text connections whose width reaches the standard (for example, the width of the connected domain is greater than or equal to 75% of the column width) in each column area. area. The transformation formula is calculated according to the text line and horizontal line of the two connected domains. The upper picture is transformed downward and corrected layer by layer according to the upper transformation formula, and the lower picture is transformed upward and corrected layer by layer according to the lower transformation formula. The default middle row has the least distortion. The invention adopts a layer-by-layer correction method from both ends (top and bottom) to the middle, so as to improve the accuracy of image correction and make the corrected picture closer to the flat printing format.
上述说明仅是本发明技术方案的概述,为了能够更清楚了解本发明的技术手段,而可依照说明书的内容予以实施,并且为了让本发明的上述和其它目的、特征和优点能够更明显易懂,以下特举本发明的具体实施方式。The above description is only an overview of the technical solutions of the present invention, in order to be able to understand the technical means of the present invention more clearly, it can be implemented according to the content of the description, and in order to make the above and other purposes, features and advantages of the present invention more obvious and easy to understand , the following specific embodiments of the present invention are given.
附图说明Description of drawings
为了实现上述以及相关目的,本文结合下面的描述和附图来描述某些说明性方面,这些方面指示了可以实践本文所公开的原理的各种方式,并且所有方面及其等效方面旨在落入所要求保护的主题的范围内。通过结合附图阅读下面的详细描述,本公开的上述以及其它目的、特征和优势将变得更加明显。遍及本公开,相同的附图标记通常指代相同的部件或元素。To achieve the above and related objects, certain illustrative aspects are described herein in conjunction with the following description and drawings, which are indicative of the various ways in which the principles disclosed herein may be practiced, and all aspects and their equivalents are intended to be within the scope of the claimed subject matter. The above and other objects, features and advantages of the present disclosure will become more apparent by reading the following detailed description in conjunction with the accompanying drawings. Throughout this disclosure, the same reference numbers generally refer to the same parts or elements.
图1示出了根据本发明一个实施例的计算设备100的结构图;FIG. 1 shows a structural diagram of a computing device 100 according to an embodiment of the present invention;
图2示出了根据本发明一个实施例的作业图像校正方法200的流程图;FIG. 2 shows a flowchart of a job
图3示出了根据本发明另一个实施例的作业图像中多个文字连通域的示意图;FIG. 3 shows a schematic diagram of multiple text connected domains in a job image according to another embodiment of the present invention;
图4示出了根据本发明一个实施例的作业图像中多个分组矩形的示意图;FIG. 4 shows a schematic diagram of a plurality of grouped rectangles in a job image according to an embodiment of the present invention;
图5和图6分别示出了根据本发明一个实施例的作业图像中分组矩形之间相互关系的示意图;FIG. 5 and FIG. 6 respectively show schematic diagrams of the mutual relationship between grouped rectangles in a job image according to an embodiment of the present invention;
图7示出了根据本发明一个实施例的作业图像最终划定的栏目区域的示意图;7 shows a schematic diagram of a column area finally delineated by a job image according to an embodiment of the present invention;
图8示出了根据本发明一个实施例的作业题目分割方法800的流程图;以及FIG. 8 shows a flow chart of a
图9示出了根据本发明另一个实施例的作业题目分割方法的流程图。FIG. 9 shows a flow chart of a method for dividing an assignment topic according to another embodiment of the present invention.
具体实施方式Detailed ways
下面将参照附图更详细地描述本公开的示例性实施例。虽然附图中显示了本公开的示例性实施例,然而应当理解,可以以各种形式实现本公开而不应被这里阐述的实施例所限制。相反,提供这些实施例是为了能够更透彻地理解本公开,并且能够将本公开的范围完整的传达给本领域的技术人员。Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited by the embodiments set forth herein. Rather, these embodiments are provided so that the present disclosure will be more thoroughly understood, and will fully convey the scope of the present disclosure to those skilled in the art.
图1是根据本发明一个实施例的计算设备100的框图。在基本的配置102中,计算设备100典型地包括系统存储器106和一个或者多个处理器104。存储器总线108可以用于在处理器104和系统存储器106之间的通信。1 is a block diagram of a computing device 100 according to one embodiment of the present invention. In a basic configuration 102 , computing device 100 typically includes system memory 106 and one or more processors 104 . The memory bus 108 may be used for communication between the processor 104 and the system memory 106 .
取决于期望的配置,处理器104可以是任何类型的处理,包括但不限于:微处理器(μP)、微控制器(μC)、数字信息处理器(DSP)或者它们的任何组合。处理器104可以包括诸如一级高速缓存110和二级高速缓存112之类的一个或者多个级别的高速缓存、处理器核心114和寄存器116。示例的处理器核心114可以包括运算逻辑单元(ALU)、浮点数单元(FPU)、数字信号处理核心(DSP核心)或者它们的任何组合。示例的存储器控制器118可以与处理器104一起使用,或者在一些实现中,存储器控制器118可以是处理器104的一个内部部分。Depending on the desired configuration, the processor 104 may be any type of process including, but not limited to, a microprocessor (μP), a microcontroller (μC), a digital information processor (DSP), or any combination thereof. Processor 104 may include one or more levels of cache, such as L1 cache 110 and
取决于期望的配置,系统存储器106可以是任意类型的存储器,包括但不限于:易失性存储器(诸如RAM)、非易失性存储器(诸如ROM、闪存等)或者它们的任何组合。系统存储器106可以包括操作系统120、一个或者多个应用122以及程序数据124。在一些实施方式中,应用122可以布置为在操作系统上利用程序数据124进行操作。程序数据124包括指令,在根据本发明的计算设备100中,程序数据124包含用于执行作业图像校正方法200和/或作业题目分割方法800。Depending on the desired configuration, system memory 106 may be any type of memory including, but not limited to, volatile memory (such as RAM), non-volatile memory (such as ROM, flash memory, etc.), or any combination thereof. System memory 106 may include operating system 120 , one or more applications 122 , and program data 124 . In some embodiments, application 122 may be arranged to operate with program data 124 on an operating system. The program data 124 includes instructions, in the computing device 100 according to the present invention, the program data 124 includes instructions for performing the job
计算设备100还可以包括有助于从各种接口设备(例如,输出设备142、外设接口144和通信设备146)到基本配置102经由总线/接口控制器130的通信的接口总线140。示例的输出设备142包括图形处理单元148和音频处理单元150。它们可以被配置为有助于经由一个或者多个A/V端口152与诸如显示器或者扬声器之类的各种外部设备进行通信。示例外设接口144可以包括串行接口控制器154和并行接口控制器156,它们可以被配置为有助于经由一个或者多个I/O端口158和诸如输入设备(例如,键盘、鼠标、笔、语音输入设备、触摸输入设备)或者其他外设(例如打印机、扫描仪等)之类的外部设备进行通信。示例的通信设备146可以包括网络控制器160,其可以被布置为便于经由一个或者多个通信端口164与一个或者多个其他计算设备162通过网络通信链路的通信。Computing device 100 may also include an interface bus 140 that facilitates communication from various interface devices (eg, output device 142 , peripheral interface 144 , and communication device 146 ) to base configuration 102 via bus/
网络通信链路可以是通信介质的一个示例。通信介质通常可以体现为在诸如载波或者其他传输机制之类的调制数据信号中的计算机可读指令、数据结构、程序模块,并且可以包括任何信息递送介质。“调制数据信号”可以这样的信号,它的数据集中的一个或者多个或者它的改变可以在信号中编码信息的方式进行。作为非限制性的示例,通信介质可以包括诸如有线网络或者专线网络之类的有线介质,以及诸如声音、射频(RF)、微波、红外(IR)或者其它无线介质在内的各种无线介质。这里使用的术语计算机可读介质可以包括存储介质和通信介质二者。A network communication link may be one example of a communication medium. Communication media may typically embody computer readable instructions, data structures, program modules in a modulated data signal such as a carrier wave or other transport mechanism, and may include any information delivery media. A "modulated data signal" can be a signal of which one or more of its data sets or whose alterations can be made in such a way as to encode information in the signal. By way of non-limiting example, communication media may include wired media, such as wired or leased line networks, and various wireless media, such as acoustic, radio frequency (RF), microwave, infrared (IR), or other wireless media. The term computer readable medium as used herein may include both storage media and communication media.
计算设备100还可以包括储存接口总线134。储存接口总线134实现了从储存设备132(例如,可移除储存器136和不可移除储存器138)经由总线/接口控制器130到基本配置102的通信。操作系统120、应用122以及数据124的至少一部分可以存储在可移除储存器136和/或不可移除储存器138上,并且在计算设备100上电或者要执行应用122时,经由储存接口总线134而加载到系统存储器106中,并由一个或者多个处理器104来执行。Computing device 100 may also include a storage interface bus 134 . Storage interface bus 134 enables communication from storage devices 132 (eg, removable storage 136 and non-removable storage 138 ) to base configuration 102 via bus/
应用122在操作系统120上执行,即操作系统120提供了各种对硬件设备(例如,储存设备132、输出设备142、外设接口144和通信设备)进行操作的接口,并同时提供了应用上下文管理的环境(例如,存储空间管理和分配、中断处理、进程管理等)。应用122利用操作系统120提供的接口和环境来控制计算设备100执行相应功能。在一些实现方式中,一些应用122还提供了接口。这样另一些应用122可以调用这些接口来实现功能。Application 122 executes on operating system 120, ie operating system 120 provides various interfaces to operate on hardware devices (eg, storage device 132, output device 142, peripheral interface 144, and communication devices), and at the same time provides application context Managed environment (eg, storage space management and allocation, interrupt handling, process management, etc.). Applications 122 utilize interfaces and environments provided by operating system 120 to control computing device 100 to perform corresponding functions. In some implementations, some applications 122 also provide interfaces. In this way, other applications 122 can call these interfaces to implement functions.
计算设备100可以实现为服务器,例如文件服务器、数据库服务器、应用程序服务器和WEB服务器等,也可以实现为小尺寸便携(或者移动)电子设备的一部分,这些电子设备可以是诸如蜂窝电话、个人数字助理(PDA)、个人媒体播放器设备、无线网络浏览设备、个人头戴设备、应用专用设备、或者可以包括上面任何功能的混合设备。计算设备100还可以实现为包括桌面计算机和笔记本计算机配置的个人计算机。在一些实施例中,计算设备100被配置为执行作业图像校正方法200和/或作业题目分割方法800。Computing device 100 can be implemented as a server, such as a file server, database server, application server, and WEB server, etc., or as part of a small-sized portable (or mobile) electronic device such as a cellular phone, a personal digital Assistants (PDAs), personal media player devices, wireless web browsing devices, personal headsets, application specific devices, or hybrid devices that may include any of the above. Computing device 100 may also be implemented as a personal computer including desktop computer and notebook computer configurations. In some embodiments, computing device 100 is configured to perform job
图2示出了根据本发明一个实施例的作业图像校正方法200的流程图。方法200在计算设备(如计算设备100)中执行,对作业图像进行扭曲校正处理。FIG. 2 shows a flowchart of a job
如图2所示,该方法始于步骤S210。在步骤S210中,获取待处理的作业图像,识别该图像中的各文字连通域。As shown in FIG. 2, the method starts at step S210. In step S210, an image of the job to be processed is acquired, and each character connected domain in the image is identified.
其中,作业图像可通过拍照或扫描等任意方式获得,本发明对此不作限制。根据一个实施例,在识别该图像中的各文字连通域之前,还可以包括步骤:将作业图像转换为二值图像,对二值图像进行形态学膨胀处理,使得同一行的相邻文字可连通。这里的,膨胀幅度以使同一行的相邻文字可连通。如果两个文字之间相隔有几个空白字符或空格键,则一般分属于两个不同的文字连通域。Wherein, the job image can be obtained by any method such as photographing or scanning, which is not limited in the present invention. According to an embodiment, before recognizing the connected domains of the characters in the image, it may further include the steps of: converting the job image into a binary image, and performing morphological expansion processing on the binary image, so that adjacent characters in the same line can be connected . Here, the expansion is made so that adjacent texts on the same line can be connected. If there are several blank characters or space bars between two characters, they generally belong to two different character connected domains.
进一步地,本发明还可以先识别作业图像中的印刷体文字区域,并识别该印刷体文字区域的文字联通域,此时所识别的文字联通域为印刷字符的文字联通域。如果一段连续文本中只有印刷体文字(如题目部分)、或者既有印刷体文字也有手写体文字(类似填空题,用户已填入答案),则该段文本为一个文字联通域。如果该段连续文本中只有手写体文字(如用户在空白区域答题部分),则该段文本不识别为文字联通域。Further, the present invention can also first identify the printed text region in the job image, and identify the text communication domain of the printed text region, where the identified text communication domain is the text communication domain of printed characters. If there is only printed text (such as the title part) in a continuous text, or both printed text and handwritten text (similar to fill-in-the-blank questions, the user has filled in the answer), then the text is a text connection domain. If there is only handwritten text in this segment of continuous text (for example, the user answers a question in a blank area), this segment of text is not recognized as a text-connected domain.
这里,印刷体文字区域的识别可形态学膨胀之前,这样只会对印刷体文字区域进行形态学膨胀。印刷体文字区域的识别也可形态学膨胀之后,这样会同时对印刷体文字区域和手写体文字区域进行形态学膨胀。本发明对该步骤的先后顺序不作限定。Here, the identification of the printed text area may be preceded by morphological expansion, which will only morphologically expand the printed text area. The recognition of the printed text area can also be morphologically expanded, which will morphologically expand the printed text area and the handwritten text area at the same time. The present invention does not limit the sequence of the steps.
应当理解的是,本发明可训练能够识别印刷体文字区域和手写体文字区域的分类模型,通过对多张图像中的印刷体文字区域和手写体文字区域进行标注后作为训练集,来训练该分类模型。该分类模型的结构和参数,本领域技术人员可以根据需要自行设定,本发明对此不作限制。It should be understood that the present invention can train a classification model capable of recognizing the printed text region and the handwritten text region, and the classification model is trained by marking the printed text region and the handwritten text region in multiple images as a training set. . The structure and parameters of the classification model can be set by those skilled in the art as required, which is not limited in the present invention.
图3为某作业图像中文字联通域的示意图,该作业图像包括顶部的课时标题和下方的两栏题目,题目中有印刷体的题干部分、有用户答题的手写体部分、题干中还有多种图画版式。课时标题和有些题干为纯印刷体,识别为文字联通域;某些填空题既有印刷体也有手写体,也识别为文字联通域;而用户在空白区域的答题、以及题目图画部分,识别为非文字联通域。Figure 3 is a schematic diagram of the text connection domain in a homework image. The homework image includes the class title at the top and the two-column questions at the bottom. The question has a printed question stem, a handwritten portion of the user's answer, and a question stem. Various picture formats. The title of the class and some question stems are purely printed, and are identified as text-connected fields; some fill-in-the-blank questions are both printed and handwritten, and are also recognized as text-connected fields; while the user's answers in the blank area and the picture part of the question are recognized as Non-text communication domains.
随后,在步骤S220中,将最左侧横坐标相近的一个或多个文字联通域划分为同一分组。Then, in step S220, one or more text communication domains with similar leftmost abscissas are divided into the same group.
其中,最左侧横坐标相近是指文字联通域的最左侧横坐标之差小于页面宽度的第一百分比。第一百分比为8%,主要是为了考虑误差和缩进,当然不限于此。这里,作业图像可采用坐标系标定,如图片左上角为原点,向右为X轴、向下为Y轴,当然不限于此。图4为根据图3的文字联通域所划分的多个分组的示意图,根据图像中每个文本框的起始位置划分入对应分组。其中三条贯穿作业图像的标记线代表了三个分组的起始位置,在该线上的文字联通域的最左侧横坐标位置相同或相近,划分为同一分组。Wherein, that the leftmost abscissa is similar means that the difference between the leftmost abscissas of the text communication domain is smaller than the first percentage of the page width. The first percentage is 8%, mainly to account for errors and indentation, of course not limited to this. Here, the job image can be calibrated by using a coordinate system, for example, the upper left corner of the picture is the origin, the rightward is the X axis, and the downward is the Y axis, of course, it is not limited to this. FIG. 4 is a schematic diagram of a plurality of groups divided according to the text communication domain of FIG. 3 , and is divided into corresponding groups according to the starting position of each text box in the image. The three marked lines running through the job image represent the starting positions of the three groups, and the leftmost abscissa of the text communication field on the lines is the same or similar, and is divided into the same group.
随后,在步骤S230中,根据所划分的一个或多个分组确定作业图像为单栏布局或多栏布局、以及对应的栏目区域。Then, in step S230, it is determined that the job image is a single-column layout or a multi-column layout and a corresponding column area according to the divided one or more groups.
在一种实现方式中,若分组个数为1,则判定作业图像为单栏布局,该单栏布局所对应的栏目区域即为该作业图像中所有文字联通域的最小横坐标和最大横坐标之间区域。栏目区域的上下边界为作业图像的上下边界。In an implementation manner, if the number of groups is 1, it is determined that the job image is a single-column layout, and the column area corresponding to the single-column layout is the minimum abscissa and maximum abscissa of all the text connection fields in the job image. area in between. The upper and lower boundaries of the column area are the upper and lower boundaries of the job image.
在另一种实现方式中,若分组个数不为1,则生成每个分组的矩形区域,并对该多个矩形区域进行去重处理,去重后的矩形区域即为栏目区域。In another implementation manner, if the number of groups is not 1, a rectangular area of each group is generated, and the multiple rectangular areas are deduplicated, and the deduplicated rectangular area is the column area.
这里,矩形区域的左右边界为对应分组内所有文字联通域的最左边界和最右边界、上下边界为所述作业图像的上下边界。这里,遍历所有分组,计算一个矩形,该矩形顶部与图片顶部重合、底部与图片底部重合,左边取值为该分组内所有文本连通域最左侧位置横坐标的最小值,右边取值为该分组内所有文本连通域最右侧位置横坐标的最大值。图5为作业图像中每个分组所对应的矩形区域的示意图,其中示例性地示出了三个分组矩形。Here, the left and right boundaries of the rectangular area are the leftmost and rightmost boundaries of all the text communication domains in the corresponding group, and the upper and lower boundaries are the upper and lower boundaries of the job image. Here, all groups are traversed, and a rectangle is calculated. The top of the rectangle coincides with the top of the picture and the bottom of the rectangle coincides with the bottom of the picture. The left value is the minimum abscissa of the leftmost position of all text connected domains in the group, and the right value is the The maximum value of the abscissa of the rightmost position of all text connected domains in the group. FIG. 5 is a schematic diagram of a rectangular area corresponding to each group in the job image, wherein three grouped rectangles are exemplarily shown.
根据一个实施例,对该多个矩形区域进行去重处理的步骤包括:删除与不止一个矩形区域相重合的矩形区域;对于剩下的矩形区域,删除包含于其他矩形区域的矩形区域,以及从相交的两个矩形区域中删除宽度较小的矩形区域。According to one embodiment, the step of deduplicating the plurality of rectangular areas includes: deleting rectangular areas that overlap more than one rectangular area; for the remaining rectangular areas, deleting rectangular areas contained in other rectangular areas, and removing rectangular areas from From the two intersecting rectangular areas, delete the rectangular area with the smaller width.
本发明保留所有与其他分组矩形不重合的分组,如果存在矩形重合的分组,先删除与不止一个其他分组矩形重合的分组。如图5中由标题“第三课时,节余多少钱”所形成的分组矩形与其他两个分组矩形都相交,则会被删除。通过这种方式可以删除-些居中标题对布局的影响。此时,如果还存在相交的两个分组,则选择宽度较大的分组矩形而删除出宽度较小的分组矩形;如果某较窄的分组矩形是其他某较宽的分组矩形的子集,则删除该较窄的分组矩形。如图6中被包含的范围较窄的三个分组矩形将被删除。The present invention retains all groups that do not overlap with other grouping rectangles, and if there is a grouping with overlapping rectangles, first deletes the groupings that overlap with more than one other grouping rectangle. As shown in Figure 5, the grouping rectangle formed by the title "How much money is left in the third class hour" intersects with the other two grouping rectangles, and will be deleted. This way you can remove some of the impact of centered headings on the layout. At this time, if there are still two intersecting groups, select the grouping rectangle with a larger width and delete the grouping rectangle with a smaller width; if a narrower grouping rectangle is a subset of another wider grouping rectangle, then Delete that narrow grouping rectangle. The narrower three grouping rectangles contained in Figure 6 will be deleted.
经过各矩形区域的合并、删除等操作去重,得到最终的矩形区域个数即为作业图像的栏目个数。如图7为经过去重操作的矩形区域,共有两个,则该作业图像为双栏布局,对应的两个栏目区域即为该两个矩形区域。After merging, deleting and other operations of each rectangular area to remove duplicates, the final number of rectangular areas is the number of columns of the job image. As shown in Figure 7, there are two rectangular areas that have undergone the deduplication operation, and the job image is a double-column layout, and the corresponding two column areas are the two rectangular areas.
随后,在步骤S240中,对于每个栏目区域,从中确定一个或多个宽度达到预定条件的标准联通域,并基于该标准联通域的文本线对该栏目区域进行图像校正。其中,文本线即为连通域的文字中心连线。Then, in step S240, for each column area, one or more standard communication domains whose widths meet a predetermined condition are determined, and image correction is performed on the column area based on the text lines of the standard communication domains. Among them, the text line is the text center connection line of the connected domain.
一般地,标准联通域为宽度达到预定条件的一个或多个文字联通域。宽度达到预定条件是指宽度大于等于栏目宽度的75%,当然不限于此。之后,根据标准联通域的文本线和水平线计算变换公式,并根据该变换公式及扭曲图像中各文本行的扭曲程度的先验知识,对该栏目区域进行图像校正。Generally, the standard communication domain is one or more text communication domains whose width reaches a predetermined condition. The width reaching the predetermined condition means that the width is greater than or equal to 75% of the column width, which is of course not limited to this. Then, the transformation formula is calculated according to the text line and horizontal line in the standard connected domain, and the image correction is performed on the column area according to the transformation formula and the prior knowledge of the distortion degree of each text line in the distorted image.
例如对于双栏图片,可先从左栏区域内宽度大于75%连通域中任选一个或多个文字连通域作为标准连通域,之后根据该标准联通域的文本线和水平线对该左栏区域进行图像校正。同理,可对右栏区域进行图像校正。For example, for a two-column picture, you can select one or more text connected domains from the connected domains with a width greater than 75% in the left column area as the standard connected domain, and then use the text lines and horizontal lines of the standard connected domain to define the left column area. Perform image correction. Similarly, image correction can be performed on the area in the right column.
进一步地,标准联通域为宽度达到预定条件的文字联通域中、最靠上的一个文字联通域和最靠下的一个文字联通域,共两个标准联通域。此时,根据该两个标准联通域的文本线和图像水平线分别计算对应的变换公式,并根据计算得到的该两个变换公式对所述栏目区域进行校正。Further, the standard connection domain is the text connection domain whose width reaches the predetermined condition, the uppermost character connection domain and the lowermost character connection domain, and there are two standard connection domains in total. At this time, corresponding transformation formulas are respectively calculated according to the text lines and image horizontal lines of the two standard connectivity domains, and the column area is corrected according to the two transformation formulas obtained by calculation.
如前文所述,作业图像从上下两端向中部扭曲程度逐渐降低,上方图片向下扭曲,下方图片向上扭曲,因此本发明从上下两部分中分别选取一个宽度达标且尽量接近图片两端的文字连通域作为标准连通域。以这两个连通域的水平线和文本线的变换关系为基准,向上下两侧逐层校正作业图像。这里,同样会用到图像扭曲变化的一些先验知识。As mentioned above, the distortion degree of the job image gradually decreases from the upper and lower ends to the middle, the upper image is distorted downward, and the lower image is distorted upward. Therefore, the present invention selects a text connection from the upper and lower parts that meets the standard and is as close as possible to the two ends of the image. domain as a standard connected domain. Based on the transformation relationship between the horizontal line and the text line of the two connected domains, the job image is corrected layer by layer on the upper and lower sides. Here, some prior knowledge of image distortion changes will also be used.
再一步地,本发明可从宽度达到预定条件的文字联通域中、最靠上的两个个文字联通域、最靠下的一个文字联通域以及最居中的文字联通域,共三个标准联通域。之后,以这三个文字联通域的变换公式为基准,对作业图像进行校正。Further, the present invention can connect three standard communication domains from the text communication domains whose width reaches the predetermined condition, the two uppermost character communication domains, the lowermost character communication domain, and the most central character communication domain. area. After that, the job image is corrected based on the transformation formulas of the three character communication domains.
根据本发明的一个实施例,在步骤S240之后,方法200还包括步骤:对于多栏布局,根据相邻两个栏目区域的图像校正结果,对该相邻栏目区域之间的过渡区域进行图像校正。根据相邻栏目区域的相邻边界线上变换的点坐标,通过线性拟合的方式计算变型公式,最后对整张作业进行横向校正。例如图7中的两个栏目区域之间的区域为过渡区域,已知中间的两条边界线上各点被校正前的坐标和校正后的坐标,则根据原作业图片上同一条水平线上的两点的变换方式,可线性拟合得到该两点之间过渡点的变型公式,进而完成整个过渡区域的图像校正。According to an embodiment of the present invention, after step S240, the
根据本发明的一个实施例,方法200还包括根据纵向直线和/或横向直线对作业图像进行校正的步骤。According to an embodiment of the present invention, the
其中,根据纵向直线进行校正的步骤包括:检测作业图像中的纵向直线,并根据该纵向直线与图像竖直线的夹角对作业图像进行纵向校正。其中,纵向直线为与图像竖直线的夹角小于预定角度且直线高度大于图像高度的第二百分比的直线(多为图像的分栏线),或者,纵向直线为作业图像的纵向边框线。优选图像的居中分栏线,如果没有分栏线,则采用边框线。The step of correcting according to the vertical straight line includes: detecting the vertical straight line in the job image, and performing vertical correction on the job image according to the included angle between the vertical straight line and the vertical line of the image. The vertical straight line is a straight line whose angle with the vertical line of the image is less than a predetermined angle and the height of the straight line is greater than the second percentage of the image height (mostly the column line of the image), or the vertical straight line is the vertical frame of the job image Wire. The central column line of the image is preferred, if there is no column line, the border line is used.
根据横向直线进行校正的步骤包括:检测作业图像中的横向直线,并根据该横向直线与图像水平线的夹角对作业图像进行横向校正。其中,横向直线为与图像水平线的夹角小于预定角度且直线高度大于图像高度的第二百分比的直线,或者,横向直线为作业图像的横向边框线。The step of correcting according to the horizontal straight line includes: detecting the horizontal straight line in the job image, and performing horizontal correction on the job image according to the included angle between the horizontal straight line and the horizontal line of the image. The horizontal straight line is a straight line whose included angle with the image horizontal line is smaller than a predetermined angle and whose height is greater than the second percentage of the image height, or the horizontal straight line is a horizontal border line of the job image.
简而言之,也就是根据竖直分栏线或图像边框线对作业图像进行纵向倾斜矫正,以及根据水平分栏线或图像边框线对作业图像进行横向倾斜矫正。可选地,预定角度为15度,第二百分比为50%,当然不限于此。In short, it is to correct the vertical inclination of the job image according to the vertical column line or the image border line, and to correct the horizontal inclination of the job image according to the horizontal column line or the image border line. Optionally, the predetermined angle is 15 degrees, and the second percentage is 50%, which is of course not limited thereto.
以上完成了对作业图像的布局确定以及扭曲、倾斜校正,在此基础上,还可以继续从该作业图像中分割出单个题目区域,以进行单题存储。The above completes the determination of the layout of the job image and the correction of distortion and inclination. On this basis, it is possible to continue to segment a single question area from the job image for single-question storage.
图8示出了根据本发明一个实施例的作业题目分割方法800的流程图。方法800在计算设备(如计算设备100)中执行。如图8所示,该方法始于步骤S810。FIG. 8 shows a flow chart of a
在步骤S810中,对于每个栏目区域,确定该区域内每行文本的第一个文字联通域,构成纵向的连通域集合。In step S810, for each column area, the first text connected domain of each line of text in the area is determined to form a vertical connected domain set.
随后,在步骤S820中,从上到下依次基于所确定的各文字联通域的前若干字符格式、内容和横坐标位置来确定题号行。Subsequently, in step S820, the question number line is determined based on the format, content and abscissa position of the first several determined character communication fields in sequence from top to bottom.
其中,题号行的判定规则包括:题号为中文数字或阿拉伯数字;同级题号的字符格式相同、数字连续、横坐标相同或相近、题号之后的标点符号相同。另外,题号呈现树状结构,具体题目在最底层的一级题号中,而底层题号之上的题号-般代表题型。图画(如几何图形、卡通人物等)所在行必然不是题号行,一般将图画所在行标记为该图画上方最接近的题号行的题目区域。题号多在首字符或第二位字符(针对题号加有括号的情况)上,因此前若干字符可以为首字符,也可以为前两个字符或前三个字符。Among them, the determination rules of the title line include: the title number is Chinese numerals or Arabic numerals; the character format of the title number at the same level is the same, the numbers are continuous, the abscissa is the same or similar, and the punctuation marks after the title number are the same. In addition, the question number presents a tree structure, and the specific question is in the first-level question number at the bottom, and the question number above the bottom question number generally represents the question type. The line where the picture (such as geometric figures, cartoon characters, etc.) is located is not necessarily the title line. Generally, the line where the picture is located is marked as the title area of the closest question number line above the picture. The title number is mostly on the first character or the second character (for the case where the title number is bracketed), so the first several characters can be the first character, or the first two characters or the first three characters.
作业图像的单题分割可结合图9进行理解,以单栏区域分别进行判断,以文本连通域横向切割单栏区域为一组纵向排列的区块,该区块可以为文本块,也可以为图画块。然后从上到下依次识别处理各个区块,在处理中会以题号为主方式判断是否为一个新的题目区域,同时题号分析识别会考虑分级逻辑以及题号连续等逻辑。The single-question segmentation of the homework image can be understood in conjunction with Figure 9. The single-column area is judged separately, and the single-column area is cut horizontally by the text connected domain to form a group of vertically arranged blocks. The block can be a text block or a picture block. Then identify and process each block from top to bottom. In the process, the question number will be the main method to determine whether it is a new question area. At the same time, the question number analysis and recognition will consider the logic of grading and question number continuity.
具体而言,首先获取消处理的区块,若该区域不是文本块,如图画块,则将该文本块加入到当前正在识别的题目区块集合中。若该区块为文本块,则判定该文本的前若干字符是否包含数字,若不包含数字,则同样将该文本块加入到当前正在识别的题目区块集合中。Specifically, the block to be eliminated is first obtained. If the area is not a text block, such as a picture block, the text block is added to the set of topic blocks currently being recognized. If the block is a text block, it is determined whether the first several characters of the text contain numbers, and if it does not contain numbers, the text block is also added to the set of topic blocks currently being recognized.
若该文本块包含数字,则判断该数字与当前正在识别的题目首行标题的格式是否一致,若一致,则结束当前题目区域的识别,判定该文本块为下一题目的题号行。反之,若不一致,一般代表两种情况,一种是当前题目已结束且出现了新的上级题型,一种是该区块仍属于当前题目。If the text block contains a number, it is determined whether the number is consistent with the format of the title of the first line of the title currently being recognized. On the contrary, if it is inconsistent, it generally represents two situations, one is that the current topic has ended and a new higher-level question type has appeared, and the other is that the block still belongs to the current topic.
因此,此时判断是否存在上一级题号,若存在且与上一级题号格式一致,则结束当前题目区域的识别,代表当前层级区块结束,并判定该文本块为新的上级题型的题号行,也就是当前层级下一个区块的起始区块。若存在上级题号但与上级题号格式不一致,则将该文本块加入到当前正在识别的题目区块集合。若不存在上级题号,则判断是否存在其他级题号,以及与该其他级题号格式是否一致。若一致,则仍代表当前题目识别结束,该区块为该其他级题号的起始区块。若不存在其他级题号,则将当前区块加入到正在识别的题目区块集合。Therefore, at this time, it is judged whether there is an upper-level question number. If it exists and the format of the upper-level question number is consistent, the identification of the current title area is ended, which means the end of the current level block, and it is determined that the text block is a new upper-level question. The line of question number of the type, that is, the starting block of the next block of the current level. If there is a superior question number but the format of the superior question number is inconsistent, the text block is added to the set of question blocks currently being recognized. If there is no higher-level question number, it is judged whether there are other-level question numbers, and whether the format is consistent with the other-level question numbers. If they are consistent, it still represents the end of the current topic identification, and this block is the starting block of the other-level title number. If there is no other level title number, the current block is added to the set of title blocks being identified.
本发明不仅依靠对单-数字的识别,还根据上述预设的逻辑判断,对于题号信息设置校验与容错功能。例如找到可能的题号,但不符合同-题号下的同级题号规则,则将该可能题号判断为普通数字,而非题号信息。The present invention not only relies on the identification of single-digit numbers, but also sets the verification and error-tolerant functions for the title number information according to the above-mentioned preset logical judgment. For example, if a possible question number is found, but it does not conform to the same-level question number rule under the same-question number, the possible question number is judged as a common number instead of question number information.
随后,在步骤S830中,根据相邻两道题号行的纵坐标位置确定每道题目的区域,并分割出每个题目区域进行存储。Subsequently, in step S830, the area of each question is determined according to the ordinate positions of two adjacent question number lines, and each question area is divided for storage.
在确认题目区域后,对单栏图片中的单个题目进行分割处理,得到单题图片,并将该单题图片进行存储和分配。After confirming the topic area, a single topic in the single-column picture is segmented to obtain a single-question picture, and the single-question picture is stored and distributed.
根据本发明的技术方案,能够准确确定作业图像的版式布局,完成图像的扭曲校正、倾斜矫正等,使作业图像尽量接近原版式。同时,准确提取了每个单题图像,以便进行后续的图像识别、单题分配、单题批改和成绩汇总等工作。本发明检测正确率高、计算量小、计算性能高,能实时快速的大批理处理。According to the technical solution of the present invention, the layout layout of the job image can be accurately determined, and the distortion correction and skew correction of the image can be completed, so that the job image can be as close to the original layout as possible. At the same time, each single-question image is accurately extracted for subsequent image recognition, single-question assignment, single-question correction and score summary. The invention has high detection accuracy, small calculation amount, high calculation performance, and can perform real-time and fast large-scale processing.
A9、如A1-A8中任一项所述的方法,其中,所述文字联通域为印刷字符的文字联通域。A10、如A1-A9中任一项所述的方法,其中,所述最左侧横坐标相近是指文字联通域的最左侧横坐标之差小于页面宽度的第一百分比。A11、如A1-A10中任一项所述的方法,还包括步骤:检测所述作业图像中的纵向直线,并根据该纵向直线与图像竖直线的夹角对所述作业图像进行纵向校正;和/或检测所述作业图像中的横向直线,并根据该横向直线与图像水平线的夹角对所述作业图像进行横向校正。A9. The method according to any one of A1-A8, wherein the text communication domain is a text communication domain of printed characters. A10. The method according to any one of A1-A9, wherein the leftmost abscissa is close means that the difference between the leftmost abscissas of the text communication domain is less than the first percentage of the page width. A11. The method according to any one of A1-A10, further comprising the step of: detecting a vertical straight line in the job image, and performing vertical correction on the job image according to the angle between the vertical straight line and the image vertical line ; and/or detect a horizontal straight line in the job image, and perform lateral correction on the job image according to the included angle between the horizontal straight line and the horizontal line of the image.
A12、如A11所述的方法,其中,所述纵向直线为与图像竖直线的夹角小于预定角度且直线高度大于图像高度的第二百分比的直线,或者为所述作业图像的纵向边框线;所述横向直线为与图像水平线的夹角小于预定角度且直线高度大于图像高度的第二百分比的直线,或者为所述作业图像的横向边框线。A13、如A12所述的方法,其中,所述宽度达到预定条件是指宽度大于等于栏目宽度的75%,所述第一百分比为8%,所述第二百分比为50%,所述预定角度为15度。A12. The method according to A11, wherein the vertical straight line is a straight line whose included angle with the vertical line of the image is less than a predetermined angle and whose height is greater than the second percentage of the image height, or the vertical line of the job image A border line; the horizontal straight line is a straight line whose included angle with the horizontal line of the image is less than a predetermined angle and whose height is greater than the second percentage of the image height, or is a horizontal border line of the job image. A13. The method according to A12, wherein the width reaching a predetermined condition means that the width is greater than or equal to 75% of the column width, the first percentage is 8%, and the second percentage is 50%, The predetermined angle is 15 degrees.
A14、如A1-A13中任一项所述的方法,还包括划分每个栏目区域中的各题目区域的步骤:对于每个栏目区域,确定该区域内每行文本的第一个文字联通域;从上到下依次基于所确定的各文字联通域的前若干字符格式、内容和横坐标位置来确定题号行;以及根据相邻两道题号行的纵坐标位置确定每道题目的区域,并分割出每个题目区域进行存储。A15、如A14所述的方法,其中,题号行的判定规则包括:题号为中文数字或阿拉伯数字;同级题号的首字符格式相同、数字连续、横坐标相同或相近、题号之后的标点符号相同。A16、如A14所述的方法,还包括步骤:将图画所在行标记为该图画上方最接近的题号行的题目区域。A14. The method according to any one of A1-A13, further comprising the step of dividing each topic area in each column area: for each column area, determine the first text connection domain of each line of text in the area ; From top to bottom, determine the title line based on the first several character formats, contents and abscissa positions of the determined respective text connection fields in turn; and determine the area of each title according to the ordinate positions of the adjacent two title lines. , and divide each topic area for storage. A15. The method according to A14, wherein the determination rules for the question number row include: the question number is Chinese numerals or Arabic numerals; the first character format of the question number at the same level is the same, the numbers are continuous, the abscissa is the same or similar, and the title number is after the question number. punctuation marks are the same. A16. The method according to A14, further comprising the step of: marking the row where the picture is located as the title area of the closest question number row above the picture.
这里描述的各种技术可结合硬件或软件,或者它们的组合一起实现。从而,本发明的方法和设备,或者本发明的方法和设备的某些方面或部分可采取嵌入有形媒介,例如可移动硬盘、U盘、软盘、CD-ROM或者其它任意机器可读的存储介质中的程序代码(即指令)的形式,其中当程序被载入诸如计算机之类的机器,并被所述机器执行时,所述机器变成实践本发明的设备。The various techniques described herein can be implemented in conjunction with hardware or software, or a combination thereof. Thus, the method and apparatus of the present invention, or certain aspects or portions of the method and apparatus of the present invention, may take the form of an embedded tangible medium, such as a removable hard disk, a USB stick, a floppy disk, a CD-ROM, or any other machine-readable storage medium. in the form of program code (ie, instructions) that, when the program is loaded into a machine, such as a computer, and executed by the machine, the machine becomes an apparatus for practicing the invention.
在程序代码在可编程计算机上执行的情况下,计算设备一般包括处理器、处理器可读的存储介质(包括易失性和非易失性存储器和/或存储元件),至少一个输入装置,和至少一个输出装置。其中,存储器被配置用于存储程序代码;处理器被配置用于根据该存储器中存储的所述程序代码中的指令,执行本发明的方法。Where the program code is executed on a programmable computer, the computing device typically includes a processor, a storage medium readable by the processor (including volatile and nonvolatile memory and/or storage elements), at least one input device, and at least one output device. Wherein, the memory is configured to store program codes; the processor is configured to execute the method of the present invention according to the instructions in the program codes stored in the memory.
以示例而非限制的方式,可读介质包括可读存储介质和通信介质。可读存储介质存储诸如计算机可读指令、数据结构、程序模块或其它数据等信息。通信介质一般以诸如载波或其它传输机制等已调制数据信号来体现计算机可读指令、数据结构、程序模块或其它数据,并且包括任何信息传递介质。以上的任一种的组合也包括在可读介质的范围之内。By way of example and not limitation, readable media include readable storage media and communication media. Readable storage media store information such as computer readable instructions, data structures, program modules or other data. Communication media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media. Combinations of any of the above are also included within the scope of readable media.
在此处所提供的说明书中,算法和显示不与任何特定计算机、虚拟系统或者其它设备固有相关。各种通用系统也可以与本发明的示例一起使用。根据上面的描述,构造这类系统所要求的结构是显而易见的。此外,本发明也不针对任何特定编程语言。应当明白,可以利用各种编程语言实现在此描述的本发明的内容,并且上面对特定语言所做的描述是为了披露本发明的最佳实施方式。In the specification provided herein, the algorithms and displays are not inherently related to any particular computer, virtual system, or other device. Various general purpose systems may also be used with examples of the present invention. The structure required to construct such a system is apparent from the above description. Furthermore, the present invention is not directed to any particular programming language. It is to be understood that various programming languages may be used to implement the inventions described herein, and that the descriptions of specific languages above are intended to disclose the best mode for carrying out the invention.
在此处所提供的说明书中,说明了大量具体细节。然而,能够理解,本发明的实施例可以在没有这些具体细节的情况下被实践。在一些实例中,并未详细示出公知的方法、结构和技术,以便不模糊对本说明书的理解。In the description provided herein, numerous specific details are set forth. It will be understood, however, that embodiments of the invention may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.
类似地,应当理解,为了精简本公开并帮助理解各个发明方面中的一个或多个,在上面对本发明的示例性实施例的描述中,本发明的各个特征有时被一起分组到单个实施例、图、或者对其的描述中。然而,并不应将该公开的方法解释成反映如下意图:即所要求保护的本发明要求比在每个权利要求中所明确记载的特征更多特征。更确切地说,如下面的权利要求书所反映的那样,发明方面在于少于前面公开的单个实施例的所有特征。因此,遵循具体实施方式的权利要求书由此明确地并入该具体实施方式,其中每个权利要求本身都作为本发明的单独实施例。Similarly, it is to be understood that in the above description of exemplary embodiments of the invention, various features of the invention are sometimes grouped together into a single embodiment, figure, or its description. This disclosure, however, should not be interpreted as reflecting an intention that the invention as claimed requires more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims following the Detailed Description are hereby expressly incorporated into this Detailed Description, with each claim standing on its own as a separate embodiment of this invention.
本领域那些技术人员应当理解在本文所公开的示例中的设备的模块或单元或组件可以布置在如该实施例中所描述的设备中,或者可替换地可以定位在与该示例中的设备不同的一个或多个设备中。前述示例中的模块可以组合为一个模块或者此外可以分成多个子模块。Those skilled in the art will appreciate that the modules or units or components of the apparatus in the examples disclosed herein may be arranged in the apparatus as described in this embodiment, or alternatively may be positioned differently from the apparatus in this example in one or more devices. The modules in the preceding examples may be combined into one module or further divided into sub-modules.
本领域那些技术人员可以理解,可以对实施例中的设备中的模块进行自适应性地改变并且把它们设置在与该实施例不同的一个或多个设备中。可以把实施例中的模块或单元或组件组合成一个模块或单元或组件,以及此外可以把它们分成多个子模块或子单元或子组件。除了这样的特征和/或过程或者单元中的至少一些是相互排斥之外,可以采用任何组合对本说明书(包括伴随的权利要求、摘要和附图)中公开的所有特征以及如此公开的任何方法或者设备的所有过程或单元进行组合。除非另外明确陈述,本说明书(包括伴随的权利要求、摘要和附图)中公开的每个特征可以由提供相同、等同或相似目的的替代特征来代替。Those skilled in the art will appreciate that the modules in the device in the embodiment can be adaptively changed and arranged in one or more devices different from the embodiment. The modules or units or components in the embodiments may be combined into one module or unit or component, and further they may be divided into multiple sub-modules or sub-units or sub-assemblies. All features disclosed in this specification (including accompanying claims, abstract and drawings) and any method so disclosed may be employed in any combination unless at least some of such features and/or procedures or elements are mutually exclusive. All processes or units of equipment are combined. Each feature disclosed in this specification (including accompanying claims, abstract and drawings) may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise.
此外,本领域的技术人员能够理解,尽管在此所述的一些实施例包括其它实施例中所包括的某些特征而不是其它特征,但是不同实施例的特征的组合意味着处于本发明的范围之内并且形成不同的实施例。例如在下面的权利要求书中,所要求保护的实施例的任意之一都可以以任意的组合方式来使用。Furthermore, it will be understood by those skilled in the art that although some of the embodiments described herein include certain features, but not others, included in other embodiments, that combinations of features of different embodiments are intended to be within the scope of the invention within and form different embodiments. For example in the following claims, any of the claimed embodiments can be used in any combination.
此外,所述实施例中的一些在此被描述成可以由计算机系统的处理器或者由执行所述功能的其它装置实施的方法或方法元素的组合。因此,具有用于实施所述方法或方法元素的必要指令的处理器形成用于实施该方法或方法元素的装置。此外,装置实施例的在此所述的元素是如下装置的例子:该装置用于实施由为了实施该发明的目的的元素所执行的功能。Furthermore, some of the described embodiments are described herein as methods or combinations of method elements that can be implemented by a processor of a computer system or by other means for performing the described functions. Thus, a processor having the necessary instructions for implementing the method or method element forms means for implementing the method or method element. Furthermore, an element of an apparatus embodiment described herein is an example of a means for carrying out the function performed by the element for the purpose of carrying out the invention.
如在此所使用的那样,除非另行规定,使用序数词“第一”、“第二”、“第三”等等来描述普通对象仅仅表示涉及类似对象的不同实例,并且并不意图暗示这样被描述的对象必须具有时间上、空间上、排序方面或者以任意其它方式的给定顺序。As used herein, unless otherwise specified, the use of the ordinal numbers "first," "second," "third," etc. to describe common objects merely refers to different instances of similar objects, and is not intended to imply such The objects being described must have a given order in time, space, ordinal, or in any other way.
尽管根据有限数量的实施例描述了本发明,但是受益于上面的描述,本技术领域内的技术人员明白,在由此描述的本发明的范围内,可以设想其它实施例。此外,应当注意,本说明书中使用的语言主要是为了可读性和教导的目的而选择的,而不是为了解释或者限定本发明的主题而选择的。因此,在不偏离所附权利要求书的范围和精神的情况下,对于本技术领域的普通技术人员来说许多修改和变更都是显而易见的。对于本发明的范围,对本发明所做的公开是说明性的而非限制性的,本发明的范围由所附权利要求书限定。While the invention has been described in terms of a limited number of embodiments, those skilled in the art will appreciate, having the benefit of the above description, that other embodiments are conceivable within the scope of the invention thus described. Furthermore, it should be noted that the language used in this specification has been principally selected for readability and teaching purposes, rather than to explain or define the subject matter of the invention. Accordingly, many modifications and variations will be apparent to those skilled in the art without departing from the scope and spirit of the appended claims. This disclosure is intended to be illustrative and not restrictive with regard to the scope of the present invention, which is defined by the appended claims.
Claims (10)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010573519.4A CN111753830A (en) | 2020-06-22 | 2020-06-22 | A job image correction method and computing device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010573519.4A CN111753830A (en) | 2020-06-22 | 2020-06-22 | A job image correction method and computing device |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111753830A true CN111753830A (en) | 2020-10-09 |
Family
ID=72674851
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010573519.4A Pending CN111753830A (en) | 2020-06-22 | 2020-06-22 | A job image correction method and computing device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111753830A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2022089196A1 (en) * | 2020-10-27 | 2022-05-05 | 北京字节跳动网络技术有限公司 | Image processing method and apparatus, and electronic device and storage medium |
CN114663902A (en) * | 2022-04-02 | 2022-06-24 | 北京百度网讯科技有限公司 | Document image processing method, device, equipment and medium |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2000187705A (en) * | 1998-12-22 | 2000-07-04 | Toshiba Corp | Document reader, document reading method and storage medium |
CN1828581A (en) * | 2006-04-14 | 2006-09-06 | 北京北大方正电子有限公司 | A Typesetting Method for Text Content Adapting to a Rectangular Text Frame |
CN103679168A (en) * | 2012-08-30 | 2014-03-26 | 北京百度网讯科技有限公司 | Detection method and detection device for character region |
CN107798321A (en) * | 2017-12-04 | 2018-03-13 | 海南云江科技有限公司 | A kind of examination paper analysis method and computing device |
CN108304814A (en) * | 2018-02-08 | 2018-07-20 | 海南云江科技有限公司 | A kind of construction method and computing device of literal type detection model |
CN110414529A (en) * | 2019-06-26 | 2019-11-05 | 深圳中兴网信科技有限公司 | Paper information extracting method, system and computer readable storage medium |
WO2019227615A1 (en) * | 2018-06-01 | 2019-12-05 | 平安科技(深圳)有限公司 | Method for correcting invoice image, apparatus, computer device, and storage medium |
-
2020
- 2020-06-22 CN CN202010573519.4A patent/CN111753830A/en active Pending
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2000187705A (en) * | 1998-12-22 | 2000-07-04 | Toshiba Corp | Document reader, document reading method and storage medium |
CN1828581A (en) * | 2006-04-14 | 2006-09-06 | 北京北大方正电子有限公司 | A Typesetting Method for Text Content Adapting to a Rectangular Text Frame |
CN103679168A (en) * | 2012-08-30 | 2014-03-26 | 北京百度网讯科技有限公司 | Detection method and detection device for character region |
CN107798321A (en) * | 2017-12-04 | 2018-03-13 | 海南云江科技有限公司 | A kind of examination paper analysis method and computing device |
CN108304814A (en) * | 2018-02-08 | 2018-07-20 | 海南云江科技有限公司 | A kind of construction method and computing device of literal type detection model |
WO2019227615A1 (en) * | 2018-06-01 | 2019-12-05 | 平安科技(深圳)有限公司 | Method for correcting invoice image, apparatus, computer device, and storage medium |
CN110414529A (en) * | 2019-06-26 | 2019-11-05 | 深圳中兴网信科技有限公司 | Paper information extracting method, system and computer readable storage medium |
Non-Patent Citations (1)
Title |
---|
中公教育教师资格考试研究院: "《2019(中公版)教育知识与能力历年真题及标准预测试卷》", 31 December 2019, 世界图书出版公司, pages: 1 - 4 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2022089196A1 (en) * | 2020-10-27 | 2022-05-05 | 北京字节跳动网络技术有限公司 | Image processing method and apparatus, and electronic device and storage medium |
CN114663902A (en) * | 2022-04-02 | 2022-06-24 | 北京百度网讯科技有限公司 | Document image processing method, device, equipment and medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108304814B (en) | Method for constructing character type detection model and computing equipment | |
CN109829453B (en) | Method and device for recognizing characters in card and computing equipment | |
CN106940799B (en) | Text image processing method and device | |
US9235758B1 (en) | Robust method to find layout similarity between two documents | |
CN111144400B (en) | Identification method, device, terminal equipment and storage medium for ID card information | |
CN103488711B (en) | A method and system for quickly making vector fonts | |
US9330331B2 (en) | Systems and methods for offline character recognition | |
CN108171104A (en) | A kind of character detecting method and device | |
CN109697414B (en) | Text positioning method and device | |
WO2021147222A1 (en) | Ocr-based table layout restoration method and device, electronic apparatus, and storage medium | |
CN111582267B (en) | Text detection method, computing device and readable storage medium | |
CN112100979A (en) | Typesetting processing method, electronic device and storage medium based on electronic book | |
JP2021043775A (en) | Information processing device and program | |
CN111753830A (en) | A job image correction method and computing device | |
CN105096244A (en) | Method and device for image transformation, method and device for image identification | |
CN105404683A (en) | Format file processing method and apparatus | |
CN107305682B (en) | Method and apparatus for stitching images | |
CN114463767A (en) | Credit card identification method, device, computer equipment and storage medium | |
CN115589786A (en) | Method, device and system for recognizing hand-drawn figure and computer readable storage medium | |
CN110598196B (en) | Table data extraction method and device without outer frame and storage medium | |
CN101017479A (en) | Method for automatically identifying digital document type page | |
CN115223172A (en) | Text extraction method, device and device | |
CN110941972B (en) | Segmentation method and device for characters in PDF document and electronic equipment | |
WO2024140094A1 (en) | Paragraph determination method and apparatus for digital document, and electronic device and storage medium | |
CN113095320A (en) | License plate recognition method and system and computing device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |