CN114663873A

CN114663873A - Text region determination method and device, storage medium and electronic equipment

Info

Publication number: CN114663873A
Application number: CN202210320579.4A
Authority: CN
Inventors: 尹康
Original assignee: Guangdong Oppo Mobile Telecommunications Corp Ltd
Current assignee: Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority date: 2022-03-29
Filing date: 2022-03-29
Publication date: 2022-06-24

Abstract

The disclosure relates to the technical field of text detection, in particular to a text content determination method and device, a computer readable storage medium and an electronic device, wherein the method comprises the following steps: acquiring an initial image comprising a text, and determining a rectangular reference region comprising the text in the initial image, wherein one edge of the rectangular reference region is parallel to a reference direction in the initial image; determining a candidate region according to the rectangular reference region; determining text distribution information of each candidate area in the candidate areas; a target text region is determined in the candidate region based on the text distribution information. According to the technical scheme of the embodiment of the disclosure, under the condition of ensuring the precision, the calculation amount is reduced, and the calculation speed is increased.

Description

Text area determination method and device, storage medium and electronic device

技术领域technical field

本公开涉及信息展示技术领域，具体而言，涉及一种文本区域确定方法及装置、计算机可读存储介质及电子设备。The present disclosure relates to the technical field of information display, and in particular, to a text area determination method and apparatus, a computer-readable storage medium, and an electronic device.

背景技术Background technique

随着图像识别技术的发展，在图像中确定文本区域以获取对应文本的技术用的也越来越广泛。With the development of image recognition technology, the technology of determining a text area in an image to obtain the corresponding text is more and more widely used.

现有技术中在图像中确定文本区域以获取对应文本的方法中存在计算速度较慢或精度不足的问题，即无法兼顾精度以及计算速度。In the prior art method for determining a text area in an image to obtain corresponding text, there are problems of slow calculation speed or insufficient precision, that is, the accuracy and calculation speed cannot be taken into account.

需要说明的是，在上述背景技术部分公开的信息仅用于加强对本公开的背景的理解，因此可以包括不构成对本领域普通技术人员已知的现有技术的信息。It should be noted that the information disclosed in the above Background section is only for enhancement of understanding of the background of the present disclosure, and therefore may contain information that does not form the prior art that is already known to a person of ordinary skill in the art.

发明内容SUMMARY OF THE INVENTION

本公开的目的在于提供一种文本区域确定方法、文本区域确定装置、计算机可读介质和电子设备，进而在保证精度的情况下，降低计算量，提升计算速度。The purpose of the present disclosure is to provide a text area determination method, a text area determination device, a computer-readable medium and an electronic device, so as to reduce the amount of calculation and improve the calculation speed under the condition of ensuring the accuracy.

根据本公开的第一方面，提供一种文本区域确定方法，包括：获取包括所述文本的初始图像，并在所述初始图像中确定包括所述文本的矩形参考区域，其中，所述矩形参考区域的其中一个边缘平行于所述初始图像中的参考方向；根据所述矩形参考区域确定候选区域；在所述候选区域中确定各所述候选区域的文本分布信息；基于所述文本分布信息在所述候选区域确定目标文本区域。According to a first aspect of the present disclosure, there is provided a method for determining a text area, comprising: acquiring an initial image including the text, and determining a rectangular reference area including the text in the initial image, wherein the rectangular reference One edge of the area is parallel to the reference direction in the initial image; a candidate area is determined according to the rectangular reference area; text distribution information of each candidate area is determined in the candidate area; based on the text distribution information, The candidate region determines the target text region.

根据本公开的第二方面，提供一种文本区域确定装置，包括：目标检测模块，用于获取包括所述文本的初始图像，并在所述初始图像中确定包括所述文本的矩形参考区域，其中，所述矩形参考区域的其中一个边缘平行于初始图像中的参考方向；第一确定模块，用于根据所述矩形参考区域确定候选区域；信息提取模块；用于在所述候选区域中确定各所述候选区域的文本分布信息；第二确定模块，用于基于所述文本分布信息在所述候选区域确定目标文本区域。According to a second aspect of the present disclosure, there is provided an apparatus for determining a text area, comprising: a target detection module for acquiring an initial image including the text, and determining a rectangular reference area including the text in the initial image, Wherein, one edge of the rectangular reference area is parallel to the reference direction in the initial image; a first determination module is used to determine a candidate area according to the rectangular reference area; an information extraction module is used to determine in the candidate area text distribution information of each of the candidate regions; and a second determination module, configured to determine a target text region in the candidate region based on the text distribution information.

根据本公开的第三方面，提供一种计算机可读介质，其上存储有计算机程序，计算机程序被处理器执行时实现上述的方法。According to a third aspect of the present disclosure, there is provided a computer-readable medium on which a computer program is stored, and when the computer program is executed by a processor, implements the above-mentioned method.

根据本公开的第四方面，提供一种电子设备，其特征在于，包括：一个或多个处理器；以及存储器，用于存储一个或多个程序，当一个或多个程序被一个或多个处理器执行时，使得一个或多个处理器实现上述的方法。According to a fourth aspect of the present disclosure, there is provided an electronic device, characterized by comprising: one or more processors; and a memory for storing one or more programs, when the one or more programs are executed by one or more When executed by the processor, one or more processors are caused to implement the above method.

本公开的一种实施例所提供的文本区域确定方法，获取包括文本的初始图像，并在初始图像中确定包括文本的矩形参考区域，其中，矩形参考区域的边缘平行于初始图像中的参考方向；根据矩形参考区域确定候选区域；在候选区域中确定各候选区域的文本分布信息；基于文本分布信息在候选区域确定目标文本区域，相较于现有技术，一方面，在初始图像中确定出了大概的矩形参考区域，然后基于几何关系在矩形参考区域中确定候选区域，不需要对图像进行校正即可以开始进行目标检测，降低了计算量，同时，矩形参考区域的边缘平行于所述初始图像中的参考方向，即在进行检测时，可以将检测框的角度一致，降低了检测过程中的计算量，进一步降低了计算量，提升了计算速度。另一方面，在确定候选区域之后，基于候选区域的文本分布信息来确定目标文本区域，能够保证对文本区域确定的精度。即本公开在保证了对文本区域确定的精确度的同时，降低了计算量，提升了确定文本区域的速度。In the method for determining a text area provided by an embodiment of the present disclosure, an initial image including text is acquired, and a rectangular reference area including text is determined in the initial image, wherein the edge of the rectangular reference area is parallel to the reference direction in the initial image Determine the candidate area according to the rectangular reference area; determine the text distribution information of each candidate area in the candidate area; determine the target text area in the candidate area based on the text distribution information, compared with the prior art, on the one hand, determine in the initial image The approximate rectangular reference area is obtained, and then the candidate area is determined in the rectangular reference area based on the geometric relationship. The target detection can start without the need to correct the image, which reduces the amount of calculation. At the same time, the edge of the rectangular reference area is parallel to the initial The reference direction in the image, that is, during detection, the angle of the detection frame can be consistent, which reduces the calculation amount in the detection process, further reduces the calculation amount, and improves the calculation speed. On the other hand, after the candidate region is determined, the target text region is determined based on the text distribution information of the candidate region, which can ensure the accuracy of the determination of the text region. That is, the present disclosure reduces the amount of calculation and improves the speed of determining the text area while ensuring the accuracy of determining the text area.

应当理解的是，以上的一般描述和后文的细节描述仅是示例性和解释性的，并不能限制本公开。It is to be understood that the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the present disclosure.

附图说明Description of drawings

此处的附图被并入说明书中并构成本说明书的一部分，示出了符合本公开的实施例，并与说明书一起用于解释本公开的原理。显而易见地，下面描述中的附图仅仅是本公开的一些实施例，对于本领域普通技术人员来讲，在不付出创造性劳动的前提下，还可以根据这些附图获得其他的附图。在附图中：The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the disclosure and together with the description serve to explain the principles of the disclosure. Obviously, the drawings in the following description are only some embodiments of the present disclosure, and for those of ordinary skill in the art, other drawings can also be obtained from these drawings without creative effort. In the attached image:

图1示出了不具备旋转角度的银行卡图像的示意图；Figure 1 shows a schematic diagram of a bank card image without a rotation angle;

图2示出了具备旋转角度的银行卡图像的示意图；Figure 2 shows a schematic diagram of a bank card image with a rotation angle;

图3示出了可以应用本公开实施例的一种示例性系统架构的示意图；3 shows a schematic diagram of an exemplary system architecture to which embodiments of the present disclosure may be applied;

图4示意性示出本公开示例性实施例中一种文本区域确定方法的流程图；FIG. 4 schematically shows a flowchart of a method for determining a text area in an exemplary embodiment of the present disclosure;

图5示意性示出本公开示例性实施例中一种获取矩形参考区域的流程图；FIG. 5 schematically shows a flow chart of acquiring a rectangular reference area in an exemplary embodiment of the present disclosure;

图6示意性示出本公开示例性实施例中一种矩形参考区域的结构图；FIG. 6 schematically shows a structural diagram of a rectangular reference area in an exemplary embodiment of the present disclosure;

图7示意性示出本公开示例性实施例中一种内接矩形与矩形参考区域的几何关系图；FIG. 7 schematically shows a geometrical relationship diagram between an inscribed rectangle and a rectangular reference area in an exemplary embodiment of the present disclosure;

图8示意性示出本公开示例性实施例中另一种内接矩形与矩形参考区域的几何关系图；FIG. 8 schematically shows a geometric relationship diagram of another inscribed rectangle and a rectangular reference area in an exemplary embodiment of the present disclosure;

图9示意性示出本公开示例性实施例中矩形参考区域对应的目标图像；FIG. 9 schematically shows a target image corresponding to a rectangular reference area in an exemplary embodiment of the present disclosure;

图10示意性示出本公开示例性实施例中内接矩形abcd在初始图像中的位置示意图；FIG. 10 schematically shows a schematic diagram of the position of the inscribed rectangle abcd in the initial image in the exemplary embodiment of the present disclosure;

图11示意性示出本公开示例性实施例中内接矩形abcd对应的目标图像；FIG. 11 schematically shows a target image corresponding to an inscribed rectangle abcd in an exemplary embodiment of the present disclosure;

图12示意性示出本公开示例性实施例中内接矩形efgj在初始图像中的位置示意图；FIG. 12 schematically shows the position of the inscribed rectangle efgj in the initial image in the exemplary embodiment of the present disclosure;

图13示意性示出本公开示例性实施例中内接矩形efgj对应的目标图像；FIG. 13 schematically shows a target image corresponding to an inscribed rectangle efgj in an exemplary embodiment of the present disclosure;

图14示意性示出本公开示例性实施例中关键点的位置图；FIG. 14 schematically shows a location map of key points in an exemplary embodiment of the present disclosure;

图15示意性示出本公开示例性实施例中根据关键点拟合的直线的位置图；FIG. 15 schematically shows a position diagram of a straight line fitted according to key points in an exemplary embodiment of the present disclosure;

图16示意性示出本公开示例性实施例中最小外接矩形的位置图；FIG. 16 schematically shows a position diagram of a minimum circumscribed rectangle in an exemplary embodiment of the present disclosure;

图17示意性示出本公开文本区域确定方法最优实施例流程图；FIG. 17 schematically shows a flow chart of the preferred embodiment of the method for determining the text region of the present disclosure;

图18示意性示出本公开示例性实施例中文本区域确定装置的组成示意图；FIG. 18 schematically shows a schematic diagram of the composition of an apparatus for determining a text area in an exemplary embodiment of the present disclosure;

图19示出了可以应用本公开实施例的一种电子设备的示意图。FIG. 19 shows a schematic diagram of an electronic device to which embodiments of the present disclosure may be applied.

具体实施方式Detailed ways

现在将参考附图更全面地描述示例实施方式。然而，示例实施方式能够以多种形式实施，且不应被理解为限于在此阐述的范例；相反，提供这些实施方式使得本公开将更加全面和完整，并将示例实施方式的构思全面地传达给本领域的技术人员。所描述的特征、结构或特性可以以任何合适的方式结合在一个或更多实施方式中。Example embodiments will now be described more fully with reference to the accompanying drawings. Example embodiments, however, can be embodied in various forms and should not be construed as limited to the examples set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of example embodiments to those skilled in the art. The described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.

此外，附图仅为本公开的示意性图解，并非一定是按比例绘制。图中相同的附图标记表示相同或类似的部分，因而将省略对它们的重复描述。附图中所示的一些方框图是功能实体，不一定必须与物理或逻辑上独立的实体相对应。可以采用软件形式来实现这些功能实体，或在一个或多个硬件模块或集成电路中实现这些功能实体，或在不同网络和/或处理器装置和/或微控制器装置中实现这些功能实体。Furthermore, the drawings are merely schematic illustrations of the present disclosure and are not necessarily drawn to scale. The same reference numerals in the drawings denote the same or similar parts, and thus their repeated descriptions will be omitted. Some of the block diagrams shown in the figures are functional entities that do not necessarily necessarily correspond to physically or logically separate entities. These functional entities may be implemented in software, or in one or more hardware modules or integrated circuits, or in different networks and/or processor devices and/or microcontroller devices.

在相关技术中，随着手机等便携式电子设备的流行，以及4G、5G等通信基础设施的大规模建设，以手机支付为代表的移动支付越来越流行，一定程度上实现了“无现金”社会的部分特征。In related technologies, with the popularity of portable electronic devices such as mobile phones and the large-scale construction of communication infrastructure such as 4G and 5G, mobile payment represented by mobile phone payment is becoming more and more popular, and to a certain extent, "cashless" has been realized. some characteristics of society.

手机支付的第一步是用户在手机上登陆支付账号，并绑定自己的实名制银行卡，其中一个重要的步骤是用户输入自己的身份信息后再输入银行卡号进行校验。但是，由于银行卡一般位数较多，其字体与常规的印刷体、手写体字符有较大差别，因此用户误输入的概率比较大，导致用户需要反复输入、确认甚至被暂时冻结账户，十分影响体验。因此，研究基于图像识别的银行卡号自动识别方案有十分重要的应用价值。The first step of mobile payment is that the user logs in the payment account on the mobile phone and binds his real-name bank card. One of the important steps is that the user enters his identity information and then enters the bank card number for verification. However, because bank cards generally have more digits, and their fonts are quite different from conventional printed and handwritten characters, there is a high probability of users entering wrongly, which leads to the need for users to repeatedly enter, confirm or even temporarily freeze the account, which is very affecting experience. Therefore, it is of great application value to study the automatic identification scheme of bank card number based on image recognition.

目前主流的银行卡号区域检测方案分为两大类：一是基于滤波器、形态学处理等较传统的计算机视觉方法，通过边缘检测定位到卡号区域，该类方法的检测精度较差；另一类方法基于目标检测类CNN(Convolutional Neural Networks,卷积神经网络)，通过大数据学习的方式训练模型具有自主标注能力，直接定位目标区域，该类方法的优点是性能极高，缺点是训练难度较大，并且一般只能定位水平矩形框因此对具有旋转的图像效果较差。如图1和图2所示，当图像具有旋转角度时，则必须对图像校正才能够保证监测精度，计算量较大，若不对图像进行校正，基于CNN的目标检测方案输出的水平检测框中有大量的无关背景区域，将影响识别模块的效果。The current mainstream bank card number area detection solutions are divided into two categories: one is based on more traditional computer vision methods such as filters and morphological processing, which locate the card number area through edge detection, and the detection accuracy of this type of method is poor; the other The class method is based on the target detection class CNN (Convolutional Neural Networks, convolutional neural network). The model is trained through big data learning to have the ability to label independently and directly locate the target area. Larger, and generally can only locate horizontal rectangular boxes, so it is less effective for images with rotation. As shown in Figure 1 and Figure 2, when the image has a rotation angle, the image must be corrected to ensure the monitoring accuracy, and the amount of calculation is large. If the image is not corrected, the horizontal detection frame output by the CNN-based target detection scheme There are a large number of irrelevant background regions, which will affect the performance of the recognition module.

图3示出了系统架构的示意图，该系统架构300可以包括终端310与服务器320。其中，终端310可以是智能手机、平板电脑、台式电脑、笔记本电脑等终端设备，服务器320泛指提供本示例性实施方式中文本区域确定的相关服务的后台系统，可以是一台服务器或多台服务器形成的集群。终端310与服务器320之间可以通过有线或无线的通信链路形成连接，以进行数据交互。FIG. 3 shows a schematic diagram of a system architecture. The system architecture 300 may include a terminal 310 and a server 320 . The terminal 310 may be a terminal device such as a smart phone, a tablet computer, a desktop computer, a notebook computer, etc., and the server 320 generally refers to a background system that provides services related to text area determination in this exemplary embodiment, and may be one server or multiple servers. A cluster of servers. A connection may be formed between the terminal 310 and the server 320 through a wired or wireless communication link to perform data interaction.

在一种实施方式中，可以由终端310执行上述文本区域确定方法。例如，用户使用终端310拍摄图像或者用户在终端310的相册中选取图像后，由终端310对该图像进行文本区域确定，输出目标文本区域。In one embodiment, the above-mentioned text area determination method may be performed by the terminal 310 . For example, after the user uses the terminal 310 to shoot an image or the user selects an image in the album of the terminal 310, the terminal 310 determines the text area of the image, and outputs the target text area.

在一种实施方式中，可以由服务器320执行上述文本区域确定方法。例如，用户使用终端310拍摄图像或者用户在终端310的相册中选取图像后，终端310将该图像上传至服务器320，由服务器320对该图像进行文本区域确定，向终端310返回目标文本区域。In one embodiment, the above-described text area determination method may be performed by the server 320 . For example, after the user uses the terminal 310 to shoot an image or the user selects an image in the album of the terminal 310 , the terminal 310 uploads the image to the server 320 , and the server 320 determines the text area of the image and returns the target text area to the terminal 310 .

由上可知，本示例性实施方式中的文本区域确定方法的执行主体可以是上述终端310或服务器320，本公开对此不做限定。It can be seen from the above that the execution subject of the text area determination method in this exemplary embodiment may be the above-mentioned terminal 310 or server 320, which is not limited in the present disclosure.

下面结合图4对本示例性实施方式中的文本区域确定方法进行说明，图4示出了该文本区域确定方法的示例性流程，可以包括：The text area determination method in this exemplary embodiment will be described below with reference to FIG. 4 , and FIG. 4 shows an exemplary flow of the text area determination method, which may include:

步骤S410，获取包括所述文本的初始图像，并在所述初始图像中确定包括所述文本的矩形参考区域，其中，所述矩形参考区域的其中一个边缘平行于所述初始图像中的参考方向；Step S410, acquiring an initial image including the text, and determining a rectangular reference area including the text in the initial image, wherein one edge of the rectangular reference area is parallel to a reference direction in the initial image ;

步骤S420，根据所述矩形参考区域的几何关系确定候选区域；Step S420, determining a candidate region according to the geometric relationship of the rectangular reference region;

步骤S430，在所述候选区域中确定各所述候选区域的文本分布信息；Step S430, determining the text distribution information of each candidate region in the candidate region;

步骤S440，基于所述文本分布信息在所述候选区域确定目标文本区域。Step S440, determining a target text region in the candidate region based on the text distribution information.

基于上述方法，一方面，在初始图像中确定出了矩形参考区域，然后基于几何关系在矩形参考区域中确定候选区域，不需要对图像进行校正即可以开始进行目标检测，降低了计算量，同时，矩形参考区域的边缘平行于所述初始图像中的参考方向，即在进行检测时，可以将检测框的角度一致，降低了检测过程中的计算量，提升了计算速度。另一方面，在确定候选区域之后，基于候选区域的文本分布信息来确定目标文本区域，能够保证对文本区域确定的精度。即本公开在保证了对文本区域确定的精确度的同时，降低了计算量，提升了确定文本区域的速度。Based on the above method, on the one hand, the rectangular reference area is determined in the initial image, and then the candidate area is determined in the rectangular reference area based on the geometric relationship. The target detection can be started without the need to correct the image, which reduces the amount of calculation. , the edge of the rectangular reference area is parallel to the reference direction in the initial image, that is, during detection, the angle of the detection frame can be consistent, which reduces the amount of calculation in the detection process and improves the calculation speed. On the other hand, after the candidate region is determined, the target text region is determined based on the text distribution information of the candidate region, which can ensure the accuracy of the determination of the text region. That is, the present disclosure reduces the amount of calculation and improves the speed of determining the text area while ensuring the accuracy of determining the text area.

下面对图4中的每个步骤进行具体说明。Each step in FIG. 4 will be described in detail below.

参考图4，在步骤S410中，获取包括所述文本的初始图像，并在所述初始图像中确定包括所述文本的矩形参考区域，其中，所述矩形参考区域的其中一个边缘平行于所述初始图像中的参考方向。Referring to FIG. 4, in step S410, an initial image including the text is acquired, and a rectangular reference area including the text is determined in the initial image, wherein one edge of the rectangular reference area is parallel to the The reference orientation in the initial image.

在本示例实施方式中，初始图像包括文本，初始图像可以是银行卡，身份证等卡片对应的图像，也可以是文档图像，还可以根据用户需求进行自定义，在本示例实施方式中不做具体限定。In this exemplary embodiment, the initial image includes text, and the initial image may be an image corresponding to a card such as a bank card or an ID card, or a document image, and can also be customized according to user needs, which is not done in this exemplary embodiment. Specific restrictions.

在上述初始图像为银行卡，身份证等卡片对应的图像时，上述文本可以包括银行卡号、身份证号等。When the above-mentioned initial image is an image corresponding to a bank card, an ID card, etc., the above-mentioned text may include a bank card number, an ID card number, and the like.

在本示例实施方式中，上述初始图像可以是矩形、三角形、圆形等，在本示例实施方式中不对上述初始图像的形状做具体限定。初始图像可以是3通道的彩色图像，也可以是1通道的灰度图像，还可以根据用户需求进行自定义，在本示例实施方式中不做具体限定。In this exemplary embodiment, the above-mentioned initial image may be a rectangle, a triangle, a circle, etc., and the shape of the above-mentioned initial image is not specifically limited in this exemplary embodiment. The initial image can be a 3-channel color image, or a 1-channel grayscale image, and can also be customized according to user requirements, which is not specifically limited in this exemplary implementation.

其中，上述矩形参考区域为包括上述文本的矩形区域，参考方向可以是由用户自定义的方向，用于限定矩形参考区域与水平方向之间的偏转角度，举例而言，若上述初始图像为矩形，此处，参考方向可以与上述初始图像的其中一个边缘平行。Wherein, the above-mentioned rectangular reference area is a rectangular area including the above-mentioned text, and the reference direction may be a user-defined direction, which is used to define the deflection angle between the rectangular reference area and the horizontal direction. For example, if the above-mentioned initial image is a rectangle , where the reference direction can be parallel to one of the edges of the above initial image.

在本示例实施方式中，在获取上述矩形参考区域时，可以包括步骤S510至步骤S530，下面对步骤S510至步骤S530进行详细说明。In this exemplary embodiment, when acquiring the above-mentioned rectangular reference area, steps S510 to S530 may be included, and steps S510 to S530 will be described in detail below.

在步骤S510中，对上述初始图像进行目标检测得到多个中间文本区域。In step S510, target detection is performed on the initial image to obtain a plurality of intermediate text regions.

在步骤S520中，确定各所述中间文本区域的准确度以及各所述中间文本区域中包括预设类型文本的置信度。In step S520, the accuracy of each of the intermediate text regions and the confidence that each of the intermediate text regions includes text of a preset type are determined.

具体而言，参照图6所示，可以首先建立一直角坐标系，将上述初始图像放置于上述直角坐标系中，此时上述参考方向可以是与任一坐标轴的平行的方向，可以矩形检测框对上述初始图像进行目标检测得到多个中间文本区域，此时上述矩形检测框的边缘平行于上述参考方向，限定了矩形检测框的方向，无需使用跟其他方向的矩形检测框，能够减少矩形检测框中数量，降低检测过程中的计算量。Specifically, referring to FIG. 6 , a rectangular coordinate system can be established first, and the initial image can be placed in the rectangular coordinate system. At this time, the reference direction can be a direction parallel to any coordinate axis, which can be detected by a rectangle. At this time, the edge of the above-mentioned rectangular detection frame is parallel to the above-mentioned reference direction, which defines the direction of the rectangular detection frame, and there is no need to use rectangular detection frames in other directions, which can reduce the number of rectangular detection frames. The number of detection frames reduces the amount of calculation in the detection process.

下面以上述初始图像为包括银行卡的图像，预设文本类型为银行卡号为例进行说明，参照图6所示，处理器可以将上述初始图像经过多个级联的卷积、池化等操作后输出n个6维向量。其中，n与具体的网络结构有关，n可以是大于或等于1000小于或等于99999的任意正整数，例如10000、20000等，也可以根据需求进行自定义，在本示例实施方式中不做具体限定。单个预测向量的形式为(x,y,w,h,c₁,c₂)，其中x、y、w、h是正数，分别代表中间文本区域的中心点横坐标、中心点纵坐标、中间文本区域的宽度、中间文本区域的高度，进而得到多个中间文本区域，接每一个6为向量代表一个中间文本区域。In the following, the above initial image is an image including a bank card, and the preset text type is a bank card number as an example for description. Referring to FIG. 6 , the processor can process the above initial image through multiple cascaded convolution, pooling and other operations. Then output n 6-dimensional vectors. Among them, n is related to the specific network structure, and n can be any positive integer greater than or equal to 1000 and less than or equal to 99999, such as 10000, 20000, etc. It can also be customized according to requirements, which is not specifically limited in this example implementation. . The form of a single prediction vector is (x, y, w, h, c ₁ , c ₂ ), where x, y, w, and h are positive numbers, representing the abscissa of the center point, the ordinate of the center point, and the center point of the middle text area, respectively. The width of the text area, the height of the intermediate text area, and then multiple intermediate text areas are obtained, and each 6 is a vector representing an intermediate text area.

在本示例实施方式中，其中，c₁、c₂是0～1之间的正数，分别代表中心点坐标准确的置信度和框内区域是银行卡号的置信度。In this exemplary embodiment, c ₁ and c ₂ are positive numbers between 0 and 1, respectively representing the confidence that the coordinates of the center point are accurate and the confidence that the area inside the frame is a bank card number.

在本示例实施方式中，可以将上述c₁、c₂的乘积作为上述中间文本区域中包括预设类型文本的置信度。In this exemplary embodiment, the product of the above-mentioned c ₁ and c ₂ may be used as the confidence level that the above-mentioned intermediate text region includes text of a preset type.

在步骤S530中，根据所述准确度和所述置信度在所述多个中间文本区域中确定所述矩形参考区域。In step S530, the rectangular reference area is determined among the plurality of intermediate text areas according to the accuracy and the confidence.

在本示例实施方式中，得到n个中间文本区域后，取其中c₁、c₂乘积最大者，即上述中间文本区域中包括预设类型文本的置信度最大的中间文本区域作为上述矩形参考区域。如图6所示，记预测结果为(X，Y，W，H，C1，C2)，矩形参考区域为矩形ABCD。In this exemplary implementation, after obtaining n intermediate text regions, the one with the largest product of c ₁ and c ₂ is taken as the above-mentioned rectangular reference region, that is, the intermediate text region with the highest confidence in the above-mentioned intermediate text region including text of the preset type . As shown in FIG. 6 , the prediction result is denoted as (X, Y, W, H, C1, C2), and the rectangular reference area is a rectangle ABCD.

在本公开的一种示例实施方式中，可以首先设定一预设阈值，预设阈值可以是0.5，也可以是0.4、0.6等，还可以根据用户需求进行自定义，在本示例实施方式中不做具体限定。In an example implementation of the present disclosure, a preset threshold may be set first, and the preset threshold may be 0.5, or 0.4, 0.6, etc., and can also be customized according to user requirements. In this example implementation No specific limitation is made.

在本示例实施方式中，以上述预设阈值为0.5为例，取初始图像的左上角为坐标原点，取图像左上角为坐标原点，AB方向为X轴正方向，AD方向为Y轴正方向进行说明，若C1×C2小于预设阈值，则说明初始图像不含卡号区域，可以不进行后续操作。若C1×C2大于或等于上述预设阈值，则确定上述矩形参考区域ABCD的中心点P的坐标为(X，Y)，A、B、C、D点的坐标分别为(X-W/2，Y-H/2)、(X+W/2，Y-H/2)、(X+W/2，Y+H/2)、(X-W/2，Y+H/2)。其中W表示矩形参考区域的长边的长度，H表示矩形参考区域的短边的长度，即本公开中的长代表长边的长度，宽表示短边的长度。采用上述方式来确定矩形参考区域，能够在保证精度的同时，减少目标检测算法的计算量。In this exemplary implementation, taking the above preset threshold value of 0.5 as an example, the upper left corner of the initial image is taken as the coordinate origin, the upper left corner of the image is taken as the coordinate origin, the AB direction is the positive direction of the X axis, and the AD direction is the positive direction of the Y axis For illustration, if C1×C2 is smaller than the preset threshold, it means that the initial image does not contain a card number area, and subsequent operations may not be performed. If C1×C2 is greater than or equal to the above-mentioned preset threshold, the coordinates of the center point P of the above-mentioned rectangular reference area ABCD are determined to be (X, Y), and the coordinates of points A, B, C, and D are respectively (X-W/2, Y-H /2), (X+W/2, Y-H/2), (X+W/2, Y+H/2), (X-W/2, Y+H/2). Wherein W represents the length of the long side of the rectangular reference area, H represents the length of the short side of the rectangular reference area, that is, the length in the present disclosure represents the length of the long side, and the width represents the length of the short side. Using the above method to determine the rectangular reference area can reduce the calculation amount of the target detection algorithm while ensuring the accuracy.

在得到上述矩形参考区域之后，则可以执行步骤S420。After the above-mentioned rectangular reference area is obtained, step S420 may be executed.

在步骤S420中，根据所述矩形参考区域确定候选区域。In step S420, a candidate area is determined according to the rectangular reference area.

在本示例实施方式中，上述候选区域为在所述矩形参考区域中可能包括全部文本的，且面积小于等于上述矩形参考区域的矩形区域，具体而言，可以根据矩形参考区域的几何关系确定候选区域，几何关系可以包括矩形参考区域的长、宽、长宽比、内接图像、外接图像等，在本示例实施方式中不做具体限定。In this exemplary embodiment, the above-mentioned candidate area is a rectangular area that may include all texts in the rectangular reference area and whose area is less than or equal to the above-mentioned rectangular reference area. Specifically, the candidate area may be determined according to the geometric relationship of the rectangular reference area. The geometric relationship may include the length, width, aspect ratio, inscribed image, and circumscribed image of the rectangular reference area, which are not specifically limited in this exemplary implementation.

在一种实施方式中，在确定上述矩形参考区域之后，首先确定上述矩形参考区域的内接矩形，可以将上述矩形参考区域的内接矩形以及上述矩形参考区域本身作为上述候选区域，即可以包括三个候选区域，具体包括矩形参考区域本身和矩形参考区域的两个内接矩形。In one embodiment, after the rectangular reference area is determined, the inscribed rectangle of the rectangular reference area is determined first, and the inscribed rectangle of the rectangular reference area and the rectangular reference area itself may be used as the candidate area, that is, it may include The three candidate regions specifically include the rectangular reference region itself and the two inscribed rectangles of the rectangular reference region.

在一种实施方式中，若上述矩形参考区域的W和H不相等(即矩形参考区域不是正方形)，则矩形参考区域中仅包括两个内接矩形，需要说明的是，本示例实施方式中的内接矩形的四个角分别位于矩形参考区域的四条边上。参照图7和图8所示，矩形参考区域ABCD的内接矩形可以包括abcd以及efgj，其中，上述两个内接矩形的大小相等，偏转方向相反。此时可以直接利用计算机拟合出abcd以及efgj的各个点的坐标以获取到上述两个内接矩形，并将上述内接矩形abcd、内接矩形efgj以及矩形参考区域本身作为上述候选区域。将上述内接矩形作为上述候选区域，一方面能够精确的定位文本的位置，另一方面，内接矩形的数量固定，可以直接减小候选区域的数量，进而降低计算量。In an embodiment, if W and H of the above-mentioned rectangular reference area are not equal (that is, the rectangular reference area is not a square), the rectangular reference area only includes two inscribed rectangles. It should be noted that in this example implementation The four corners of the inscribed rectangle are located on the four sides of the rectangle reference area. Referring to FIG. 7 and FIG. 8 , the inscribed rectangles of the rectangular reference area ABCD may include abcd and efgj, wherein the two inscribed rectangles have the same size and opposite deflection directions. At this time, the coordinates of each point of abcd and efgj can be directly fitted with a computer to obtain the above two inscribed rectangles, and the above inscribed rectangle abcd, inscribed rectangle efgj and the rectangular reference area itself are used as the above candidate areas. Using the above inscribed rectangle as the above candidate area, on the one hand, can accurately locate the position of the text, on the other hand, the number of inscribed rectangles is fixed, which can directly reduce the number of candidate areas, thereby reducing the amount of calculation.

在本公开的一种示例实施方式中，参照图7所示，可以首先确定内接矩形相对于矩形参考区域的偏转角∠α，其中上述偏转角∠α和上述∠BDC具有一一对应的关系，举例而言，在上述∠BDC＝32°时，∠α＝22°；在上述∠BDC＝20°时，∠α＝15°。可以首先确定上述不同的矩形参考区域中∠BDC与上述偏转角之间的对应关系，在得到上述∠BDC之后，可以根据上述对应关系确定上述偏转角，然后基于偏转角计算上述内接矩形的各点的坐标值。In an exemplary embodiment of the present disclosure, referring to FIG. 7 , the deflection angle ∠α of the inscribed rectangle relative to the rectangular reference area may be determined first, wherein the above deflection angle ∠α and the above ∠BDC have a one-to-one correspondence , for example, when the above ∠BDC=32°, ∠α=22°; when the above ∠BDC=20°, ∠α=15°. The corresponding relationship between ∠BDC and the above-mentioned deflection angle in the above-mentioned different rectangular reference regions can be determined first. After the above-mentioned ∠BDC is obtained, the above-mentioned deflection angle can be determined according to the above-mentioned corresponding relationship, and then each of the above-mentioned inscribed rectangles can be calculated based on the deflection angle. The coordinate value of the point.

在本示例实施方式中，可以将上述∠BDC与偏转角之间的对应关系精确到1度，也可以精确到0.5度、0.1度等，在本示例实施方式中不做具体限定。In this exemplary embodiment, the above-mentioned correspondence between ∠BDC and the deflection angle may be accurate to 1 degree, or may be accurate to 0.5 degrees, 0.1 degrees, etc., which is not specifically limited in this exemplary embodiment.

在一种示例实施方式中，还可以首先确定上述文本的预设长宽比，然后基于上述预设长宽比和上述偏转角确定上述内接矩形，具体而言，以内接矩形abcd为例进行说明，设内接矩形abcd的长边为w，短边为h，本公开中的w代表长边的长度，h表示短边的长度，假设偏转角为∠α，∠α是∠BDC的m倍，此时可以得到如下几何逻辑关系：In an example implementation, the preset aspect ratio of the text may also be determined first, and then the inscribed rectangle is determined based on the preset aspect ratio and the deflection angle. Specifically, the inscribed rectangle abcd is used as an example to perform Explanation, let the long side of the inscribed rectangle abcd be w, and the short side be h, w in this disclosure represents the length of the long side, h represents the length of the short side, assuming that the deflection angle is ∠α, ∠α is m of ∠BDC times, the following geometric logical relationship can be obtained:

∠α＝m∠BDC∠α=m∠BDC

hsinα+wcosα＝Whsinα+wcosα=W

同时将上述文字的预设长宽比，文字的预设长宽比可以是上述文字区域的预设长宽比，例如，w＝kh，其中，k表示上述预设长宽比，此时，据此即可联立方程组解出h、w的值。进而计算得到上述abcd四点的坐标。比如，假设A点的坐标是(x,y)，那么a点的坐标为(x，y+H-h*cosα)，b点的坐标为(x+W-h*sinα,y)，c点的坐标为(x+W,y+h*cosα)，d点坐标为(x+h*sinα,y+H)。At the same time, the preset aspect ratio of the text, the preset aspect ratio of the text can be the preset aspect ratio of the text area, for example, w=kh, where k represents the preset aspect ratio, in this case, According to this, the values of h and w can be solved simultaneously from the system of equations. Then, the coordinates of the above four points abcd are calculated. For example, if the coordinates of point A are (x, y), then the coordinates of point a are (x, y+H-h*cosα), the coordinates of point b are (x+W-h*sinα,y), and the coordinates of point c are (x+W, y+h*cosα), the coordinates of point d are (x+h*sinα, y+H).

在本公开的一种示例实施方式中，若上述矩形参考区域的W和H相等(即矩形参考区域为正方形)，此时，可以基于上述预设长宽比和上述偏转角确定上述内接矩形，进而确定上述候选区域，在上述矩形参考区域为正方形时，∠α＝∠BDC＝45°，则可以得到

然后基于w＝kh既可以计算得到上述h、w的值，进而计算得到上述abcd四点的坐标。例如，假设A点的坐标是(x,y)，那么a点的坐标为

b点的坐标为

c点的坐标为

d点坐标为

In an exemplary embodiment of the present disclosure, if W and H of the above-mentioned rectangular reference area are equal (that is, the rectangular reference area is a square), at this time, the above-mentioned inscribed rectangle may be determined based on the above-mentioned preset aspect ratio and the above-mentioned deflection angle , and then determine the above-mentioned candidate area, when the above-mentioned rectangular reference area is a square, ∠α=∠BDC=45°, then we can get

Then, based on w=kh, the above-mentioned values of h and w can be calculated, and then the coordinates of the above-mentioned four points of abcd can be calculated. For example, if the coordinates of point A are (x, y), then the coordinates of point a are

The coordinates of point b are

The coordinates of point c are

The coordinates of point d are

在再一种示例实施方式中，在基于预设长宽比以及偏转角确定内接矩形时，若上述初始图像对应的是银行卡，则可以近似认定为上述∠α＝∠BDC。In yet another exemplary embodiment, when the inscribed rectangle is determined based on the preset aspect ratio and the deflection angle, if the above-mentioned initial image corresponds to a bank card, it can be approximately determined that the above-mentioned ∠α=∠BDC.

此时，可以得到如下几何逻辑关系，At this point, the following geometric logical relationship can be obtained,

sinα＝H/(W²+H²)1^/2 sinα=H/(W ² +H ² )1 ^/2

cosα＝W/(W²+H²)1^/2 cosα=W/(W ² +H ² )1 ^/2

hsinα+wcosα＝Whsinα+wcosα=W

同时，将w＝kh代入上述几何逻辑关系中，可以据此即可联立方程组解出h、w的值。进而计算得到上述abcd四点的坐标。比如，假设A点的坐标是(x,y)，那么a点的坐标为(x，y+H-h*cosα)，b点的坐标为(x+W-h*sinα,y)，c点的坐标为(x+W,y+h*cosα)，d点坐标为(x+h*sinα,y+H)。At the same time, by substituting w=kh into the above geometrical logical relationship, the values of h and w can be solved simultaneously from the equation system. Then, the coordinates of the above four points abcd are calculated. For example, if the coordinates of point A are (x, y), then the coordinates of point a are (x, y+H-h*cosα), the coordinates of point b are (x+W-h*sinα,y), and the coordinates of point c are (x+W, y+h*cosα), the coordinates of point d are (x+h*sinα, y+H).

在本示例实施方式中，上述预设长宽比根据初始图像以及初始图像中的文本确定，举例而言，若上述初始图像为银行卡图像，预设类型文本为银行卡号，则上述预设长宽比k可以取14，在本示例实施方式中不多上述预设长宽比做具体限定。In this exemplary embodiment, the above-mentioned preset aspect ratio is determined according to the initial image and the text in the initial image. For example, if the above-mentioned initial image is a bank card image, and the preset type text is a bank card number, the above-mentioned preset length The width ratio k may be 14, which is not specifically limited by the above-mentioned preset length-width ratio in this exemplary embodiment.

基于内接矩形efgj的计算过程，可以参照内接矩形abcd的计算过程，此处不再赘述。Based on the calculation process of the inscribed rectangle efgj, reference may be made to the calculation process of the inscribed rectangle abcd, which will not be repeated here.

在得到两个内接矩形之后，将上述内接矩形以及上述矩形参考区域本身作为上述候选区域，然后执行步骤S430。After the two inscribed rectangles are obtained, the above-mentioned inscribed rectangle and the above-mentioned rectangle reference area itself are used as the above-mentioned candidate area, and then step S430 is executed.

在步骤S430中，在所述候选区域中确定各所述候选区域的文本分布信息。In step S430, the text distribution information of each of the candidate regions is determined in the candidate regions.

文本分布信息为用于表示上述文本分布位置的信息，可以包括文本分布系数、文本中各个文字之间的距离等，在本示例实施方式中不做具体限定。The text distribution information is information used to represent the above-mentioned text distribution positions, and may include text distribution coefficients, distances between characters in the text, and the like, which are not specifically limited in this exemplary implementation.

在本示例实施方式中，参照图9和图10所示，在得到上述多个候选区域之后可以获取各候选区域对应的目标图像，即将上述候选区域的图像从上述初始图像中提取出来，并将其候选区域的边缘也与上述参考方向平行。具体而言，参照图10所示，以内接矩形adcd为例进行说明，可以假设候选区域在原图中对应的四点分别为P₁、P₂、P₃、P₄(其中P₁是最靠近图像左上角的点，P₁、P₂、P₃、P₄按顺时针次序排列)，它们的坐标集合为PS＝{(x₁,y₁),(x₂,y₂),(x₃,y₃),(x₄,y₅)}，候选区域的宽、高分别为w1、h1，则求解PS到点集PD＝{(0,0),(w1,0),(w1,h),(0,h1)}的仿射变换矩阵，再对原图进行相应的仿射变换即可。图9示出了以矩形参考区域作为候选区域时得到的目标图像，参照图10以及图11所示，以内接矩形abcd为候选区域时确定的目标图像，参照图12和图13，为以内接矩形efgj为候选区域时确定的目标图像。In this exemplary embodiment, referring to FIG. 9 and FIG. 10 , after obtaining the plurality of candidate regions, the target image corresponding to each candidate region can be obtained, that is, the image of the candidate region is extracted from the initial image, and the target image corresponding to each candidate region can be obtained. The edge of its candidate region is also parallel to the above reference direction. Specifically, referring to FIG. 10 , taking the inscribed rectangle adcd as an example, it can be assumed that the four points corresponding to the candidate region in the original image are P ₁ , P ₂ , P ₃ , and P ₄ (where P ₁ is the closest Points in the upper left corner of the image, P ₁ , P ₂ , P ₃ , P ₄ are arranged in clockwise order), and their coordinate set is PS={(x ₁ , y ₁ ), (x ₂ , y ₂ ), (x ₃ , y ₃ ), (x ₄ , y ₅ )}, the width and height of the candidate area are w1, h1 respectively, then solve PS to point set PD={(0,0),(w1,0),(w1 ,h),(0,h1)} affine transformation matrix, and then perform the corresponding affine transformation on the original image. FIG. 9 shows the target image obtained when the rectangular reference area is used as the candidate area. Referring to FIG. 10 and FIG. 11 , the target image determined when the inscribed rectangle abcd is the candidate area, referring to FIGS. 12 and 13 , is the inscribed target image. The target image determined when the rectangle efgj is the candidate region.

此时，可以确定上述目标图像中文本的旋转角，具体而言，参照图14、图15以及图16所示，以候选图像为矩形参考区域本身为例进行说明，处理器可以对上述目标图像使用Shi Tomasi算法对候选区域进行角点检测得到多个关键点，然后根据上述关键点得到文本对应的最小外接矩形，将最小外接矩形S与上述候选区域的夹角作为上述旋转角，并将上述旋转角记为V。在本示例实施方式中，可以设置角点检测的最大点数量为100，质量水平0.005，最小距离2，可以使得角点集中在文本所在区域，角点检测的具体参数还可以根据用户需求进行自定义，在本示例实施方式中不作具体限定。At this time, the rotation angle of the text in the above target image can be determined. Specifically, referring to FIG. 14 , FIG. 15 and FIG. 16 , taking the candidate image as the rectangular reference area itself as an example, the processor can determine the above target image. Use the Shi Tomasi algorithm to detect the corner points of the candidate area to obtain multiple key points, and then obtain the minimum circumscribed rectangle corresponding to the text according to the above key points, take the angle between the minimum circumscribed rectangle S and the above candidate area as the above rotation angle, and use the above The rotation angle is denoted as V. In this example implementation, the maximum number of points for corner detection can be set to 100, the quality level is 0.005, and the minimum distance is 2, so that the corners can be concentrated in the area where the text is located, and the specific parameters of corner detection can also be customized according to user needs. Definitions, which are not specifically limited in this exemplary embodiment.

在本示例实施方式中，上述文本分布参数包括文本分布系数，参照图15所示，在得到多个关键点之后，可以采用最小二乘法对上述关键点做直线拟合，并得到确定的直线的斜率K，然后可以根据上述斜率和旋转角计算得到上述文本分布系数R。具体而言，R＝VK。In this example implementation, the above-mentioned text distribution parameters include text distribution coefficients. Referring to FIG. 15 , after obtaining multiple key points, the least squares method can be used to perform straight line fitting on the above-mentioned key points, and the determined straight line is obtained. The slope K, and then the above text distribution coefficient R can be calculated according to the above slope and rotation angle. Specifically, R=VK.

在得分别计算上述多个候选区域的上述文本分布信息之后，可以执行步骤S440。After the above-mentioned text distribution information of the above-mentioned multiple candidate regions is calculated respectively, step S440 may be executed.

在步骤S440中，基于所述文本分布信息在所述候选区域确定目标文本区域。In step S440, a target text region is determined in the candidate region based on the text distribution information.

根据文本分布信息确定目标文本区域，可以通过文本分布信息确定文本的旋转角度，将目标图像中的文本最接近水平的候选区域作为目标文本区域，或者通过文本分布信息确定文本在候选区域中占据的面积比例，将最接近占满的作为目标文本区域The target text area is determined according to the text distribution information, the rotation angle of the text can be determined by the text distribution information, the candidate area where the text in the target image is closest to the horizontal level is used as the target text area, or the text occupied in the candidate area can be determined by the text distribution information. Area ratio, take the closest occupied area as the target text area

在本公开的一种示例实时方式中，可以将上述文本分布系数最小的候选区域作为上述目标文本区域。上述文本分布系数最小的候选区域中的文本最接近水平，即得到的目标文本区域的精确度最高，采用上述方式能够提升确定上述目标文本区域的精度。In an exemplary real-time manner of the present disclosure, the candidate region with the smallest text distribution coefficient may be used as the target text region. The text in the candidate region with the smallest text distribution coefficient is the closest to the level, that is, the obtained target text region has the highest accuracy, and the above method can improve the accuracy of determining the target text region.

在本公开的另一种示例实施方式中，处理器可以针对每一个候选区域执行如下操作，首先将候选区域对应的目标图像划分为多个子区域，然后可以根据上述文本分布系数来确定上述各个子区域的文本密度信息，然后计算各各所述候选区域中各所述子区域的文本密度信息的标准差。In another exemplary embodiment of the present disclosure, the processor may perform the following operations for each candidate area: first, divide the target image corresponding to the candidate area into a plurality of sub-areas, and then determine each of the above-mentioned sub-areas according to the above-mentioned text distribution coefficient The text density information of the region is calculated, and then the standard deviation of the text density information of each of the sub-regions in each of the candidate regions is calculated.

在得到多个上述候选区域的标准差之后，可以将上述标准差最小的目标图像对应的候选区域作为上述目标文本区域。候选区域中的各个区域的文本密度的标准差越小，表示文本分布越均匀，选择文本分布最均匀的候选区域作为目标文本区域能够提升对确定的目标文本区域的精度。After the standard deviations of the plurality of candidate regions are obtained, the candidate region corresponding to the target image with the smallest standard deviation may be used as the target text region. The smaller the standard deviation of the text density of each region in the candidate region, the more uniform the text distribution is. Selecting the candidate region with the most uniform text distribution as the target text region can improve the accuracy of the determined target text region.

在本公开的再一种示例实施方式中，处理器也可以直接采用OCR(OpticalCharacter Recognition，光学字符识别)在上述候选区域中确定上述目标文本区域。In yet another exemplary embodiment of the present disclosure, the processor may also directly use OCR (Optical Character Recognition, optical character recognition) to determine the target text region in the candidate region.

需要说明的是，在候选区域确定目标文本区域的方式可以包括多种，上述为示例性说明，本公开不对其进行具体限定。It should be noted that, the manners of determining the target text region in the candidate region may include various methods, the above is an exemplary description, and the present disclosure does not specifically limit it.

进一步的，参照图17所示，通过一具体的示例实施方式来对上述文本区域的确定方法进行说明，首先可以执行步骤S1710，获取包括所述文本的初始图像，并在所述初始图像中确定包括所述文本的矩形参考区域；然后则执行步骤S1720，将所述矩形参考区域以及所述内接矩形作为所述候选区域，具体的，在执行步骤S1720时，可以首先执行步骤S1721，判断上述矩形参考区域为长方形还是正方形，若是长方形，则执行步骤S1722，根据所述几何关系确定偏转角，以及步骤S1723，基于所述偏转角在所述矩形参考区域中确定所述内接矩形。若上述矩形参考区域为正方形，则执行步骤S1724，获取所述文本的预设长宽比，并根据所述几何关系确定偏转角；以及步骤S1725，基于所述预设长宽比和所述偏转角在所述矩形参考区域确定所述内接矩形。Further, referring to FIG. 17 , a specific example implementation is used to illustrate the above-mentioned method for determining the text area. First, step S1710 may be executed to acquire an initial image including the text, and determine in the initial image Include the rectangular reference area of the text; then step S1720 is executed, and the rectangular reference area and the inscribed rectangle are used as the candidate area. Specifically, when step S1720 is executed, step S1721 may be executed first to determine the above Whether the rectangular reference area is a rectangle or a square, if it is a rectangle, step S1722 is executed to determine a deflection angle according to the geometric relationship, and step S1723 is to determine the inscribed rectangle in the rectangular reference area based on the deflection angle. If the above-mentioned rectangular reference area is a square, step S1724 is executed to obtain a preset aspect ratio of the text, and a deflection angle is determined according to the geometric relationship; and step S1725, based on the preset aspect ratio and the deflection The corners define the inscribed rectangle in the rectangle reference area.

在得到上述内接矩形之后，可以确定文本分布信息中的文本分布系数可以执行步骤S1730，获取各所述候选区域对应的目标图像；然后执行步骤S1740，对各所述目标图像进行角点检测得到多个关键点；之后执行步骤S1750，确定所述目标图像中文本的旋转角；以及步骤S1760，基于所述关键点进行直线拟合，并确定所述直线的斜率；最后执行步骤，步骤S1770，基于所述旋转角和所述斜率确定所述文本分布系数。After obtaining the above inscribed rectangle, the text distribution coefficient in the text distribution information can be determined. Step S1730 can be executed to obtain the target image corresponding to each candidate region; multiple key points; then step S1750 is performed to determine the rotation angle of the text in the target image; and step S1760 is to perform straight line fitting based on the key points, and determine the slope of the straight line; the last step is to perform step S1770, The text distribution coefficient is determined based on the rotation angle and the slope.

在得到上述文本分布系数之后，可以执行步骤S1780，将所述文本分布系数的绝对值最小目标图像对应的候选区域确定为所述目标文本区域。以完成对目标文本区域的确定。After the above text distribution coefficient is obtained, step S1780 may be executed to determine the candidate area corresponding to the target image with the smallest absolute value of the text distribution coefficient as the target text area. to complete the determination of the target text area.

综上所述，相较于现有技术，一方面，在初始图像中确定出了大概的矩形参考区域，然后基于几何关系在矩形参考区域中确定候选区域，不需要对图像进行校正即可以开始进行目标检测，降低了计算量，同时，矩形参考区域的边缘平行于所述初始图像中的参考方向，即在进行检测时，可以将检测框的角度一致，降低了检测过程中的计算量，进一步降低了计算量，提升了计算速度。另一方面，利用矩形参考区域的内接矩形以及其本身作为候选区域，并通过相关的几何关系计算得到候选区域的各个顶点的坐标值，在保证精度的情况下，无需计算机做较为复杂的操作，更进一步的降低的计算量。再一方面，在确定候选区域之后，基于候选区域的文本分布信息来确定目标文本区域，具体的通过文本的旋转角以及关键点你和直线斜率得到文本分布信息中的文本分布系数，并利用文本分布系数来确定目标文本区域，能够保证对文本区域确定的精度。即本公开在保证了对文本区域去确定的精确度的同时，降低了计算量，提升了确定文本区域的速度。To sum up, compared with the prior art, on the one hand, an approximate rectangular reference area is determined in the initial image, and then a candidate area is determined in the rectangular reference area based on the geometric relationship, and you can start without correcting the image. Performing target detection reduces the amount of calculation. At the same time, the edge of the rectangular reference area is parallel to the reference direction in the initial image, that is, during detection, the angle of the detection frame can be consistent, which reduces the amount of calculation in the detection process. The calculation amount is further reduced and the calculation speed is improved. On the other hand, the inscribed rectangle of the rectangular reference area and itself are used as the candidate area, and the coordinate values of each vertex of the candidate area are calculated through the relevant geometric relationship. Under the condition of ensuring the accuracy, there is no need for the computer to do more complicated operations. , which further reduces the amount of computation. On the other hand, after the candidate area is determined, the target text area is determined based on the text distribution information of the candidate area. Specifically, the text distribution coefficient in the text distribution information is obtained through the rotation angle of the text, the key point you and the slope of the straight line, and the text is used. The distribution coefficient is used to determine the target text area, which can ensure the accuracy of the determination of the text area. That is, the present disclosure reduces the amount of calculation and improves the speed of determining the text area while ensuring the accuracy of determining the text area.

需要注意的是，上述附图仅是根据本公开示例性实施例的方法所包括的处理的示意性说明，而不是限制目的。易于理解，上述附图所示的处理并不表明或限制这些处理的时间顺序。另外，也易于理解，这些处理可以是例如在多个模块中同步或异步执行的。It should be noted that the above-mentioned drawings are only schematic illustrations of the processes included in the method according to the exemplary embodiment of the present disclosure, and are not intended to be limiting. It is easy to understand that the processes shown in the above figures do not indicate or limit the chronological order of these processes. In addition, it is also readily understood that these processes may be performed synchronously or asynchronously, for example, in multiple modules.

进一步的，参考图18所示，本示例的实施方式中还提供一种文本区域确定装置1800，包括目标检测模块1810、第一确定模块1820、信息提取模块1830以及图像生成模块1840。其中：Further, referring to FIG. 18 , the embodiment of this example further provides a text area determination apparatus 1800 , including a target detection module 1810 , a first determination module 1820 , an information extraction module 1830 and an image generation module 1840 . in:

目标检测模块1810可以用于获取包括文本的初始图像，并在初始图像中确定包括文本的矩形参考区域，其中，矩形参考区域的其中一个边缘平行于初始图像中的参考方向，具体而言，采用矩形检测框对初始图像进行目标检测得到多个中间文本区域；确定各中间文本区域的准确度以及各中间文本区域中包括预设类型文本的置信度；根据准确度和置信度在多个中间文本区域中确定矩形参考区域。The object detection module 1810 can be used to obtain an initial image including text, and determine a rectangular reference area including text in the initial image, wherein one edge of the rectangular reference area is parallel to the reference direction in the initial image, specifically, using The rectangular detection frame performs target detection on the initial image to obtain multiple intermediate text areas; determines the accuracy of each intermediate text area and the confidence that each intermediate text area includes text of a preset type; The rectangular reference area is determined in the area.

第一确定模块1820可以用于根据矩形参考区域确定候选区域，具体而言，可以首先确定矩形参考区域的内接矩形；将所述矩形参考区域作为所述候选区域，并将所述内接矩形作为所述候选区域。The first determination module 1820 may be configured to determine a candidate area according to the rectangular reference area, specifically, may first determine the inscribed rectangle of the rectangular reference area; take the rectangular reference area as the candidate area, and use the inscribed rectangle as the candidate area. as the candidate area.

在一种示例实施方式中，在确定矩形参考区域的内接矩形时，第一确定模块1820可以首先根据几何关系确定内接矩形相对于矩形参考区域的偏转角，然后基于偏转角在矩形参考区域中确定内接矩形。In an example implementation, when determining the inscribed rectangle of the rectangular reference area, the first determination module 1820 may first determine the deflection angle of the inscribed rectangle relative to the rectangular reference area according to the geometric relationship, and then determine the deflection angle of the inscribed rectangle in the rectangular reference area based on the deflection angle. Determine the inscribed rectangle in .

在本示例实施方式中，在基于偏转角在矩形参考区域中确定内接矩形时，第一确定模块1820可以首先获取文本的预设长宽比，然后基于预设长宽比和偏转角在矩形参考区域确定内接矩形。In this example embodiment, when determining the inscribed rectangle in the rectangle reference area based on the deflection angle, the first determination module 1820 may first obtain the preset aspect ratio of the text, and then determine the rectangle based on the preset aspect ratio and the deflection angle. The reference area determines the inscribed rectangle.

信息提取模块1830可以用于在候选区域中确定各候选区域的文本分布信息。具体而言，文本分布信息包括文本分布系数，在执行在候选区域中确定各候选区域的文本分布信息包括时，信息提取模块1830可以首先获取各候选区域对应的目标图像；然后确定目标图像中文本相对于所述候选区域的旋转角；最后根据旋转角确定文本分布系数。The information extraction module 1830 may be used to determine text distribution information of each candidate region in the candidate regions. Specifically, the text distribution information includes text distribution coefficients. When determining the text distribution information of each candidate region in the candidate region, the information extraction module 1830 can first obtain the target image corresponding to each candidate region; then determine the text in the target image. relative to the rotation angle of the candidate region; finally, the text distribution coefficient is determined according to the rotation angle.

在本示例实施方式中，在根据旋转角确定文本分布系数，信息提取模块1830可以首先对各目标图像进行角点检测得到多个关键点；然后，基于关键点进行直线拟合，并确定直线的斜率；最后，基于旋转角和斜率确定文本分布系数。In this exemplary embodiment, after determining the text distribution coefficient according to the rotation angle, the information extraction module 1830 may first perform corner detection on each target image to obtain multiple key points; then, perform straight line fitting based on the key points, and determine the Slope; finally, the text distribution coefficient is determined based on the rotation angle and the slope.

图像生成模块1840可以用于基于文本分布信息在候选区域确定目标文本区域。The image generation module 1840 may be used to determine the target text region in the candidate region based on the text distribution information.

在一种示例实施方式中，图像生成模块1840用于将文本分布系数的绝对值最小目标图像对应的候选区域确定为目标文本区域。In an example implementation, the image generation module 1840 is configured to determine the candidate region corresponding to the target image with the smallest absolute value of the text distribution coefficient as the target text region.

在另一种示例实施方式中，图像生成模块1840可以用于首先获取各候选区域对应的目标图像；然后将各目标图像划为多个子区域；其次，根据文本分布信息确定各子区域中的文本密度信息；之后，计算各候选区域中各子区域的文本密度信息的标准差；最后将标准差最小的目标图像对应的候选区域确定为目标文本区域。In another exemplary embodiment, the image generation module 1840 may be configured to first obtain the target image corresponding to each candidate area; then divide each target image into multiple sub-areas; secondly, determine the text in each sub-area according to the text distribution information density information; then, calculate the standard deviation of the text density information of each sub-region in each candidate region; finally, determine the candidate region corresponding to the target image with the smallest standard deviation as the target text region.

上述装置中各模块的具体细节在方法部分实施方式中已经详细说明，未披露的细节内容可以参见方法部分的实施方式内容，因而不再赘述。The specific details of each module in the above-mentioned apparatus have been described in detail in the method part of the implementation manner, and the undisclosed details can refer to the method part of the implementation manner, and thus will not be repeated.

本公开的示例性实施方式还提供一种用于执行上述文本区域确定方法的电子设备，该电子设备可以是上述终端310或服务器320。一般的，该电子设备可以包括处理器与存储器，存储器用于存储处理器的可执行指令，处理器配置为经由执行可执行指令来执行上述图像文本区域确定方法。Exemplary embodiments of the present disclosure also provide an electronic device for executing the above-described text area determination method, and the electronic device may be the above-described terminal 310 or server 320 . Generally, the electronic device may include a processor and a memory, the memory is used for storing executable instructions of the processor, and the processor is configured to execute the above-mentioned image text area determination method by executing the executable instructions.

下面以图19中的移动终端1900为例，对该电子设备的构造进行示例性说明。本领域技术人员应当理解，除了特别用于移动目的的部件之外，图19中的构造也能够应用于固定类型的设备。The following takes the mobile terminal 1900 in FIG. 19 as an example to illustrate the structure of the electronic device. It will be understood by those skilled in the art that the configuration in Figure 19 can also be applied to stationary type devices, in addition to components specifically for mobile purposes.

如图19所示，移动终端1900具体可以包括：处理器1901、存储器1902、总线1903、移动通信模块1904、天线1、无线通信模块1905、天线2、显示屏1906、摄像模块1907、音频模块1908、电源模块1909与传感器模块1910。As shown in FIG. 19 , the mobile terminal 1900 may specifically include: a processor 1901 , a memory 1902 , a bus 1903 , a mobile communication module 1904 , an antenna 1 , a wireless communication module 1905 , an antenna 2 , a display screen 1906 , a camera module 1907 , and an audio module 1908 , power module 1909 and sensor module 1910.

处理器1901可以包括一个或多个处理单元，例如：处理器210可以包括AP(Application Processor，应用处理器)、调制解调处理器、GPU(Graphics ProcessingUnit，图形处理器)、ISP(Image Signal Processor，图像信号处理器)、控制器、编码器、解码器、DSP(Digital Signal Processor，数字信号处理器)、基带处理器和/或NPU(Neural-Network Processing Unit，神经网络处理器)等。本示例性实施方式中的文本区域确定方法可以由AP、GPU或DSP来执行，当方法涉及到神经网络相关的处理时，可以由NPU来执行。The processor 1901 may include one or more processing units, for example, the processor 210 may include an AP (Application Processor, application processor), a modem processor, a GPU (Graphics Processing Unit, graphics processor), an ISP (Image Signal Processor) , image signal processor), controller, encoder, decoder, DSP (Digital Signal Processor, digital signal processor), baseband processor and/or NPU (Neural-Network Processing Unit, neural network processor), etc. The text region determination method in this exemplary embodiment may be performed by an AP, a GPU or a DSP, and when the method involves processing related to a neural network, it may be performed by an NPU.

处理器1901可以通过总线1903与存储器1902或其他部件形成连接。The processor 1901 may form a connection with the memory 1902 or other components through the bus 1903 .

存储器1902可以用于存储计算机可执行程序代码，所述可执行程序代码包括指令。处理器1901通过运行存储在存储器1902的指令，执行移动终端1900的各种功能应用以及数据处理。存储器1902还可以存储应用数据，例如存储图像，视频等文件。Memory 1902 may be used to store computer-executable program code, which includes instructions. The processor 1901 executes various functional applications and data processing of the mobile terminal 1900 by executing the instructions stored in the memory 1902 . The memory 1902 may also store application data, such as storing images, videos, and other files.

移动终端1900的通信功能可以通过移动通信模块1904、天线1、无线通信模块1905、天线2、调制解调处理器以及基带处理器等实现。天线1和天线2用于发射和接收电磁波信号。移动通信模块204可以提供应用在移动终端1900上2G、3G、4G、5G等移动通信解决方案。无线通信模块1905可以提供应用在移动终端1900上的无线局域网、蓝牙、近场通信等无线通信解决方案。The communication function of the mobile terminal 1900 can be implemented by the mobile communication module 1904, the antenna 1, the wireless communication module 1905, the antenna 2, the modem processor, the baseband processor, and the like. Antenna 1 and Antenna 2 are used to transmit and receive electromagnetic wave signals. The mobile communication module 204 can provide 2G, 3G, 4G, 5G and other mobile communication solutions applied on the mobile terminal 1900 . The wireless communication module 1905 can provide wireless communication solutions such as wireless local area network, Bluetooth, near field communication, etc. applied to the mobile terminal 1900 .

传感器模块1910可以包括深度传感器19101、压力传感器19102、陀螺仪传感器19103、气压传感器19104等，以实现相应的感应检测功能。The sensor module 1910 may include a depth sensor 19101, a pressure sensor 19102, a gyroscope sensor 19103, an air pressure sensor 19104, etc., to implement corresponding sensing detection functions.

所属技术领域的技术人员能够理解，本公开的各个方面可以实现为系统、方法或程序产品。因此，本公开的各个方面可以具体实现为以下形式，即：完全的硬件实施方式、完全的软件实施方式(包括固件、微代码等)，或硬件和软件方面结合的实施方式，这里可以统称为“电路”、“模块”或“系统”。As will be appreciated by one skilled in the art, various aspects of the present disclosure may be implemented as a system, method or program product. Therefore, various aspects of the present disclosure can be embodied in the following forms: a complete hardware implementation, a complete software implementation (including firmware, microcode, etc.), or a combination of hardware and software aspects, which may be collectively referred to herein as implementations "circuit", "module" or "system".

本公开的示例性实施方式还提供了一种计算机可读存储介质，其上存储有能够实现本说明书上述方法的程序产品。在一些可能的实施方式中，本公开的各个方面还可以实现为一种程序产品的形式，其包括程序代码，当程序产品在终端设备上运行时，程序代码用于使终端设备执行本说明书上述“示例性方法”部分中描述的根据本公开各种示例性实施方式的步骤。Exemplary embodiments of the present disclosure also provide a computer-readable storage medium on which a program product capable of implementing the above-described method of the present specification is stored. In some possible implementations, various aspects of the present disclosure can also be implemented in the form of a program product, which includes program code, when the program product runs on a terminal device, the program code is used to cause the terminal device to execute the above-mentioned procedures in this specification. Steps according to various exemplary embodiments of the present disclosure are described in the "Example Methods" section.

需要说明的是，本公开所示的计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质或者是上述两者的任意组合。计算机可读存储介质例如可以是——但不限于——电、磁、光、电磁、红外线、或半导体的系统、装置或器件，或者任意以上的组合。计算机可读存储介质的更具体的例子可以包括但不限于：具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机访问存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑磁盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。It should be noted that the computer-readable medium shown in the present disclosure may be a computer-readable signal medium or a computer-readable storage medium, or any combination of the above two. The computer-readable storage medium can be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus or device, or a combination of any of the above. More specific examples of computer readable storage media may include, but are not limited to, electrical connections with one or more wires, portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable Programmable read only memory (EPROM or flash memory), fiber optics, portable compact disk read only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination of the foregoing.

在本公开中，计算机可读存储介质可以是任何包含或存储程序的有形介质，该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。而在本公开中，计算机可读信号介质可以包括在基带中或者作为载波一部分传播的数据信号，其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式，包括但不限于电磁信号、光信号或上述的任意合适的组合。计算机可读的信号介质还可以是计算机可读存储介质以外的任何计算机可读介质，该计算机可读介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。计算机可读介质上包含的程序代码可以用任何适当的介质传输，包括但不限于：无线、电线、光缆、RF等等，或者上述的任意合适的组合。In this disclosure, a computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device. In the present disclosure, however, a computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave with computer-readable program code embodied thereon. Such propagated data signals may take a variety of forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing. A computer-readable signal medium can also be any computer-readable medium other than a computer-readable storage medium that can transmit, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device . Program code embodied on a computer readable medium may be transmitted using any suitable medium including, but not limited to, wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.

此外，可以以一种或多种程序设计语言的任意组合来编写用于执行本公开操作的程序代码，程序设计语言包括面向对象的程序设计语言—诸如Java、C++等，还包括常规的过程式程序设计语言—诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算设备上执行、部分地在用户设备上执行、作为一个独立的软件包执行、部分在用户计算设备上部分在远程计算设备上执行、或者完全在远程计算设备或服务器上执行。在涉及远程计算设备的情形中，远程计算设备可以通过任意种类的网络，包括局域网(LAN)或广域网(WAN)，连接到用户计算设备，或者，可以连接到外部计算设备(例如利用因特网服务提供商来通过因特网连接)。Furthermore, program code for performing the operations of the present disclosure may be written in any combination of one or more programming languages, including object-oriented programming languages such as Java, C++, etc., as well as conventional procedural Programming Language - such as the "C" language or similar programming language. The program code may execute entirely on the user computing device, partly on the user device, as a stand-alone software package, partly on the user computing device and partly on a remote computing device, or entirely on the remote computing device or server execute on. In the case of a remote computing device, the remote computing device may be connected to the user computing device through any kind of network, including a local area network (LAN) or a wide area network (WAN), or may be connected to an external computing device (eg, using an Internet service provider business via an Internet connection).

本领域技术人员在考虑说明书及实践这里公开的发明后，将容易想到本公开的其他实施例。本申请旨在涵盖本公开的任何变型、用途或者适应性变化，这些变型、用途或者适应性变化遵循本公开的一般性原理并包括本公开未公开的本技术领域中的公知常识或惯用技术手段。说明书和实施例仅被视为示例性的，本公开的真正范围和精神由权利要求指出。Other embodiments of the present disclosure will readily suggest themselves to those skilled in the art upon consideration of the specification and practice of the invention disclosed herein. This application is intended to cover any variations, uses, or adaptations of the present disclosure that follow the general principles of the present disclosure and include common knowledge or techniques in the technical field not disclosed by the present disclosure . The specification and examples are to be regarded as exemplary only, with the true scope and spirit of the disclosure being indicated by the claims.

应当理解的是，本公开并不局限于上面已经描述并在附图中示出的精确结构，并且可以在不脱离其范围进行各种修改和改变。本公开的范围仅由所附的权利要求来限。It is to be understood that the present disclosure is not limited to the precise structures described above and illustrated in the accompanying drawings, and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims

1. a text area determination method, is characterized in that, comprises:

acquiring an initial image including text, and determining a rectangular reference area including the text in the initial image, wherein one edge of the rectangular reference area is parallel to a reference direction in the initial image;

determining a candidate region according to the rectangular reference region;

determining text distribution information of each of the candidate regions in the candidate regions;

The target text area is determined based on the text distribution information.

2. The method according to claim 1, wherein the determining a rectangular reference area including the text in the initial image comprises:

performing target detection on the initial image to obtain a plurality of intermediate text regions;

determining the accuracy of each of the intermediate text regions and the confidence that each of the intermediate text regions includes text of a preset type;

The rectangular reference area is determined among the plurality of intermediate text areas according to the accuracy and the confidence level.

3. The method according to claim 1, wherein the determining a candidate region according to the rectangular reference region comprises:

determining the inscribed rectangle of the rectangular reference area;

The rectangular reference area is used as the candidate area, and the inscribed rectangle is used as the candidate area.

4. The method according to claim 3, the determining the inscribed rectangle of the rectangular reference area, comprising:

determining the deflection angle of the inscribed rectangle relative to the rectangular reference area according to the geometric relationship;

The inscribed rectangle is determined in the rectangular reference area based on the deflection angle.

5. The method of claim 4, the determining the inscribed rectangle in the rectangular reference area based on the deflection angle comprising:

Get the preset aspect ratio of the text;

The inscribed rectangle is determined in the rectangle reference area based on the preset aspect ratio and the deflection angle.

6 . The method according to claim 1 , wherein the text distribution information comprises a text distribution coefficient, and the determining the text distribution information of each of the candidate regions in the candidate regions comprises: 6 .

acquiring the target image corresponding to each of the candidate regions;

determining the rotation angle of the text in the target image relative to the candidate region;

The text distribution coefficient is determined according to the rotation angle.

7. The method according to claim 6, wherein the determining the text distribution coefficient according to the rotation angle comprises:

Perform corner detection on each of the target images to obtain a plurality of key points;

Fitting a straight line based on the key points, and determining the slope of the straight line;

The text distribution coefficient is determined based on the rotation angle and the slope.

8. The method according to claim 6, wherein the determining the target text region in the candidate region based on the text distribution information comprises:

The candidate area corresponding to the target image with the smallest absolute value of the text distribution coefficient is determined as the target text area.

9. The method according to claim 1, wherein the determining the target text region in the candidate region based on the text distribution information comprises:

acquiring the target image corresponding to each of the candidate regions;

dividing each described target image into a plurality of sub-regions;

Determine text density information in each of the sub-regions according to the text distribution information;

calculating the standard deviation of the text density information of each of the sub-regions in each of the candidate regions;

The candidate region corresponding to the target image with the smallest standard deviation is determined as the target text region.

10. A device for determining a text area, comprising:

an object detection module for acquiring an initial image including the text, and determining a rectangular reference area including the text in the initial image, wherein one edge of the rectangular reference area is parallel to the reference in the initial image direction;

a first determination module, configured to determine a candidate region according to the geometric relationship of the rectangular reference region;

an information extraction module; used for determining the text distribution information of each of the candidate regions in the candidate regions;

The second determining module is configured to determine a target text region in the candidate region based on the text distribution information.

11 . A computer-readable storage medium on which a computer program is stored, characterized in that, when the program is executed by a processor, the method for determining a text area according to any one of claims 1 to 9 is implemented.

12. An electronic device, characterized in that, comprising:

one or more processors; and

a memory for storing one or more programs which, when executed by the one or more processors, cause the one or more processors to implement any one of claims 1 to 9 The text area determination method described in item.