CN111414905B

CN111414905B - Text detection method, text detection device, electronic equipment and storage medium

Info

Publication number: CN111414905B
Application number: CN202010117641.0A
Authority: CN
Inventors: 张博熠; 马文伟; 刘设伟; 王亚领
Original assignee: Taikang Insurance Group Co Ltd; Taikang Online Property Insurance Co Ltd
Current assignee: Taikang Insurance Group Co Ltd; Taikang Online Property Insurance Co Ltd
Priority date: 2020-02-25
Filing date: 2020-02-25
Publication date: 2023-08-18
Anticipated expiration: 2040-02-25
Also published as: CN111414905A

Abstract

The application provides a text detection method, a text detection device, electronic equipment and a storage medium, wherein a picture to be detected is firstly obtained; then detecting a target text area in the picture to be detected, wherein the target text area comprises at least one of a header text area, a seal text area and a layout text area; determining a target text detection model corresponding to the target text region according to a preset corresponding relation between the text region and the text detection model; and detecting the text in the target text region by using the target text detection model to obtain a target text detection box. According to the technical scheme, the target text region in the picture to be detected is detected, the text in the region is detected by adopting the text detection model matched with the target text region in a targeted manner by combining the advantages of different text detection models, the integrity of text detection is improved, and powerful support is provided for subsequent text recognition.

Description

Text detection method, text detection device, electronic equipment and storage medium

Technical Field

The present application relates to the field of image recognition technologies, and in particular, to a text detection method, a text detection device, an electronic device, and a storage medium.

Background

Nowadays, with the development of economy and the improvement of living standard of people, more and more people choose to purchase medical, commercial, financial and other insurance. Some insurance companies slowly start self-help claim settlement business, for example, users only need to photograph and upload outpatient or inpatient notes to an insurance company system in the medical claim settlement process, and information on the note pictures uploaded by the users is input into the claim settlement system by insurance company operators.

The recording efficiency of the bill can be improved under a certain condition through an optical character recognition technology (OCR, optical Character Recognition), however, pictures uploaded by a user are influenced by shooting angles, curved characters, inclined characters and the like exist, in addition, due to the complexity of the bill layout, long text at the gauge head, characters in a seal and the like exist, and the difficulty of character detection and recognition is greatly increased under the conditions. For complex layouts, the related text detection technology has the problem of incapability of detecting or incomplete detection, and the final text recognition result is directly influenced by the quality of the detection effect, so that the problem of text detection in complex layouts is solved, and the method is particularly important in the optical character recognition technology.

Disclosure of Invention

The application provides a text detection method, a text detection device, electronic equipment and a storage medium, which are used for improving the integrity of text detection.

In order to solve the problems, the application discloses a text detection method, which comprises the following steps:

acquiring a picture to be detected;

detecting a target text region in the picture to be detected, wherein the target text region comprises at least one of a header text region, a seal text region and a layout text region;

determining a target text detection model corresponding to the target text region according to a preset corresponding relation between the text region and the text detection model, wherein the corresponding relation between the text region and the text detection model comprises at least one of the following: performing text detection on the seal text region by adopting a first Psenet model, performing text detection on the header text region by adopting a second Psenet model, and performing text detection on the layout text region by adopting an EAST model;

and detecting the text in the target text region by adopting the target text detection model to obtain a target text detection box.

In an optional implementation manner, the step of acquiring the picture to be detected includes:

receiving an original bill picture;

determining the rotation angle of the original bill picture according to the included angle between the edge straight line and the horizontal axis of the original bill picture;

and rotating the original bill picture to the horizontal direction according to the rotation angle to obtain the picture to be detected.

In an optional implementation manner, before the step of determining the rotation angle of the original bill picture according to the included angle between the edge straight line and the horizontal axis of the original bill picture, the method further includes:

and carrying out edge detection on the original bill picture by adopting Huffman straight line detection to obtain an edge straight line of the original bill picture.

In an optional implementation manner, the step of detecting the target text region in the picture to be detected includes:

performing seal detection on the picture to be detected to obtain a seal detection frame;

determining a preprinted seal area and a non-preprinted seal area according to the length-width ratio of the seal detection frame;

and determining the pre-printing seal area and/or the non-pre-printing seal area as the seal text area.

In an alternative implementation manner, the step of detecting the text in the target text area by using the target text detection model to obtain a target text detection box includes:

detecting the text in the seal text area by adopting a first Psenet model obtained by pre-training to obtain an initial seal text detection box;

and carrying out text correction on the initial seal text detection box to obtain a horizontal seal text detection box.

detecting the picture to be detected by adopting a second Psenet model obtained through pre-training to obtain a plurality of text detection boxes;

determining a header text region according to the length-width ratio of the text detection boxes;

the step of detecting the text in the target text region by using the target text detection model to obtain a target text detection box comprises the following steps:

and determining a text detection box corresponding to the header text region in the text detection boxes as a header text detection box.

determining the area except the header text area and the seal text area in the picture to be detected as the layout text area;

and detecting the text in the layout text area by adopting an EAST model obtained through pre-training to obtain a layout text detection box.

In order to solve the above problems, the present application also discloses a text detection device, which includes:

the acquisition module is configured to acquire a picture to be detected;

the region detection module is configured to detect a target text region in the picture to be detected, wherein the target text region comprises at least one of a header text region, a seal text region and a layout text region;

the model determining module is configured to determine a target text detection model corresponding to the target text region according to a preset corresponding relation between the text region and the text detection model, wherein the corresponding relation between the text region and the text detection model comprises at least one of the following: performing text detection on the seal text region by adopting a first Psenet model, performing text detection on the header text region by adopting a second Psenet model, and performing text detection on the layout text region by adopting an EAST model;

and the text detection module is configured to detect the text in the target text area by adopting the target text detection model to obtain a target text detection box.

In an alternative implementation, the acquisition module is specifically configured to:

receiving an original bill picture;

In an alternative implementation, the acquisition module is further configured to:

In an alternative implementation, the area detection module is specifically configured to:

In an alternative implementation, the text detection module is specifically configured to:

the text detection module is specifically configured to:

In order to solve the above problems, the present application also discloses an electronic device, including:

a processor;

a memory for storing the processor-executable instructions;

wherein the processor is configured to execute the instructions to implement the text detection method of any of the embodiments.

In order to solve the above-mentioned problem, the present application also discloses a storage medium, which when the instructions in the storage medium are executed by a processor of an electronic device, enables the electronic device to execute the text detection method according to any embodiment.

Compared with the prior art, the application has the following advantages:

the technical scheme of the application provides a text detection method, a text detection device, electronic equipment and a storage medium, wherein a picture to be detected is firstly obtained; then detecting a target text area in the picture to be detected, wherein the target text area comprises at least one of a header text area, a seal text area and a layout text area; determining a target text detection model corresponding to the target text region according to a preset corresponding relation between the text region and the text detection model, wherein the corresponding relation between the text region and the text detection model comprises at least one of the following: performing text detection on the seal text region by adopting a first Psenet model, performing text detection on the header text region by adopting a second Psenet model, and performing text detection on the layout text region by adopting an EAST model; and detecting the text in the target text region by using the target text detection model to obtain a target text detection box. According to the technical scheme, the target text region in the picture to be detected is detected, the text in the region is detected by adopting the text detection model matched with the target text region in a targeted manner by combining the advantages of different text detection models, the integrity of text detection is improved, and powerful support is provided for subsequent text recognition.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the description of the embodiments of the present application will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flowchart showing steps of a text detection method according to an embodiment of the present application;

FIG. 2 is a flowchart showing steps for detecting seal text according to an embodiment of the present application;

FIG. 3 is a flowchart showing steps for detecting header text according to one embodiment of the present application;

FIG. 4 is a flowchart showing steps for detecting layout text according to an embodiment of the present application;

FIG. 5 is a flowchart showing steps of a specific implementation manner of a text detection method according to an embodiment of the present application;

FIG. 6 is a diagram showing the effect of text detection on a hand stamp using a first Pseneet model according to an embodiment of the present application;

FIG. 7 is a diagram showing the effect of text detection on a pre-printed stamp using a first Pseneet model in accordance with one embodiment of the present application;

FIG. 8 is a diagram showing the effect of text detection on header text using a second Pseneet model according to an embodiment of the present application;

FIG. 9 shows an effect diagram of text detection of header text using the EAST model;

FIG. 10 is a diagram showing the effect of text detection on layout text using EAST model according to one embodiment of the present application;

fig. 11 is a block diagram showing a structure of a text detection device according to an embodiment of the present application.

Detailed Description

In order to enable those skilled in the art to better understand the technical solutions of the present disclosure, the technical solutions of the embodiments of the present disclosure will be clearly and completely described below with reference to the accompanying drawings.

It should be noted that the terms "first," "second," and the like in the description and claims of the present disclosure and in the foregoing figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the disclosure described herein may be capable of operation in sequences other than those illustrated or described herein. The implementations described in the following exemplary examples are not representative of all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with some aspects of the present disclosure as detailed in the accompanying claims.

Text detection is an important non-exhaustive ring in optical character recognition technology, and text detection is not only focused on integrity but also focused on integrity. Because of the complexity of bill layout, there are long text of header, seal text, handwritten text, etc., and curved text and inclined text, etc., which are affected by shooting angle, the inventor analyses and discovers that the root cause that it is difficult to completely identify the text in the complex layout in the prior art is to detect various text types in the complex layout by only one model.

To solve the above problems, an exemplary embodiment of the present application shows a flowchart of a text detection method, as shown in fig. 1, which may include the steps of:

in step S11, a picture to be detected is acquired.

In a specific implementation, an original bill picture can be received first, and then correction processing is carried out on the original bill picture to obtain a picture to be detected.

For example, this step may further include: receiving an original bill picture; determining the rotation angle of the original bill picture according to the included angle between the edge straight line and the horizontal axis of the original bill picture; and rotating the original bill picture to the horizontal direction according to the rotation angle to obtain a picture to be detected.

The edge line of the original bill picture can be obtained by detecting the edge of the original bill picture by adopting Huffman line detection. The horizontal and vertical edge straight lines in the original bill picture can be detected through Huffman straight line detection, then the original bill picture is rotated according to the included angles between the horizontal and vertical edge straight lines and the horizontal axis, and the original bill picture is rotated to the horizontal direction, so that characters in the bill picture can be detected and identified later.

In step S12, a target text region in the picture to be detected is detected, the target text region including at least one of a header text region, a seal text region, and a layout text region.

In a specific implementation, seal detection can be performed on the picture to be detected to obtain a seal text region in the picture to be detected; a text detection algorithm (such as a Psenet algorithm) can be adopted to carry out text detection on the picture to be detected, and a header text region is determined according to the length-width ratio of the text detection box; the non-seal text area and the non-header text area in the picture to be detected can be determined as layout text areas.

The target text region may also include a handwritten text region, a tabular text region, a capped text region, and the like.

In step S13, a target text detection model corresponding to the target text region is determined according to a preset correspondence between the text region and the text detection model, wherein the correspondence between the text region and the text detection model includes at least one of: and performing text detection on the seal text area by adopting a first Psenet model, performing text detection on the header text area by adopting a second Psenet model, and performing text detection on the layout text area by adopting an EAST model.

In practical applications, there are various text detection models, such as fast RCNN, SSD, YOLO, psenet, EAST, RRCNN, textBoxes, CTPN, etc., however, each text detection model has its own merits and merits. For example, the deep learning model EAST has the advantage of better detection effect on horizontal or inclined quadrilateral block texts, has the disadvantage of not being capable of completely detecting header long texts, can generate segment detection once the texts are overlong, and can not effectively detect texts in a seal and two-dimensional codes. The other deep learning model Psenet has the advantages of better detection effect on curved text (such as in-seal text) and long block text (such as header text) and two-dimension codes, and has the disadvantage of easy detection on text with closer distance (comprising upper and lower spacing and left and right spacing).

Thus, each text detection model is adapted to detect a different text type (text region), i.e. there is a correspondence between the text region and the matching text detection model. Specifically, the text detection can be performed on the seal text area by adopting the first Psenet model, the text detection can be performed on the header text area by adopting the second Psenet model, and the text detection can be performed on the layout text area by adopting the EAST model. By combining the advantages of different text detection models and adopting a proper text detection model to detect the corresponding text types, the complete and effective detection of various text types in the complex layout can be realized.

In step S14, the text in the target text region is detected using the target text detection model, and a target text detection box is obtained.

In a specific implementation, the text in the target text region is detected by adopting the target text detection model determined in the step S13, so as to obtain a target text detection box.

For example, a first Psenet model may be used to detect the seal text region, so as to obtain a seal text detection box; detecting the header text region by adopting a second Psenet model to obtain a header text detection box; and detecting the layout text area by adopting an EAST model to obtain a layout text detection box.

In practical application, the detected target text detection box can be input into a deep learning recognition engine to output a text recognition result.

The embodiment can be applied to OCR emergency bill input (such as the field of medical bill and claim in insurance industry), is responsible for the character detection function of OCR emergency bill pictures, and can solve the pain points of various links such as insurance industry check and claim. The embodiment can not only detect the quadrilateral block characters completely and effectively, but also detect the header long text, the handwritten text, the form text, the capped text, the seal text (curved text) and the two-dimensional code completely, so that the embodiment can be applied to the text detection of other text types such as long text, curved text or quadrilateral block text.

In the text detection method provided by the embodiment, the target text region in the picture to be detected is detected first, and the text in the region is detected by adopting the text detection model suitable for the target text region in a targeted manner by combining the advantages or detection characteristics of different text detection models, so that the integrity of text detection is improved, powerful support is provided for subsequent text recognition, and the problem that various text types in complex layouts cannot be completely detected by adopting a single model in the prior art is solved.

In an alternative implementation, in step S12, the method may further include: and performing seal detection on the picture to be detected to obtain a seal text region. Further, referring to fig. 2, this step may include:

in step S21, seal detection is performed on the picture to be detected, and a seal detection frame is obtained.

In step S22, a preprinted stamp area and a non-preprinted stamp area are determined according to the aspect ratio of the stamp detection frame.

In step S23, the pre-printed stamp area and/or the non-pre-printed stamp area is determined as a stamp text area.

In specific implementation, the technology such as LCSELLIPSE and the like can be used for detecting the picture to be detected by the seal, so that the seal detection frame, namely the circumscribed rectangle of the seal, is obtained, and the aspect ratio of the seal detection frame is calculated. Because of the characteristic that the aspect ratio (preset aspect ratio) of the pre-printed stamp is fixed, a stamp detection frame whose aspect ratio satisfies the preset aspect ratio (e.g., when the absolute value of the difference between the two is smaller than a preset threshold value, it is determined that the aspect ratio satisfies the preset aspect ratio) is determined as a pre-printed stamp area, and a stamp detection frame whose aspect ratio does not satisfy the preset aspect ratio (e.g., when the absolute value of the difference between the two is greater than or equal to the preset threshold value, it is determined that the aspect ratio does not satisfy the preset aspect ratio) is determined as a non-pre-printed stamp area (e.g., a manual stamp area).

Since information such as a region is generally contained in the non-preprinted stamp, the non-preprinted stamp region can be determined as a stamp text region in this case. In practical application, the seal text area can be determined to be a pre-printed seal area, or a non-pre-printed seal area, or a pre-printed seal area and a non-pre-printed seal area according to practical requirements.

In a specific implementation, step S14 may further include:

in step S24, the text in the seal text region is detected by using the first Psenet model obtained by training in advance, and an initial seal text detection box is obtained.

In step S25, the initial seal text detection box is subjected to text correction, and a horizontal seal text detection box is obtained.

In a specific implementation, a seal text region in a picture to be detected can be scratched, then text detection is performed on the seal text region by using a first Psenet model, an initial seal text detection frame is obtained, an effect diagram of text detection on a manual seal by using the first Psenet model is shown with reference to fig. 6, and an effect diagram of text detection on a pre-printed seal by using the first Psenet model is shown with reference to fig. 7.

Because the initial seal text detection boxes obtained through detection are mostly curved or inclined, in order to facilitate subsequent text recognition, a text correction network can be adopted to carry out text correction on the initial seal text detection boxes, so that horizontal seal text detection boxes are obtained, and then the horizontal seal text detection boxes are sent to a deep learning recognition engine for text recognition.

The first Psenet model is obtained by training a curved text serving as a training sample in advance based on a progressive expansion network Psenet. The progressive extension network Psenet is a new instance segmentation network, which can locate text with arbitrary shape, and adopts a progressive scale extension algorithm, which can successfully identify adjacent text instances.

The Psenet model is adopted to detect the text in the seal, and the Psenet model can effectively detect the long block text, so that the integrity of text detection can be improved.

In an alternative implementation, referring to fig. 3, in step S12, it may further include:

in step S31, a second Psenet model obtained by training in advance is adopted to detect the picture to be detected, so as to obtain a plurality of text detection boxes.

In step S32, a header text region is determined based on the aspect ratios of the plurality of text detection boxes.

Accordingly, in step S14, it may further include:

in step S33, a text detection box corresponding to the header text region in the plurality of text detection boxes is determined as a header text detection box.

In a specific implementation, a second Psenet model can be adopted to carry out text detection on the whole picture to be detected, so as to obtain a plurality of text detection boxes; and then screening the characteristics (such as the length-width ratio) of the text detection boxes, and screening out the text detection box area with the largest length-width ratio as a header text area, wherein the text detection box with the largest length-width ratio is a header text detection box, thereby completing header text detection. Referring to fig. 8, which shows an effect diagram of text detection on header text using the second Psenet model, and referring to fig. 9, which shows an effect diagram of text detection on header text using the EAST model, it can be seen that the second Psenet model can more completely detect header length text.

The second Psenet model is obtained by training a long text of a quadrangle (horizontal or inclined) as a training sample based on a progressive expansion network Psenet.

The method adopts the Psenet model to detect the header long text, and the Psenet model can effectively detect the long block text, so that the integrity of text detection can be further improved.

In an alternative implementation, referring to fig. 4, in step S12, it may further include:

in step S41, the region of the picture to be detected excluding the header text region and the seal text region is determined as the layout text region.

Accordingly, in step S14, it may further include:

in step S42, text in the layout text area is detected by using the EAST model obtained by training in advance, and a layout text detection box is obtained.

In a specific implementation, a non-header text region and a non-seal text region in a picture to be detected can be determined as layout text regions; and then adopting an EAST model to carry out text detection on the layout text area to obtain a layout text detection box. Referring to fig. 10, an effect diagram of text detection of layout text using the EAST model is shown.

The EAST model is obtained by training a text of a quadrangle (horizontal or inclined) as a training sample based on an EAST network in advance. The EAST network contains three parts in total: feature extractor stem (feature extraction branch), feature-merge branch, and output layer.

According to the implementation mode, the EAST model is adopted to detect the layout text, and the EAST model has a good detection effect on the horizontal or inclined quadrilateral text block, so that the implementation mode can obtain a text detection result of a more complete and fine quadrilateral text block.

Referring to fig. 5, a flow chart of a specific implementation manner of the multi-mechanism joint text detection method provided in this embodiment is shown, and the implementation manner is mainly divided into 4 steps:

step 1: and carrying out rotation correction on the input bill image to obtain a horizontal bill image (picture to be detected).

Step 2: detecting the seal of the picture to be detected, detecting the position of the pre-printed seal according to the characteristic of fixed length-width ratio of the pre-printed seal, thereby determining the coordinate positions of other seals, then carrying out the matting of other seals (namely, non-pre-printed seals), carrying out the text detection of the non-pre-printed seals by using a Pseneet algorithm, carrying out the text correction network correction on the detected result, and finally sending the result to a deep learning recognition engine for character recognition. The method aims at solving the problem of detecting the text in the seal.

Step 3: and performing Psenet character detection on the picture to be detected, performing feature screening on the detected quadrangle (text detection box), and screening the quadrangle with the largest length-width ratio as the detected header text detection box. The aim is to solve the header text detection.

Step 4: and performing text detection on the bill layout by using an EAST engine. The method aims at solving the problem that a more complete and fine text detection result of the quadrilateral text block can be obtained.

The method comprises the steps that firstly, a Psenet model and a seal are utilized to detect a header long text and a seal in a picture to be detected, and because the length-to-width ratio of the header long text is maximum, a preprinted bill seal has the characteristics of a fixed format and an aspect ratio, a header long text area and a preprinted seal area can be easily detected, other seals obtained by seal detection are regarded as artificial hand seal official seals (namely non-preprinted seals), the areas of the artificial hand seal official seals are subjected to pattern matting, and then the Psenet model is utilized to detect the text of the artificial hand seal official seals; and then the EAST is used for carrying out text detection on the whole layout.

According to the technical scheme, the rotation correction algorithm of the bill, the seal detection algorithm, the EAST deep learning text detection frame engine and the Psenet curved text detection deep learning frame engine are combined, text detection in a layout is carried out aiming at a complex medical bill layout, and the completeness of the text detection is improved well. The embodiment combines the advantages of different models to solve the problem of detecting layout characters under the complex conditions.

Fig. 11 is a block diagram of a text detection device according to an exemplary embodiment. Referring to fig. 11, the apparatus may include:

an acquisition module 111 configured to acquire a picture to be detected;

a region detection module 112 configured to detect a target text region in the picture to be detected, the target text region including at least one of a header text region, a seal text region, and a layout text region;

a model determining module 113 configured to determine a target text detection model corresponding to the target text region according to a preset correspondence between the text region and the text detection model, wherein the correspondence between the text region and the text detection model includes at least one of: performing text detection on the seal text region by adopting a first Psenet model, performing text detection on the header text region by adopting a second Psenet model, and performing text detection on the layout text region by adopting an EAST model;

the text detection module 114 is configured to detect the text in the target text region by using the target text detection model, so as to obtain a target text detection box.

In a specific implementation, the acquiring module 111 may first receive an original bill picture, and then correct the original bill picture to obtain a picture to be detected.

Further, the obtaining module 111 is specifically configured to receive an original ticket picture; determining the rotation angle of the original bill picture according to the included angle between the edge straight line and the horizontal axis of the original bill picture; and rotating the original bill picture to the horizontal direction according to the rotation angle to obtain a picture to be detected.

The region detection module 112 can perform seal detection on the picture to be detected to obtain a seal text region in the picture to be detected; the region detection module 112 may perform text detection on the picture to be detected by using a text detection algorithm (such as Psenet algorithm), and determine a header text region according to an aspect ratio of the text detection box; the region detection module 112 may determine the non-seal text region and the non-header text region in the picture to be detected as layout text regions.

In a specific implementation, the text detection module 114 detects the text in the target text area by using the target text detection model determined by the model determination module 113, so as to obtain a target text detection box. For example, the text detection module 114 may detect the seal text region by using the first Psenet model to obtain a seal text detection box; detecting the header text region by adopting a second Psenet model to obtain a header text detection box; and detecting the layout text area by adopting an EAST model to obtain a layout text detection box.

According to the text detection device provided by the embodiment, the target text region in the picture to be detected is detected first, the text detection model suitable for the target text region is used for detecting the text in the region in a targeted manner by combining the advantages or detection characteristics of different text detection models, so that the integrity of text detection is improved, powerful support is provided for subsequent text recognition, and the problem that various text types in complex layouts cannot be completely detected by adopting a single model in the prior art is solved.

In an alternative implementation, the region detection module 112 is specifically configured to:

In an alternative implementation, the text detection module 114 is specifically configured to:

the text detection module 114 is specifically configured to:

The specific manner in which the various modules perform the operations in the apparatus of the above embodiments have been described in detail in connection with the embodiments of the method, and will not be described in detail herein.

Another embodiment of the present application also provides an electronic device, including:

a processor;

a memory for storing the processor-executable instructions;

Another embodiment of the present application also provides a storage medium, which when executed by a processor of an electronic device, causes the electronic device to perform the text detection method of any of the embodiments.

In this specification, each embodiment is described in a progressive manner, and each embodiment is mainly described by differences from other embodiments, and identical and similar parts between the embodiments are all enough to be referred to each other.

Finally, it is further noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article or apparatus that comprises the element.

The text detection method, the text detection device, the electronic equipment and the storage medium provided by the application are described in detail, and specific examples are applied to illustrate the principle and the implementation of the application, and the description of the above examples is only used for helping to understand the method and the core idea of the application; meanwhile, as those skilled in the art will have variations in the specific embodiments and application scope in accordance with the ideas of the present application, the present description should not be construed as limiting the present application in view of the above.

Claims

1. A method of text detection, the method comprising:

acquiring a picture to be detected;

detecting a target text region in the picture to be detected, wherein the target text region comprises a header text region, a seal text region and a layout text region;

determining a target text detection model corresponding to the target text region according to a preset corresponding relation between the text region and the text detection model, wherein the corresponding relation between the text region and the text detection model comprises the following steps: performing text detection on the seal text region by adopting a first Psenet model, performing text detection on the header text region by adopting a second Psenet model, and performing text detection on the layout text region by adopting an EAST model;

detecting the text in the target text region by adopting the target text detection model to obtain a target text detection box; the text detection of the seal text region by adopting the first Psenet model comprises the following steps:

performing seal detection on the picture to be detected to obtain a seal detection frame; determining a pre-printing seal area and a non-pre-printing seal area according to the length-width ratio of the seal detection frame, and determining the pre-printing seal area and/or the non-pre-printing seal area as the seal text area;

the text detection of the header text region by adopting the second Psenet model comprises the following steps:

performing Psenet character detection on the picture to be detected to obtain a text detection box;

taking the text detection box with the largest length-width ratio of the text detection box as a header text detection box;

the text detection of the layout text area by adopting the EAST model comprises the following steps:

and performing text detection on the bill layout by using an EAST engine.

2. The text detection method according to claim 1, wherein the step of acquiring the picture to be detected includes:

receiving an original bill picture;

3. The text detection method of claim 2, further comprising, before the step of determining the rotation angle of the original document picture based on the angle between the edge straight line of the original document picture and the horizontal axis:

4. The text detection method according to claim 1, wherein the step of detecting text in the target text region using the target text detection model to obtain a target text detection box includes:

5. The text detection method according to claim 1, wherein the step of detecting a target text region in the picture to be detected includes:

6. The text detection method according to claim 1, wherein the step of detecting a target text region in the picture to be detected includes:

7. A text detection device, the device comprising:

the acquisition module is configured to acquire a picture to be detected;

the region detection module is configured to detect a target text region in the picture to be detected, wherein the target text region comprises a header text region, a seal text region and a layout text region;

the model determining module is configured to determine a target text detection model corresponding to the target text region according to a preset corresponding relation between the text region and the text detection model, wherein the corresponding relation between the text region and the text detection model comprises: performing text detection on the seal text region by adopting a first Psenet model, performing text detection on the header text region by adopting a second Psenet model, and performing text detection on the layout text region by adopting an EAST model;

the text detection module is configured to detect the text in the target text area by adopting the target text detection model to obtain a target text detection box;

the region detection module is specifically configured to:

determining the pre-printed seal area and/or the non-pre-printed seal area as the seal text area;

the region detection module is specifically configured to:

and performing text detection on the bill layout by using an EAST engine.

8. An electronic device, the electronic device comprising:

a processor;

a memory for storing the processor-executable instructions;

wherein the processor is configured to execute the instructions to implement the text detection method of any of claims 1 to 6.

9. A storage medium, which when executed by a processor of an electronic device, causes the electronic device to perform the text detection method of any of claims 1 to 6.