CN111553361B - Pathological section label identification method - Google Patents

Pathological section label identification method Download PDF

Info

Publication number
CN111553361B
CN111553361B CN202010199537.0A CN202010199537A CN111553361B CN 111553361 B CN111553361 B CN 111553361B CN 202010199537 A CN202010199537 A CN 202010199537A CN 111553361 B CN111553361 B CN 111553361B
Authority
CN
China
Prior art keywords
characters
pathological section
identification method
network
image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010199537.0A
Other languages
Chinese (zh)
Other versions
CN111553361A (en
Inventor
王杰
郑众喜
向旭辉
陈杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
West China Hospital of Sichuan University
Original Assignee
West China Hospital of Sichuan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by West China Hospital of Sichuan University filed Critical West China Hospital of Sichuan University
Priority to CN202010199537.0A priority Critical patent/CN111553361B/en
Publication of CN111553361A publication Critical patent/CN111553361A/en
Application granted granted Critical
Publication of CN111553361B publication Critical patent/CN111553361B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/148Segmentation of character regions
    • G06V30/153Segmentation of character regions using recognition of characters or words
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/32Normalisation of the pattern dimensions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Abstract

The invention discloses a pathological section label identification method, which adopts a deep learning method to identify pathological section label images, wherein the basic network of a model adopted by the deep learning is a RetinaNet network based on ResNet-50 and a module used for helping the basic network to identify direction-sensitive characters, the module comprises a vertical self-attention mechanism branch, a horizontal self-attention mechanism branch and a middle branch, and the fusion method of the modules is as follows: o = Cvβ+Ch(1-. Beta.) (1) formula (1): o represents an output, CvIndicating a vertical self-attentive mechanism branch, ChRepresenting the horizontal self-attention mechanism branch, beta is the output result of the middle branch. The invention can correctly process characters in different directions.

Description

Pathological section label identification method
Technical Field
The invention relates to the field of medical detection, in particular to a pathological section label identification method.
Background
One of the current methods for pathological section label recognition is Optical Character Recognition (OCR). Mainstream OCR algorithms all comprise the following two steps:
1. detecting characters in a scene;
2. the detected text is identified.
The output of the first step in the above steps is usually the position information of a word or a line of characters, and the currently used technology is mostly based on a general target detection algorithm; the second step is to use the CTC or attention mechanism based method to recognize the corresponding text cut out of the image and scaled to a fixed height image according to the detection result of the first step, and they usually assume that the text satisfies the forward direction and is left to right at the time of recognition. Most of the current research is focused on the first step and the main focus is on how to recognize irregular text.
The mainstream OCR algorithm is directly applied to the pathological section label recognition, and the following problems exist:
1. at present, the mainstream OCR technology needs a large amount of training data, usually the first step needs 10 k-50 k of marking data, and the second step needs more than 1000k of training data, and collection of pathological section data of the order is almost impossible, wherein the number of marking data used in the patent is less than 2000, which is far smaller than the data volume used by the mainstream OCR technology;
2. most of the mainstream OCR technology focuses on how to detect irregular characters, as shown in fig. 1, the labels of pathological sections are scanned by a digital section scanner, as shown in fig. 2, there is almost no deformation;
3. the characters in the label of the pathological section can be in any direction (different directions may exist in the same label at the same time), and the mainstream OCR technology has little interest in this aspect, and most OCR methods directly assume that the characters are arranged upwards and from left to right;
4. most of the mainstream OCR detection is natural language, the recognized target is a word, semantic correlation exists between words, characters in a pathological label have high randomness, and the correlation between the characters is small;
5. the technology that the part can directly process the character in any direction has the limitation of using scenes, such as that the character is generated in a fixed position according to a rule, a locator which needs to be assisted is required, a fixed font is used, and the like.
As described above, since the current mainstream OCR technology and tag recognition have great differences in data amount and attention point, it is not possible to achieve a good effect by directly using the OCR technology in tag recognition.
Disclosure of Invention
The invention aims to provide a pathological section label identification method which can correctly process characters in different directions.
In order to achieve the purpose, the invention adopts the following technical scheme:
the invention discloses a pathological section label identification method, which adopts a deep learning method to identify pathological section label images, wherein the basic network of a model adopted by the deep learning is a RetinaNet network based on ResNet-50 and a module used for helping the basic network to identify direction-sensitive characters, the module comprises a vertical self-attention mechanism branch, a horizontal self-attention mechanism branch and a middle branch, and the fusion method of the modules is as follows:
O=Cvβ+Ch(1-β) (1)
in formula (1): o represents an output, CvIndicating a vertical self-attentive mechanism branch, ChRepresenting the horizontal self-attention mechanism branch, beta is the output result of the middle branch.
Preferably, the ratio of the top-most Anchor box of the base network is 1, 1; the bottom most Anchor box ratios are 1, 2 and 2.
Preferably, the topmost output network and the middle output network of the model share weights, and the bottommost network uses separate weights.
Preferably, the loss function of the training network is as follows:
L=Lcls(p,u)+λ[u≥1]Lloc(tu,v)+γLdre(p,w) (2)
in formula (2): l iscls(p,u)=-log puU is the type of the target box in the output result, where the class number of the background is 0locIs the regression loss of the target box, Ldre(p,w)=-log pwW is the direction of the target box in the output result, and λ, γ are the weights of the corresponding losses.
Preferably, λ is 10 and γ is 1.
Preferably, the deep learning training phase processing steps are as follows:
step 1, preprocessing an input image;
step 2, carrying out random cutting, left-right turning, up-down turning, rotation at any angle, color disturbance, random brightness transformation and random noise addition on the preprocessed image to carry out data enhancement
Step 3, zooming the image processed in the step 2 into a fixed size;
step 4, forming a batch by a plurality of zoomed images;
step 5, forward propagation is carried out by using the model;
step 6, calculating loss by using a loss function, reversely propagating, and updating training parameters;
and 7, carrying out iterative training until the model converges.
Preferably, the prediction stage processing steps of the deep learning are as follows:
a. preprocessing an input image;
b. zooming the preprocessed image into a fixed size;
c. forward propagation using the model;
d. dividing the result output in the step c into two groups of words and characters;
e. aggregating the characters into words according to whether the words and the characters are overlapped;
f. counting the directions of all characters in the same word, and determining the direction of the current word by using a voting method;
g. arranging the characters in the words according to the direction of the words in sequence;
h. determining whether spaces exist among the characters according to the distance among the characters in the word, and if yes, adding the spaces;
i. and outputting the result.
Preferably, the pretreatment method comprises the following steps:
Figure GDA0002573660630000041
in the formula (3), μ is a mean value of the image, and σ is a variance of the image.
Preferably, the fixed size is 512 by 512, and the number of sheets is 16.
The invention has the following beneficial effects:
1. the present invention requires only a very small number of training samples. Compared with the classical OCR, the network architecture of the invention is easier to train, and meanwhile, the invention uses the training methods of migration training, simulation data addition and the like to greatly reduce the requirement of the algorithm on samples, and less than 1400 training samples used at present are far less than the million-level requirement of the classical OCR.
2. The invention can correctly process characters in different directions. The algorithm of the invention uses the self-defined LineAttention module and increases direction prediction during output, and compared with the mainstream OCR algorithm (generally, characters are supposed to be arranged upwards and from left to right), the algorithm of the invention can correctly process characters in different directions.
Drawings
FIG. 1 is a schematic view of a picture with irregular text;
FIG. 2 is an example of pathological section label data;
FIG. 3 is a diagram of the model architecture of the present invention;
FIG. 4 is a schematic diagram of a LineAttention module;
FIG. 5 is an exemplary graph of synthetic data samples;
FIG. 6 is a diagram illustrating the detection results.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail below with reference to the accompanying drawings.
The invention discloses an algorithm for pathological section label character recognition (hereinafter referred to as label recognition). The algorithm is based on RetinaNet, but RetinaNet is designed for general target detection and cannot correctly identify characters in different directions, in order to identify characters in different directions, a direction prediction branch is newly added in network output, and simultaneously, in order to correctly process characters which are sensitive to directions, such as '6', '9', and the like in different directions, a unique line attention module is designed for effectively processing the characters which are sensitive to directions; another improvement point of the RetinaNet of the invention lies in the special Anchor box parameter setting, which is used for effectively processing the condition of larger aspect ratio in character detection, and the invention is also adjusted in the aspect of basic architecture of the model. After the individual characters are detected, the characters are combined into lines and output using a corresponding post-processing algorithm. The concrete milk is as follows:
model architecture
The basic architecture of the model is shown in FIG. 3, and the invention uses RetinaNet [2] based on ResNet-50[3] as the basic network architecture of the invention. However, retinaNet is designed for general purpose target detection, and cannot achieve the optimal effect when being directly used for label character recognition. Therefore, the invention carries out the following improvement on RetinaNet:
the invention designs a module called 'LineAttention' (orange boxes in an architecture diagram) to help the model correctly recognize the direction-sensitive characters. FIG. 4 shows a specific structure of LineAttention, and the fusion (fusion) method in FIG. 4 is:
O=Cvβ+Ch(1-β) (1)
wherein O represents an output, CvShows the vertical attention mechanism branch (third branch in the block diagram), ChRepresenting the horizontal self-attention mechanism branch (first branch in the structure diagram), β is the output result of the intermediate sigmod branch. Reference is made to the detailed implementation of the self-attention mechanism [4 ]]。
LineAttention can automatically detect the direction of a current character, and increase the recognition accuracy of the current character by analyzing adjacent characters in the same direction as the current character through association, and particularly has obvious effect on promoting characters such as ' 6 ', ' 9 ', and ' minus ', ' and the like which are sensitive to directions.
The RetinaNet model only outputs the position and the size of the target frame and the category information of the target, and the invention increases the direction information of the target in the output. The invention can accurately process the label data in different directions only after the direction information exists.
The method is optimized on the Anchor box parameters of different output layers, and the proportion of the Anchor box at the topmost layer is 1, 1; the ratio of the Anchor box of the intermediate layer is 1, 1; the bottom-most Anchor box ratios are 1, 2 and 2, the top-most and middle layers are devoted to handling words with large aspect ratios, and the bottom-most layer is devoted to handling words with small aspect ratios and characters;
another difference from RetinaNet is that the topmost output network and the middle output network share the weight, and the bottommost network uses a single weight, so that the design is based on the assumption that the topmost and middle output networks are mainly used for detecting words, and the bottommost output network is mainly used for detecting characters, and the tasks are different, so that different weight sharing rules are designed, and RetinaNet does not have the requirement, so that all output layers of RetinaNet share the weight.
Loss function
The loss function used by the training network is defined as follows:
L=Lcls(p,u)+λ[u≥1]Lloc(tu,v)+γLdre(p,w) (2)
wherein L iscls(p,u)=-log puU is the type of the target box in the output result (the class number of the background is 0), LlocRegression loss as target Box (vs Fast R-CNN [5 ]]The same definition). L isdre(p,w)=-log pwAnd w is the direction of the target frame in the output result. And lambda and gamma are corresponding lost weights, and in the experiment, the lambda is 10 and the gamma is 1.
Detailed processing steps
The invention relates to an algorithm based on deep learning, which is divided into a training (learning) stage and a prediction (using) stage, and the following respectively describes the corresponding processing steps:
step 1, preprocessing an input image, wherein the preprocessing method comprises the following steps:
Figure GDA0002573660630000071
in the formula (3), μ is the mean value of the image, σ is the variance of the image, and img is the image;
step 2, carrying out random cutting, left-right turning, up-down turning, rotation at any angle, color disturbance, random brightness transformation and random noise addition on the preprocessed image to carry out data enhancement
Step 3, scaling the image processed in the step 2 into a fixed size (512 x 512);
step 4, forming a batch by a plurality of (16) zoomed images;
step 5, forward propagation is carried out by using the model;
step 6, calculating loss by using a loss function, reversely propagating, and updating training parameters;
and 7, carrying out iterative training until the model converges.
Preferably, the prediction stage processing steps of the deep learning are as follows:
a. the method comprises the following steps of preprocessing an input image, wherein the preprocessing method comprises the following steps:
Figure GDA0002573660630000072
in the formula (3), μ is the mean value of the image, σ is the variance of the image, and img is the image;
b. scaling the pre-processed image to a fixed size (512 x 512);
c. forward propagation using the model;
d. dividing the result output in the step c into two groups of words and characters;
e. aggregating the characters into words according to whether the words and the characters are overlapped;
f. counting the direction of each character in the same word, and determining the direction of the current word by using a voting method;
g. arranging the characters in the words according to the direction of the words in sequence;
h. determining whether spaces exist among the characters according to the distance among the characters in the word, and if yes, adding the spaces;
i. and outputting the result.
Results of the experiment
In the experiment we used 1900 more than 1900 medical slice data from more than ten hospitals as samples, 1400 as training data and 500 as test data. For deep learning, 1400 samples are very few, and we use the following method to alleviate the data shortage problem:
1. the model is pre-trained on COCO 6, and then is transferred to the problem of label character recognition;
2. as shown in FIG. 5, we automatically generated about 50000 samples using a program, but the weight of the automatically generated samples at the time of training was 1/30 of the real samples;
3. and data enhancement methods such as random up-down turning, random left-right turning, random rotation, random color disturbance, random brightness disturbance and the like are used.
The final properties of our model are shown in Table 1
TABLE 1 model characters and test results
Number of samples tested Rate of accuracy Recall rate Rate of accuracy of direction mAP@0.5
500 96.5% 95.7% 95.9% 93.1%
Through our post-processing algorithm, if only the label samples are classified, the classification is Her-2, ki-67, ER, PR and the like. Automatic classification of the tags may provide the necessary prerequisites for subsequent automatic processing of digital pathological sections. The test results of the model are shown in table 2:
TABLE 2 model Classification results
Number of samples tested Rate of accuracy Recall rate
925 100.0% 97.5%
As fig. 6 shows an example of detection results, colors of the object boxes in fig. 6 represent different directions, such as yellow for right, blue for up, green for left, and text in the label may be in any direction, if simple character-level detection using a general object detector such as RetinaNet cannot correctly distinguish direction-sensitive characters such as "6", "9" and "-", "_", etc., with the help of lineattention module, we can correctly distinguish the direction-sensitive characters.
The present invention is capable of other embodiments, and various changes and modifications may be made by one skilled in the art without departing from the spirit and scope of the invention.
The prior art documents to which the present invention relates are as follows:
[1].Yuliang L,Lianwen J,Shuaitao Z,et al.Detecting Curve Text in the Wild:New Dataset and New Solution[J].2017.
[2].Lin T Y,Goyal P,Girshick R,et al.Focal Loss for Dense Object Detection[J].IEEE Transactions on Pattern Analysis&Machine Intelligence,2017,PP(99):2999-3007.
[3].Kaiming He,Xiangyu Zhang,Shaoqing Ren,Jian Sun.Deep Residual Learning for Image Recognition.The IEEE Conference on Computer Vision and Pattern Recognition(CVPR),2016,pp.770-778
[4].A.Vaswani,N.Shazeer,N.Parmar,J.Uszkoreit,L.Jones,A.N.Gomez,L.Kaiser,and I.Polosukhin.Attention is all you need.In Neural Information Processing Systems(NIPS),2017.2,3,6
[5].R.Girshick,“Fast R-CNN,”in IEEE International Conference on Computer Vision(ICCV),2015.
[6].T.-Y.Lin,M.Maire,S.Belongie,J.Hays,P.Perona,D.Ramanan,P.Dollár,and C.L.Zitnick.Microsoft coco:Common objects in context.In European Conference on Computer Vision,pages 740–755.Springer,2014.4

Claims (7)

1. a pathological section label identification method is characterized in that: identifying the pathological section label image by adopting a deep learning method, wherein a basic network of a model adopted by the deep learning is a RetinaNet network based on ResNet-50, and a module for helping the basic network to identify the direction-sensitive characters; the topmost output network and the middle output network of the basic network share the weight, and the bottommost network uses the independent weight; the module comprises a vertical self-attention mechanism branch, a horizontal self-attention mechanism branch and a middle branch, and the fusion method of the module comprises the following steps:
O=Cvβ+Ch(1-β) (1)
in formula (1): o represents an output, CvIndicating a vertical self-attentive mechanism branch, ChRepresenting a horizontal self-attention mechanism branch, wherein beta is an output result of the middle branch;
the prediction stage processing steps of the deep learning are as follows:
a. preprocessing an input image;
b. zooming the preprocessed image into a fixed size;
c. forward propagation using the model;
d. dividing the result output in the step c into two groups of words and characters;
e. aggregating the characters into words according to whether the words and the characters are overlapped;
f. counting the directions of all characters in the same word, and determining the direction of the current word by using a voting method;
g. arranging the characters in the words according to the direction of the words in sequence;
h. determining whether spaces exist among the characters according to the distance among the characters in the word, and if yes, adding the spaces;
i. and outputting the result.
2. The pathological section tag identification method according to claim 1, wherein: the model has a ratio of the top most Anchor box of 1, 7, and 7; the bottommost Anchor box ratios are 1, 2 and 2.
3. The pathological section tag identification method according to any one of claims 1-2, wherein: the loss function of the training network is as follows:
L=Lcls(p,u)+λ[u≥1]Lloc(tu,v)+γLdre(p,w) (2)
in formula (2): l iscls(p,u)=-logpuU is the type of the target box in the output result, where the class number of the background is 0locRegression loss for the target Box, Ldre(p,w)=-log pwW is the direction of the target frame in the output result, and λ, γ are the weights lost accordingly.
4. The pathological section tag identification method according to claim 3, wherein: λ is 10 and γ is 1.
5. The pathological section tag identification method according to claim 3, wherein: the deep learning training stage comprises the following processing steps:
step 1, preprocessing an input image;
step 2, carrying out random cutting, left-right turning, up-down turning, rotation at any angle, color disturbance, random brightness transformation and random noise addition on the preprocessed image to carry out data enhancement
Step 3, zooming the image processed in the step 2 into a fixed size;
step 4, forming a batch by the zoomed images;
step 5, forward propagation is carried out by using the model;
step 6, calculating loss by using a loss function, reversely propagating, and updating training parameters;
and 7, carrying out iterative training until the model converges.
6. The pathological section tag identification method according to claim 5, wherein: the pretreatment method comprises the following steps:
Figure FDA0003746570880000021
in the formula (3), μ is the mean value of the image, and δ is the variance of the image.
7. The pathological section tag identification method according to claim 5, wherein: the fixed size is 512 by 512, and the number of sheets is 16.
CN202010199537.0A 2020-03-19 2020-03-19 Pathological section label identification method Active CN111553361B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010199537.0A CN111553361B (en) 2020-03-19 2020-03-19 Pathological section label identification method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010199537.0A CN111553361B (en) 2020-03-19 2020-03-19 Pathological section label identification method

Publications (2)

Publication Number Publication Date
CN111553361A CN111553361A (en) 2020-08-18
CN111553361B true CN111553361B (en) 2022-11-01

Family

ID=72001858

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010199537.0A Active CN111553361B (en) 2020-03-19 2020-03-19 Pathological section label identification method

Country Status (1)

Country Link
CN (1) CN111553361B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112634279B (en) * 2020-12-02 2023-04-07 四川大学华西医院 Medical image semantic segmentation method based on attention Unet model
CN114648680B (en) * 2022-05-17 2022-08-16 腾讯科技(深圳)有限公司 Training method, device, equipment and medium of image recognition model

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110245657A (en) * 2019-05-17 2019-09-17 清华大学 Pathological image similarity detection method and detection device
WO2019192397A1 (en) * 2018-04-04 2019-10-10 华中科技大学 End-to-end recognition method for scene text in any shape
CN110781305A (en) * 2019-10-30 2020-02-11 北京小米智能科技有限公司 Text classification method and device based on classification model and model training method

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10282414B2 (en) * 2017-02-28 2019-05-07 Cisco Technology, Inc. Deep learning bias detection in text
CN109447078B (en) * 2018-10-23 2020-11-06 四川大学 Detection and identification method for natural scene image sensitive characters
CN109753954A (en) * 2018-11-14 2019-05-14 安徽艾睿思智能科技有限公司 The real-time positioning identifying method of text based on deep learning attention mechanism
CN109697414B (en) * 2018-12-13 2021-06-18 北京金山数字娱乐科技有限公司 Text positioning method and device
CN109977861B (en) * 2019-03-25 2023-06-20 中国科学技术大学 Off-line handwriting mathematical formula recognition method
CN110837835B (en) * 2019-10-29 2022-11-08 华中科技大学 End-to-end scene text identification method based on boundary point detection

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019192397A1 (en) * 2018-04-04 2019-10-10 华中科技大学 End-to-end recognition method for scene text in any shape
CN110245657A (en) * 2019-05-17 2019-09-17 清华大学 Pathological image similarity detection method and detection device
CN110781305A (en) * 2019-10-30 2020-02-11 北京小米智能科技有限公司 Text classification method and device based on classification model and model training method

Also Published As

Publication number Publication date
CN111553361A (en) 2020-08-18

Similar Documents

Publication Publication Date Title
US7570816B2 (en) Systems and methods for detecting text
US8494273B2 (en) Adaptive optical character recognition on a document with distorted characters
Jain et al. Unconstrained scene text and video text recognition for arabic script
CN102385592B (en) Image concept detection method and device
CN112613502A (en) Character recognition method and device, storage medium and computer equipment
CN113591866B (en) Special operation certificate detection method and system based on DB and CRNN
CN113361432B (en) Video character end-to-end detection and identification method based on deep learning
CN110008899B (en) Method for extracting and classifying candidate targets of visible light remote sensing image
CN113343989B (en) Target detection method and system based on self-adaption of foreground selection domain
CN111553361B (en) Pathological section label identification method
CN109213886B (en) Image retrieval method and system based on image segmentation and fuzzy pattern recognition
CN114663904A (en) PDF document layout detection method, device, equipment and medium
CN113158895A (en) Bill identification method and device, electronic equipment and storage medium
CN111178196A (en) Method, device and equipment for cell classification
Nguyen TableSegNet: a fully convolutional network for table detection and segmentation in document images
Li et al. Image pattern recognition in identification of financial bills risk management
CN112508000B (en) Method and equipment for generating OCR image recognition model training data
CN111832497B (en) Text detection post-processing method based on geometric features
CN111767919A (en) Target detection method for multi-layer bidirectional feature extraction and fusion
US20230154217A1 (en) Method for Recognizing Text, Apparatus and Terminal Device
CN111414917A (en) Identification method of low-pixel-density text
CN116030469A (en) Processing method, processing device, processing equipment and computer readable storage medium
CN113205049A (en) Document identification method and identification system
Rani et al. Object Detection in Natural Scene Images Using Thresholding Techniques
Bumbu On classification of 17th century fonts using neural networks

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant