CN111522951A - Sensitive data identification and classification technical method based on image identification - Google Patents
Sensitive data identification and classification technical method based on image identification Download PDFInfo
- Publication number
- CN111522951A CN111522951A CN202010338824.5A CN202010338824A CN111522951A CN 111522951 A CN111522951 A CN 111522951A CN 202010338824 A CN202010338824 A CN 202010338824A CN 111522951 A CN111522951 A CN 111522951A
- Authority
- CN
- China
- Prior art keywords
- sensitive
- image
- strategy
- identification
- information
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 44
- 230000035945 sensitivity Effects 0.000 claims description 20
- 238000013528 artificial neural network Methods 0.000 claims description 6
- 238000013527 convolutional neural network Methods 0.000 claims description 6
- 125000004122 cyclic group Chemical group 0.000 claims description 6
- 230000011218 segmentation Effects 0.000 claims description 6
- 102100032202 Cornulin Human genes 0.000 claims description 3
- 101000920981 Homo sapiens Cornulin Proteins 0.000 claims description 3
- 230000002457 bidirectional effect Effects 0.000 claims description 3
- 238000006243 chemical reaction Methods 0.000 claims description 3
- 230000000750 progressive effect Effects 0.000 claims description 3
- 230000000306 recurrent effect Effects 0.000 claims description 3
- 230000002103 transcriptional effect Effects 0.000 claims description 3
- 238000012549 training Methods 0.000 abstract description 11
- 238000005516 engineering process Methods 0.000 abstract description 7
- 238000012015 optical character recognition Methods 0.000 abstract description 6
- 239000000284 extract Substances 0.000 abstract description 3
- 238000010276 construction Methods 0.000 abstract 1
- 238000004458 analytical method Methods 0.000 description 6
- 238000010801 machine learning Methods 0.000 description 6
- 238000000605 extraction Methods 0.000 description 3
- 230000007547 defect Effects 0.000 description 2
- 238000002372 labelling Methods 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 238000005070 sampling Methods 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 230000006399 behavior Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
- G06F16/3346—Query execution using probabilistic model
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/60—Protecting data
- G06F21/62—Protecting access to data via a platform, e.g. using keys or access control rules
- G06F21/6218—Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
- G06F21/6245—Protecting personal data, e.g. for financial or medical purposes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
- G06V10/267—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/14—Image acquisition
- G06V30/148—Segmentation of character regions
- G06V30/153—Segmentation of character regions using recognition of characters or words
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Multimedia (AREA)
- Health & Medical Sciences (AREA)
- Bioethics (AREA)
- General Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Probability & Statistics with Applications (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Medical Informatics (AREA)
- Computer Hardware Design (AREA)
- Computer Security & Cryptography (AREA)
- Software Systems (AREA)
- Character Discrimination (AREA)
Abstract
The invention discloses a sensitive data identification and classification technical method based on image identification, which comprises a sensitive strategy model construction method and an image sensitive information mark classification method; the method directly extracts characters in the image through an OCR (optical character recognition) technology, classifies the sensitive information of the image according to a preset sensitive information strategy, can accurately recognize and classify the image information, can directly modify the sensitive information strategy at the later stage, does not need to perform image classification training, has higher expansibility and detailed sensitive classification, and can adapt to and meet the service difference requirements of different service systems; the method can adapt to and meet the business difference requirements of different business systems, the sensitive information is quickly and accurately identified by extracting the image characters and using a sensitive strategy, the sensitive characteristic is expanded, the self-defined sensitive information matching of the image is realized, and the leakage of private data is reduced.
Description
Technical Field
The invention belongs to the field of data security technology and machine learning, and particularly relates to a sensitive data identification and classification technical method based on image identification.
Background
The method aims at the technology of sensitive identification in images, a deep learning-based image classification method is available at present, the core is a task of distributing a label to the images from a set classification set, identifying information such as objects, scenes, behaviors and the like in the images and returning corresponding label information.
The method comprises the steps of identifying each object through a TENSORFLOW (symbolic mathematical system based on data flow programming) training system through a machine learning algorithm, and using TENSORFLOW to train image data to be divided into three steps of labeling, training and classifying, wherein the labeling step is very time-consuming.
The image classification is carried out through a machine learning algorithm TENSORFLOW, a large number of image training sets are used for training the model, the verification set is used for verifying whether the model is over-fitted, and the test set is used for testing the accuracy of the model. Therefore, the prior art has the following disadvantages: 1. the amount of sample data required for learning is large, and a large amount of sample pictures are required to be provided for each picture type; 2. the calculation amount is large, the requirement on computer hardware is high, and the time for training the model by machine learning is long; 3. revising the classification samples requires relearning; 4. different learning rates may lead to different results. If the speed is too high, the accuracy rate can continuously jump up and down in the training process, and if the speed is too low, the expected accuracy rate can not be reached before the training is finished; 5. images that are similar but differ in the type of text-sensitive information in the figure cannot be carefully distinguished.
Therefore, the image classification is performed through the machine learning algorithm TENSORFLOW, and the accurate classification and extraction of the character information in the image cannot be met, so that a technical method for identifying and classifying sensitive data based on image identification is needed.
Disclosure of Invention
The invention aims to solve the defects in the prior art, and provides a technical method for identifying and classifying sensitive data based on image identification.
In order to achieve the purpose, the invention provides the following technical scheme: a sensitive data identification and classification technical method based on image identification comprises a method for constructing a sensitive strategy model and a method for classifying image sensitive information labels, wherein the method for constructing the sensitive strategy model comprises the following steps:
s1, acquiring sensitive information characteristics, and acquiring sensitive information characteristics and sensitive elements according to a sensitive sample, wherein the sensitive characteristics are a set of minimum sensitive metadata, the sensitive characteristics are only used for configuring a sensitive strategy, and the sensitive characteristics cannot be directly used for sensitive identification;
s2, constructing a sensitivity strategy rule, setting sensitivity classification and grading on the basis of a sensitivity strategy according to actual scenes, industry specifications and different sensitivity strategies of use range components, wherein the sensitivity strategy combines the sensitivity elements and limits the identification range;
the image sensitive information mark classification method comprises the following steps:
a1, image character region segmentation, namely performing region segmentation on images and characters in the images through a PSENET algorithm, setting BACKBONE in the PSENET algorithm as a RESNET structure, performing multiple predictions corresponding to a plurality of KERNELS with different scales on text examples in the images and characters, gradually expanding the KERNELS with the minimum scale to the size matched with the shape of the text examples through a progressive scale expansion algorithm, distinguishing adjacent text examples through a larger geometric edge between the KERNELS with the minimum scale through the PSE algorithm, and detecting the text examples with any shapes;
a2, performing character recognition by using a method of solving the problem of sequence recognition based on images by using a convolution cyclic neural network structure, and extracting sequence features from input images to obtain a feature map (CNN); predicting the characteristic sequence by using a deep bidirectional recurrent neural network (BLSTM), learning each characteristic vector in the sequence, and outputting prediction label distribution (RNN) representing a true value; using the CTC loss, performing distribution conversion on a series of labels acquired from a cycle layer (RNN), and predicting and outputting a final label sequence (CTC) by selecting a label sequence with the highest probability;
a3, identifying a sensitive engine, adding an image to be identified into a queue, acquiring queue information through a background task, identifying the image, acquiring character information in the image, saving the character information as a sample file, acquiring an available sensitive strategy according to a system configuration item, summarizing hit sensitive characteristic information, matching strategy rules according to configuration strategy information after the characteristic information is identified, and summarizing and recording the hit strategy rules.
Preferably, in step S1, the sensitive features are embodied as sensitive information such as a mobile phone number, a name, an address, an identification number, and the like.
Preferably, in step S2, the sensitive data identification criterion of the sensitive policy is a combination of sensitive elements, and the ordering rule of the combination of sensitive elements is set to three types, i.e., unordered ordering, ordered ordering, and interval ordering.
Preferably, in the step a1, the network structure of the PSENET algorithm is set as a pyramid network framework structure like FPN.
Preferably, in the step a1, the plural KERNELS with different scales are set to be a shape shared with the original whole text instance, and the plural KERNELS with different scales and the same center point are located at different scales and the same center point as the original whole text instance.
Preferably, in step a2, the CRNN network structure includes three parts, CNN (convolutional layer), RNN (cyclic layer), and CTC (transcriptional layer).
Preferably, in the step a3, the recognition of the image in the queue by the sensitivity engine is set as multi-thread recognition.
The invention has the technical effects and advantages that: the invention provides a sensitive data identification and classification technical method based on image identification, which directly extracts characters in an image through an OCR (optical character recognition) technology, classifies the sensitive information of the image according to a preset sensitive information strategy, can accurately identify and classify the image information, can directly modify the sensitive information strategy at the later stage, does not need to perform image classification training any more, and has higher expansibility and detailed sensitive classification;
the sensitive strategy rules and the sensitive classification and grading methods are established according to the sensitive characteristic information sequence, and technologies such as sensitive information classification and grading are realized to carry out the datamation on the sensitive rules by using the clustering, sampling, probability and other mathematical analysis methods; identifying an algorithm of sensitive information in an engine according to a sensitive strategy, and realizing sensitive information matching by using mathematical analysis methods such as probability, statistics and the like; the method has the advantages of automatic optimization updating, self-improvement and enrichment according to scene analysis sensitive characteristics, and can adapt to and meet the service difference requirements of different service systems; the image characters are extracted, sensitive information is rapidly and accurately identified by using a sensitive strategy, the sensitive characteristic is expanded, the self-defined sensitive information of the image is matched, and the leakage of private data is reduced.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail with reference to the following embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
A sensitive data identification and classification technical method based on image identification comprises a method for constructing a sensitive strategy model and a method for classifying image sensitive information labels, wherein the method for constructing the sensitive strategy model comprises the following steps:
s1, acquiring sensitive information characteristics, and acquiring sensitive information characteristics and sensitive elements according to a sensitive sample, wherein the sensitive characteristics are a set of minimum sensitive metadata, the sensitive characteristics are only used for configuring a sensitive strategy, and the sensitive characteristics cannot be directly used for sensitive identification;
s2, constructing a sensitivity strategy rule, setting sensitivity classification and grading on the basis of a sensitivity strategy according to actual scenes, industry specifications and different sensitivity strategies of use range components, wherein the sensitivity strategy combines the sensitivity elements and limits the identification range;
the image sensitive information mark classification method comprises the following steps:
a1, image character region segmentation, namely performing region segmentation on images and characters in the images through a PSENET algorithm, setting BACKBONE in the PSENET algorithm as a RESNET structure, performing multiple predictions corresponding to a plurality of KERNELS with different scales on text examples in the images and characters, gradually expanding the KERNELS with the minimum scale to the size matched with the shape of the text examples through a progressive scale expansion algorithm, distinguishing adjacent text examples through a larger geometric edge between the KERNELS with the minimum scale through the PSE algorithm, and detecting the text examples with any shapes;
a2, performing character recognition by using a method of solving the problem of sequence recognition based on images by using a convolution cyclic neural network structure, and extracting sequence features from input images to obtain a feature map (CNN); predicting the characteristic sequence by using a deep bidirectional recurrent neural network (BLSTM), learning each characteristic vector in the sequence, and outputting prediction label distribution (RNN) representing a true value; using the CTC loss, performing distribution conversion on a series of labels acquired from a cycle layer (RNN), and predicting and outputting a final label sequence (CTC) by selecting a label sequence with the highest probability;
a3, identifying a sensitive engine, adding an image to be identified into a queue, acquiring queue information through a background task, identifying the image, acquiring character information in the image, saving the character information as a sample file, acquiring an available sensitive strategy according to a system configuration item, summarizing hit sensitive characteristic information, matching strategy rules according to configuration strategy information after the characteristic information is identified, and summarizing and recording the hit strategy rules.
Further, in step S1, the sensitive characteristics are embodied as sensitive information such as a mobile phone number, a name, an address, an identification number, and the like.
Further, in step S2, the sensitive data identification criterion of the sensitive policy is a combination of sensitive elements, and the ordering rule of the combination of sensitive elements is set to three types, i.e., unordered ordering, ordered ordering, and interval ordering.
Further, in the step a1, the network structure of the PSENET algorithm is set as a pyramid network framework structure like FPN.
Further, in the step a1, the plural KERNELS with different scales are set to be a shape shared with the original whole text instance, and the plural KERNELS with different scales and the same center point are located at different scales and the same center point as the original whole text instance.
Further, in the step a2, the CRNN network structure includes three parts, namely CNN (convolutional layer), RNN (cyclic layer) and CTC (transcriptional layer).
Further, in the step a3, the recognition of the image in the queue by the sensitivity engine is set as multi-thread recognition.
To sum up: the invention provides a sensitive data identification and classification technical method based on image identification, which directly extracts characters in an image through an OCR (optical character recognition) technology, classifies the sensitive information of the image according to a preset sensitive information strategy, can accurately identify and classify the image information, can directly modify the sensitive information strategy at the later stage, does not need to perform image classification training any more, and has higher expansibility and detailed sensitive classification;
the method overcomes the defects that the conventional image sensitive information identification is low in identification accuracy, few in preset classification and low in later-stage increased sensitive classification efficiency, and the accurate identification and specific sensitive information extraction of similar images cannot be carried out. Sensitive strategies are enriched through machine learning and manual intervention, classification and information extraction of image sensitive information are realized through extracting characters and performing strategy matching, and accuracy and expansibility of image sensitive information are improved. Developers can easily realize sensitive feature expansion to realize the matching of the self-defined sensitive information of the image without repeatedly training and identifying samples for many times, and the leakage of private data and the like is reduced;
the sensitive strategy rules and the sensitive classification and grading methods are established according to the sensitive characteristic information sequence, and technologies such as sensitive information classification and grading are realized to carry out the datamation on the sensitive rules by using the clustering, sampling, probability and other mathematical analysis methods; identifying an algorithm of sensitive information in an engine according to a sensitive strategy, and realizing sensitive information matching by using mathematical analysis methods such as probability, statistics and the like; the method has the advantages of automatic optimization updating, self-improvement and enrichment according to scene analysis sensitive characteristics, and can adapt to and meet the service difference requirements of different service systems; and extracting image characters and rapidly and accurately identifying sensitive information by using a sensitive strategy.
Finally, it should be noted that: although the present invention has been described in detail with reference to the foregoing embodiments, it will be apparent to those skilled in the art that modifications may be made to the embodiments or portions thereof without departing from the spirit and scope of the invention.
Claims (7)
1. A sensitive data identification and classification technical method based on image identification comprises a method for constructing a sensitive strategy model and a method for classifying image sensitive information labels, and is characterized in that:
the method for constructing the sensitive strategy model comprises the following steps:
s1, acquiring sensitive information characteristics, and acquiring sensitive information characteristics and sensitive elements according to a sensitive sample, wherein the sensitive characteristics are a set of minimum sensitive metadata, the sensitive characteristics are only used for configuring a sensitive strategy, and the sensitive characteristics cannot be directly used for sensitive identification;
s2, constructing a sensitivity strategy rule, setting sensitivity classification and grading on the basis of a sensitivity strategy according to actual scenes, industry specifications and different sensitivity strategies of use range components, wherein the sensitivity strategy combines the sensitivity elements and limits the identification range;
the image sensitive information mark classification method comprises the following steps:
a1, image character region segmentation, namely performing region segmentation on images and characters in the images through a PSENET algorithm, setting BACKBONE in the PSENET algorithm as a RESNET structure, performing multiple predictions corresponding to a plurality of KERNELS with different scales on text examples in the images and characters, gradually expanding the KERNELS with the minimum scale to the size matched with the shape of the text examples through a progressive scale expansion algorithm, distinguishing adjacent text examples through a larger geometric edge between the KERNELS with the minimum scale through the PSE algorithm, and detecting the text examples with any shapes;
a2, performing character recognition by using a method of solving the problem of sequence recognition based on images by using a convolution cyclic neural network structure, and extracting sequence features from input images to obtain a feature map (CNN); predicting the characteristic sequence by using a deep bidirectional recurrent neural network (BLSTM), learning each characteristic vector in the sequence, and outputting prediction label distribution (RNN) representing a true value; using the CTC loss, performing distribution conversion on a series of labels acquired from a cycle layer (RNN), and predicting and outputting a final label sequence (CTC) by selecting a label sequence with the highest probability;
a3, identifying a sensitive engine, adding an image to be identified into a queue, acquiring queue information through a background task, identifying the image, acquiring character information in the image, saving the character information as a sample file, acquiring an available sensitive strategy according to a system configuration item, summarizing hit sensitive characteristic information, matching strategy rules according to configuration strategy information after the characteristic information is identified, and summarizing and recording the hit strategy rules.
2. The technical method for sensitive data identification and classification based on image identification as claimed in claim 1, wherein: in step S1, the sensitive features are embodied as sensitive information such as a mobile phone number, a name, an address, an identification number, and the like.
3. The technical method for sensitive data identification and classification based on image identification as claimed in claim 1, wherein: in the step S2, the sensitive data identification standard of the sensitive policy is a combination of sensitive elements, and the ordering rule of the combination of sensitive elements is set to three types, i.e., unordered ordering, ordered ordering, and interval ordering.
4. The technical method for sensitive data identification and classification based on image identification as claimed in claim 1, wherein: in the step a1, the network structure of the PSENET algorithm is set as a pyramid network framework structure like FPN.
5. The technical method for sensitive data identification and classification based on image identification as claimed in claim 1, wherein: in step a1, the plural KERNELS with different scales are set to be a shape shared with the original whole text instance, and the plural KERNELS with different scales and the same center point are located at different scales and the same center point as the original whole text instance.
6. The technical method for sensitive data identification and classification based on image identification as claimed in claim 1, wherein: in the step a2, the CRNN network structure includes three parts, CNN (convolutional layer), RNN (cyclic layer), and CTC (transcriptional layer).
7. The technical method for sensitive data identification and classification based on image identification as claimed in claim 1, wherein: in the step a3, the recognition of the image in the queue by the sensitive engine is set as multi-thread recognition.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010338824.5A CN111522951A (en) | 2020-04-26 | 2020-04-26 | Sensitive data identification and classification technical method based on image identification |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010338824.5A CN111522951A (en) | 2020-04-26 | 2020-04-26 | Sensitive data identification and classification technical method based on image identification |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111522951A true CN111522951A (en) | 2020-08-11 |
Family
ID=71903881
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010338824.5A Pending CN111522951A (en) | 2020-04-26 | 2020-04-26 | Sensitive data identification and classification technical method based on image identification |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111522951A (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112598016A (en) * | 2020-09-17 | 2021-04-02 | 北京小米松果电子有限公司 | Image classification method and device, communication equipment and storage medium |
CN112990212A (en) * | 2021-02-05 | 2021-06-18 | 开放智能机器(上海)有限公司 | Reading method and device of thermal imaging temperature map, electronic equipment and storage medium |
CN113221890A (en) * | 2021-05-25 | 2021-08-06 | 深圳市瑞驰信息技术有限公司 | OCR-based cloud mobile phone text content supervision method, system and system |
CN113240024A (en) * | 2021-05-20 | 2021-08-10 | 贾晓丰 | Data classification method and device based on deep learning clustering algorithm |
CN114117533A (en) * | 2021-11-30 | 2022-03-01 | 重庆理工大学 | Method and system for classifying picture data |
CN114218391A (en) * | 2021-12-30 | 2022-03-22 | 闪捷信息科技有限公司 | Sensitive information identification method based on deep learning technology |
CN115641594A (en) * | 2022-12-23 | 2023-01-24 | 广州佰锐网络科技有限公司 | OCR technology-based identification card recognition method, storage medium and device |
Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101859382A (en) * | 2010-06-03 | 2010-10-13 | 复旦大学 | License plate detection and identification method based on maximum stable extremal region |
CN103605992A (en) * | 2013-11-28 | 2014-02-26 | 国家电网公司 | Sensitive image recognizing method in interaction of inner and outer power networks |
CN109189965A (en) * | 2018-07-19 | 2019-01-11 | 中国科学院信息工程研究所 | Pictograph search method and system |
CN109447078A (en) * | 2018-10-23 | 2019-03-08 | 四川大学 | A kind of detection recognition method of natural scene image sensitivity text |
CN109446817A (en) * | 2018-10-29 | 2019-03-08 | 成都思维世纪科技有限责任公司 | A kind of detection of big data and auditing system |
CN109492118A (en) * | 2018-10-31 | 2019-03-19 | 北京奇艺世纪科技有限公司 | A kind of data detection method and detection device |
CN110008950A (en) * | 2019-03-13 | 2019-07-12 | 南京大学 | The method of text detection in the natural scene of a kind of pair of shape robust |
CN110458132A (en) * | 2019-08-19 | 2019-11-15 | 河海大学常州校区 | One kind is based on random length text recognition method end to end |
CN110738207A (en) * | 2019-09-10 | 2020-01-31 | 西南交通大学 | character detection method for fusing character area edge information in character image |
CN110866108A (en) * | 2019-11-20 | 2020-03-06 | 满江(上海)软件科技有限公司 | Sensitive data detection system and detection method thereof |
CN110880000A (en) * | 2019-11-27 | 2020-03-13 | 上海智臻智能网络科技股份有限公司 | Picture character positioning method and device, computer equipment and storage medium |
CN110991440A (en) * | 2019-12-11 | 2020-04-10 | 易诚高科(大连)科技有限公司 | Pixel-driven mobile phone operation interface text detection method |
-
2020
- 2020-04-26 CN CN202010338824.5A patent/CN111522951A/en active Pending
Patent Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101859382A (en) * | 2010-06-03 | 2010-10-13 | 复旦大学 | License plate detection and identification method based on maximum stable extremal region |
CN103605992A (en) * | 2013-11-28 | 2014-02-26 | 国家电网公司 | Sensitive image recognizing method in interaction of inner and outer power networks |
CN109189965A (en) * | 2018-07-19 | 2019-01-11 | 中国科学院信息工程研究所 | Pictograph search method and system |
CN109447078A (en) * | 2018-10-23 | 2019-03-08 | 四川大学 | A kind of detection recognition method of natural scene image sensitivity text |
CN109446817A (en) * | 2018-10-29 | 2019-03-08 | 成都思维世纪科技有限责任公司 | A kind of detection of big data and auditing system |
CN109492118A (en) * | 2018-10-31 | 2019-03-19 | 北京奇艺世纪科技有限公司 | A kind of data detection method and detection device |
CN110008950A (en) * | 2019-03-13 | 2019-07-12 | 南京大学 | The method of text detection in the natural scene of a kind of pair of shape robust |
CN110458132A (en) * | 2019-08-19 | 2019-11-15 | 河海大学常州校区 | One kind is based on random length text recognition method end to end |
CN110738207A (en) * | 2019-09-10 | 2020-01-31 | 西南交通大学 | character detection method for fusing character area edge information in character image |
CN110866108A (en) * | 2019-11-20 | 2020-03-06 | 满江(上海)软件科技有限公司 | Sensitive data detection system and detection method thereof |
CN110880000A (en) * | 2019-11-27 | 2020-03-13 | 上海智臻智能网络科技股份有限公司 | Picture character positioning method and device, computer equipment and storage medium |
CN110991440A (en) * | 2019-12-11 | 2020-04-10 | 易诚高科(大连)科技有限公司 | Pixel-driven mobile phone operation interface text detection method |
Non-Patent Citations (1)
Title |
---|
WENHAI WANG 等: "Shape Robust Text Detection with Progressive Scale Expansion Network", 《 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR)》 * |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112598016A (en) * | 2020-09-17 | 2021-04-02 | 北京小米松果电子有限公司 | Image classification method and device, communication equipment and storage medium |
CN112990212A (en) * | 2021-02-05 | 2021-06-18 | 开放智能机器(上海)有限公司 | Reading method and device of thermal imaging temperature map, electronic equipment and storage medium |
CN113240024A (en) * | 2021-05-20 | 2021-08-10 | 贾晓丰 | Data classification method and device based on deep learning clustering algorithm |
CN113221890A (en) * | 2021-05-25 | 2021-08-06 | 深圳市瑞驰信息技术有限公司 | OCR-based cloud mobile phone text content supervision method, system and system |
CN114117533A (en) * | 2021-11-30 | 2022-03-01 | 重庆理工大学 | Method and system for classifying picture data |
CN114218391A (en) * | 2021-12-30 | 2022-03-22 | 闪捷信息科技有限公司 | Sensitive information identification method based on deep learning technology |
CN115641594A (en) * | 2022-12-23 | 2023-01-24 | 广州佰锐网络科技有限公司 | OCR technology-based identification card recognition method, storage medium and device |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111522951A (en) | Sensitive data identification and classification technical method based on image identification | |
CN114240878A (en) | Routing inspection scene-oriented insulator defect detection neural network construction and optimization method | |
CN105574550A (en) | Vehicle identification method and device | |
CN108304835A (en) | character detecting method and device | |
EP1388804A2 (en) | Method and system for face detection using pattern classifier | |
CN108171184A (en) | Method for distinguishing is known based on Siamese networks again for pedestrian | |
CN111767927A (en) | Lightweight license plate recognition method and system based on full convolution network | |
EP4124993A1 (en) | Automatic image classification and processing method based on continuous processing structure of multiple artificial intelligence model, and computer program stored in computer-readable recording medium to execute the same | |
CN111583180B (en) | Image tampering identification method and device, computer equipment and storage medium | |
CN113963147B (en) | Key information extraction method and system based on semantic segmentation | |
CN102385592A (en) | Image concept detection method and device | |
CN112766170B (en) | Self-adaptive segmentation detection method and device based on cluster unmanned aerial vehicle image | |
CN109858570A (en) | Image classification method and system, computer equipment and medium | |
CN112464925A (en) | Mobile terminal account opening data bank information automatic extraction method based on machine learning | |
CN110751191A (en) | Image classification method and system | |
CN111461143A (en) | Picture copying identification method and device and electronic equipment | |
CN111652846A (en) | Semiconductor defect identification method based on characteristic pyramid convolution neural network | |
CN114639152A (en) | Multi-modal voice interaction method, device, equipment and medium based on face recognition | |
CN111553361B (en) | Pathological section label identification method | |
CN117372956A (en) | Method and device for detecting state of substation screen cabinet equipment | |
CN112364687A (en) | Improved Faster R-CNN gas station electrostatic sign identification method and system | |
CN111444362A (en) | Malicious picture intercepting method, device, equipment and storage medium | |
CN111767919A (en) | Target detection method for multi-layer bidirectional feature extraction and fusion | |
CN112380970B (en) | Video target detection method based on local area search | |
CN114927236A (en) | Detection method and system for multiple target images |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20200811 |