CN109284756A - A kind of terminal censorship method based on OCR technique - Google Patents
A kind of terminal censorship method based on OCR technique Download PDFInfo
- Publication number
- CN109284756A CN109284756A CN201810865946.2A CN201810865946A CN109284756A CN 109284756 A CN109284756 A CN 109284756A CN 201810865946 A CN201810865946 A CN 201810865946A CN 109284756 A CN109284756 A CN 109284756A
- Authority
- CN
- China
- Prior art keywords
- image
- character
- terminal
- carries out
- censorship
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/14—Image acquisition
- G06V30/148—Segmentation of character regions
- G06V30/153—Segmentation of character regions using recognition of characters or words
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/253—Fusion techniques of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
- G06V10/267—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/28—Quantising the image, e.g. histogram thresholding for discrimination between background and foreground patterns
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/30—Noise filtering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/74—Image or video pattern matching; Proximity measures in feature spaces
- G06V10/75—Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
- G06V10/751—Comparing pixel values or logical combinations thereof, or feature values having positional relevance, e.g. template matching
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Data Mining & Analysis (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Databases & Information Systems (AREA)
- General Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Software Systems (AREA)
- Character Input (AREA)
Abstract
The present invention discloses a kind of terminal censorship method based on OCR technique, step is: terminal sends censorship request to server, server includes that Connection Time and IP address record to the letter of terminal, and server by utilizing document analysis module judges the type of file to be detected;Server by utilizing document analysis module is judged that the file for belonging to image type is handled using image processing module;Cutting is carried out to image text using image cutting module to treated image;Text after cutting is extracted into character feature using characteristic extracting module and carries out Classification and Identification;Character after dividing to Classification and Identification carries out keyword match using text matches module, if containing keyword, the information that will acquire is shown to terminal interface, otherwise directly logs off.Such method can further improve the performance of OCR system, reduce the complexity of censorship system, realize the target to terminal censorship.
Description
Technical field
The invention belongs to computer and information security field, in particular to a kind of terminal based on OCR technique is protected
Close inspection method.
Background technique
In the epoch of Internet technology high speed development, people increasingly deepen the degree of dependence of internet, and information content is therewith
In explosive growth, mass data is produced, event also happens occasionally and secrets disclosed by net event and network are stolen secret information.Nowadays most of
The leakage of a state or party secret is all closely related with internet, computer and storage medium, is divulged a secret by internet, computer and storage medium
Through becoming the main path currently divulged a secret, and have the tendency that rising year by year.In traditional censorship, many departments into
The file that computer can be handled directly, such as office file, txt file are directed to when row censorship mostly, and for image
File is then helpless, is unable to satisfy the demand of current secret inspection, not only inefficiency, and wastes vast resources, the method
It is inevitable undesirable.
Therefore, the censorship method that censorship can be carried out to image file and operate by designing one kind has
Thus certain value and meaning, this case generate.
Summary of the invention
The purpose of the present invention is to provide a kind of terminal censorship method based on OCR technique, can be into one
Step improves the performance of OCR system, reduces the complexity of censorship system, realizes the target to terminal censorship.
In order to achieve the above objectives, solution of the invention is:
A kind of terminal censorship method based on OCR technique, comprising the following steps:
Step 1, the terminal in censorship network sends censorship request to server, and server is rung
Terminal is attached after request should be detected;
Step 2, server includes that Connection Time and IP address record, and are stored in data to the letter of terminal
In library, server by utilizing document analysis module judges the type of file to be detected;
Step 3, server by utilizing document analysis module is judged that the file for belonging to image type utilizes image procossing mould
Block is handled, and is specifically included:
If image is gaussian noise image, it is weighted at denoising using the method that mean filter is merged with median filtering
Reason, the image after being denoised;
If image is complex background image, character picture carries out the foreground area and background area of binary conversion treatment segmented image
Domain;
If image is tilted image, image compress and selected part image carries out Hough transform, after obtaining correction
Image;
Step 4, cutting is carried out to image text using image cutting module to treated image;
Step 5, the text after cutting is extracted into character feature using characteristic extracting module and carries out Classification and Identification;
Step 6, the character after dividing to Classification and Identification carries out keyword match using text matches module, if containing related
Otherwise key word directly logs off then the information that will acquire is shown to terminal interface.
After adopting the above scheme, the technological means that the present invention uses B/S to combine with C/S, to the terminal of networking
Carry out safe and secret inspection, overcome it is multidisciplinary in the file that be directed to mostly when censorship computer and can directly handle,
Such as office file, txt file, and for image file then helpless drawback, on the one hand expand check object
Range supports multiple types file and its Content inspection, including picture file, office file, web page files, compressed package text
The various regular files such as part, mail document and picture OCR inspection etc., on the other hand also improve the efficiency for checking work, disappear
In addition to the disturbing factor of character recognition.The present invention meets the needs of current secret inspection, and not only efficiency is generally improved, and also saves
Vast resources is saved, the method has centainly advisability.
The drawbacks of present invention can not only overcome most of check objects that can only identify text and can not identify image file,
Operating process of the information manager to censorship system can also be simplified.
Detailed description of the invention
Fig. 1 is overall logic configuration diagram of the invention;
Fig. 2 is overall procedure schematic diagram of the invention;
Fig. 3 is document analysis module diagram of the invention;
Fig. 4 is image processing module schematic diagram of the invention;
Fig. 5 is image cutting module diagram of the invention;
Fig. 6 is characteristic extracting module schematic diagram of the invention;
Fig. 7 is routine inspection result figure of the invention;
Fig. 8 is depth inspection result figure of the invention.
Specific embodiment
Below with reference to attached drawing, technical solution of the present invention and beneficial effect are described in detail.
As depicted in figs. 1 and 2, the present invention provides a kind of terminal censorship method based on OCR technique, including
Following steps:
Step 1) sends censorship request, server to server in the terminal in censorship network
Terminal is attached after responding detection request, censorship includes the routine inspection of file content shown in Fig. 7,
Inspection to filename, file content, Mail Contents, picture file;Also support the depth inspection of file content shown in Fig. 8,
Depth inspection described in Fig. 8 is that the file of deletion and operation note are restored and checked.
Step 2) server includes that Connection Time and IP address record, and are stored in data to the letter of terminal
In library, server by utilizing document analysis module shown in Fig. 3 judges the type of file to be detected.
If file type is compressed file, compressed file is decompressed, again for each file in compressed file
Judge file type;If file type is non-image files, the content of document data bank is parsed;If file type is
Image file is then handled using image processing module.
Step 3) judges server by utilizing document analysis module shown in Fig. 3 the file for belonging to image type utilizes figure
Image processing module shown in 4 is handled.
If step 3.1) image is gaussian noise image, in terms of removing Gaussian noise, filtered using mean filter and intermediate value
The method of wave fusion, this method are weighted denoising by assigning the mean filter weight different from median filtering, with
Just the image after being denoised;
Step 3.1.1) 3 × 3 windows of setting, containing the pixel value for obtaining each position in window in gaussian noise image.
Step 3.1.2) calculate separately 3 × 3 window mean values and intermediate value.
Step 3.1.3) mean value and intermediate value data that are obtained according to above step, and assign their different weights and added
Power calculates, and the result of calculating is set as to the pixel value of center position.
Step 3.1.4) above step is repeated to whole image progress denoising.
If step 3.2) image is complex background image, character picture carries out the foreground area of binary conversion treatment segmented image
And background area, to reduce system operations complexity.
Step 3.2.1) using the method calculating whole image threshold value T of global threshold, and find out cluster centre T1。
Step 3.2.2) according to the region of the threshold value setting calculated in step 3.2.1) again threshold value, it is right using c, d as variable
Whole image is judged that judgment method is as follows:
(1-c)T≤f(x,y)≤(1+c)T
(1-d)T1≤f(x,y)≤(1+d)T1
Wherein c, d are preset parameter.If meeting with upper inequality, step 3.2.3 is carried out);Otherwise the pixel root
Binary conversion treatment is carried out according to Global thresholding.
Step 3.2.3) when meeting above-mentioned formula, local threshold is carried out using improved Bernsen algorithm.
Step 3.2.4) above step is repeated to whole image progress binary conversion treatment.
Binary conversion treatment is carried out to image by Global thresholding, and result continues using improved by treated
Bernsen algorithm carries out the processing operation of binaryzation, removes complex background to obtain the foreground image of better effect;
If step 3.3) image is tilted image, image compress and selected part image carries out Hough transform, is led to
It crosses Hough transform and the test problems to straight line can be converted into and meet at the curve number problem of same point in parameter space statistics,
It needs for ρ, θ to be discretized into N number of parameter space, parameter space is divided into many units, to establish parameter space accumulator.
Step 3.3.1) it compresses image and piecemeal processing is carried out to image, selected part character picture is as detected mesh
Mark.
Step 3.3.2) discrete parameter space is constructed in ρ-θ plane, and accumulated matrix A (ρ, θ) is established, by each of which member
Plain initial value is assigned to 0.
Step 3.3.3) Hough transform is carried out to each non-zero point element of bianry image, if θ and ρ
It is corresponding, then result is recorded in accumulated matrix.
Step 3.3.4) maximum value of finally finding out vector ρ in accumulated matrix, corresponding to θ value at this time is a series of straight lines
Inclination angle.
After detecting the tilt angle of character picture, processing is corrected to it, the transformation for mula of rotation correction is such as
Under:
The speed of image procossing can be improved by carrying out processing to tilted image, the image after being corrected;
Step 4) carries out cutting, Fig. 5 institute to image text using image cutting module shown in fig. 5 to treated image
The image cutting module shown includes String localization module and Character segmentation module.
Step 4.1) String localization module shown in fig. 5 is believed by extracting the edge of character picture with Sobel operator
Breath, and String localization operation is carried out using the method for character edge detection and Morphological scale-space to the marginal information extracted,
To determine the character zone in text, character zone is marked with rectangle frame.
The character zone being marked with rectangle frame is utilized projection histogram by step 4.2) Character segmentation module shown in fig. 5
Each row or each column target point quantity in figure statistical picture space, result are distributed by image ranks sequence, are divided character
It cuts;
Step 4.3) utilizes the probability density distribution of character horizontally and vertically to the character after segmentation, to character
Equal proportion scaling is carried out, linear normalization processing, the text after obtaining cutting to the end are carried out;
Text after cutting is extracted character feature using characteristic extracting module shown in fig. 6 and carries out classification knowledge by step 5)
Not.
Step 5.1) carries out the feature extraction of directional line element feature to the character after segmentation, for the character after segmentation, 3 × 3
In window, all black pixel points in character picture element matrix are scanned one by one from top to bottom, from left to right;
If the pixel of scanning is that black pixel point is character pattern, then judge that window white pixel is counted;
If white pixel points are more than or equal to 2 and less than 8, current pixel point is the profile point of text;
Otherwise it is judged as noise spot, and the pixel value is set as 0.
Step 5.2) combines the spy for being drawn up character from eight sides based on structure feature and based on the method for statistical nature
The octuple direction vector of vector is levied, to calculate the directional element features of black pixel point.
Step 5.3) selection template matching classifier is divided according to the octuple direction vector of obtained template characteristic vector
Class identification, to divide different characters.
Wherein image processing module shown in Fig. 4 in step 3), image cutting module shown in fig. 5, step in step 4)
5) characteristic extracting module as shown in FIG. 6 is the nucleus module of OCR engine technology in, mainly converts the text in image to
Editable text.
Character after step 6) divides Classification and Identification utilizes the text matches algorithm in text matches module shown in FIG. 1
Keyword match is carried out, text matches module shown in FIG. 1 can search in text whether contain keyword, if containing key
Otherwise word directly logs off then the information that will acquire is shown to terminal interface.
The present invention includes the content of three aspects: first, the framework that B/S is combined with C/S is built, terminal is completed
Various functions needed for communication and server end of the end with server;Second, it researchs and develops and realizes for various operations system
The safe and secret inspection of system is supported the inspection to image file, is handled the text information of image file;Third, service
Device carries main management work.
1, user entered keyword inspection
Server is connected with all clients under C/S model, and database is then placed on server end, and C/S structure is necessary
Network Environment.When client proposes connection request, server end can respond these requests, and operating database saves
Related data, subsequent client carry out censorship.Meanwhile server can be to the client-side information of connection (including Connection Time
With IP address etc.) it is recorded, and it is stored in database.
2, rolling inspection program
Terminal rolling inspection program supports the routine inspection of the file content of terminal shown in Fig. 7 (main
It is related to terminal security secrecy provision, USB device usage record, internet records, communication apparatus, user information etc.), in addition to Fig. 7
Shown in file content routine inspection result except, also provide and carry out safe and secret inspection for the contents of all types of files
Function (classified information inspection), such as various document class (Office, PDF, txt etc.) file, web page files, compressed package text
The inspection of part, mail document, picture file, or even also support depth recovery and inspection to file content shown in Fig. 8, finally
Inspection result is shown to terminal by server.
3, the management of server
Core of the invention part is that terminal checks the audit function of program and the management function of server, is led to
It crosses and image file is pre-processed, and call OCR to be parsed into the text information that can be edited volume image file after processing, it is right
Text information using character locating go forward side by side line character cutting techniques by text segmentation at character, the character after extracting segmentation carries out special
Sign is extracted, and carries out Classification Management to characteristic value, and sorted character is finally carried out to the text matches of keyword, and matching is believed
Breath is shown in terminal.
The above examples only illustrate the technical idea of the present invention, and this does not limit the scope of protection of the present invention, all
According to the technical idea provided by the invention, any changes made on the basis of the technical scheme each falls within the scope of the present invention
Within.
Claims (8)
1. a kind of terminal censorship method based on OCR technique, it is characterised in that the following steps are included:
Step 1, the terminal in censorship network sends censorship request to server, and server response should
Terminal is attached after detection request;
Step 2, server includes that Connection Time and IP address record, and are stored in database to the letter of terminal
In, server by utilizing document analysis module judges the type of file to be detected;
Step 3, server by utilizing document analysis module is judged the file for belonging to image type, using image processing module into
Row processing, specifically includes:
If image is gaussian noise image, denoising is weighted using the method that mean filter is merged with median filtering, is obtained
Image after to denoising;
If image is complex background image, character picture carries out foreground area and the background area of binary conversion treatment segmented image;
If image is tilted image, image compress and selected part image carries out Hough transform, the figure after being corrected
Picture;
Step 4, cutting is carried out to image text using image cutting module to treated image;
Step 5, the text after cutting is extracted into character feature using characteristic extracting module and carries out Classification and Identification;
Step 6, the character after dividing to Classification and Identification carries out keyword match using text matches module, if containing key
Otherwise word directly logs off then the information that will acquire is shown to terminal interface.
2. a kind of terminal censorship method based on OCR technique as described in claim 1, it is characterised in that: institute
It states in step 2, if file type is compressed file, compressed file is decompressed, for each file weight in compressed file
Newly judge file type;If file type is non-image files, the content of document data bank is parsed;If file type
It is image file, then is handled using image processing module.
3. a kind of terminal censorship method based on OCR technique as described in claim 1, it is characterised in that: institute
It states in step 3, if image is gaussian noise image, is weighted at denoising using the method that mean filter is merged with median filtering
Reason, the detailed process of the image after being denoised is:
Step 3.1.1 sets 3 × 3 windows, containing the pixel value for obtaining each position in window in gaussian noise image;
Step 3.1.2 calculates separately 3 × 3 window mean values and intermediate value;
Step 3.1.3 assigns their different weights and is weighted, will calculate according to obtained mean value and intermediate value data
Result be set as the pixel value of center position;
Step 3.1.4 repeats above step and carries out denoising to whole image.
4. a kind of terminal censorship method based on OCR technique as described in claim 1, it is characterised in that: institute
It states in step 3, if image is complex background image, character picture carries out the foreground area and background of binary conversion treatment segmented image
The detailed process in region is:
Step 3.2.1 calculates whole image threshold value T using the method for global threshold, and finds out cluster centre T1;
Step 3.2.2, according to the region of the threshold value setting calculated in step 3.2.1 again threshold value, using c, d as variable, to entire
Image is judged that judgment method is as follows:
(1-c)T≤f(x,y)≤(1+c)T
(1-d)T1≤f(x,y)≤(1+d)T1
Wherein c, d are preset parameter;If meeting with upper inequality, step 3.2.3 is carried out;Otherwise the pixel is according to the overall situation
Threshold method carries out binary conversion treatment;
Step 3.2.3 carries out local threshold using improved Bernsen algorithm when meeting above-mentioned formula;
Step 3.2.4 repeats above step and carries out binary conversion treatment to whole image.
5. a kind of terminal censorship method based on OCR technique as claimed in claim 4, it is characterised in that: institute
It states in step 3.2.4, binary conversion treatment is carried out to image by Global thresholding, and result continues using improvement by treated
Bernsen algorithm carry out the processing operation of binaryzation, remove complex background to obtain the foreground image of better effect.
6. a kind of terminal censorship method based on OCR technique as described in claim 1, it is characterised in that: institute
It states in step 3, if image is tilted image, image compress and selected part image carries out Hough transform, is corrected
The detailed process of image afterwards is:
Step 3.3.1 compresses image and carries out piecemeal processing to image, and selected part character picture is as detected target;
Step 3.3.2 constructs discrete parameter space in ρ-θ plane, and establishes accumulated matrix A (ρ, θ), will be at the beginning of each of which element
Value is assigned to 0;
Step 3.3.3 carries out Hough transform to each non-zero point element of bianry image, if a θ is corresponding with a ρ,
Then result is recorded in accumulated matrix;
Step 3.3.4, finally finds out the maximum value of vector ρ in accumulated matrix, corresponds to a series of inclination that θ value is straight lines at this time
Angle;
After detecting the tilt angle of character picture, processing is corrected to it, the transformation for mula of rotation correction is as follows:
The speed of image procossing can be improved by carrying out processing to tilted image, the image after being corrected.
7. a kind of terminal censorship method based on OCR technique as described in claim 1, it is characterised in that: institute
It states in step 4, image cutting module includes String localization module and Character segmentation module, and the detailed process of step 4 is:
Step 4.1, String localization module uses the marginal information of Sobel operator extraction character picture, and to the side extracted
Edge information carries out String localization operation using the method for character edge detection and Morphological scale-space, to determine the word in text
Region is accorded with, character zone is marked with rectangle frame;
Step 4.2, Character segmentation module is empty using projection histogram statistical picture by the character zone being marked with rectangle frame
Between in each row or each column target point quantity, result by image ranks sequence be distributed, character is split;
Step 4.3, the probability density distribution of character horizontally and vertically is utilized to the character after segmentation, character is carried out
Equal proportion scaling, carries out linear normalization processing, the text after obtaining cutting to the end.
8. a kind of terminal censorship method based on OCR technique as described in claim 1, it is characterised in that: institute
The detailed process for stating step 5 is:
Step 5.1, the feature extraction that directional line element feature is carried out to the character after segmentation, for the character after segmentation, in 3 × 3 windows
In, scan all black pixel points in character picture element matrix one by one from top to bottom, from left to right;
If the pixel of scanning is that black pixel point is character pattern, then judge that window white pixel is counted;
If white pixel points are more than or equal to 2 and less than 8, current pixel point is the profile point of text;
Otherwise it is judged as noise spot, and the pixel value is set as 0;
Step 5.2, in conjunction with based on structure feature and based on the method for statistical nature from eight sides be drawn up the feature of character to
The octuple direction vector of amount, to calculate the directional element features of black pixel point;
Step 5.3, selection template matching classifier carries out classification knowledge according to the octuple direction vector of obtained template characteristic vector
Not, different characters is divided.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810865946.2A CN109284756A (en) | 2018-08-01 | 2018-08-01 | A kind of terminal censorship method based on OCR technique |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810865946.2A CN109284756A (en) | 2018-08-01 | 2018-08-01 | A kind of terminal censorship method based on OCR technique |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109284756A true CN109284756A (en) | 2019-01-29 |
Family
ID=65183339
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810865946.2A Pending CN109284756A (en) | 2018-08-01 | 2018-08-01 | A kind of terminal censorship method based on OCR technique |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109284756A (en) |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110390260A (en) * | 2019-06-12 | 2019-10-29 | 平安科技(深圳)有限公司 | Picture scanning part processing method, device, computer equipment and storage medium |
CN111046874A (en) * | 2019-12-12 | 2020-04-21 | 北京小白世纪网络科技有限公司 | Single number identification method based on template matching |
CN111461205A (en) * | 2020-03-30 | 2020-07-28 | 拉扎斯网络科技(上海)有限公司 | Image processing method, image processing device, electronic equipment and computer readable storage medium |
CN111767769A (en) * | 2019-08-14 | 2020-10-13 | 北京京东尚科信息技术有限公司 | Text extraction method and device, electronic equipment and storage medium |
CN112115735A (en) * | 2019-06-19 | 2020-12-22 | 国网江苏省电力有限公司常州供电分公司 | Identification management method for confidential files |
CN112200735A (en) * | 2020-09-18 | 2021-01-08 | 安徽理工大学 | Temperature identification method based on flame image and control method of low-concentration gas combustion system |
CN113191348A (en) * | 2021-05-31 | 2021-07-30 | 山东新一代信息产业技术研究院有限公司 | Template-based text structured extraction method and tool |
CN115328463A (en) * | 2022-08-01 | 2022-11-11 | 无锡雪浪数制科技有限公司 | Design system based on visual business arrangement |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101697228A (en) * | 2009-10-15 | 2010-04-21 | 东莞市步步高教育电子产品有限公司 | Method for processing text images |
CN101859382A (en) * | 2010-06-03 | 2010-10-13 | 复旦大学 | License plate detection and identification method based on maximum stable extremal region |
CN106411650A (en) * | 2016-10-19 | 2017-02-15 | 北京交通大学 | Distributed security and confidentiality checking method |
-
2018
- 2018-08-01 CN CN201810865946.2A patent/CN109284756A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101697228A (en) * | 2009-10-15 | 2010-04-21 | 东莞市步步高教育电子产品有限公司 | Method for processing text images |
CN101859382A (en) * | 2010-06-03 | 2010-10-13 | 复旦大学 | License plate detection and identification method based on maximum stable extremal region |
CN106411650A (en) * | 2016-10-19 | 2017-02-15 | 北京交通大学 | Distributed security and confidentiality checking method |
Non-Patent Citations (4)
Title |
---|
孔斌等: "保密检查中图像文件内容识别技术研究", 《保密科学技术》 * |
孙李娜等: "视频图像中文本的检测、定位与提取", 《电子科技》 * |
张克: "《温度测控技术及应用》", 30 November 2011 * |
陈育宁: "《西夏文字数学化方法及其应用》", 30 June 2002 * |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110390260A (en) * | 2019-06-12 | 2019-10-29 | 平安科技(深圳)有限公司 | Picture scanning part processing method, device, computer equipment and storage medium |
WO2020248497A1 (en) * | 2019-06-12 | 2020-12-17 | 平安科技(深圳)有限公司 | Picture scanning document processing method and apparatus, computer device, and storage medium |
CN110390260B (en) * | 2019-06-12 | 2024-03-22 | 平安科技(深圳)有限公司 | Picture scanning piece processing method and device, computer equipment and storage medium |
CN112115735A (en) * | 2019-06-19 | 2020-12-22 | 国网江苏省电力有限公司常州供电分公司 | Identification management method for confidential files |
CN111767769A (en) * | 2019-08-14 | 2020-10-13 | 北京京东尚科信息技术有限公司 | Text extraction method and device, electronic equipment and storage medium |
CN111046874A (en) * | 2019-12-12 | 2020-04-21 | 北京小白世纪网络科技有限公司 | Single number identification method based on template matching |
CN111461205A (en) * | 2020-03-30 | 2020-07-28 | 拉扎斯网络科技(上海)有限公司 | Image processing method, image processing device, electronic equipment and computer readable storage medium |
CN112200735A (en) * | 2020-09-18 | 2021-01-08 | 安徽理工大学 | Temperature identification method based on flame image and control method of low-concentration gas combustion system |
CN113191348A (en) * | 2021-05-31 | 2021-07-30 | 山东新一代信息产业技术研究院有限公司 | Template-based text structured extraction method and tool |
CN113191348B (en) * | 2021-05-31 | 2023-02-03 | 山东新一代信息产业技术研究院有限公司 | Template-based text structured extraction method and tool |
CN115328463A (en) * | 2022-08-01 | 2022-11-11 | 无锡雪浪数制科技有限公司 | Design system based on visual business arrangement |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109284756A (en) | A kind of terminal censorship method based on OCR technique | |
JP3359095B2 (en) | Image processing method and apparatus | |
JP4477468B2 (en) | Device part image retrieval device for assembly drawings | |
CN108805076A (en) | The extracting method and system of environmental impact assessment report table word | |
JP2001167131A (en) | Automatic classifying method for document using document signature | |
CN112861865B (en) | Auxiliary auditing method based on OCR technology | |
CN116403094B (en) | Embedded image recognition method and system | |
US11704925B2 (en) | Systems and methods for digitized document image data spillage recovery | |
CN114444566B (en) | Image forgery detection method and device and computer storage medium | |
JP4391704B2 (en) | Image processing apparatus and method for generating binary image from multi-valued image | |
CN111274762A (en) | Computer expression method based on diversified fonts in Tibetan classical documents | |
CN114842478A (en) | Text area identification method, device, equipment and storage medium | |
Dixit et al. | Automatic logo detection from document image using HOG features | |
Zhan et al. | A robust split-and-merge text segmentation approach for images | |
CN104899551B (en) | A kind of form image sorting technique | |
Kia et al. | Integrated segmentation and clustering for enhanced compression of document images | |
Padma et al. | I DENTIFICATION OF T ELUGU, D EVANAGARI AND E NGLISH S CRIPTS U SING D ISCRIMINATING | |
CN116311297A (en) | Electronic evidence image recognition and analysis method based on computer vision | |
CN114758340A (en) | Intelligent identification method, device and equipment for logistics address and storage medium | |
CN113591657A (en) | OCR (optical character recognition) layout recognition method and device, electronic equipment and medium | |
CN118379753B (en) | Method and system for extracting bad asset contract key information by utilizing OCR technology | |
CN106469267A (en) | A kind of identifying code sample collection method and system | |
Sun | Pornographic image screening by integrating recognition module and image black-list/white-list subsystem | |
Malkawi et al. | Auto Signature Verification Using Line Projection Features Combined with Different Classifiers and Selection Methods | |
Hukkeri et al. | Machine Learning in OCR Technology: Performance Analysis of Different OCR Methods for Slide-to-Text Conversion in Lecture Videos |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190129 |