CN109284756A - A kind of terminal censorship method based on OCR technique - Google Patents

A kind of terminal censorship method based on OCR technique Download PDF

Info

Publication number
CN109284756A
CN109284756A CN201810865946.2A CN201810865946A CN109284756A CN 109284756 A CN109284756 A CN 109284756A CN 201810865946 A CN201810865946 A CN 201810865946A CN 109284756 A CN109284756 A CN 109284756A
Authority
CN
China
Prior art keywords
image
character
terminal
carries out
censorship
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810865946.2A
Other languages
Chinese (zh)
Inventor
李昌利
贾乾
刘翔
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hohai University HHU
Original Assignee
Hohai University HHU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hohai University HHU filed Critical Hohai University HHU
Priority to CN201810865946.2A priority Critical patent/CN109284756A/en
Publication of CN109284756A publication Critical patent/CN109284756A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/148Segmentation of character regions
    • G06V30/153Segmentation of character regions using recognition of characters or words
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/28Quantising the image, e.g. histogram thresholding for discrimination between background and foreground patterns
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/30Noise filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/75Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
    • G06V10/751Comparing pixel values or logical combinations thereof, or feature values having positional relevance, e.g. template matching

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Character Input (AREA)

Abstract

The present invention discloses a kind of terminal censorship method based on OCR technique, step is: terminal sends censorship request to server, server includes that Connection Time and IP address record to the letter of terminal, and server by utilizing document analysis module judges the type of file to be detected;Server by utilizing document analysis module is judged that the file for belonging to image type is handled using image processing module;Cutting is carried out to image text using image cutting module to treated image;Text after cutting is extracted into character feature using characteristic extracting module and carries out Classification and Identification;Character after dividing to Classification and Identification carries out keyword match using text matches module, if containing keyword, the information that will acquire is shown to terminal interface, otherwise directly logs off.Such method can further improve the performance of OCR system, reduce the complexity of censorship system, realize the target to terminal censorship.

Description

A kind of terminal censorship method based on OCR technique
Technical field
The invention belongs to computer and information security field, in particular to a kind of terminal based on OCR technique is protected Close inspection method.
Background technique
In the epoch of Internet technology high speed development, people increasingly deepen the degree of dependence of internet, and information content is therewith In explosive growth, mass data is produced, event also happens occasionally and secrets disclosed by net event and network are stolen secret information.Nowadays most of The leakage of a state or party secret is all closely related with internet, computer and storage medium, is divulged a secret by internet, computer and storage medium Through becoming the main path currently divulged a secret, and have the tendency that rising year by year.In traditional censorship, many departments into The file that computer can be handled directly, such as office file, txt file are directed to when row censorship mostly, and for image File is then helpless, is unable to satisfy the demand of current secret inspection, not only inefficiency, and wastes vast resources, the method It is inevitable undesirable.
Therefore, the censorship method that censorship can be carried out to image file and operate by designing one kind has Thus certain value and meaning, this case generate.
Summary of the invention
The purpose of the present invention is to provide a kind of terminal censorship method based on OCR technique, can be into one Step improves the performance of OCR system, reduces the complexity of censorship system, realizes the target to terminal censorship.
In order to achieve the above objectives, solution of the invention is:
A kind of terminal censorship method based on OCR technique, comprising the following steps:
Step 1, the terminal in censorship network sends censorship request to server, and server is rung Terminal is attached after request should be detected;
Step 2, server includes that Connection Time and IP address record, and are stored in data to the letter of terminal In library, server by utilizing document analysis module judges the type of file to be detected;
Step 3, server by utilizing document analysis module is judged that the file for belonging to image type utilizes image procossing mould Block is handled, and is specifically included:
If image is gaussian noise image, it is weighted at denoising using the method that mean filter is merged with median filtering Reason, the image after being denoised;
If image is complex background image, character picture carries out the foreground area and background area of binary conversion treatment segmented image Domain;
If image is tilted image, image compress and selected part image carries out Hough transform, after obtaining correction Image;
Step 4, cutting is carried out to image text using image cutting module to treated image;
Step 5, the text after cutting is extracted into character feature using characteristic extracting module and carries out Classification and Identification;
Step 6, the character after dividing to Classification and Identification carries out keyword match using text matches module, if containing related Otherwise key word directly logs off then the information that will acquire is shown to terminal interface.
After adopting the above scheme, the technological means that the present invention uses B/S to combine with C/S, to the terminal of networking Carry out safe and secret inspection, overcome it is multidisciplinary in the file that be directed to mostly when censorship computer and can directly handle, Such as office file, txt file, and for image file then helpless drawback, on the one hand expand check object Range supports multiple types file and its Content inspection, including picture file, office file, web page files, compressed package text The various regular files such as part, mail document and picture OCR inspection etc., on the other hand also improve the efficiency for checking work, disappear In addition to the disturbing factor of character recognition.The present invention meets the needs of current secret inspection, and not only efficiency is generally improved, and also saves Vast resources is saved, the method has centainly advisability.
The drawbacks of present invention can not only overcome most of check objects that can only identify text and can not identify image file, Operating process of the information manager to censorship system can also be simplified.
Detailed description of the invention
Fig. 1 is overall logic configuration diagram of the invention;
Fig. 2 is overall procedure schematic diagram of the invention;
Fig. 3 is document analysis module diagram of the invention;
Fig. 4 is image processing module schematic diagram of the invention;
Fig. 5 is image cutting module diagram of the invention;
Fig. 6 is characteristic extracting module schematic diagram of the invention;
Fig. 7 is routine inspection result figure of the invention;
Fig. 8 is depth inspection result figure of the invention.
Specific embodiment
Below with reference to attached drawing, technical solution of the present invention and beneficial effect are described in detail.
As depicted in figs. 1 and 2, the present invention provides a kind of terminal censorship method based on OCR technique, including Following steps:
Step 1) sends censorship request, server to server in the terminal in censorship network Terminal is attached after responding detection request, censorship includes the routine inspection of file content shown in Fig. 7, Inspection to filename, file content, Mail Contents, picture file;Also support the depth inspection of file content shown in Fig. 8, Depth inspection described in Fig. 8 is that the file of deletion and operation note are restored and checked.
Step 2) server includes that Connection Time and IP address record, and are stored in data to the letter of terminal In library, server by utilizing document analysis module shown in Fig. 3 judges the type of file to be detected.
If file type is compressed file, compressed file is decompressed, again for each file in compressed file Judge file type;If file type is non-image files, the content of document data bank is parsed;If file type is Image file is then handled using image processing module.
Step 3) judges server by utilizing document analysis module shown in Fig. 3 the file for belonging to image type utilizes figure Image processing module shown in 4 is handled.
If step 3.1) image is gaussian noise image, in terms of removing Gaussian noise, filtered using mean filter and intermediate value The method of wave fusion, this method are weighted denoising by assigning the mean filter weight different from median filtering, with Just the image after being denoised;
Step 3.1.1) 3 × 3 windows of setting, containing the pixel value for obtaining each position in window in gaussian noise image.
Step 3.1.2) calculate separately 3 × 3 window mean values and intermediate value.
Step 3.1.3) mean value and intermediate value data that are obtained according to above step, and assign their different weights and added Power calculates, and the result of calculating is set as to the pixel value of center position.
Step 3.1.4) above step is repeated to whole image progress denoising.
If step 3.2) image is complex background image, character picture carries out the foreground area of binary conversion treatment segmented image And background area, to reduce system operations complexity.
Step 3.2.1) using the method calculating whole image threshold value T of global threshold, and find out cluster centre T1
Step 3.2.2) according to the region of the threshold value setting calculated in step 3.2.1) again threshold value, it is right using c, d as variable Whole image is judged that judgment method is as follows:
(1-c)T≤f(x,y)≤(1+c)T
(1-d)T1≤f(x,y)≤(1+d)T1
Wherein c, d are preset parameter.If meeting with upper inequality, step 3.2.3 is carried out);Otherwise the pixel root Binary conversion treatment is carried out according to Global thresholding.
Step 3.2.3) when meeting above-mentioned formula, local threshold is carried out using improved Bernsen algorithm.
Step 3.2.4) above step is repeated to whole image progress binary conversion treatment.
Binary conversion treatment is carried out to image by Global thresholding, and result continues using improved by treated Bernsen algorithm carries out the processing operation of binaryzation, removes complex background to obtain the foreground image of better effect;
If step 3.3) image is tilted image, image compress and selected part image carries out Hough transform, is led to It crosses Hough transform and the test problems to straight line can be converted into and meet at the curve number problem of same point in parameter space statistics, It needs for ρ, θ to be discretized into N number of parameter space, parameter space is divided into many units, to establish parameter space accumulator.
Step 3.3.1) it compresses image and piecemeal processing is carried out to image, selected part character picture is as detected mesh Mark.
Step 3.3.2) discrete parameter space is constructed in ρ-θ plane, and accumulated matrix A (ρ, θ) is established, by each of which member Plain initial value is assigned to 0.
Step 3.3.3) Hough transform is carried out to each non-zero point element of bianry image, if θ and ρ It is corresponding, then result is recorded in accumulated matrix.
Step 3.3.4) maximum value of finally finding out vector ρ in accumulated matrix, corresponding to θ value at this time is a series of straight lines Inclination angle.
After detecting the tilt angle of character picture, processing is corrected to it, the transformation for mula of rotation correction is such as Under:
The speed of image procossing can be improved by carrying out processing to tilted image, the image after being corrected;
Step 4) carries out cutting, Fig. 5 institute to image text using image cutting module shown in fig. 5 to treated image The image cutting module shown includes String localization module and Character segmentation module.
Step 4.1) String localization module shown in fig. 5 is believed by extracting the edge of character picture with Sobel operator Breath, and String localization operation is carried out using the method for character edge detection and Morphological scale-space to the marginal information extracted, To determine the character zone in text, character zone is marked with rectangle frame.
The character zone being marked with rectangle frame is utilized projection histogram by step 4.2) Character segmentation module shown in fig. 5 Each row or each column target point quantity in figure statistical picture space, result are distributed by image ranks sequence, are divided character It cuts;
Step 4.3) utilizes the probability density distribution of character horizontally and vertically to the character after segmentation, to character Equal proportion scaling is carried out, linear normalization processing, the text after obtaining cutting to the end are carried out;
Text after cutting is extracted character feature using characteristic extracting module shown in fig. 6 and carries out classification knowledge by step 5) Not.
Step 5.1) carries out the feature extraction of directional line element feature to the character after segmentation, for the character after segmentation, 3 × 3 In window, all black pixel points in character picture element matrix are scanned one by one from top to bottom, from left to right;
If the pixel of scanning is that black pixel point is character pattern, then judge that window white pixel is counted;
If white pixel points are more than or equal to 2 and less than 8, current pixel point is the profile point of text;
Otherwise it is judged as noise spot, and the pixel value is set as 0.
Step 5.2) combines the spy for being drawn up character from eight sides based on structure feature and based on the method for statistical nature The octuple direction vector of vector is levied, to calculate the directional element features of black pixel point.
Step 5.3) selection template matching classifier is divided according to the octuple direction vector of obtained template characteristic vector Class identification, to divide different characters.
Wherein image processing module shown in Fig. 4 in step 3), image cutting module shown in fig. 5, step in step 4) 5) characteristic extracting module as shown in FIG. 6 is the nucleus module of OCR engine technology in, mainly converts the text in image to Editable text.
Character after step 6) divides Classification and Identification utilizes the text matches algorithm in text matches module shown in FIG. 1 Keyword match is carried out, text matches module shown in FIG. 1 can search in text whether contain keyword, if containing key Otherwise word directly logs off then the information that will acquire is shown to terminal interface.
The present invention includes the content of three aspects: first, the framework that B/S is combined with C/S is built, terminal is completed Various functions needed for communication and server end of the end with server;Second, it researchs and develops and realizes for various operations system The safe and secret inspection of system is supported the inspection to image file, is handled the text information of image file;Third, service Device carries main management work.
1, user entered keyword inspection
Server is connected with all clients under C/S model, and database is then placed on server end, and C/S structure is necessary Network Environment.When client proposes connection request, server end can respond these requests, and operating database saves Related data, subsequent client carry out censorship.Meanwhile server can be to the client-side information of connection (including Connection Time With IP address etc.) it is recorded, and it is stored in database.
2, rolling inspection program
Terminal rolling inspection program supports the routine inspection of the file content of terminal shown in Fig. 7 (main It is related to terminal security secrecy provision, USB device usage record, internet records, communication apparatus, user information etc.), in addition to Fig. 7 Shown in file content routine inspection result except, also provide and carry out safe and secret inspection for the contents of all types of files Function (classified information inspection), such as various document class (Office, PDF, txt etc.) file, web page files, compressed package text The inspection of part, mail document, picture file, or even also support depth recovery and inspection to file content shown in Fig. 8, finally Inspection result is shown to terminal by server.
3, the management of server
Core of the invention part is that terminal checks the audit function of program and the management function of server, is led to It crosses and image file is pre-processed, and call OCR to be parsed into the text information that can be edited volume image file after processing, it is right Text information using character locating go forward side by side line character cutting techniques by text segmentation at character, the character after extracting segmentation carries out special Sign is extracted, and carries out Classification Management to characteristic value, and sorted character is finally carried out to the text matches of keyword, and matching is believed Breath is shown in terminal.
The above examples only illustrate the technical idea of the present invention, and this does not limit the scope of protection of the present invention, all According to the technical idea provided by the invention, any changes made on the basis of the technical scheme each falls within the scope of the present invention Within.

Claims (8)

1. a kind of terminal censorship method based on OCR technique, it is characterised in that the following steps are included:
Step 1, the terminal in censorship network sends censorship request to server, and server response should Terminal is attached after detection request;
Step 2, server includes that Connection Time and IP address record, and are stored in database to the letter of terminal In, server by utilizing document analysis module judges the type of file to be detected;
Step 3, server by utilizing document analysis module is judged the file for belonging to image type, using image processing module into Row processing, specifically includes:
If image is gaussian noise image, denoising is weighted using the method that mean filter is merged with median filtering, is obtained Image after to denoising;
If image is complex background image, character picture carries out foreground area and the background area of binary conversion treatment segmented image;
If image is tilted image, image compress and selected part image carries out Hough transform, the figure after being corrected Picture;
Step 4, cutting is carried out to image text using image cutting module to treated image;
Step 5, the text after cutting is extracted into character feature using characteristic extracting module and carries out Classification and Identification;
Step 6, the character after dividing to Classification and Identification carries out keyword match using text matches module, if containing key Otherwise word directly logs off then the information that will acquire is shown to terminal interface.
2. a kind of terminal censorship method based on OCR technique as described in claim 1, it is characterised in that: institute It states in step 2, if file type is compressed file, compressed file is decompressed, for each file weight in compressed file Newly judge file type;If file type is non-image files, the content of document data bank is parsed;If file type It is image file, then is handled using image processing module.
3. a kind of terminal censorship method based on OCR technique as described in claim 1, it is characterised in that: institute It states in step 3, if image is gaussian noise image, is weighted at denoising using the method that mean filter is merged with median filtering Reason, the detailed process of the image after being denoised is:
Step 3.1.1 sets 3 × 3 windows, containing the pixel value for obtaining each position in window in gaussian noise image;
Step 3.1.2 calculates separately 3 × 3 window mean values and intermediate value;
Step 3.1.3 assigns their different weights and is weighted, will calculate according to obtained mean value and intermediate value data Result be set as the pixel value of center position;
Step 3.1.4 repeats above step and carries out denoising to whole image.
4. a kind of terminal censorship method based on OCR technique as described in claim 1, it is characterised in that: institute It states in step 3, if image is complex background image, character picture carries out the foreground area and background of binary conversion treatment segmented image The detailed process in region is:
Step 3.2.1 calculates whole image threshold value T using the method for global threshold, and finds out cluster centre T1
Step 3.2.2, according to the region of the threshold value setting calculated in step 3.2.1 again threshold value, using c, d as variable, to entire Image is judged that judgment method is as follows:
(1-c)T≤f(x,y)≤(1+c)T
(1-d)T1≤f(x,y)≤(1+d)T1
Wherein c, d are preset parameter;If meeting with upper inequality, step 3.2.3 is carried out;Otherwise the pixel is according to the overall situation Threshold method carries out binary conversion treatment;
Step 3.2.3 carries out local threshold using improved Bernsen algorithm when meeting above-mentioned formula;
Step 3.2.4 repeats above step and carries out binary conversion treatment to whole image.
5. a kind of terminal censorship method based on OCR technique as claimed in claim 4, it is characterised in that: institute It states in step 3.2.4, binary conversion treatment is carried out to image by Global thresholding, and result continues using improvement by treated Bernsen algorithm carry out the processing operation of binaryzation, remove complex background to obtain the foreground image of better effect.
6. a kind of terminal censorship method based on OCR technique as described in claim 1, it is characterised in that: institute It states in step 3, if image is tilted image, image compress and selected part image carries out Hough transform, is corrected The detailed process of image afterwards is:
Step 3.3.1 compresses image and carries out piecemeal processing to image, and selected part character picture is as detected target;
Step 3.3.2 constructs discrete parameter space in ρ-θ plane, and establishes accumulated matrix A (ρ, θ), will be at the beginning of each of which element Value is assigned to 0;
Step 3.3.3 carries out Hough transform to each non-zero point element of bianry image, if a θ is corresponding with a ρ, Then result is recorded in accumulated matrix;
Step 3.3.4, finally finds out the maximum value of vector ρ in accumulated matrix, corresponds to a series of inclination that θ value is straight lines at this time Angle;
After detecting the tilt angle of character picture, processing is corrected to it, the transformation for mula of rotation correction is as follows:
The speed of image procossing can be improved by carrying out processing to tilted image, the image after being corrected.
7. a kind of terminal censorship method based on OCR technique as described in claim 1, it is characterised in that: institute It states in step 4, image cutting module includes String localization module and Character segmentation module, and the detailed process of step 4 is:
Step 4.1, String localization module uses the marginal information of Sobel operator extraction character picture, and to the side extracted Edge information carries out String localization operation using the method for character edge detection and Morphological scale-space, to determine the word in text Region is accorded with, character zone is marked with rectangle frame;
Step 4.2, Character segmentation module is empty using projection histogram statistical picture by the character zone being marked with rectangle frame Between in each row or each column target point quantity, result by image ranks sequence be distributed, character is split;
Step 4.3, the probability density distribution of character horizontally and vertically is utilized to the character after segmentation, character is carried out Equal proportion scaling, carries out linear normalization processing, the text after obtaining cutting to the end.
8. a kind of terminal censorship method based on OCR technique as described in claim 1, it is characterised in that: institute The detailed process for stating step 5 is:
Step 5.1, the feature extraction that directional line element feature is carried out to the character after segmentation, for the character after segmentation, in 3 × 3 windows In, scan all black pixel points in character picture element matrix one by one from top to bottom, from left to right;
If the pixel of scanning is that black pixel point is character pattern, then judge that window white pixel is counted;
If white pixel points are more than or equal to 2 and less than 8, current pixel point is the profile point of text;
Otherwise it is judged as noise spot, and the pixel value is set as 0;
Step 5.2, in conjunction with based on structure feature and based on the method for statistical nature from eight sides be drawn up the feature of character to The octuple direction vector of amount, to calculate the directional element features of black pixel point;
Step 5.3, selection template matching classifier carries out classification knowledge according to the octuple direction vector of obtained template characteristic vector Not, different characters is divided.
CN201810865946.2A 2018-08-01 2018-08-01 A kind of terminal censorship method based on OCR technique Pending CN109284756A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810865946.2A CN109284756A (en) 2018-08-01 2018-08-01 A kind of terminal censorship method based on OCR technique

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810865946.2A CN109284756A (en) 2018-08-01 2018-08-01 A kind of terminal censorship method based on OCR technique

Publications (1)

Publication Number Publication Date
CN109284756A true CN109284756A (en) 2019-01-29

Family

ID=65183339

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810865946.2A Pending CN109284756A (en) 2018-08-01 2018-08-01 A kind of terminal censorship method based on OCR technique

Country Status (1)

Country Link
CN (1) CN109284756A (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110390260A (en) * 2019-06-12 2019-10-29 平安科技(深圳)有限公司 Picture scanning part processing method, device, computer equipment and storage medium
CN111046874A (en) * 2019-12-12 2020-04-21 北京小白世纪网络科技有限公司 Single number identification method based on template matching
CN111461205A (en) * 2020-03-30 2020-07-28 拉扎斯网络科技(上海)有限公司 Image processing method, image processing device, electronic equipment and computer readable storage medium
CN111767769A (en) * 2019-08-14 2020-10-13 北京京东尚科信息技术有限公司 Text extraction method and device, electronic equipment and storage medium
CN112115735A (en) * 2019-06-19 2020-12-22 国网江苏省电力有限公司常州供电分公司 Identification management method for confidential files
CN112200735A (en) * 2020-09-18 2021-01-08 安徽理工大学 Temperature identification method based on flame image and control method of low-concentration gas combustion system
CN113191348A (en) * 2021-05-31 2021-07-30 山东新一代信息产业技术研究院有限公司 Template-based text structured extraction method and tool
CN115328463A (en) * 2022-08-01 2022-11-11 无锡雪浪数制科技有限公司 Design system based on visual business arrangement

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101697228A (en) * 2009-10-15 2010-04-21 东莞市步步高教育电子产品有限公司 Method for processing text images
CN101859382A (en) * 2010-06-03 2010-10-13 复旦大学 License plate detection and identification method based on maximum stable extremal region
CN106411650A (en) * 2016-10-19 2017-02-15 北京交通大学 Distributed security and confidentiality checking method

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101697228A (en) * 2009-10-15 2010-04-21 东莞市步步高教育电子产品有限公司 Method for processing text images
CN101859382A (en) * 2010-06-03 2010-10-13 复旦大学 License plate detection and identification method based on maximum stable extremal region
CN106411650A (en) * 2016-10-19 2017-02-15 北京交通大学 Distributed security and confidentiality checking method

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
孔斌等: "保密检查中图像文件内容识别技术研究", 《保密科学技术》 *
孙李娜等: "视频图像中文本的检测、定位与提取", 《电子科技》 *
张克: "《温度测控技术及应用》", 30 November 2011 *
陈育宁: "《西夏文字数学化方法及其应用》", 30 June 2002 *

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110390260A (en) * 2019-06-12 2019-10-29 平安科技(深圳)有限公司 Picture scanning part processing method, device, computer equipment and storage medium
WO2020248497A1 (en) * 2019-06-12 2020-12-17 平安科技(深圳)有限公司 Picture scanning document processing method and apparatus, computer device, and storage medium
CN110390260B (en) * 2019-06-12 2024-03-22 平安科技(深圳)有限公司 Picture scanning piece processing method and device, computer equipment and storage medium
CN112115735A (en) * 2019-06-19 2020-12-22 国网江苏省电力有限公司常州供电分公司 Identification management method for confidential files
CN111767769A (en) * 2019-08-14 2020-10-13 北京京东尚科信息技术有限公司 Text extraction method and device, electronic equipment and storage medium
CN111046874A (en) * 2019-12-12 2020-04-21 北京小白世纪网络科技有限公司 Single number identification method based on template matching
CN111461205A (en) * 2020-03-30 2020-07-28 拉扎斯网络科技(上海)有限公司 Image processing method, image processing device, electronic equipment and computer readable storage medium
CN112200735A (en) * 2020-09-18 2021-01-08 安徽理工大学 Temperature identification method based on flame image and control method of low-concentration gas combustion system
CN113191348A (en) * 2021-05-31 2021-07-30 山东新一代信息产业技术研究院有限公司 Template-based text structured extraction method and tool
CN113191348B (en) * 2021-05-31 2023-02-03 山东新一代信息产业技术研究院有限公司 Template-based text structured extraction method and tool
CN115328463A (en) * 2022-08-01 2022-11-11 无锡雪浪数制科技有限公司 Design system based on visual business arrangement

Similar Documents

Publication Publication Date Title
CN109284756A (en) A kind of terminal censorship method based on OCR technique
JP3359095B2 (en) Image processing method and apparatus
JP4477468B2 (en) Device part image retrieval device for assembly drawings
CN108805076A (en) The extracting method and system of environmental impact assessment report table word
JP2001167131A (en) Automatic classifying method for document using document signature
CN112861865B (en) Auxiliary auditing method based on OCR technology
CN116403094B (en) Embedded image recognition method and system
US11704925B2 (en) Systems and methods for digitized document image data spillage recovery
CN114444566B (en) Image forgery detection method and device and computer storage medium
JP4391704B2 (en) Image processing apparatus and method for generating binary image from multi-valued image
CN111274762A (en) Computer expression method based on diversified fonts in Tibetan classical documents
CN114842478A (en) Text area identification method, device, equipment and storage medium
Dixit et al. Automatic logo detection from document image using HOG features
Zhan et al. A robust split-and-merge text segmentation approach for images
CN104899551B (en) A kind of form image sorting technique
Kia et al. Integrated segmentation and clustering for enhanced compression of document images
Padma et al. I DENTIFICATION OF T ELUGU, D EVANAGARI AND E NGLISH S CRIPTS U SING D ISCRIMINATING
CN116311297A (en) Electronic evidence image recognition and analysis method based on computer vision
CN114758340A (en) Intelligent identification method, device and equipment for logistics address and storage medium
CN113591657A (en) OCR (optical character recognition) layout recognition method and device, electronic equipment and medium
CN118379753B (en) Method and system for extracting bad asset contract key information by utilizing OCR technology
CN106469267A (en) A kind of identifying code sample collection method and system
Sun Pornographic image screening by integrating recognition module and image black-list/white-list subsystem
Malkawi et al. Auto Signature Verification Using Line Projection Features Combined with Different Classifiers and Selection Methods
Hukkeri et al. Machine Learning in OCR Technology: Performance Analysis of Different OCR Methods for Slide-to-Text Conversion in Lecture Videos

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20190129