CN102486834A - Method and device for processing characters - Google Patents

Method and device for processing characters Download PDF

Info

Publication number
CN102486834A
CN102486834A CN2010105823030A CN201010582303A CN102486834A CN 102486834 A CN102486834 A CN 102486834A CN 2010105823030 A CN2010105823030 A CN 2010105823030A CN 201010582303 A CN201010582303 A CN 201010582303A CN 102486834 A CN102486834 A CN 102486834A
Authority
CN
China
Prior art keywords
character
character stream
stream
converted
detected
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN2010105823030A
Other languages
Chinese (zh)
Inventor
郝佳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Peking University Founder Group Co Ltd
Beijing Founder Electronics Co Ltd
Original Assignee
Peking University Founder Group Co Ltd
Beijing Founder Electronics Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Peking University Founder Group Co Ltd, Beijing Founder Electronics Co Ltd filed Critical Peking University Founder Group Co Ltd
Priority to CN2010105823030A priority Critical patent/CN102486834A/en
Publication of CN102486834A publication Critical patent/CN102486834A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Storage Device Security (AREA)

Abstract

The invention relates to the technical field of communication, and particularly relates to a method and device for processing characters. The method comprises the following steps of: obtaining a first character flow in an existing file; when detecting that a character to be converted exists in the first character flow, converting the character to be converted according to a pre-set conversion relation to form a character flow to be detected; comparing the character flow to be detected with a standard character flow, wherein the standard character flow is the character flow obtained by converting a second character flow in an original file of the existing file according to the pre-set conversion relation, and a starting position and an ending position which are used for obtaining the first character flow correspond to the starting position and the ending position which are used for obtaining the second character flow; if a compared result is that the character flow to be detected is matched with the standard character flow, outputting the character flow to be detected. By using the method and device for processing the characters, provided by the embodiment of the invention, similar characters are converted so as to assure that the character flow for encrypting a scanned and identified file is consistent with the character flow for encrypting an original file.

Description

A kind of method and apparatus of processing character
Technical field
The present invention relates to technical field of communication, relate in particular to a kind of method and apparatus of processing character.
Background technology
OCR (Optical Character Recognition; Optical character identification) technology is meant the character of printing on electronic equipment (for example scanner or digital camera) the inspection paper, confirms its shape through detecting dark, bright pattern; With character identifying method shape is translated into the process of computword then; Promptly text information is scanned, then image file is carried out analyzing and processing, obtain the process of literal and layout information.
Yet, when using the OCR technology to extract akin character, might cause identification error; As O is identified as 0; File after identification generates to the OCR technology is like this encrypted the summary of generation, and encrypting the summary that generates with original maybe be inequality, thereby causes the summary of subsequent treatment mistake.
Summary of the invention
The embodiment of the invention provides a kind of method and apparatus of processing character, through similar character is changed, uses unified sign to represent, the character stream that the character stream that uses when guaranteeing to scan the file encryption after the identification uses when encrypting with original is consistent.
The embodiment of the invention provides a kind of method of processing character, and this method comprises:
Obtain first character stream in the current file;
Detect when having character to be converted in said first character stream, change said character to be converted, constitute character stream to be detected according to the transformational relation that presets;
Said character stream to be detected and standard character stream are compared; Said standard character stream is according to the said transformational relation that presets; The character stream that obtains after second character stream in the source document of said current file changed; And reference position and final position when obtaining said first character stream, corresponding with the reference position of obtaining said second character stream and final position;
If comparison result is then exported said character stream to be detected for coupling.
Accordingly, the embodiment of the invention provides a kind of device of processing character, comprising:
Acquisition module is used for obtaining first character stream of current file;
Processing module when being used for detecting said first character stream and having character to be converted, is changed said character to be converted according to the transformational relation that presets, and constitutes character stream to be detected;
Comparing module; Be used for said character stream to be detected and standard character stream are compared; Said standard character stream is according to the said transformational relation that presets; The character stream that obtains after second character stream in the source document of said current file changed, and reference position and final position when obtaining said first character stream, corresponding with the reference position of obtaining said second character stream and final position;
Output module is used for if comparison result for coupling, is then exported said character stream to be detected.
The embodiment of the invention provides a kind of method and apparatus of processing character, is used for obtaining first character stream of current file; Detect when having character to be converted in said first character stream, change said character to be converted, constitute character stream to be detected according to the transformational relation that presets; Said character stream to be detected and standard character stream are compared; Said standard character stream is according to the said transformational relation that presets; The character stream that obtains after second character stream in the source document of said current file changed; And reference position and final position when obtaining said first character stream, corresponding with the reference position of obtaining said second character stream and final position; If comparison result is then exported said character stream to be detected for coupling.Use the method and apparatus of the processing character that the embodiment of the invention provides, through similar character is changed, uses unified sign to represent, the character stream of use was consistent when the character stream that uses when guaranteeing to scan the file encryption after the identification was encrypted with original.
Description of drawings
Fig. 1 is the method flow synoptic diagram of processing character in the embodiment of the invention;
Fig. 2 is the method flow synoptic diagram of processing character in another embodiment of the present invention;
Fig. 3 is the device synoptic diagram of processing character in another embodiment of the present invention.
Embodiment
At length set forth to the main realization principle of embodiment of the invention technical scheme, embodiment and to the beneficial effect that should be able to reach below in conjunction with each accompanying drawing.
The character stream that the character stream that uses during in order to ensure the file encryption after the scanning identification uses when encrypting with original is consistent, and the embodiment of the invention provides a kind of method of processing character, and is as shown in Figure 1, may further comprise the steps:
Step 101, obtain first character stream in the current file; Concrete; Source document is printed to current file; After equipment such as use scanner scan this current file; Extract its first character stream, during extraction can according to from the top down, or from bottom to top, or begin to extract from certain ad-hoc location, stop modes such as extraction to another ad-hoc location.
Step 102, detect when having character to be converted in first character stream; Change character to be converted according to the transformational relation that presets; Constitute character stream to be detected, and reference position and final position when obtaining first character stream, corresponding with the reference position of obtaining second character stream and final position; Concrete; Detect when having character to be converted in first character stream,, change character to be converted in first character stream according to the transformational relation that presets; Behind character replacement after the conversion character to be converted, constitute character stream to be detected with character non-to be converted in first character stream.This transformational relation comprises: convert similar character into unified sign; Judge that character is whether similar and whether obscure relevant easily with similarity degree, experience and the conversion of character itself; For example: first character stream is 52O1L, and the transformational relation that presets is: 0, O, Q, o convert 0 into; I, 1, L, l convert 1 into, then detect when having character O to be converted, 1, L in first character stream, and the transformational relation according to presetting converts O into 0, and L converts 1 into.Then, use the O in 0 replacement, first character stream, use the L in 1 replacement, first character stream, constitute character stream 52011 to be detected.
Step 103, character stream to be detected and standard character stream is compared, this standard character stream is for according to the transformational relation that presets, the character stream that obtains after second character stream in the source document of current file is changed; Concrete, current file is printed by source document and is generated, and its content is identical with the content of source document.Reference position when obtaining first character stream and final position, corresponding with the reference position of obtaining second character stream and final position.Then this second character stream is changed according to the above-mentioned transformational relation that presets; For example: second character stream is 520IL, then according to the transformational relation that presets, converts I into 1; L converts 1 into; Use the I in 1 replacement, second character stream, use the L in 1 replacement, first character stream, constitute standard character stream 52011.
Step 104, if comparison result for the coupling, then export character stream to be detected.Concrete, if comparison result can confirm then that for coupling above-mentioned current file is the printout of source document, both contents are consistent, export this character stream to be detected, subsequent treatment such as encrypt.
Pass through foregoing description; Can find out that the method for the processing character that the use embodiment of the invention provides is through changing similar character; Use unified sign to represent, the character stream that the character stream that uses when guaranteeing to scan the file encryption after the identification uses when encrypting with original is consistent.
The method of the processing character that the embodiment of the invention is provided through specific embodiment below is elaborated; Suppose file A is printed spanned file B; Get access to first character stream after file B scanned identification, simultaneously, obtain second character stream in the same position of file A.Use identical transformational relation then, first character stream and second character stream are changed, even like this because the accuracy of identification is not good; To similar character-recognition errors, but through above-mentioned transformational relation, can make the character stream after the conversion identical; And then encrypt etc. when handling; Can not produce mistake, as shown in Figure 2, may further comprise the steps:
Step 201, obtain first character stream among the file A; Concrete; File B is printed to file A, after equipment such as use scanner scan file A, extract its first character stream; During extraction can according to from the top down, or from bottom to top, or begin to extract from certain ad-hoc location, stop modes such as extraction extraction to another ad-hoc location.
Whether there is character to be converted in step 202, detection first character stream, if there is execution in step 203; Otherwise, export this first character stream.
This character to be converted of transformational relation conversion that step 203, basis preset; Concrete; Pre-configured and this transformational relation regularly; It is used for similar character is corresponded to unified sign, judge whether similar similarity degree, the experience and changing with character itself of character whether obscure easily relevant, for example 0, o, O, Q convert 0 into; I, 1, L, l, |,! Convert 1 into; *, *, x, X convert X into; , 3 convert 3 into; { }, [], () convert () into; Z, 2, z convert 2 into; $, s, S convert S etc. into.
Step 204, the character replacement character to be converted after will changing constitute character stream to be detected; Concrete, suppose that first character stream is oz*3s, then replaces with 0 with o; Z replaces with 2; * replace with X; S converts S into; All the other characters are constant, constitute character stream 02X3S to be detected.
Step 205, character stream to be detected and standard character stream is compared; This standard character stream is according to the transformational relation that presets, the character stream that obtains after second character stream among the file B is changed.Wherein, the mode of obtaining this second character stream is identical with the mode of obtaining first character stream, and the transformational relation of following too.
Step 206, if comparison result for the coupling, then output this character stream to be detected.If comparison result for not matching, explains that then this document A is not the printout of file B.
Pass through foregoing description; Can find out that the method for the processing character that the use embodiment of the invention provides is through changing similar character; Use unified sign to represent, the character stream that the character stream that uses when guaranteeing to scan the file encryption after the identification uses when encrypting with original is consistent.
Accordingly, the embodiment of the invention also provides a kind of device of processing character, and is as shown in Figure 3, specifically comprises:
Acquisition module 301 is used for obtaining first character stream of current file;
Processing module 302 when being used for detecting first character stream and having character to be converted, is changed character to be converted according to the transformational relation that presets, and constitutes character stream to be detected;
Comparing module 303; Be used for character stream to be detected and standard character stream are compared; The transformational relation that this standard character stream presets for basis; The character stream that obtains after second character stream in the source document of current file changed, and reference position and final position when obtaining said first character stream, corresponding with the reference position of obtaining said second character stream and final position;
Output module 304 is used for if comparison result for coupling, is then exported character stream to be detected.
Preferable, acquisition module 301 specifically is used to scan this current file and extracts first character stream.
Preferable; Processing module 302; When specifically being used for detecting said first character stream and having character to be converted,, change character to be converted in first character stream according to the transformational relation that presets; Behind character replacement after the conversion character to be converted, constitute character stream to be detected with character non-to be converted in said first character stream.
Preferable, this device also comprises: update module 305 is used for regularly upgrading said transformational relation.
Pass through foregoing description; Can find out that the method and apparatus of the processing character that the use embodiment of the invention provides is through changing similar character; Use unified sign to represent, the character stream that the character stream that uses when guaranteeing to scan the file encryption after the identification uses when encrypting with original is consistent.
Obviously, those skilled in the art can carry out various changes and modification to the present invention and not break away from the spirit and scope of the present invention.Like this, belong within the scope of claim of the present invention and equivalent technologies thereof if of the present invention these are revised with modification, then the present invention also is intended to comprise these changes and modification interior.

Claims (10)

1. the method for a processing character is characterized in that, comprising:
Obtain first character stream in the current file;
Detect when having character to be converted in said first character stream, change said character to be converted, constitute character stream to be detected according to the transformational relation that presets;
Said character stream to be detected and standard character stream are compared; Said standard character stream is according to the said transformational relation that presets; The character stream that obtains after second character stream in the source document of said current file changed; And reference position and final position when obtaining said first character stream, corresponding with the reference position of obtaining said second character stream and final position;
If comparison result is then exported said character stream to be detected for coupling.
2. the method for claim 1 is characterized in that, also comprises: said source document is printed to said current file.
3. method as claimed in claim 2 is characterized in that, said obtain current file in first character stream, comprising: scan said current file and extract first character stream.
4. the method for claim 1 is characterized in that, when having character to be converted in said first character stream of said detection, changes said character to be converted according to the transformational relation that presets, and constitutes character stream to be detected, comprising:
Detect when having character to be converted in said first character stream; The transformational relation that said basis presets; Change character to be converted in first character stream, behind the character to be converted of the character replacement after the conversion, constitute character stream to be detected with character non-to be converted in said first character stream.
5. the method for claim 1 is characterized in that, said transformational relation comprises: convert similar character into unified sign.
6. the method for claim 1 is characterized in that, also comprises: regularly upgrade said transformational relation.
7. the device of a processing character is characterized in that, comprising:
Acquisition module is used for obtaining first character stream of current file;
Processing module when being used for detecting said first character stream and having character to be converted, is changed said character to be converted according to the transformational relation that presets, and constitutes character stream to be detected;
Comparing module; Be used for said character stream to be detected and standard character stream are compared; Said standard character stream is according to the said transformational relation that presets; The character stream that obtains after second character stream in the source document of said current file changed, and reference position and final position when obtaining said first character stream, corresponding with the reference position of obtaining said second character stream and final position;
Output module is used for if comparison result for coupling, is then exported said character stream to be detected.
8. device as claimed in claim 7 is characterized in that, said acquisition module specifically is used to scan said current file and extracts first character stream.
9. device as claimed in claim 7; It is characterized in that said processing module is when specifically being used for detecting said first character stream and having character to be converted; According to the transformational relation that presets; Change character to be converted in first character stream, behind the character to be converted of the character replacement after the conversion, constitute character stream to be detected with character non-to be converted in said first character stream.
10. device as claimed in claim 7 is characterized in that, also comprises:
Update module is used for regularly upgrading said transformational relation.
CN2010105823030A 2010-12-06 2010-12-06 Method and device for processing characters Pending CN102486834A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2010105823030A CN102486834A (en) 2010-12-06 2010-12-06 Method and device for processing characters

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2010105823030A CN102486834A (en) 2010-12-06 2010-12-06 Method and device for processing characters

Publications (1)

Publication Number Publication Date
CN102486834A true CN102486834A (en) 2012-06-06

Family

ID=46152325

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2010105823030A Pending CN102486834A (en) 2010-12-06 2010-12-06 Method and device for processing characters

Country Status (1)

Country Link
CN (1) CN102486834A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103646068A (en) * 2013-12-04 2014-03-19 Tcl集团股份有限公司 Encryption method, decryption method, method for group messaging and corresponding devices thereof
CN106335293A (en) * 2016-10-21 2017-01-18 河南纸纹智能科技有限公司 File processing apparatus
CN110298027A (en) * 2018-03-22 2019-10-01 卡西欧计算机株式会社 Display device, display system, display methods and recording medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1300400A (en) * 1998-04-01 2001-06-20 威廉·彼得曼 System and method for searching electronic documents created with optical character recognition
CN1933391A (en) * 2005-09-16 2007-03-21 北京书生国际信息技术有限公司 Hidden code inserting and detecting method
CN101178763A (en) * 2007-12-12 2008-05-14 北京航空航天大学 Government documents ciphering and deciphering method
US7403657B2 (en) * 2001-03-22 2008-07-22 Hitachi, Ltd. Method and apparatus for character string search in image

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1300400A (en) * 1998-04-01 2001-06-20 威廉·彼得曼 System and method for searching electronic documents created with optical character recognition
US7403657B2 (en) * 2001-03-22 2008-07-22 Hitachi, Ltd. Method and apparatus for character string search in image
CN1933391A (en) * 2005-09-16 2007-03-21 北京书生国际信息技术有限公司 Hidden code inserting and detecting method
CN101178763A (en) * 2007-12-12 2008-05-14 北京航空航天大学 Government documents ciphering and deciphering method

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103646068A (en) * 2013-12-04 2014-03-19 Tcl集团股份有限公司 Encryption method, decryption method, method for group messaging and corresponding devices thereof
CN103646068B (en) * 2013-12-04 2017-10-20 Tcl集团股份有限公司 Encryption method, decryption method, the method for sending bulk message and its corresponding intrument
CN106335293A (en) * 2016-10-21 2017-01-18 河南纸纹智能科技有限公司 File processing apparatus
CN110298027A (en) * 2018-03-22 2019-10-01 卡西欧计算机株式会社 Display device, display system, display methods and recording medium

Similar Documents

Publication Publication Date Title
US20200257869A1 (en) Scanner with control logic for resolving package labeling conflicts
WO2006002009A3 (en) Document management system with enhanced intelligent document recognition capabilities
CN102360419B (en) Method and system for computer scanning reading management
CN104281841A (en) Fingerprint identification system and fingerprint processing method and device thereof
CN109759713B (en) Rapid marking method and rapid marking system based on CCD image recognition
CN102201053A (en) Method for cutting edge of text image
JP2006059124A (en) System and apparatus for recognizing character string in landscape
CN104883343A (en) Online sharing method, system and transaction machine thereof
CN102486834A (en) Method and device for processing characters
CN102567711A (en) Method and system for making and using scanning recognition template
CN105025188B (en) Image processing system, image processing apparatus and image processing method
JP2019128690A (en) Handwritten character recognition system
CN101873415A (en) Camera device having translation function and method
US9268998B2 (en) Image determining apparatus, image processing system, and recording medium
US10523848B2 (en) Image processing apparatus for processing marked regions
CN111008387A (en) Anti-counterfeiting tracing system and method for printed document based on digital signature and document DNA
US10679101B2 (en) Optical character recognition systems and methods
CN101335801A (en) Scanning method combining camera and optical positioning
US20140104663A1 (en) Method and apparatus for document digitization
US20130275133A1 (en) Electronic Pen with Printable Arrangement
EP2586539B1 (en) Apparatus and method for upgrading the camera of a sorting / scanning system with backwards compatibility
JP2008244545A (en) Image processor
CN105072141A (en) Intelligent equipment networking method and intelligent equipment
WO2020065980A1 (en) Image processing device, control method and control program
CN208890892U (en) A kind of Novel scanner

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C12 Rejection of a patent application after its publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20120606