Background technology
In daily life, people often need to take paper document, preserve into the photo of jpeg format, perhaps generate the document of PDF, thereby realize the electronization of paper document, convenient management.Smart mobile phone be exactly commonly use with one of instrument of paper document electronization.Because generally all with camera, utilize the camera on the mobile phone can take paper document on the smart mobile phone, and the electronic document that shooting obtains carried out converting to after certain image is processed the photo of jpeg format again, perhaps generate the document of PDF.The application software that possesses above-mentioned functions is also more universal, such as the application CamScanner in apple application shop and the google application shop.These application software can be from the image of taking automatic monitoring measure the four edges of captured document, excise the background of document areas outside in the image as benchmark, and document areas proofreaied and correct and the processing such as figure image intensifying, obtain an effect that is similar to the shipshape electronic document that obtains with scanner scanning, preserve and manage with the form of user's appointment.
The paper document that common needs carry out electronization is the paper spare notebook page, people did various records originally through the notes of common paper part for a long time, such as minutes, memorandum transaction record etc., have dozens or even hundreds of page or leaf paper in the paper spare notebook, and the notebook of same type, the pattern of its all pages that are used for recording generally is unified.In actual use, the user often need to be on notebook records one by one clauses and subclauses with handwriting mode, writes possible movable option at weekend such as a user is divided into 3 row at the notebook page: 1, go window-shopping 2, see a film, 3, go to the park; The image of having taken this notebook page carries out after the electronization, the user has made decision in these 3 options, select 2, see a film, he need to be saved in this decision and go in the backlog just need to input one time literal in electronic equipment again, and this is just very inconvenient.Desirable way be the user as long as in the electronic document of this notebook page that electronic equipment shows, click " 2, see a film ", the zone at person's handwriting place just cuts out the image-region that comprises " 2, see a film " person's handwriting automatically, joins the backlog the inside and goes.A lot of notebooks all can stamp a minute line; often meeting and branch's line overlap of printing in advance of handwriting when the user is hand-written; some notebook even can stamp background patterns at the page; these all can give obtain positions that the user clicks be syncopated as later on " 2, see a film " this handwriting place automatically from image image-region and bring interference, cause cutting inaccurate.
Summary of the invention
The shortcoming of prior art in view of the above the object of the present invention is to provide the method for handwritten entries in a kind of automatic segmentation electronization notebook, be used for solving prior art can't automatic lifting power taking subdocument in the problem of content of ad-hoc location.
Reach for achieving the above object other relevant purposes, the invention provides the method for handwritten entries in a kind of automatic segmentation electronization notebook.
The method of handwritten entries in a kind of automatic segmentation electronization notebook, the method for handwritten entries comprises in the described automatic segmentation electronization notebook:
Take the papery page-images of the notebook that needs electronization;
Determine the four edges edge line of described papery page-images by the line detection method in the image, and the page area that four edges edge line limits proofreaied and correct be square region;
Determine the type of the described papery page according to described papery page-images, obtain the papery page empty cutting template of the described type notebook of pre-save, described blank cutting template is comprised of some character blocks;
Determine the character block at user's handwriting place in the described square region, extract the user's handwriting that is in any one character block as the unit automatic segmentation take character block.
Preferably, the type of the described papery page is by size and the format determination of this papery page; The form of the described papery page comprises number, size, the interval of the character block that the papery page comprises.
Preferably, described character block can merge with adjacent character block, and the character block after merging extracts the user's handwriting that is in any one character block as the unit automatic segmentation.
Preferably, under the type of the described papery page is known in advance situation, determine that according to described papery page-images the specific implementation of the type of the described papery page is: the type of manually specifying the described papery page.
Preferably, under the type of the described papery page was known in advance situation, determine that according to described papery page-images the specific implementation of the type of the described papery page is: the place, fixed position on the described papery page was printed with a type mark; Detect the type mark on the described papery page-images, this type mark that detects and known type mark are in advance compared one by one, find out the type under the described papery page.
Preferably, type at the described papery page is in the situation of not knowing in advance, the specific implementation of determining the type of the described papery page according to described papery page-images is: create the type of the new papery page, input size and the form of this unknown papery page.
As mentioned above, the method for handwritten entries in the automatic segmentation electronization notebook of the present invention has following beneficial effect:
The present invention is by when the papery page to notebook carries out electronization, assist to obtain and cut apart the hand-written literal of user on the papery page with the blank cutting template of pre-save, because described blank cutting template is comprised of several character blocks, so each character block all can be used as the unit of writing on the cutting page, thereby obtain to have comprised the handwritten entries of content intact, realized automatic segmentation and the extraction of electronic document content.
Embodiment
Below by specific instantiation explanation embodiments of the present invention, those skilled in the art can understand other advantages of the present invention and effect easily by the disclosed content of this instructions.The present invention can also be implemented or be used by other different embodiment, and the every details in this instructions also can be based on different viewpoints and application, carries out various modifications or change under the spirit of the present invention not deviating from.
See also accompanying drawing.Need to prove, the diagram that provides in the present embodiment only illustrates basic conception of the present invention in a schematic way, satisfy only show in graphic with the present invention in relevant assembly but not component count, shape and size drafting when implementing according to reality, kenel, quantity and the ratio of each assembly can be a kind of random change during its actual enforcement, and its assembly layout kenel also may be more complicated.
The present invention is described in detail below in conjunction with embodiment and accompanying drawing.
Embodiment one
Present embodiment provides the method for handwritten entries in a kind of automatic segmentation electronization notebook, and as shown in Figure 1, the method for handwritten entries comprises in the described automatic segmentation electronization notebook:
Take the papery page-images of the notebook that needs electronization.In the present embodiment, the described papery page of the notebook of electronization that needs can be any type, as being printed with class indication zone, page number zone, Title area, minute line on this papery page or/and parse line etc. also can be above-mentioned every any-mode combination.
Determine the four edges edge line of described papery page-images by the line detection method in the image, and the page area that four edges edge line limits proofreaied and correct be square region.Particularly, obtain four outer peripheral straight lines of the page that represent in the papery page-images by the line detection method in the image, cut away the background area beyond these four page outward flange straight line restricted portions in the image, and take these four page outward flange straight lines as benchmark the papery page-images of taking is proofreaied and correct, the page area that these four page outward flange straight lines are limited is corrected into rectangular region.
Determine the type of the described papery page according to described papery page-images, obtain the papery page empty cutting template of the described type notebook of pre-save, described blank cutting template is comprised of some character blocks.In the present embodiment, the type of the described papery page is by size and the format determination of this papery page; The form of the described papery page comprises the interval between number, literal block size and the adjacent character block of the character block that the papery page comprises.That is to say that the described papery page can be comprised of the piece zone of arbitrary shape, each piece zone is a character block.This character block just in time can intactly be cut apart the user's handwriting on the papery page.
The papery page-images of captured notebook belongs to the page type that application software such as having CamScanner now has been preserved in advance among the present invention, therefore can with reference to the blank cutting template of the papery page of the type of pre-save obtain user's handwriting place image-region (namely a character block or merge after the zone at a plurality of character blocks place), obvious accuracy can improve greatly.
Determine the character block at user's handwriting place in the described square region, extract the user's handwriting that is in any one character block as the unit automatic segmentation take character block.Wherein, described character block also can merge with adjacent character block, namely can extract the user's handwriting that is in any one character block as the unit automatic segmentation take the character block after merging.In the notebook papery page-images after correction, the blank cutting template of this notebook papery page of the described pre-save of reference, determine the position of user's handwriting in blank cutting template in the notebook page, and user's handwriting is cut into the character block that has represented different literal lines.By method of the present invention, the user can be by shirtsleeve operation manually the representative that closes on the zone of a plurality of character blocks of complete implication be merged into one.The representative that these cut out the content in the character block of complete implication can be used for joining in the tabulation of the charg`e d'affaires's item in the electronic equipment, the literal that also can utilize existing handwriting recognition technology to identify wherein comes, and saves the trouble of user's manual input characters on electronic equipment.
The present invention is by when carrying out electronization to the notebook page, assist to obtain and cut apart the hand-written character area of user with the blank cutting template Chinese word piece of pre-save, obtained comprising the image block (also claiming character block) of the handwritten entries of content intact, thereby make things convenient for the subregion electronization of the papery page, and the using and managing of the document after the electronization.That is to say, the present invention is by when the papery page to notebook carries out electronization, assist to obtain and cut apart the hand-written literal of user on the papery page with the blank cutting template of pre-save, because described blank cutting template is comprised of several character blocks, so each character block all can be used as the unit of writing on the cutting page, thereby obtain to have comprised the handwritten entries of content intact, realized automatic segmentation and the extraction of electronic document content.
Embodiment two
Present embodiment provides the method for handwritten entries in a kind of automatic segmentation electronization notebook, the difference of the method for handwritten entries is in itself and the embodiment one described automatic segmentation electronization notebook: the type of the known described papery page in advance, determine that according to described papery page-images the specific implementation of the type of the described papery page is: the type of manually specifying the described papery page; Be the user before photographic images, perhaps process before the image after the photographic images, manually specify the type under the papery page of notebook, such as from a series of notebook page types of pre-save the application software such as camScanner, selecting one.
Embodiment three
Present embodiment provides the method for handwritten entries in a kind of automatic segmentation electronization notebook, the difference of the method for handwritten entries is in itself and embodiment one and the two described automatic segmentations electronization notebooks: the type of the known described papery page in advance, determine that according to described papery page-images the specific implementation of the type of the described papery page is:
Place, fixed position on the described papery page is printed with a type mark; Described type mark can be literal, symbol, figure or any two or three s' combination.
Detect the type mark on the described papery page-images, this type mark that detects and known type mark are in advance compared one by one, find out the type under the described papery page.Place, fixed position on the described papery page is printed with a type mark; An i.e. pre-designed mark (being type mark) in the assigned address printing of each papery page of notebook in advance, obtained in shooting after the image of the papery page of notebook, in image, detect first four outward flanges of the papery page of notebook, take these four outward flanges as the approximate location with reference to definite described mark in the image of the papery page, thereby realize the described detection that is marked in the image, then the mark of the mark that detects with the papery page of a plurality of dissimilar notebooks of the representative of pre-save compared one by one, find out the type under the papery page of captured notebook.The mark that the detects mark with a plurality of dissimilar notebook papery pages of the representative of pre-save is compared one by one, find out the affiliated type of the papery page of captured notebook, this step relates to handwriting recognition, literal identification, mature technology in this areas such as images match, therefore not to repeat here.
Embodiment four
Present embodiment provides the method for handwritten entries in a kind of automatic segmentation electronization notebook, the difference of the method for handwritten entries is in itself and the embodiment one described automatic segmentation electronization notebook: the type of not knowing in advance the described papery page, in such cases, determine that according to described papery page-images the specific implementation of the type of the described papery page is:
Create the type of the new papery page, input size and the form of this unknown papery page.
If the i.e. papery page of the captured notebook overstriking that do not belonged to the in advance known printing of the application software such as CamScanner or/and minute line that lengthens or/and parse line or/and the type of the papery page of Title area, after then the type of the papery page that this is unknown is added in the type of the papery page of new establishment first in follow-up step, carry out again follow-up processing.
The present invention is by when the papery page to notebook carries out electronization, assist to obtain and cut apart the hand-written literal of user on the papery page with the blank cutting template of pre-save, because described blank cutting template is comprised of several character blocks, so each character block all can be used as the unit of writing on the cutting page, thereby obtain to have comprised the handwritten entries of content intact, realized automatic segmentation and the extraction of electronic document content.
In sum, the present invention has effectively overcome various shortcoming of the prior art and the tool high industrial utilization.
Above-described embodiment is illustrative principle of the present invention and effect thereof only, but not is used for restriction the present invention.Any person skilled in the art scholar all can be under spirit of the present invention and category, and above-described embodiment is modified or changed.Therefore, have in the technical field under such as and know that usually the knowledgeable modifies or changes not breaking away from all equivalences of finishing under disclosed spirit and the technological thought, must be contained by claim of the present invention.