CN1525378A - Bill definition data generating method and bill processing apparatus - Google Patents

Bill definition data generating method and bill processing apparatus Download PDF

Info

Publication number
CN1525378A
CN1525378A CNA2004100006610A CN200410000661A CN1525378A CN 1525378 A CN1525378 A CN 1525378A CN A2004100006610 A CNA2004100006610 A CN A2004100006610A CN 200410000661 A CN200410000661 A CN 200410000661A CN 1525378 A CN1525378 A CN 1525378A
Authority
CN
China
Prior art keywords
definition
data
mentioned
defined range
bill
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CNA2004100006610A
Other languages
Chinese (zh)
Inventor
浅野英辅
新庄广
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hitachi Omron Financial System Co Ltd
Original Assignee
Hitachi Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hitachi Ltd filed Critical Hitachi Ltd
Publication of CN1525378A publication Critical patent/CN1525378A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • FMECHANICAL ENGINEERING; LIGHTING; HEATING; WEAPONS; BLASTING
    • F41WEAPONS
    • F41BWEAPONS FOR PROJECTING MISSILES WITHOUT USE OF EXPLOSIVE OR COMBUSTIBLE PROPELLANT CHARGE; WEAPONS NOT OTHERWISE PROVIDED FOR
    • F41B11/00Compressed-gas guns, e.g. air guns; Steam guns
    • F41B11/80Compressed-gas guns, e.g. air guns; Steam guns specially adapted for particular purposes
    • F41B11/89Compressed-gas guns, e.g. air guns; Steam guns specially adapted for particular purposes for toys
    • FMECHANICAL ENGINEERING; LIGHTING; HEATING; WEAPONS; BLASTING
    • F41WEAPONS
    • F41BWEAPONS FOR PROJECTING MISSILES WITHOUT USE OF EXPLOSIVE OR COMBUSTIBLE PROPELLANT CHARGE; WEAPONS NOT OTHERWISE PROVIDED FOR
    • F41B11/00Compressed-gas guns, e.g. air guns; Steam guns
    • F41B11/50Magazines for compressed-gas guns; Arrangements for feeding or loading projectiles from magazines
    • FMECHANICAL ENGINEERING; LIGHTING; HEATING; WEAPONS; BLASTING
    • F41WEAPONS
    • F41BWEAPONS FOR PROJECTING MISSILES WITHOUT USE OF EXPLOSIVE OR COMBUSTIBLE PROPELLANT CHARGE; WEAPONS NOT OTHERWISE PROVIDED FOR
    • F41B11/00Compressed-gas guns, e.g. air guns; Steam guns
    • F41B11/70Details not provided for in F41B11/50 or F41B11/60

Abstract

To reduce a user burden upon creation of form definition data. A form processing device extracts layout information about an area specified by a user (Step 200), extracts an item name related to the specified area, and converts the information into definition data (Step 500, Step 600) to automatically create form definition data. The mere specification of the definition position automatically creates the definition data, so that the user burden of setting form definition data can be reduced.

Description

Bill definition of data generation method and bill processing apparatus
Technical field
The present invention relates to handle the technology of bill, particularly relate to and carrying out using the technology that generates the bill definition of data when literal identification is handled.
Background technology
When the text lines such as ledger account with balance column of identification bill, carry out literal identification with the bill definition of data of the information of the position of having logined literal identification requirement zone in advance and literal number etc.For generating this bill definition of data, considered various methods up to now, these methods are mainly put forth effort on the generation of the very big definition of data of workload are more prone to.
For example, open in the bill definition generation method of putting down in writing 2001-126010 communique (8-9 page or leaf, Fig. 7) lining,, go out definition of data from ruling extraction and frame extraction Automatic Extraction for the bill that does not have data-in the spy.In addition,, just can set the literal that is not surrounded and charge to frame, perhaps set the text type of the frame that is present in the position corresponding with keyword by ruling by the literal of identification preprinted and the keyword dictionary contrast of login in advance.Designating frame by hand not, the method for automatically carrying out generating with whole definition relevant operation further, have been described by read the literal of preprinted comprehensively.
But, in above-mentioned definition generation method,,, just can not generate the definition of data of text type etc. if in designating frame, there is not preprinted in the occasion of designating frame.In addition, thus it is very time-consuming impracticable will discerning preprinted literal present situation comprehensively.Have again because different because of bill with the corresponding position of keyword, make the keyword dictionary can be general to bill be impossible.
Summary of the invention
Therefore, the main points of view of this announcement is to solve the problem that relates to, and regardless of being specific bill or having charged to or do not charged to, can both generate the bill definition of data automatically to various bills.
Specifically, for example, based on reading around the zone or the layout information of inner preprinted literal, Automatic Extraction goes out when charging to literal and the input of definition view data frame, ruling etc. carries out the automatic generation of definition of data appointed.Be not present in the text line on every side that reads the zone even this definition of data generation method also can not be discerned in the inside preprinted that reads the zone, and extract definition of data by this recognition result is transformed into definition of data.In addition, even reading the occasion that there are a plurality of text lines on every side in the zone, also can be from the position of text line or the having or not of size, frame, text line size with respect to the frame magnitude proportion, to quantize as the appropriate property with respect to the keyword (below be referred to as project name) that reads the zone, the recognition result of the text line by will be the most appropriate is transformed into definition of data and extracts definition of data.
By above processing,, all may generate the bill definition of data automatically no matter charge to bill/do not charge to bill, with respect to the having or not of the preprinted of the position of the project name that reads the zone, inside, appointed area.
In addition, can also take various modes, for example, the automatic definition of data generation method that also can be used as in the above-mentioned bill processing constitutes, and also can be used as the computer program that this function is achieved on computers and constitutes.Here, can utilize floppy disk, CD-ROM, DVD, photomagnetism dish, IC-card, IC chip, ROM cartridge, punch card, be printed on the various medium of optics, magnetic, electric embodied on computer readable such as the internal reservoir device (storeies such as RAM and ROM) of printed matter, computing machine of symbols such as bar code and external storage device as recording medium.Above-mentioned various feature also can combine.
Description of drawings
Fig. 1 is the summary pie graph of bill processing apparatus.
Fig. 2 is the figure of the formation of expression bill image and bill definition of data.
Fig. 3 is the process flow diagram that the bill definition of data generates processing automatically.
Fig. 4 is the figure that expression is used to illustrate the expression example that the bill definition of data generates.
Fig. 5 is the figure that expression is used to illustrate the expression example that the bill definition of data generates.
Fig. 6 is the process flow diagram of the project name-definition of data conversion process of bill definition of data generation.
Fig. 7 is the figure of the position of the corresponding project name of defined range of appointment during expression generates with the bill definition of data.
Fig. 8 is the figure of an example of the project name-definition of data conversion dictionary during expression bill definition of data generates.
Embodiment
Divide following project to describe with reference to accompanying drawing to a preferable embodiment.
A. system constitutes
B. the formation of bill definition of data
C. the generation of bill definition of data
C1. project name-definition of data conversion process
A. system constitutes
Shown in Figure 1 is the figure that assists the formation of the bill processing apparatus that the bill definition of data generates.Also have, below be that the occasion that generates new bill definition of data automatically with the view data according to bill 106 is that example describes, but bill processing apparatus also can append the definition of data that reads the zone of new other of login in the bill definition of data that has generated.
As shown in the figure, this bill processing apparatus as hardware by connect general personal computer 101 and display 102, keyboard 103, mouse 104, scanner 105 constitutes.In personal computer 101, the application software of the function that is used to realize bill processing apparatus is installed.What represent among the figure is functional block 107~113 as bill processing apparatus.These functional blocks are made of above-mentioned application software.Certainly, also can constitute by hardware.
Image input part 107 gated sweep devices 105, play the view data of input as the bill 106 of the sample that generates the bill definition of data.The effect of bill definition of data generating unit 108 is to specify defined range by the input media from keyboard 103 and mouse 104, draws a bill according to definition of data according to this view data Automatic Extraction.At this moment, discern with dictionary 110, project name with reference to using knowledge dictionary 111, project name-each database such as definition information conversion dictionary 112 with reference to literal.Literal identification is to be used for the dictionary that shape and literal with view data contrast with each literal unit with dictionary 110.The project name contrast is to be used for by contrast text line and the word that may become entry name, the dictionary that the literal discrimination is improved with knowledge dictionary 111.Project name-definition information conversion dictionary 112 is to be used for being transformed into dictionary as the definition of data of the attribute of reading object or literal number etc. from the project name that obtains by the contrast of above-mentioned project name.
The definition of data that 109 outputs of bill definition of data efferent extract by bill definition of data generating unit 108.Automatically the definition of data that generates is logged bill definition of data database 113.
B. the formation of bill definition of data
Fig. 2 is the figure of the formation of expression bill image and bill definition of data.The use-case subrepresentation is wanted the bill image 201 that defines above figure, below the example of formation of expression definition of data 202.In bill image 201, the upper left corner is that initial point, x, y axle are defined as illustrated direction.
As an example of bill definition of data 202, form by institutes such as identification requirement area coordinate, shaped as frame shape, knowledge dictionary kind, literal number, handwritten form/fonts.For example, it is corresponding with the upper left definition of data that is positioned at definition of data 202 to be positioned at the top-right literal identification definition of data of entrusting day of bill image 201.In definition of data, for this identification requirement zone, should carry out rectangular extent that literal identification handles can with each summit of upper left (starting position) and bottom right (end position) (x, y) coordinate defines.In the example in the drawings, be set with upper left summit and be (1200,100), the summit of bottom right is (1400,150).In addition, owing to there is frame, set the shaped as frame shape with ' frame is arranged '.About knowledge dictionary kind, because the attribute of reading object is the date, so be set at ' date ', the literal number is set at ' 12 literal ', is set at ' font ' as the kind of literal.
But, here for definition of data be an example, also various information settings in addition can be become definition of data.For example, know that in advance the identification requirement zone for font, literal are spaced apart certain occasion, is set at definition of data with literal, at interval by using this information can improve discrimination when literal is discerned.
C. the generation of bill definition of data
Fig. 3 generates the process flow diagram of processing automatically for definition of data.It is the processing that the CPU of computing machine 101 carries out according to user's instruction.After handling beginning, CPU at first by the view data (step S100) of image input part 107 input bills, carries out layout dissection process (step S200) to bill comprehensively.That is, from the view data that is transfused to information such as table, frame, ruling are extracted as Word message, the part that will be identified as literal line simultaneously also extracts as Word message.
The layout information that will obtain by this processing is prompted to the user by the display device such as display 102 of computing machine 101.For example, in the example of Fig. 4 (a), will resolve the frame extraction result who obtains by layout and be presented at window 405.In this embodiment for simple and only display box extracts the result, but in fact can switch to the demonstration of ruling, literal line information by button and instruction etc.
Below the occasion that extracts in the ruling of presumptive area or the frame mistake of definition, the layout information that the user extracts mistake make amendment (step S300).Mouse 104 indicator devices such as grade of revising processing and utilizing computing machine 101 wait the frame that is presented at display device such as display 102 and ruling and carry out.For example, in the example of Fig. 4 (b), extracted by mistake, start and revise button 401, by revising frame (407) with drawing behind the mouse 104 selection modification frames owing to resolving the frame 406 that obtains through layout.CPU detect layout information carried out revise handle after, CPU just carries out layout once more based on amended information and resolves, correctly the relevant layout information of presumptive area of setting and defining.
This processing only in the layout information of defined range frame or ruling etc. the occasion of wrong extraction just carry out.Therefore, can not confirm to miss the occasion of extraction and the occasion of the wrong extraction of part beyond defined range etc., owing to can skip this processing, so can shorten the definition of data rise time.
The modification of layout information by the layout information to frame or ruling etc. append/delete/revise/merge/processing such as cutting apart carries out.In addition, what also can keep in inside by change extracts relevant threshold value with layout information, once revises layout information.For example, also carry out the layout dissection process once more in the minimum dimension/maximum sized threshold value of the frame that can extract of inside maintenance, just can once extract the frame that before the change threshold value, can not extract by change.
Obtain necessary layout information by this processing after, next, CPU carries out defined range and sets processing (step S400).In this is handled, resulting layout information is prompted to the user by display device such as displays 102, request utilizes mouse 104 indicator devices such as grade to specify which zone of definition.This processing also can be selected resulting frame, does not have the occasion in the zone of frame also can surround the part that the text line that reads has been selected to write in the zone with dilatory mouse etc. preferring justice.For example, in the example of Fig. 4 (d),, start selector button 402, by carrying out the setting of defined range with mouse 104 choice boxs 408 with the occasion of frame 408 as defined range.
After the user had specified defined range, CPU carried out layout information-definition of data conversion process (step S500).Set frame that processing selecting is extracted out occasion by defined range, obtain and the corresponding information of selected frame, be transformed into definition of data based on this from the layout information table as layout information.Occasion in the zone that has defined no frame is regarded the frame that surrounds as imaginary frame and is generated definition of data.Here so-called definition of data is meant the definition project that can extract by layout informations such as having or not of the rectangular coordinates in identification requirement zone or frame.
In addition, be single textbox if in defined range, have a plurality of frames and be judged as full frame from the length/width size of each frame, just can carry out alpha-numeric setting according to the frame number.For example, the occasion that exists of the single textbox of being divided by vertical line at ledger account with balance column is a lot.In definition during this zone, can extract the definition of data of the having or not of the rectangular coordinates in identification requirement zone or frame, literal number etc. by said method.
CPU also carries out project name-definition of data conversion process (step S600) after carrying out this processing.Detailed contents processing will be narrated in the back, in this is handled, extract the definition of data that reads attribute or literal number etc. by the literal around the defined range of identification appointment.
After obtaining definition of data by these processing, CPU arrangement definition of data is next pointed out definition of data by display device such as display 102 to the user.The occasion of wrong occasion or the project whether set in the definition of data of prompting, user or revise definition of data perhaps appends definition of data (step S700).When the user points out definition of data, by with color differentiating by the project of automatic setting with the project of not setting, just can allow the user clear easily.In addition, in the project of automatic setting, use the high project of color differentiating ambiguity equally, the attention that can mention the user.Here for pointing out an example of the method for definition of data, but also it is also conceivable that other various reminding methods to the user.
For example, in the example of Fig. 4 (e), the definition of data that arrangement extracts from layout parsing and project name also is presented at window 409.The definition of data that user's affirmation is shown, if all correct words of whole definition of data are just no longer made amendment to definition of data, if there is the words user of mistake just to revise each definition of data and setting by finishing the defined range of frame 408 by OK button 410.By by cancellation button 411, it is invalid that the defined range of the frame of having selected is set in the occasion of not setting defined range.
In the example of Fig. 4 (d), owing to be the form of table, the attribute of the definition of data in each frame all has identical value with the unit of classifying as.For example, the frame that is positioned at the below of ' bank's name ' is all listed the attribute of ' bank's name ' in, ' branch name ' too.Like this, by use the copy function of defined range in the occasion that the attribute of definition of data is set as defined range with the identical zone of the unit of classifying as, can define operation (step S800) effectively.
For example, in the example of Fig. 5 (f), in region-wide occasion as defined range definition ' bank's name ', ' branch's name ', ' account number ', order as described above set be present in projects under zone 412.Next after pressing reproduction button 403, shown in Fig. 5 (g), the zone 413 of wanting to duplicate is surrounded by drawing with mouse 104.CPU detects the processing of the frame that defined range 412 length/width that the processing of setting the defined range finish and detection and setting finish equate to the zone of wanting to duplicate 413.In this is handled, the defined range 412 that setting is finished is explored above-below direction in the zone of wanting to duplicate 413, detect the frame that length/width equates.Next, shown in Fig. 5 (h), the defined attribute value that CPU finishes setting copies to (414) in the detected frame.Here so-called defined attribute value is meant the definition of data of literal number beyond the coordinate information or knowledge dictionary kind etc.Because the coordinate information of reference position or end position etc. is in each frame difference, so these definition of data extract from resolving the frame information that obtains by layout.
In this embodiment, duplicating also and can similarly realize row has been described to duplicating of being listed as.In addition, in addition, also can be when detecting the frame that length/width equates by display 102 to user's prompting, the user only selects to want to carry out the frame that defined attribute is duplicated with mouse, carries out defined attribute and duplicates.
By above processing, the bill definition of data (step S900) that output is set finishes the automatic generation of definition of data and handles.The bill definition of data that generates, as described above, be stored in the bill processing apparatus, by the literal identification of flexible Application in bill.For example, in the example of Fig. 4, after confirming that all bill definition of data are correctly set, can be by preserving the bill definition of data by save button 404.
C1. project name-definition of data conversion process
Fig. 6 is the process flow diagram of project name-definition of data conversion process 600.In this is handled, to the defined range of user's appointment detect up to left to adjacent frame (step S601).At this, CPU detects corresponding frame information with reference to the frame information table of the layout information that extracts at bill in advance comprehensively.For example, in the example of Fig. 7, the occasion of defined range will be appointed as in this zone 705 will ' putting down on Dec 1st, 14 ', be equivalent to zone 706 with zone 705 adjacent frames.
Next, CPU carries out literal identification processing (step S602) to the literal line that is present in this adjacent frame, carries out the inspection (step S603) whether resulting recognition result exists.Here, the literal identification dictionary 110 that CPU illustrated with reference to the front, the lined image that proposes and the contrast of literal.In addition, by resulting text line and project name contrast are carried out the knowledge control treatment of determining as word with 111 contrasts of knowledge dictionary.
For example, in the example of Fig. 7, with dictionary 110 and project name contrast knowledge dictionary 111, the literal line 707 in the frame 706 adjacent with the defined range 705 of appointment is carried out the project name recognition result obtain ' depositing the appointed date in ' with reference to literal identification.So-called resulting recognition result does not exist and is meant the occasion that does not have literal line in occasion that this adjacent frame does not have and the adjacent frame, though or have literal line kaput occasion when knowledge contrasts.For example, in the example of Fig. 7, do not exist, have only adjacent literal line 702 to exist with zone 701 adjacent frames.In addition, do not exist, have literal line 704 in regional 703 inside with zone 703 adjacent frames.Also have, exist more than two at adjacent frame, there is plural occasion in the project name recognition result, and the side that the certainty factor that obtains from literal identification processing is high is preferential.In addition, at this moment, also can select correct project name by pointing out to the user.
In the occasion that has obtained the project name recognition result in the adjacent frame, CPU is transformed into definition of data (step S609) with the project name of recognition result.In this was handled, the project name-definition of data conversion dictionary 111 by the reference front illustrated was transformed into the definition of data to project name.In Fig. 8, provided an example of project name-definition of data conversion dictionary 111.With ' depositing the appointed date in ' in the zone 706 of Fig. 7 is that example illustrates, this project name is present in project name-definition of data conversion dictionary 111, with the corresponding knowledge dictionary of this project name kind be that ' date ', literal number are ' 12 words '.So, just from project name, extract definition of data.Also have, the definition of data of following project name is also can be with knowledge dictionary kind and literal number irrelevant and set various information for.For example, beyond knowledge dictionary kind and literal number, can consider text type.
In step 603, in the occasion that can not get the project name recognition result of adjacent frame, the interior literal line of defined range that carries out appointment extracts to be handled.At this, CPU at the literal line information table of the comprehensive layout information of extracting out of bill, detects the literal line information that is present in the appointed area with reference in advance.The literal line that extracts is carried out literal identification handle (step S604), carry out the inspection (step S605) whether resulting recognition result exists.At this, CPU utilizes literal identification to carry out literal identification with dictionary 110 and project name contrast with dictionary 111 with top the same.
For example, in the example of Fig. 7, when specifying ' entrusting day days ' this zone 703 be defined range, the text line 704 of inside that is present in the defined range 703 of appointment with reference to literal identification with dictionary 110 and project name contrast with 111 pairs of dictionaries carries out the project name recognition result and obtains ' trust day '.In the occasion that has obtained the project name recognition result of inner literal line, CPU just is transformed into definition of data (step S609) to the project name of recognition result.
In the occasion that can not get the project name recognition result of inner literal line, just to the defined range of appointment detect up to left to adjacent literal line (step S606).At this, CPU detects corresponding literal line information with reference to the frame information table of the layout information that extracts at bill in advance comprehensively.For example, in the example of Fig. 7, be the defined range occasion specifying ' _ _ _ _ Your Excellency ' this zone 701, the literal line adjacent with zone 701 is equivalent to regional 702.
Next, CPU carries out literal identification to this adjacent literal line and handles (step S607), carries out the inspection (step S608) whether resulting recognition result exists.At this, CPU and top same utilizes literal identification to carry out literal identification with dictionary 110 and project name contrast with knowledge dictionary 111.For example, in the example of Fig. 7, contrast defined range 701 adjacent literal lines 702 with 111 pairs of knowledge dictionaries and appointment with reference to literal identification with dictionary 110 and project name and carry out the project name recognition result and obtain ' clientage '.
In the occasion that has obtained the project name recognition result of adjacent literal line, CPU is transformed into definition of data (step S609) to the project name of recognition result.In the occasion that can not get the project name recognition result of adjacent literal line, regard the defined range of appointment as do not have project name zone, the definition of data of knowledge dictionary kind and literal number etc. finishes as not setting.
CPU carries out above processing to the defined range of whole appointments.Also have, extract at current project name and handle, according to the literal line in the adjacent frame, specify the order of literal line in the defined range, adjacent literal line to set priority, still, also can be according to bill kind change priority.In addition, also can not use 3 literal line, and only use for example interior literal line of adjacent frame.So, the limited bill in position that occurs of project name etc. just can carry out that more correct project name extracts, the definition of data generation.
As mentioned above, according to disclosed technology, make the generation robotization of bill definition of data as far as possible, the part of can not robotization handling can more successfully assist the bill definition of data to generate then by partly manually getting involved.
When the user revises or appends definition of data,, can constitute with the understandable picture of user by distinguishing on display device with color by the project of automatic setting and setting item etc. not.In addition, in the project of automatic setting, the project that ambiguity is high is also passed through color differentiating, the attention that can mention the user too.
Specific bill or charged to, do not charged to has, no matter also can both generate the bill definition of data automatically for various bills again.
More than, disclosed technology is not limited to embodiment, and this is self-evident can to adopt various formations in the scope that does not break away from its purport.For example, above control and treatment can also realize with hardware beyond realizing with software.In addition, also can implement the generation of bill definition of data and constitute with the character recognition device of bill processing apparatus.
According to disclosed technology, no matter specific bill or charged to, do not charged to can both generate the bill definition of data automatically for various bills.

Claims (10)

1. a bill definition of data generation method is characterized in that: the view data that obtains bill; From this view data, extract the layout information of Word message; From with the corresponding above-mentioned layout information of specified defined range extract 1st definition of data relevant with the position of this defined range; Identification is present in above-mentioned defined range periphery or inner Word message; This recognition result is transformed into 2nd definition of data relevant with the attribute of this defined range.
2. bill definition of data generation method as claimed in claim 1 is characterized in that: near the existence of the above-mentioned Word message of inspection above-mentioned defined range; The result who checks does not check out the occasion of the existence of Word message near above-mentioned defined range, in the existence of the internal check Word message of this defined range; The result who checks does not check out the occasion of the existence of Word message yet in the inside of above-mentioned defined range, check the last direction of this defined range and left to the existence of position Word message.
3. bill definition of data generation method as claimed in claim 1, it is characterized in that: at above-mentioned defined range along the continuous occasion of column direction, from the corresponding above-mentioned layout information of each defined range extract 1st definition of data relevant with the position of this each defined range; Above-mentioned the 2nd definition of data is duplicated as the 2nd definition of data relevant with the attribute of above-mentioned each defined range.
4. bill definition of data generation method as claimed in claim 1 is characterized in that: in the wrong occasion of above-mentioned layout information, extract layout information once again according to the information of revising.
5. bill definition of data generation method as claimed in claim 1, it is characterized in that: recently judged whether textbox in length and breadth by what from the above-mentioned layout information corresponding, obtain each frame in this defined range with above-mentioned defined range, be judged as the occasion of textbox, calculate the number of textbox and extract alpha-numeric definition of data.
6. bill processing apparatus is the bill processing apparatus of literal identification that view data according to bill generates the content of charging to employed definition of data when handling, and it is characterized in that: have: the device of obtaining the view data of bill; For above-mentioned view data, extract the device of the layout resolving information of frame, ruling and literal line etc.; From with the corresponding above-mentioned layout resolving information of appointed defined range extract the device of the definition of data relevant with the position of this defined range; From being present in around the above-mentioned defined range or extracting the device of the project name of this defined range inner frame and the literal line; Carry out the device of the literal identification of above-mentioned project name; To handle the device that the recognition result that obtains and project name dictionary contrast by the identification of above-mentioned literal; To convert the device of the definition of data of the attribute of representing this defined range from the project name that above-mentioned results of comparison obtains to; Put above-mentioned definition of data in order and output to the device of bill definition of data file.
7. bill processing apparatus as claimed in claim 6, it is characterized in that: have in the wrong occasion of above-mentioned layout resolving information, by carrying out the device that the layout dissection process is revised the layout resolving information of ruling or frame etc. once again according to the layout analytic modification information of revising.
8. bill processing apparatus, it has and reads the image-input device that bill is obtained view data, to carry out the character recognition device of literal identification from the view data of this image-input device, it is characterized in that: above-mentioned character recognition device extracts the layout information of Word message from the view data from above-mentioned image-input device, from with the corresponding above-mentioned layout information of appointed defined range extract 1st definition of data relevant with the position of this defined range, identification be present in above-mentioned defined range around or inner Word message, convert this recognition result to 2nd definition of data relevant, the 2nd definition of data and above-mentioned the 1st definition of data are gathered preservation with the attribute of this defined range.
9. bill processing apparatus as claimed in claim 8, it is characterized in that: above-mentioned defined range is along the column direction consecutive hours, above-mentioned character recognition device from the corresponding above-mentioned layout information of above-mentioned each defined range extract 1st definition of data relevant with the position of this each defined range, above-mentioned the 2nd definition of data is duplicated as the 2nd definition of data relevant with the attribute of above-mentioned each defined range.
10. bill processing apparatus as claimed in claim 8, it is characterized in that: above-mentioned character recognition device has or not textbox by recently the judging of obtaining in this defined range in length and breadth of each frame from the above-mentioned layout information corresponding with above-mentioned defined range, be judged to be the occasion of textbox, calculating the textbox number and extract alpha-numeric definition of data.
CNA2004100006610A 2003-02-24 2004-01-15 Bill definition data generating method and bill processing apparatus Pending CN1525378A (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2003045406A JP4183527B2 (en) 2003-02-24 2003-02-24 Form definition data creation method and form processing apparatus
JP2003045406 2003-02-24

Publications (1)

Publication Number Publication Date
CN1525378A true CN1525378A (en) 2004-09-01

Family

ID=33112215

Family Applications (1)

Application Number Title Priority Date Filing Date
CNA2004100006610A Pending CN1525378A (en) 2003-02-24 2004-01-15 Bill definition data generating method and bill processing apparatus

Country Status (4)

Country Link
JP (1) JP4183527B2 (en)
KR (1) KR100570224B1 (en)
CN (1) CN1525378A (en)
TW (1) TW200416583A (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102262616A (en) * 2010-05-24 2011-11-30 株式会社Pfu Form processing system, OCR device, and form creation device
CN102262615A (en) * 2010-05-24 2011-11-30 株式会社Pfu Device, Method, And Computer Readable Medium For Creating Forms
CN102331914A (en) * 2010-05-24 2012-01-25 株式会社Pfu Form processing system, ocr device, form creation device, and form procrssing method
CN102331913A (en) * 2010-05-24 2012-01-25 株式会社Pfu Form processing system, form creation device, and form processing method
CN101464951B (en) * 2007-12-21 2012-05-30 北大方正集团有限公司 Image recognition method and system
CN102591596A (en) * 2010-10-12 2012-07-18 株式会社Pfu Information processing equipment, and information processing method
CN103092625A (en) * 2013-01-28 2013-05-08 中国航空结算有限责任公司 Method and device used for processing civil aviation passenger transport passenger ticket purchase certificate data and based on .NET Framework platform
CN104391830A (en) * 2014-10-24 2015-03-04 华迪计算机集团有限公司 Method and device for dynamic layout of bill page
CN107533651A (en) * 2015-05-11 2018-01-02 株式会社东芝 Identification device, recognition methods and program
CN111931473A (en) * 2019-05-13 2020-11-13 阿里巴巴集团控股有限公司 Bill processing method and device

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4973063B2 (en) * 2006-08-14 2012-07-11 富士通株式会社 Table data processing method and apparatus
JP5556524B2 (en) 2010-09-13 2014-07-23 株式会社リコー Form processing apparatus, form processing method, form processing program, and recording medium recording the program
JP2013109690A (en) * 2011-11-24 2013-06-06 Oki Electric Ind Co Ltd Business form data input device, and business form data input method
WO2014061081A1 (en) * 2012-10-15 2014-04-24 富士通株式会社 Form creation assistance device, form creation assistance method, and form creation assistance program
CN102930174B (en) * 2012-11-20 2015-07-01 江苏省疾病预防控制中心 System and method for acquiring residential health information
JP6109688B2 (en) * 2013-09-06 2017-04-05 株式会社東芝 Form reader and program
JP7235269B2 (en) * 2017-03-13 2023-03-08 日本電気株式会社 Data item name estimation device, data item name estimation program, and data item name estimation method
JP6445645B1 (en) * 2017-09-21 2018-12-26 株式会社東芝 Form information recognition apparatus and form information recognition method
CN109634606A (en) * 2018-12-10 2019-04-16 山东浪潮通软信息科技有限公司 A kind of method and device of defined function menu
JP7259468B2 (en) 2019-03-25 2023-04-18 富士フイルムビジネスイノベーション株式会社 Information processing device and program
JP2020167618A (en) * 2019-03-29 2020-10-08 キヤノン株式会社 Image processing apparatus, method for controlling the same, and program
JP7468004B2 (en) 2020-03-11 2024-04-16 富士フイルムビジネスイノベーション株式会社 Document processing device and program

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101464951B (en) * 2007-12-21 2012-05-30 北大方正集团有限公司 Image recognition method and system
US9047265B2 (en) 2010-05-24 2015-06-02 Pfu Limited Device, method, and computer readable medium for creating forms
CN102262615A (en) * 2010-05-24 2011-11-30 株式会社Pfu Device, Method, And Computer Readable Medium For Creating Forms
CN102331914A (en) * 2010-05-24 2012-01-25 株式会社Pfu Form processing system, ocr device, form creation device, and form procrssing method
CN102331913A (en) * 2010-05-24 2012-01-25 株式会社Pfu Form processing system, form creation device, and form processing method
CN102262616A (en) * 2010-05-24 2011-11-30 株式会社Pfu Form processing system, OCR device, and form creation device
US9274732B2 (en) 2010-05-24 2016-03-01 Pfu Limited Form processing system, form creation device, and computer readable medium
CN102591596A (en) * 2010-10-12 2012-07-18 株式会社Pfu Information processing equipment, and information processing method
CN103092625A (en) * 2013-01-28 2013-05-08 中国航空结算有限责任公司 Method and device used for processing civil aviation passenger transport passenger ticket purchase certificate data and based on .NET Framework platform
CN103092625B (en) * 2013-01-28 2016-01-20 中国航空结算有限责任公司 A kind of method and apparatus of the process civil aviation passenger transport passenger ticket ticket data based on .NET Framework platform
CN104391830A (en) * 2014-10-24 2015-03-04 华迪计算机集团有限公司 Method and device for dynamic layout of bill page
CN107533651A (en) * 2015-05-11 2018-01-02 株式会社东芝 Identification device, recognition methods and program
CN107533651B (en) * 2015-05-11 2021-05-04 株式会社东芝 Identification device, identification method, and computer-readable recording medium
CN111931473A (en) * 2019-05-13 2020-11-13 阿里巴巴集团控股有限公司 Bill processing method and device

Also Published As

Publication number Publication date
TW200416583A (en) 2004-09-01
JP2004258706A (en) 2004-09-16
JP4183527B2 (en) 2008-11-19
KR20040078046A (en) 2004-09-08
KR100570224B1 (en) 2006-04-11

Similar Documents

Publication Publication Date Title
CN1525378A (en) Bill definition data generating method and bill processing apparatus
US10824801B2 (en) Interactively predicting fields in a form
US9613267B2 (en) Method and system of extracting label:value data from a document
US8107727B2 (en) Document processing apparatus, document processing method, and computer program product
US8467614B2 (en) Method for processing optical character recognition (OCR) data, wherein the output comprises visually impaired character images
US20040139391A1 (en) Integration of handwritten annotations into an electronic original
KR20060044691A (en) Method and apparatus for populating electronic forms from scanned documents
CN114299528B (en) Information extraction and structuring method for scanned document
JP4785655B2 (en) Document processing apparatus and document processing method
CN106250830A (en) Digital book structured analysis processing method
US20070002054A1 (en) Method of identifying semantic units in an electronic document
JP2006277167A (en) Annotation data processing program, system and method
JP2002352191A (en) Printing control interface system and method having handwriting discrimination capability
WO2007117334A2 (en) Document analysis system for integration of paper records into a searchable electronic database
CN101206639A (en) Method for indexing complex impression based on PDF
JP2010157107A (en) Business document processor
CN115828874A (en) Industry table digital processing method based on image recognition technology
JP2019204399A (en) Information processing device and program
JP2006221569A (en) Document processing system, document processing method, program, and storage medium
JP2008129793A (en) Document processing system, apparatus and method, and recording medium with program recorded thereon
CN102883085B (en) Image processing apparatus and image processing method
JP2008108114A (en) Document processor and document processing method
JP2007041709A (en) Document processing system, control method of document processing system, document processing device, computer program and computer readable storage medium
JP4807618B2 (en) Image processing apparatus and image processing program
JPH08320914A (en) Table recognition method and device

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
ASS Succession or assignment of patent right

Owner name: HITACHI OMRON FINANCIAL SYSTEMS LTD.

Free format text: FORMER OWNER: HITACHI CO., LTD.

Effective date: 20060512

C41 Transfer of patent application or patent right or utility model
TA01 Transfer of patent application right

Effective date of registration: 20060512

Address after: Tokyo, Japan, Japan

Applicant after: Hitachi Omron Financial System Co., Ltd.

Address before: Tokyo, Japan, Japan

Applicant before: Hitachi Ltd.

C12 Rejection of a patent application after its publication
RJ01 Rejection of invention patent application after publication

Open date: 20040901