JP3515586B2 - Document processing method and apparatus - Google Patents

Document processing method and apparatus

Info

Publication number
JP3515586B2
JP3515586B2 JP27885792A JP27885792A JP3515586B2 JP 3515586 B2 JP3515586 B2 JP 3515586B2 JP 27885792 A JP27885792 A JP 27885792A JP 27885792 A JP27885792 A JP 27885792A JP 3515586 B2 JP3515586 B2 JP 3515586B2
Authority
JP
Japan
Prior art keywords
document
character string
step
type
predetermined character
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Lifetime
Application number
JP27885792A
Other languages
Japanese (ja)
Other versions
JPH06131225A (en
Inventor
順一 青江
Original Assignee
株式会社ジャストシステム
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 株式会社ジャストシステム filed Critical 株式会社ジャストシステム
Priority to JP27885792A priority Critical patent/JP3515586B2/en
Publication of JPH06131225A publication Critical patent/JPH06131225A/en
Application granted granted Critical
Publication of JP3515586B2 publication Critical patent/JP3515586B2/en
Anticipated expiration legal-status Critical
Application status is Expired - Lifetime legal-status Critical

Links

Description

Description: BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a document processing method and apparatus.
More specifically, the present invention relates to a document processing method and apparatus for classifying created documents according to predetermined rules. 2. Description of the Related Art Generally, in a document editing apparatus represented by a word processor, a created or edited document is stored in a storage device such as a floppy disk or a hard magnetic disk. There are various reasons for preservation, for example,
This is for facilitating re-editing when a document is incomplete, or for creating another document based on an already created document. [0004] Incidentally, a storage device can store a large number of documents as long as its capacity allows.
Therefore, in order to quickly find a target document from a group of documents once stored, the user has dealt with saving the document by assigning a file name that best represents the characteristics of the document. [0005] However, individual file names are determined by the user without permission, and it is quite difficult to properly assign individual file names as a practical matter. It is one means to manage documents for each floppy disk or each directory. However, this is also dependent on the user, and in the case of a device commonly used by a plurality of persons, the document management method is set for each user. Are still different, so the problem still exists. SUMMARY OF THE INVENTION The present invention has been made in view of the above prior art, and provides a document processing method and apparatus for classifying document files without special awareness and facilitating the management. It is what we are going to offer. In order to achieve this object, a document processing method according to the present invention comprises an input step of inputting a document, a predetermined character string in the document input in the input step, and the predetermined character string in the document. A detecting step of detecting an existing position including information on an existing line; and a type of the document based on a combination of the predetermined character string detected by the detecting step and an existing position of the predetermined character string in the document. And a registration step of classifying and registering the document based on the type determined in the document type determination step. The document processing method includes an input step of inputting a document, a predetermined character string in the document input in the input step, and the presence of the predetermined character string in the document. A detection step of detecting an existing position including information about a line, and the type of the document based on a combination of the predetermined character string detected by the detecting step and an existing position of the predetermined character string in the document. A document type determining step of determining and a type information adding step of adding information on the type determined in the document type determining step to the document. Further, the document processing apparatus of the present invention has an input means for receiving an input of a document, a predetermined character string in the document whose input has been received by the input means, and the presence of the predetermined character string in the document. Detecting means for detecting an existing position including information about a line; and a type of the document based on a combination of the predetermined character string detected by the detecting means and an existing position of the predetermined character string in the document. Document type discriminating means for discriminating, and registration means for classifying and registering the document based on the type discriminated by the document type discriminating means are provided. According to another aspect of the present invention, there is provided a document processing apparatus comprising: input means for receiving an input of a document; a predetermined character string in the document, the input of which is received by the input means; Detecting means for detecting an existing position including information about a line; and a type of the document based on a combination of the predetermined character string detected by the detecting means and an existing position of the predetermined character string in the document. Document type discriminating means for discriminating, and type information adding means for adding information on the type discriminated by the document type discriminating means to the document are provided. In the document processing method and apparatus according to the present invention, it is detected whether or not there is a significant character string for specifying the type of the document in the input document information, and if so, the location of the character string. I do. Then, the document information is classified based on the detected result. An embodiment according to the present invention will be described below in detail with reference to the accompanying drawings. FIG. 1 shows a block configuration of a document processing apparatus according to an embodiment. In the figure, 1 is a CPU that controls the entire apparatus, and 2 is an R that stores a boot program.
OM and 3 are programs related to document editing (FIG.
And a RAM 4 for storing a document being edited and a keyboard 4 for inputting characters and various instruction commands. Reference numeral 5 denotes an external storage device (for example, a hard disk device or a floppy disk device) that stores the OS of the apparatus, the above-described programs, document files, kana-kanji conversion dictionaries, and a document classification analysis table described later. 6 is V to expand the displayed characters
The RAM 7 is a display device for displaying characters and the like developed in the VRAM 6. In the above configuration, in the embodiment, when a document created or edited on the apparatus is stored in the external storage device 5, the type of the document is determined and the determination information is added and stored. The principle of determining the type of a document will be described below. Normally, a document to be created is of course free in this type of apparatus, but it is a fact that there are many letters and papers as documents to be created and edited. [0017] In the case of a letter document, particularly a letter addressed to an individual, "Dear" and "abbreviation" are often preceded, and "Dear" in the case of an English sentence. In the same letter document, the format used in the business is the date, followed by the name of the partner company, followed by the name of your company, followed by "Dear Sirs", etc. Followed by In the case of a dissertation, there are many cases in which a title comes, followed by a heading “reference” or simply “document”, followed by the name of the document. From the above description, it can be seen that when a character string such as "Dear Sirs" is at the head, the document can be determined to be a personal letter document (a letter addressed from an individual to an individual).
If the same character string such as “Dear sir” is located in the middle of the document (at least not at the beginning),
The document can be determined to be a business letter. When the character string “document” exists at the intermediate position, the document can be recognized as a paper. Therefore, in the embodiment, based on a character string that is significant for specifying the type of document and the position of the character string,
Determine the type of document. Then, the determination result is added to the document to be stored. In order to determine the document type, in the embodiment, the document classification analysis table shown in FIG. 2 is stored in the external storage device 5. The illustrated document classification analysis table will be briefly described below. A document in which the character strings "Dear Sir" and "Abbreviation" are located at the beginning of the document is a personal document. If the letter and the character string “document” are located in the middle, it indicates that the document is determined to be a paper. In the embodiment, the above classification processing is executed at the stage of storing a document in the external storage device 5. The operation will be described below with reference to the flowchart of FIG. It should be noted that the flowchart shown shows a routine that is called when an instruction to save a document is given as described above. First, in step S1, it is determined whether data to be stored exists in the RAM 3. If there is no data to be stored, it is determined that the document storage process is invalid, and the process returns to the main process. If there is data to be stored, the flow advances to step S2 to retrieve one classification character string from the document classification analysis table (the initial classification character string in the initial stage), Search for it in.
In step S3, it is determined whether or not the search result exists. Since the search itself is publicly known,
The description here is omitted. If there is, the flow advances to step S4 to store the classified character string and the position where it was present (the number of the line, etc.) in a predetermined area in the RAM 3, and then to step S5.
Proceed to. In step S5, it is determined whether or not the search has been completed for all the classification character strings in the document classification analysis table. If it is determined that it is not completed, the process returns to step S2, and the above-described processing is performed for the next classification character string. Since the classification character string and its existence position are detected by one search processing, the search for the duplicate classification character string is not performed. For example, when the search processing of "Dear Sirs" for the personal letter document in FIG. 2 is performed, the search for the same classified character string in the business letter is not performed. For this reason, as shown in FIG. 3, a separate table of only classified character strings is provided. That is, the search processing in step S2 performs only the search processing for the classification character string table shown in FIG. When the search process for all the classified character strings is performed in this manner, the information on the classified character strings and their locations is stored in a predetermined area on the RAM 3.
However, only one set is not necessarily stored,
In some cases, two or more sets or none at all may be considered. In step S6, the document is classified based on the classification character string obtained by the search and the location of the character string. As described above, if no classification character string as shown in FIG. 2 exists in one document, the document is recognized as a document different from a normal document.
Assign the classification “Other”. When there are two or more classified character strings, for example, when the result is such that the character string corresponds to a personal letter and also corresponds to a paper, a classification “classification impossible” is assigned. . If there is only one classification character string, the document can be classified based on the location of the classification character string. In any case, by performing the above processing,
The document to be stored is classified. In step S7, the classification name determined by the classification processing is added to a predetermined position (in the format control information) of the document data and stored in the external storage device 5. Next, a document reading process in the document processing apparatus according to the embodiment will be described. When the document file once stored is edited again, the document file is naturally stored in the RAM 3.
It is necessary to read on. In the embodiment, when displaying a list of document files to be read on the screen, the operator specifies the classification of the file to be read. Then, for example, when "business letter" is designated, only the file list of the classification is displayed to make it easier to find the target document. The operation will be described with reference to the flowchart of FIG. First, in step S11, a display target is specified. The display target here means a drive name or a directory name of a floppy disk or the like. In step S12, a classification name is specified.
Although various examples can be considered as the designation method, here, a list of classifications is displayed, and one or more of them are designated. If a plurality of designations are made, processing is performed assuming that the kind of the logical sum of the designated classification is designated. When the process proceeds to step S13, a predetermined position of one document file in the target specified in step S11 is checked, and the classification of the document is extracted. Then, in the next step S14, it is determined whether or not the classification of the document file of interest at present is the specified classification. If it is determined that the classification is the specified classification, the process proceeds to step S15, and the document file name is displayed on the screen. In the next step S16, it is determined whether or not the above-described processing has been performed on all the document files in the specified target. If it is determined that the processing is not complete, the process returns to step S13. Therefore, in step S16, "yes"
When "s" is reached, a list of document files of the specified classification is displayed on the display screen. Hereinafter, a desired document file is designated, and the document file is read into the RAM 3. As described above, according to the embodiment, it is possible to automatically classify documents based on a specific character string and its existing position. The document classification analysis table (see FIG. 2) and the classification character string table (see FIG. 3) shown in the embodiment are easy to find. ,
The user can freely change or add (the change or addition may be performed by a normal document editing work, or may be performed when a special command is instructed). Therefore, for example, in order to classify a business letter with a higher identification rate, if the character string ““ × △ □, Inc. ”exists before the x-th line from the beginning, the business letter (or report letter) ). Further, in the case of business letters, it is also possible to classify the letters for each company of the other party. Also, if it is an internal document,
Sufficient classification can be performed even if character strings such as "Regulations" and "circulation" are handled. As described above, when various classification character strings are registered, the processing in step S4 in FIG.
It is expected that a plurality of sets of classification character strings generated on AM3 and their location information will be detected. Therefore, priorities may be assigned to the document classification analysis table in FIG. 2 so that the classification is not disabled even when a plurality of sets occur. Further, as a classification of one document file, 1
Instead of allowing only one, a plurality of classifications may be allowed. In other words, as a classification of a certain document file,
It allows multiple categories, such as "business letters" and "circulations". <Explanation of Second Embodiment> In the above embodiment (first embodiment), when a document being edited is stored, the type of the document is determined, and the determination result is stored together with the document data. Was. Then, at the stage of display, switching between display and non-display was performed based on the added classification information. However, the present invention is not limited to this. For example, a plurality of document files in a certain directory may be collectively classified. Here, as an example, a plurality of document files in the floppy disk A are copied to the floppy disk B.
In the following, an example will be described in which a directory of each category name is created on the floppy disk B, and document files of the same category are stored and managed in the same directory. still,
Here, the floppy disks A and B mean logical devices, and are a concept including two physically different drives. That is, in order to arrange documents in a certain directory, it is only necessary to make the input target and the output target the same. First, in steps S21 and S22, an input target and an output target are designated, respectively. In step S23, one file is read from the input target position, and is loaded on the RAM 3. In the next step S24, the document data loaded on the RAM 3 is checked to classify the document. The classification process itself is the same as in FIG. However, in the second embodiment, as will be described later, the types of documents are managed on a directory basis, so that it is not necessary to add a classification name to the document data itself. When the classification process for the read document file is completed, it is determined in step S25 whether or not a directory having the corresponding classification name exists in the output target. If it is determined that the directory of the classification name exists, the process proceeds to step S27, and the read document file is stored under the directory. Also,
If it is determined that the directory does not exist, create that subdirectory in the output target,
Proceed to step S27. When the classification and output processing of one document file in one input object is completed, the process returns to step S23, and the processing is repeated until the processing for all input document files is completed. Incidentally, before and after the description, step S21,
If the objects specified in S22 are logically the same, the document file is stored after being classified, and then the document file existing before the classification is deleted. Furthermore, in the above-described first and second embodiments, one character string that is significant for classification has been described. However, two or more character strings may be used. In other words, if “a character string A is at position a and character string B is at position b,
In this manner, the classification can be performed at a higher rate. As described above, according to the present invention, the document file can be obtained without special awareness. Can be classified and its management can be facilitated.

BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1 is a block diagram of a document processing apparatus according to an embodiment. FIG. 2 is a diagram showing contents of a document classification analysis table in the embodiment. FIG. 3 is a diagram showing contents of a classification character string table in the embodiment. FIG. 4 is a flowchart illustrating a processing procedure when a document is stored in the embodiment. FIG. 5 is a flowchart illustrating a processing procedure for displaying a document file list according to the embodiment; FIG. 6 is a flowchart relating to document classification in the second embodiment. [Description of Signs] 1 CPU 2 ROM 3 RAM 4 Keyboard 5 External storage device 6 VRAM 7 Display device

──────────────────────────────────────────────────続 き Continued on the front page (58) Fields surveyed (Int. Cl. 7 , DB name) G06F 12/00 G06F 17/20-17/26 G06F 17/30

Claims (1)

  1. (57) [Claim 1] An input step of inputting a document, a predetermined character string in the document input in the input step, and a line where the predetermined character string exists in the document. A detecting step of detecting an existing position including information; and determining a type of the document based on a combination of the predetermined character string detected in the detecting step and an existing position of the predetermined character string in the document. A document processing method, comprising: a document type determining step; and a registration step of classifying and registering the document based on the type determined in the document type determining step. 2. An inputting step of inputting a document, and a detecting step of detecting a predetermined character string in the document input in the inputting step and a position of the predetermined character string including information on a line in the document. A document type determining step of determining a type of the document based on a combination of the predetermined character string detected by the detection step and an existing position of the predetermined character string in the document; A type information adding step of adding information on the type determined in the determining step to the document. 3. An input unit for receiving an input of a document, and a position of a predetermined character string in the document whose input is received by the input unit and a position of the predetermined character string including information on an existing line in the document. Detecting means for detecting, document type determining means for determining the type of the document based on a combination of the predetermined character string detected by the detecting means and the location of the predetermined character string in the document, A registration unit that classifies and registers the document based on the type determined by the document type determination unit. 4. An input unit for receiving an input of a document, a predetermined character string in the document whose input is received by the input unit, and an existing position including information on an existing line of the predetermined character string in the document. Detecting means for detecting, document type determining means for determining the type of the document based on a combination of the predetermined character string detected by the detecting means and the location of the predetermined character string in the document, And a type information adding unit for adding information on the type determined by the document type determining unit to the document.
JP27885792A 1992-10-16 1992-10-16 Document processing method and apparatus Expired - Lifetime JP3515586B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP27885792A JP3515586B2 (en) 1992-10-16 1992-10-16 Document processing method and apparatus

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP27885792A JP3515586B2 (en) 1992-10-16 1992-10-16 Document processing method and apparatus

Publications (2)

Publication Number Publication Date
JPH06131225A JPH06131225A (en) 1994-05-13
JP3515586B2 true JP3515586B2 (en) 2004-04-05

Family

ID=17603105

Family Applications (1)

Application Number Title Priority Date Filing Date
JP27885792A Expired - Lifetime JP3515586B2 (en) 1992-10-16 1992-10-16 Document processing method and apparatus

Country Status (1)

Country Link
JP (1) JP3515586B2 (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2002222083A (en) 2001-01-29 2002-08-09 Fujitsu Ltd Device and method for instance storage
JP2005115628A (en) * 2003-10-07 2005-04-28 Hewlett-Packard Development Co Lp Document classification apparatus using stereotyped expression, method, program
JP4747591B2 (en) * 2005-01-31 2011-08-17 日本電気株式会社 Confidential document retrieval system, confidential document retrieval method, and confidential document retrieval program
JP5110306B2 (en) * 2007-12-05 2012-12-26 日本電気株式会社 Communication limit system, communication limit device, communication limit method, and communication limit program
JP2015212907A (en) * 2014-05-07 2015-11-26 株式会社リコー Output system, terminal device, program and output method

Also Published As

Publication number Publication date
JPH06131225A (en) 1994-05-13

Similar Documents

Publication Publication Date Title
US6035282A (en) Information processing apparatus and method utilizing useful additional information packet
US5708806A (en) Data processing system and method for generating a representation for and for representing electronically published structured documents
EP0810534B1 (en) Document display system and electronic dictionary
US8107727B2 (en) Document processing apparatus, document processing method, and computer program product
EP0464306B1 (en) Structured document tags invoking specialized functions
CN1205573C (en) Method and apparatus for synchronizing, displaying and manipulating text and image documents
US5404435A (en) Non-text object storage and retrieval
US7143349B2 (en) Document processing system
US5331547A (en) Process and computer system for control of interface software and data files
EP1074925B1 (en) Document management system, information processing apparatus, document management method and computer-readable recording medium
US5832476A (en) Document searching method using forward and backward citation tables
US20060248089A1 (en) Storing and retrieving the visual form of data
US7487190B2 (en) Automated identification and marking of new and changed content in a structured document
US7797622B2 (en) Versatile page number detector
US7110939B2 (en) Process of automatically generating translation-example dictionary, program product, computer-readable recording medium and apparatus for performing thereof
US4868733A (en) Document filing system with knowledge-base network of concept interconnected by generic, subsumption, and superclass relations
US7072983B1 (en) Scheme for systemically registering meta-data with respect to various types of data
JP2004086868A (en) Document group management device
US20050193330A1 (en) Methods and systems for eBook storage and presentation
US20040267734A1 (en) Document search method and apparatus
US6570597B1 (en) Icon display processor for displaying icons representing sub-data embedded in or linked to main icon data
US5628003A (en) Document storage and retrieval system for storing and retrieving document image and full text data
US6874002B1 (en) System and method for normalizing a resume
JP3178475B2 (en) Data processing equipment
US7081975B2 (en) Information input device

Legal Events

Date Code Title Description
A01 Written decision to grant a patent or to grant a registration (utility model)

Free format text: JAPANESE INTERMEDIATE CODE: A01

Effective date: 20040113

A61 First payment of annual fees (during grant procedure)

Free format text: JAPANESE INTERMEDIATE CODE: A61

Effective date: 20040116

R250 Receipt of annual fees

Free format text: JAPANESE INTERMEDIATE CODE: R250

FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20100123

Year of fee payment: 6

FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20130123

Year of fee payment: 9

EXPY Cancellation because of completion of term
FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20130123

Year of fee payment: 9