CN1380620A - Automatic editing method of book index - Google Patents

Automatic editing method of book index Download PDF

Info

Publication number
CN1380620A
CN1380620A CN 01144430 CN01144430A CN1380620A CN 1380620 A CN1380620 A CN 1380620A CN 01144430 CN01144430 CN 01144430 CN 01144430 A CN01144430 A CN 01144430A CN 1380620 A CN1380620 A CN 1380620A
Authority
CN
China
Prior art keywords
file
index
coding
entry
chinese
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN 01144430
Other languages
Chinese (zh)
Inventor
张弦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to CN 01144430 priority Critical patent/CN1380620A/en
Publication of CN1380620A publication Critical patent/CN1380620A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Document Processing Apparatus (AREA)

Abstract

The invention relates to a method for automatic editing the index of books. The method includes following steps. The words or vocabulary entries to be indexed are marked on the text. The marked words or vocabulary entries are picked up, generating a directory file B and a relevant page number file C. The corresponding code of each character in the directory file B is found out from the library file A. The said corresponding codes being combined with the file B forms the coded file D. The coded file D links to the page number file C, generating a prophase file E. The file E is sorted to generate the sorted file F. The index file G is obtained by picking up words and vocabulary entries from the sorted file F and relevant page number. The invention can generate multiple index files based on multiple coding forms.

Description

Automatic editing method of book index
One, technical field
The present invention relates to a kind of method of combination of book index, particularly a kind of method of manuscript being handled the direct acquisition in back index file.
Two, background technology
Particularly in the editing of reference book, dictionary, dictionary class books, the layout of index is a loaded down with trivial details engineering at book editor.Up to the present, also do not see a kind of method of from editor's books, extracting the entry permutation index automatically, existing index layout remains the method for taking manual acquisition, and promptly manually the collection index clauses and subclauses and the page number of living in from editor's manuscript are compiled index file.If with the postedit manuscript literal additions and deletions need take place, adjust change before and after the entry word order, will cause the variation of whole index word, entry and the corresponding page number thereof, need update index file; And in the manual permutation index file process also than the phenomenon that is easier to gaps and omissions occur or lose some entry, cause the index file mistake.Therefore, not only time-consuming, the effort of the method for permutation index file hand-manipulated, and also layout is inaccurate, the error rate height.
Three, summary of the invention
The objective of the invention is to overcome the deficiency of manual permutation index method, a kind of method of the file of permutation index fast and automatically is provided.
Automatic editing method of book index of the present invention, form by following step:
1) sets up a library file A who includes literal and corresponding coding thereof.
Include literal such as simplified form of Chinese Character, Chinese-traditional, western language at library file A, and the corresponding Chinese character sequencer coding of each literal, codings such as strokes of Chinese characters encoding, western language coding.
2) word or the entry that needs to index in editor's manuscript text carried out mark.
The work that the word or the entry of index carried out mark can be carried out in the process that editor's manuscript text or text are set type, and for example word or the entry row with index is No. 5 words, and perhaps boldface type etc. is exactly a kind of mark.
3) with underlined word or entry from the manuscript text, extract, generate catalogue file B.
4) simultaneously at word or entry residing page number position in the manuscript text of institute's mark, produce the page number lock token of a correspondence, generate page file C.
The word or the entry that extract among each page number among this page file C and the catalogue file B are one-to-one relationship.If the manuscript text changes, the residing page number of word or entry changes, and then the corresponding page number among the page file C also can change automatically, is consistent with the manuscript text at any time.
5) search each word or entry corresponding codes in library file A among the catalogue file B successively, file B generates coded file D with corresponding coded combination.
In fact comprised all the elements among the catalogue file B among this coded file D and increased the correspondence coding that all literal search among the catalogue file B in library file A.The generative process of coded file D is: open catalogue file B, word-for-word from library file A, seek corresponding coding one by one, promptly at each entry, word for word from library file A, seek the corresponding coding of this literal, and be stored in after this entry after the coding of each literal is arranged in order in this entry that will search out, form the coding of entry, edit the coding of next entry again, finally generate coded file D.If a certain literal in the entry is not included in library file A, i.e. this literal corresponding codes not, then give this literal a specified coding automatically, for example " 60 ", this specified coding must be can be discharged at last when sorting with all the other any codings.
6), generate the file E in early stage of index with the corresponding connection of coded file D with page file C.
Include each word or the entry that need index, the coding of this word or entry or assembly coding, and this word or entry in manuscript text the residing page number three contents among the file E early stage.
7) according to word or the corresponding coding of entry, coding according to the rules puts in order, and file E resequences to early stage, generates sort file F.
According to different codings, what regulation was different puts in order, and for example the sequencer coding predetermined arrangement is to arrange according to English alphabet order and numerical order in proper order, and stroke puts in order to row is digital earlier, arranges the small letter English alphabet again, arranges the order of capitalization English letter at last.
8) word or the entry among the extraction sort file F, and the corresponding page number generates index file G.
As required, can finally generate different index files according to different codings.For example generate Chinese character sound sequence index, generate the Chinese character stroke index, generate the western language index, perhaps generate Chinese and western languages in conjunction with index or the like in conjunction with western language and a certain Chinese character code according to the western language coding according to strokes of Chinese characters encoding according to the Chinese character sequencer coding.
Wherein, the Chinese character sequencer coding is that the Chinese phonetic alphabet and combinations of tones with this Chinese character generates sequencer coding together.Chinese phonetic alphabet corresponding codes is the Chinese phonetic alphabet itself, and the tone corresponding codes is:
Tone even tone (-) rising tune (
Figure A0114443000061
) last sound (∨) falling tone (`)
Corresponding coding 1234
For example: the sequencer coding of peace is an1, and the sequencer coding that is equipped with is bei4, and sequencer coding is pei2 between accompanying.
Strokes of Chinese characters encoding is generated by stroke number coding and order of strokes coded combination, wherein the stroke number coding is the actual stroke number of each Chinese character, when the stroke number of Chinese character is 1-9, coding corresponds to 1-9, when the stroke number of Chinese character is double figures, replaces numerical coding with the small letter English alphabet, as 10 → a, 11 → b, 12 → c, and the like.After the small letter English alphabet uses up, then encode with capitalization English letter.Order of strokes is corresponding to be encoded to:
The order of strokes coding
One,
Figure A0114443000062
(format write from left to right) 1 Shu , 亅 (from top to bottom format write) 2 Pie,
Figure A0114443000063
(by upper right format write to left down) 3
Fu, Dian (by upper left to the bottom right format write) 4
Figure A0114443000071
Second (all folding pen strokes) 5 is for example: character stroke number encoder stroke encoding
One 11
In 4 2512
Say 9 454325135
Compile c 551451325122
The stroke encoding of " middle school student " is 42512844335551531121.
The western language coding is western language itself, and the differentiation of capital and small letter is arranged simultaneously.
The present invention utilizes computing machine that editor's manuscript text is carried out the layout of index file automatically, solved loaded down with trivial details, the time-consuming deficiency of traditional-handwork permutation index file operation, method of combination has suitable dirigibility, randomness and randomness, the index file that generates is comprehensive, accurately, the generation of index file is convenient, fast, greatly facilitates the editing of index.
Index file of the present invention is to produce according to the content of manuscript text, if the content of manuscript text has been carried out some additions and deletions, transfer variations such as preface, the word of index file, entry and the corresponding page number also can be adjusted automatically, remain the consistance of index and manuscript text, not only the adjustment of manuscript text does not produce any influence to index file, and the index file that generates is accurate, complete, does not have the omission phenomenon.
The present invention can generate the index file of various ways as required at random neatly, to satisfy different requirements.
Four, description of drawings
Fig. 1 is the process flow diagram that generates catalogue file B among the present invention;
Fig. 2 is the process flow diagram that is generated index file G among the present invention by catalogue file B.
Five, embodiment
Embodiment 1
Fig. 1 is the product process figure of catalogue file B.At first determine one piece of manuscript file T1 that index entry is carried out mark, then storage space M1 of initialization is used for depositing the entry N1 that extracts from T1.Whether test T1 finishes earlier, is "No" as the figure conclusion, initialization N1, begin to search the reference position of mark, and be defined as P1, then search the final position of mark, and be defined as P2, with content-defined between P1, the P2 is N1, from T1, extract, be stored among the M1, so circulate, until to manuscript file T1 EO the time, generate catalogue file B.If conclusion is a "No" when reference position of searching or final position, then directly generate catalogue file B.
When generating catalogue file B, manuscript file T1 editing and composing is generated a page file C simultaneously, the page number among the page file C is corresponding one by one with entry among the catalogue file B.
Fig. 2 is the process flow diagram that is generated index file G by catalogue file B.At first open library file A and catalogue file B, and whether test catalogue file B finishes, if conclusion is a "No", then distinguish initialization memory space C1, H1 and B1, then test the suffix whether pointer points to the current entry of catalogue file B, if conclusion is a "No", deposit the current literal of current entry in H1, from library file A, search H1, and give B1 with the H1 corresponding codes, give C1=C1+B1, then search the coding of next literal, continue aforesaid operations and finish until all text search of this entry.If conclusion is a "Yes", C1 is deposited in the back of corresponding entry among the catalogue file B.From catalogue file B, extract next entry and carry out above-mentioned circulation again.If there is not the coding of a certain literal among the catalogue file B among the library file A, then give C1=C1+ " 60 " automatically, continue cycling again.After having searched the coding of all literal of all entries, generate a coded file D.This coded file D contains catalogue file B and the corresponding codes thereof that the word that extracts or entry generate from manuscript.
Open page file C, connect coded file D and page file C, generate a file E in earlier stage.
According to the coding according to sort method (as sound preface, stroke preface, western language preface etc.) to early stage file E sort, generate sort file F, extract the entry and the page number among the sort file F again, generate final index file G, at last, according to the edit format requirement, index file G is carried out format editing, finally form index file.
Embodiment 2
Embodiment 2 provides a concrete entry layout example.
For example, from the manuscript text, extract entry " peace ", " being equipped with ", " to ", " pressing ", " accompanying " generate catalogue file B, generate a page file C according to the page number position of above-mentioned entry in the manuscript text simultaneously, be specially " 1 ", " 2 ", " 3, " " 4 ", " 5 ".From the library file A that contains sequencer coding, find the corresponding coding of entry and be respectively " an1 ", " bei4 ", " dao4 ", " an4 ", " pei2 ", after above-mentioned coding was placed on the corresponding entry of catalogue file B one by one, it was as follows to form a coded file D:
Peace an1
Be equipped with bei4
To dao4
Press an4
Accompany pei2
With coded file D and page file C combination, file E is as follows in earlier stage to generate one again:
Peace an1 1
Be equipped with bei4 2
To dao4 3
Press an4 4
Accompany pei2 5
Put in order according to the sound preface,, generate following sort file F file E ordering in early stage:
Peace an1 1
Press an4 4
Be equipped with bei4 2
To dao4 3
Accompany pei2 5
Extract the entry and the page number among the sort file F, generate sound sequence index file G at last:
Peace 1
By 4
Be equipped with 2
To 3
Accompany 5
Embodiment 3
From the manuscript text, extract entry " middle school student learn newspaper ", " middle school student ", " middle school student study column " and " middle school student learn the garden ", obtain its corresponding page number simultaneously and be " 17 " " 36 " " 49 " " 5 ", from the strokes of Chinese characters encoding library file, search and be combined into the stroke encoding of above-mentioned entry, form the file in early stage with the entry and the page number:
Middle school student learn to report 4,251,284,433,555,153,112,184,433,555,135,171,215,345 17
Middle school student 42,512,844,335,551,531,121 36
Middle school student study column 425,128,443,355,515,311,218,443,355,513,541,725,113,516,121,525 49
Middle school student learn garden 42,512,844,335,551,531,121,844,335,551,354,172,511,351 5
Sort file is:
Middle school student 42,512,844,335,551,531,121 36
Middle school student learn to report 4,251,284,433,555,153,112,184,433,555,135,171,215,345 17
Middle school student learn garden 42,512,844,335,551,531,121,844,335,551,354,172,511,351 5
Middle school student study column 425,128,443,355,515,311,218,443,355,513,541,725,113,516,121,525 49
Last Chinese character stroke index file is:
Middle school student 36
Middle school student learn to report 17
Middle school student learn garden 5
Middle school student study column 49
Among the present invention, can also be according to page number permutation index file, perhaps according to the layout of the multiple index file of scheme implementation such as combine with order of strokes of western language order, Chinese and western languages binding sequence, sound preface order.

Claims (8)

1, a kind of automatic editing method of book index is characterized in that being made up of following step:
1) sets up a library file A who includes literal and corresponding coding thereof;
2) word or the entry that needs to index in editor's manuscript text carried out mark;
3) with underlined word or entry from the manuscript text, extract, generate catalogue file B;
4) simultaneously at word or entry residing page number position in the manuscript text of institute's mark, produce the page number lock token of a correspondence, generate page file C;
5) search each word or entry corresponding codes in library file A among the catalogue file B successively, generate coded file D with the B combination of files;
6), generate the file E in early stage of index with the corresponding connection of coded file D with page file C;
7) according to word or the corresponding coding of entry, coding according to the rules puts in order, and file E resequences to early stage, generates sort file F;
8) word or the entry among the extraction sort file F, and the corresponding page number generates index file G;
2, automatic editing method of book index according to claim 1 is characterized in that the literal among the library file A includes simplified form of Chinese Character, Chinese-traditional and western language.
3, automatic editing method of book index according to claim 1 is characterized in that the corresponding coding of literal among the library file A includes the Chinese character sequencer coding, one or more in strokes of Chinese characters encoding, the western language coding.
4, according to claim 1 or 3 described automatic editing method of book index, it is characterized in that described Chinese character sequencer coding is that the Chinese phonetic alphabet and combinations of tones with this Chinese character generates sequencer coding together, Chinese phonetic alphabet corresponding codes wherein is the Chinese phonetic alphabet itself, and the tone corresponding codes is:
Tone even tone (-) rising tune (
Figure A0114443000031
) last sound (∨) falling tone (`)
Corresponding coding 1234
5, according to claim 1 or 3 described automatic editing method of book index, it is characterized in that described strokes of Chinese characters encoding is that the stroke number and the order of strokes of this Chinese character are combined the generation stroke encoding, wherein the stroke number corresponding codes is 1-9 when stroke number 1-9, when stroke number is 10 when above, according to a, b, c ... z; A, B, C ... the order of Z is encoded, and order of strokes is corresponding to be encoded to:
The order of strokes coding
One,
Figure A0114443000032
(format write from left to right) 1 Shu , 亅 (from top to bottom format write) 2 Pie,
Figure A0114443000033
(by upper right format write to left down) 3 Fu, Dian (by upper left to the bottom right format write) 4 , second (strokes of all folding pens) 5
6,, it is characterized in that described western language coding is western language itself according to claim 1 or 3 described automatic editing method of book index.
7, automatic editing method of book index according to claim 1, the coding that it is characterized in that described entry are the codings that the coding of each literal in the entry is arranged in order the formation entry.
8, automatic editing method of book index according to claim 1 is characterized in that the index file that generates can be Chinese character sound sequence index, Chinese character stroke index, Chinese character sound preface, stroke in conjunction with index, western language index, Chinese and western languages in conjunction with in the index one or more.
CN 01144430 2001-12-18 2001-12-18 Automatic editing method of book index Pending CN1380620A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 01144430 CN1380620A (en) 2001-12-18 2001-12-18 Automatic editing method of book index

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN 01144430 CN1380620A (en) 2001-12-18 2001-12-18 Automatic editing method of book index

Publications (1)

Publication Number Publication Date
CN1380620A true CN1380620A (en) 2002-11-20

Family

ID=4677573

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 01144430 Pending CN1380620A (en) 2001-12-18 2001-12-18 Automatic editing method of book index

Country Status (1)

Country Link
CN (1) CN1380620A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103729402A (en) * 2013-11-22 2014-04-16 浙江大学 Method for establishing mapping knowledge domain based on book catalogue
CN103810199A (en) * 2012-11-12 2014-05-21 北大方正集团有限公司 Method and device for directory production
CN103927339A (en) * 2014-03-27 2014-07-16 北大方正集团有限公司 System and method for reorganizing knowledge
CN108205578A (en) * 2016-12-20 2018-06-26 北大方正集团有限公司 Index generation method and device
CN112380814A (en) * 2020-11-04 2021-02-19 福建亿榕信息技术有限公司 Domestic operating system-based automatic information manuscript combination and edition method
CN117633143A (en) * 2023-11-29 2024-03-01 雅昌文化(集团)有限公司 Chinese vocabulary entry multi-condition compound ordering method

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103810199A (en) * 2012-11-12 2014-05-21 北大方正集团有限公司 Method and device for directory production
CN103810199B (en) * 2012-11-12 2017-07-14 北大方正集团有限公司 The preparation method and device of a kind of catalogue
CN103729402A (en) * 2013-11-22 2014-04-16 浙江大学 Method for establishing mapping knowledge domain based on book catalogue
CN103729402B (en) * 2013-11-22 2017-01-18 浙江大学 Method for establishing mapping knowledge domain based on book catalogue
CN103927339A (en) * 2014-03-27 2014-07-16 北大方正集团有限公司 System and method for reorganizing knowledge
CN103927339B (en) * 2014-03-27 2017-10-31 北大方正集团有限公司 Knowledge Reorganizing system and method for knowledge realignment
CN108205578A (en) * 2016-12-20 2018-06-26 北大方正集团有限公司 Index generation method and device
CN112380814A (en) * 2020-11-04 2021-02-19 福建亿榕信息技术有限公司 Domestic operating system-based automatic information manuscript combination and edition method
CN112380814B (en) * 2020-11-04 2022-08-19 福建亿榕信息技术有限公司 Domestic operating system-based automatic information manuscript combination and edition method
CN117633143A (en) * 2023-11-29 2024-03-01 雅昌文化(集团)有限公司 Chinese vocabulary entry multi-condition compound ordering method

Similar Documents

Publication Publication Date Title
CN1023916C (en) Chinese keyboard entry technique with both simplified and original complex form of Chinese character root and its keyboard
CN1095560C (en) Kanji conversion result amending system
CN1434365A (en) Chinese Character graphic form input device and method
CN1380620A (en) Automatic editing method of book index
CN1737739A (en) Tibetan input method based on English keyboard
CN1136496C (en) Simplified spelling-touching screen mouse chinese character input method
CN1119739C (en) Chinese-character 5-stroke digital input method with keyboard of computer and its keyboard
CN1302415C (en) English-Chinese translation machine
CN1731389A (en) Braille-Chinese contrapositive editing/typesetting system and editing/typesetting method
CN100339808C (en) U Code Chinese character inputting method
CN1136497C (en) Chinese character input method through touching screen and mouse
CN1679023A (en) Method and system of creating and using chinese language data and user-corrected data
CN1257445C (en) Chinese-character 'Pronunciation-meaning code' input method
CN1118085A (en) Chinese character input system capable of inputing by digital keyboard and its keyboard
CN1052200A (en) Pronunciation-form-meaning words encode series with compatibility and keyboard
CN1246758C (en) Four-corner code Chinese character input method for computer and keyboard thereof
CN1142474C (en) Dictionary code Chinese character input method
CN1103181A (en) Multi-key pressing high-speed Chinese character input method and keyboard
CN1056007C (en) Codes for inputting Chinese characters
CN1059969C (en) Tone-form Chinese character coding input method
CN1288185A (en) Sound and shape word code Chinese character input method
CN1808349A (en) User interface and database structure for chinese phrasal stroke and phonetic text input
CN1104673C (en) Popularized Lin code inputting method for Chinese characters
CN1195257C (en) Chinese-character structure code input method
CN1114066A (en) Sense sgroup input, editing and word code

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C02 Deemed withdrawal of patent application after publication (patent law 2001)
WD01 Invention patent application deemed withdrawn after publication