CN108205578A - Index generation method and device - Google Patents

Index generation method and device Download PDF

Info

Publication number
CN108205578A
CN108205578A CN201611187873.3A CN201611187873A CN108205578A CN 108205578 A CN108205578 A CN 108205578A CN 201611187873 A CN201611187873 A CN 201611187873A CN 108205578 A CN108205578 A CN 108205578A
Authority
CN
China
Prior art keywords
index
terms
reference book
type
stroke
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201611187873.3A
Other languages
Chinese (zh)
Inventor
耿红霞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Peking University Founder Group Co Ltd
Beijing Founder Electronics Co Ltd
Original Assignee
Peking University Founder Group Co Ltd
Beijing Founder Electronics Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Peking University Founder Group Co Ltd, Beijing Founder Electronics Co Ltd filed Critical Peking University Founder Group Co Ltd
Priority to CN201611187873.3A priority Critical patent/CN108205578A/en
Publication of CN108205578A publication Critical patent/CN108205578A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/31Indexing; Data structures therefor; Storage structures

Abstract

The embodiment of the present invention provides a kind of index generation method and device.This method includes:According to the data that the index terms of input or reference book index, index thesaurus is established;Word segmentation processing is carried out to the content of the reference book and obtains index terms;Establish the correspondence of index type and index terms;According to the index of sort algorithm generation different index type.The data that the embodiment of the present invention is indexed according to the index terms or reference book of input, establish index thesaurus, word segmentation processing is carried out to the content of reference book and obtains index terms, index terms is stored according to index type, according to the index of sort algorithm generation different index type, can efficient quick Auto-Generation Tool book index, the speed of reference book index generation is improved, so as to improve efficiency data query.

Description

Index generation method and device
Technical field
The present embodiments relate to reference books to compile field more particularly to a kind of index generation method and device.
Background technology
Reference book is the books for searching true and data, it is not generally for the purpose of providing system reading, but conduct The auxiliary tool with searching knowledge is investigated when needed.Various knotty problems are efficiently solved, are saved time and effort, are common Not available for books, under the trend of the huge increasing of information content, establishing index is just particularly important.
But in the prior art, the index of reference book need it is artificial establish, cause index to establish speed slower, reduce Efficiency data query.
Invention content
The embodiment of the present invention provides a kind of index generation method and device, and speed is established with raising index.
The one side of the embodiment of the present invention is to provide a kind of index generation method, including:
According to the data that the index terms of input or reference book index, index thesaurus is established, the index thesaurus includes index Word and index condition;
Word segmentation processing is carried out to the content of the reference book and obtains index terms;
The correspondence of index type and index terms is established, the index type includes following at least one:Stroke index, Content indexing, sound sequence index;
According to the index of sort algorithm generation different index type.
The other side of the embodiment of the present invention is to provide a kind of index generating means, including:
Index thesaurus establishes module, for the data that the index terms according to input or reference book index, establishes index thesaurus, The index thesaurus includes index terms and index condition;
Word segmentation processing module obtains index terms for carrying out word segmentation processing to the content of the reference book;
Module is established, for establishing the correspondence of index type and index terms, the index type is included as follows at least It is a kind of:Stroke index, content indexing, sound sequence index;
Generation module is indexed, for according to the index of sort algorithm generation different index type.
Index generation method and device provided in an embodiment of the present invention, the number indexed according to the index terms of input or reference book According to, index thesaurus is established, carrying out word segmentation processing to the content of reference book obtains index terms, and index terms is stored according to index type, According to the index of sort algorithm generation different index type, can efficient quick Auto-Generation Tool book index, improve work Has the speed of book index generation, so as to improve efficiency data query.
Description of the drawings
Fig. 1 is index generation method flow chart provided in an embodiment of the present invention;
Fig. 2 is the indexed results schematic diagram of index generation method provided in an embodiment of the present invention;
Fig. 3 is the flow chart of stroke Index Algorithm provided in an embodiment of the present invention;
Fig. 4 is the flow chart of sound sequence index algorithm provided in an embodiment of the present invention;
Fig. 5 is the flow chart of content indexing algorithm provided in an embodiment of the present invention;
Fig. 6 is the structure chart of index generating means provided in an embodiment of the present invention;
Fig. 7 is the structure chart of index generating means that another embodiment of the present invention provides.
Specific embodiment
Fig. 1 is index generation method flow chart provided in an embodiment of the present invention.The embodiment of the present invention is directed in the prior art, The index of reference book need it is artificial establish, cause index to establish speed slower, reduce efficiency data query, provide index Generation method, this method are as follows:
Step S101, the data indexed according to the index terms of input or reference book, establish index thesaurus, the index thesaurus Including index terms and index condition.
The executive agent of the present embodiment can be computer or database, and the computer or database purchase have tool Book, and the number of the reference book stored can be multiple or one, user can pass through input equipment such as mouse, key Disk, touch screen etc. input index terms to computer or database, in addition, computer or database can also be to the contents of reference book It is indexed, forms the data of index, the number that computer or database can be indexed according to index terms input by user or reference book According to establishing index thesaurus, the index thesaurus includes index terms and index condition.
Step S102, word segmentation processing is carried out to the content of the reference book and obtains index terms.
In addition, computer or database can also carry out word segmentation processing to the content of reference book and obtain index terms.
Step S103, the correspondence of index type and index terms is established, the index type includes following at least one: Stroke index, content indexing, sound sequence index.
After obtaining a large amount of index terms according to above-mentioned steps, the correspondence of index type and index terms is established, specifically, Index type includes following at least one:Stroke index, content indexing, sound sequence index, same index terms can belong to simultaneously Three kinds of stroke index, content indexing, sound sequence index different index types, each index terms can belong to stroke rope simultaneously Draw, three kinds of content indexing, sound sequence index different index types, so as to store data according to index type.
Step S104, according to the index of sort algorithm generation different index type.
Stroke index, content indexing, sound sequence index are three kinds of different index types, can for each index type The index of the index type is generated according to sort algorithm, for example, being indexed for stroke, is indexed according to the stroke of sort algorithm generation Index include:One draws, two draw, three draw ...;Wherein, a beginning stroke for drawing expression index terms is a picture, and two draw expression ropes The beginning stroke for drawing word is two pictures, and so on.
The data that the embodiment of the present invention is indexed according to the index terms or reference book of input, establish index thesaurus, to reference book Content carry out word segmentation processing obtain index terms, according to index type store index terms, according to sort algorithm generate different index The index of type, can efficient quick Auto-Generation Tool book index, the speed of reference book index generation is improved, so as to carry High efficiency data query.
Fig. 2 is the indexed results schematic diagram of index generation method provided in an embodiment of the present invention.As shown in Fig. 2, index terms Table includes index navigation area 21 and index content area 22, and index navigation area 21 includes the selective listing 211 of index type, such as pen Draw index, content indexing, sound sequence index, in addition, index navigation area 21 further include list 212, it is assumed that index type select be The index indexed according to the stroke that sort algorithm generates, such as 0 picture, 1 picture, 2 pictures, 3 are shown in stroke index in list 212 Draw etc..
On the basis of above-described embodiment, index terms input by user corresponds at least one in stroke index and sound sequence index It is a.Optionally, index terms input by user had both belonged to stroke index, also belonged to sound sequence index.
In addition, the data that computer or database can also be indexed according to reference book, generate index terms, the index terms pair Answer at least one of stroke index and sound sequence index.Optionally, the data indexed according to reference book, the index terms of generation both belonged to It is indexed in stroke, also belongs to sound sequence index.
In addition, the index terms that computer or database carry out the content of the reference book word segmentation processing acquisition corresponds to stroke At least one of index, content indexing, sound sequence index.Optionally, word segmentation processing acquisition is carried out to the content of the reference book Index terms belong to stroke index, content indexing, sound sequence index simultaneously.
In addition, as shown in Fig. 2, when user is after index type is selected on interface, which will be automatically according to sort algorithm The index of different index type is generated, when user selects an index in multiple indexes, index content area 22 will show and accord with Close the data of index type and the index of the index type.
The data that the embodiment of the present invention is indexed according to the index terms or reference book of input, establish index thesaurus, to reference book Content carry out word segmentation processing obtain index terms, according to index type store index terms, according to sort algorithm generate different index The index of type, can efficient quick Auto-Generation Tool book index, the speed of reference book index generation is improved, so as to carry High efficiency data query.
Fig. 3 is the flow chart of stroke Index Algorithm provided in an embodiment of the present invention;Fig. 4 is sound provided in an embodiment of the present invention The flow chart of sequence index algorithm;Fig. 5 is the flow chart of content indexing algorithm provided in an embodiment of the present invention.
As shown in figure 3, stroke Index Algorithm includes the following steps:
Step S301, all stroke orders of index terms are found, and by sequence sequence from small to large;
Step S302, all index terms of stroke index and the sequence of progress from small to large are searched;
Step S303, corresponding stroke number is clicked, index terms and the row of progress from small to large of corresponding stroke number can be found Sequence;
Step S304, corresponding result is shown to user interface.
As shown in figure 4, sound sequence index algorithm includes the following steps:
Step S401, all lexicographic orders of index terms are found, and are sorted by the tandem of English alphabet;
Step S402, all index terms of sound sequence index are searched and carry out sound sequence sequence;
Step S403, corresponding letters are clicked, the index terms under corresponding letter can be found and carry out sound sequence sequence;
Step S404, corresponding result is shown to user interface.
As shown in figure 5, content indexing algorithm includes the following steps:
Step S501, all index terms of content indexing are searched and carry out sound sequence sequence;
Step S502, corresponding result is shown to user interface.
The data that the embodiment of the present invention is indexed according to the index terms or reference book of input, establish index thesaurus, to reference book Content carry out word segmentation processing obtain index terms, according to index type store index terms, according to sort algorithm generate different index The index of type, can efficient quick Auto-Generation Tool book index, the speed of reference book index generation is improved, so as to carry High efficiency data query.
Fig. 6 is the structure chart of index generating means provided in an embodiment of the present invention.Index life provided in an embodiment of the present invention The process flow of index generation method embodiment offer can be performed into device, as shown in fig. 6, index generating means 60 include rope Draw vocabulary to establish module 61, word segmentation processing module 62, establish module 63, index generation module 64, wherein, index thesaurus establishes mould Block 61 is used for the data indexed according to the index terms or reference book of input, establishes index thesaurus, and the index thesaurus includes index Word and index condition;Word segmentation processing module 62 is used to carry out the content of the reference book word segmentation processing acquisition index terms;It establishes Module 63 is used to establish the correspondence of index type and index terms, and the index type includes following at least one:Stroke rope Draw, content indexing, sound sequence index;Index the index that generation module 64 is used to generate different index type according to sort algorithm.
The data that the embodiment of the present invention is indexed according to the index terms or reference book of input, establish index thesaurus, to reference book Content carry out word segmentation processing obtain index terms, according to index type store index terms, according to sort algorithm generate different index The index of type, can efficient quick Auto-Generation Tool book index, the speed of reference book index generation is improved, so as to carry High efficiency data query.
Fig. 7 is the structure chart of index generating means that another embodiment of the present invention provides.On the basis of above-described embodiment, The index terms of the input corresponds at least one of stroke index and sound sequence index.Index generating means 60 further include:Index Word generation module 65, index terms generation module 65 are used for the data indexed according to reference book, generate index terms, the index terms pair Answer at least one of stroke index and sound sequence index.
The content to the reference book carry out word segmentation processing acquisition index terms correspond to stroke index, content indexing, At least one of sound sequence index.
In addition, index generating means 60 further include:Enquiry module 66, enquiry module 66 be used for according to the index type, And the index of the index type, inquire data.
Index generating means provided in an embodiment of the present invention can be specifically used for performing the method implementation that above-mentioned Fig. 1 is provided Example, details are not described herein again for concrete function.
The data that the embodiment of the present invention is indexed according to the index terms or reference book of input, establish index thesaurus, to reference book Content carry out word segmentation processing obtain index terms, according to index type store index terms, according to sort algorithm generate different index The index of type, can efficient quick Auto-Generation Tool book index, the speed of reference book index generation is improved, so as to carry High efficiency data query.
In conclusion the data that the embodiment of the present invention is indexed according to the index terms or reference book of input, establish index thesaurus, Word segmentation processing is carried out to the content of reference book and obtains index terms, index terms is stored according to index type, is generated according to sort algorithm The index of different index type, can efficient quick Auto-Generation Tool book index, improve the speed of reference book index generation Degree, so as to improve efficiency data query.
In several embodiments provided by the present invention, it should be understood that disclosed device and method can pass through it Its mode is realized.For example, the apparatus embodiments described above are merely exemplary, for example, the division of the unit, only Only a kind of division of logic function can have other dividing mode in actual implementation, such as multiple units or component can be tied It closes or is desirably integrated into another system or some features can be ignored or does not perform.Another point, it is shown or discussed Mutual coupling, direct-coupling or communication connection can be the INDIRECT COUPLING or logical by some interfaces, device or unit Letter connection can be electrical, machinery or other forms.
The unit illustrated as separating component may or may not be physically separate, be shown as unit The component shown may or may not be physical unit, you can be located at a place or can also be distributed to multiple In network element.Some or all of unit therein can be selected according to the actual needs to realize the mesh of this embodiment scheme 's.
In addition, each functional unit in each embodiment of the present invention can be integrated in a processing unit, it can also That each unit is individually physically present, can also two or more units integrate in a unit.Above-mentioned integrated list The form that hardware had both may be used in member is realized, can also be realized in the form of hardware adds SFU software functional unit.
The above-mentioned integrated unit realized in the form of SFU software functional unit, can be stored in one and computer-readable deposit In storage media.Above-mentioned SFU software functional unit is stored in a storage medium, is used including some instructions so that a computer It is each that equipment (can be personal computer, server or the network equipment etc.) or processor (processor) perform the present invention The part steps of embodiment the method.And aforementioned storage medium includes:USB flash disk, mobile hard disk, read-only memory (Read- Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disc or CD etc. it is various The medium of program code can be stored.
Those skilled in the art can be understood that, for convenience and simplicity of description, only with above-mentioned each function module Division progress for example, in practical application, can be complete by different function modules by above-mentioned function distribution as needed Into the internal structure of device being divided into different function modules, to complete all or part of function described above.On The specific work process of the device of description is stated, the corresponding process in preceding method embodiment can be referred to, details are not described herein.
Finally it should be noted that:The above embodiments are only used to illustrate the technical solution of the present invention., rather than its limitations;To the greatest extent Pipe is described in detail the present invention with reference to foregoing embodiments, it will be understood by those of ordinary skill in the art that:Its according to Can so modify to the technical solution recorded in foregoing embodiments either to which part or all technical features into Row equivalent replacement;And these modifications or replacement, various embodiments of the present invention technology that it does not separate the essence of the corresponding technical solution The range of scheme.

Claims (10)

1. a kind of index generation method, which is characterized in that including:
According to the data that the index terms of input or reference book index, establish index thesaurus, the index thesaurus include index terms and Index condition;
Word segmentation processing is carried out to the content of the reference book and obtains index terms;
The correspondence of index type and index terms is established, the index type includes following at least one:Stroke index, content Index, sound sequence index;
According to the index of sort algorithm generation different index type.
2. according to the method described in claim 1, it is characterized in that, the index terms of the input corresponds to stroke index and sound sequence rope At least one of draw.
3. it according to the method described in claim 1, it is characterized in that, further includes:
According to the data that reference book indexes, index terms is generated, the index terms is corresponded in stroke index and sound sequence index at least One.
4. according to the method described in claim 1, it is characterized in that, the content progress word segmentation processing to the reference book obtains The index terms obtained corresponds at least one of stroke index, content indexing, sound sequence index.
5. according to claim 1-4 any one of them methods, which is characterized in that further include:
According to the index type and the index of the index type, data are inquired.
6. a kind of index generating means, which is characterized in that including:
Index thesaurus establishes module, for the data that the index terms according to input or reference book index, establishes index thesaurus, described Index thesaurus includes index terms and index condition;
Word segmentation processing module obtains index terms for carrying out word segmentation processing to the content of the reference book;
Module is established, for establishing the correspondence of index type and index terms, the index type includes following at least one: Stroke index, content indexing, sound sequence index;
Generation module is indexed, for according to the index of sort algorithm generation different index type.
7. index generating means according to claim 6, which is characterized in that the index terms of the input corresponds to stroke index At least one of with sound sequence index.
8. index generating means according to claim 6, which is characterized in that further include:
Index terms generation module for the data indexed according to reference book, generates index terms, and the index terms corresponds to stroke index At least one of with sound sequence index.
9. index generating means according to claim 6, which is characterized in that the content to the reference book is divided The index terms that word processing obtains corresponds at least one of stroke index, content indexing, sound sequence index.
10. generating means are indexed according to claim 6-9 any one of them, which is characterized in that further include:
Enquiry module for the index according to the index type and the index type, inquires data.
CN201611187873.3A 2016-12-20 2016-12-20 Index generation method and device Pending CN108205578A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611187873.3A CN108205578A (en) 2016-12-20 2016-12-20 Index generation method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611187873.3A CN108205578A (en) 2016-12-20 2016-12-20 Index generation method and device

Publications (1)

Publication Number Publication Date
CN108205578A true CN108205578A (en) 2018-06-26

Family

ID=62604327

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611187873.3A Pending CN108205578A (en) 2016-12-20 2016-12-20 Index generation method and device

Country Status (1)

Country Link
CN (1) CN108205578A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020037794A1 (en) * 2018-08-20 2020-02-27 南京师范大学 Index building method for english geographical name, and query method and apparatus therefor

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5913208A (en) * 1996-07-09 1999-06-15 International Business Machines Corporation Identifying duplicate documents from search results without comparing document content
CN1380620A (en) * 2001-12-18 2002-11-20 张弦 Automatic editing method of book index
CN1672957A (en) * 2004-03-06 2005-09-28 龚学胜 International phonetic symbol scheme, Chinese reference book arrangement and single-pinyin keypad input method
CN103823799A (en) * 2012-11-16 2014-05-28 镇江诺尼基智能技术有限公司 New-generation industry knowledge full-text search method

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5913208A (en) * 1996-07-09 1999-06-15 International Business Machines Corporation Identifying duplicate documents from search results without comparing document content
CN1380620A (en) * 2001-12-18 2002-11-20 张弦 Automatic editing method of book index
CN1672957A (en) * 2004-03-06 2005-09-28 龚学胜 International phonetic symbol scheme, Chinese reference book arrangement and single-pinyin keypad input method
CN103823799A (en) * 2012-11-16 2014-05-28 镇江诺尼基智能技术有限公司 New-generation industry knowledge full-text search method

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020037794A1 (en) * 2018-08-20 2020-02-27 南京师范大学 Index building method for english geographical name, and query method and apparatus therefor

Similar Documents

Publication Publication Date Title
EP3855324A1 (en) Associative recommendation method and apparatus, computer device, and storage medium
CN106227894B (en) A kind of data page querying method and device
CN108920611B (en) Article generation method, device, equipment and storage medium
CN108733681A (en) Information processing method and device
CN106033416A (en) A string processing method and device
CN106446019B (en) A kind of software function treating method and apparatus
CN102725753A (en) Method and apparatus for optimizing data access, method and apparatus for optimizing data storage
CN105894183A (en) Project evaluation method and apparatus
Amin et al. A comparison of two oversampling techniques (smote vs mtdf) for handling class imbalance problem: A case study of customer churn prediction
CN108228657B (en) Method and device for realizing keyword retrieval
Vishwakarma et al. A comparative study of K-means and K-medoid clustering for social media text mining
CN104320460A (en) Big data processing method
CN110263021B (en) Theme library generation method based on personalized label system
US7529648B2 (en) Method, system and computer program product for automatically generating a subset of task-based components from engineering and maintenance data
CN105872635A (en) Video resource distribution method and device
CN103810210B (en) Search result display methods and device
CN105069034A (en) Recommendation information generation method and apparatus
CN106599291A (en) Method and device for grouping data
CN112860850B (en) Man-machine interaction method, device, equipment and storage medium
CN108932434A (en) A kind of data ciphering method and device based on machine learning techniques
CN108205578A (en) Index generation method and device
CN104657749A (en) Method and device for classifying time series
CN102479072B (en) Multi-header report generating method, device and terminal
CN100587663C (en) Data presentation device and data presentation method
CN106372071B (en) The information acquisition method and device of data warehouse

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20180626