CN102930060A - Method and device for performing fast indexing of database - Google Patents

Method and device for performing fast indexing of database Download PDF

Info

Publication number
CN102930060A
CN102930060A CN2012104916427A CN201210491642A CN102930060A CN 102930060 A CN102930060 A CN 102930060A CN 2012104916427 A CN2012104916427 A CN 2012104916427A CN 201210491642 A CN201210491642 A CN 201210491642A CN 102930060 A CN102930060 A CN 102930060A
Authority
CN
China
Prior art keywords
file
database
retrieved
index list
type
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN2012104916427A
Other languages
Chinese (zh)
Other versions
CN102930060B (en
Inventor
孙振辉
刘富堂
徐德军
栾晓岩
邢轻
吴国庆
高轩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to CN201210491642.7A priority Critical patent/CN102930060B/en
Publication of CN102930060A publication Critical patent/CN102930060A/en
Application granted granted Critical
Publication of CN102930060B publication Critical patent/CN102930060B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention provides a method and a device for performing fast indexing of a database. The method comprises the following steps of: classifying files in a database according to a preset strategy, dividing the files in the database into a plurality of sub-databases according to file types; and building an index catalogue of the database according to the number of the sub-databases, and searching by utilizing the type of a to-be-searched file input by a user as a keyword. According to the method and the device, the sub-databases are built through a file classification method, and the corresponding index catalogue is built, and then the type of the to-be-searched file input by the user is used as the keyword for search; and the searching efficiency is higher, and less sources are occupied.

Description

A kind of method of database quick indexing and device
Technical field
The present invention relates to a kind of method and device of database quick indexing, belong to technical field of data storage.
Background technology
Database (Database) be according to data structure organize, the warehouse of store and management data.Along with the development in infotech and market, data management no longer only is the store and management data, and is transformed into the mode of the needed various data managements of user.Database has number of different types, all is widely used in all fields to the large-scale database system that can carry out mass data storage from the simplest form that stores various data.
Database is to organize and deposit data acquisition in the second-level storage according to certain data model.This data acquisition has following features: do not repeat as far as possible, multiple application service take optimum way as certain particular organization, its data structure is independent of the application program of using it, increasing, delete, changing and retrieve by unifying software and manage and control data.Database is the advanced stage of data management, and it is grown up by file management system.
Because the file type of storing in the database is varied, such as word document, excel form, txt document etc., if all carry out the scanning of full storehouse during certain file in each searching database, cause recall precision lower and to take resource more; And if by the program inquiring database, run into program deadlock or key assignments and occur also can causing redirect to help library searching when wrong, still can't improve recall precision
Summary of the invention
The present invention is lower and take the more problem of resource for solve the recall precision that causes when the redirect of search program mistake or when directly adopting full storehouse to scan deposit in existing database index technology.For this reason, the present invention proposes following technical scheme:
A kind of method of database quick indexing comprises:
According to the strategy that sets in advance the file in the database is classified, and by file type the Divide File in the described database is become several subdata bases;
Set up the index list of described database according to the number of described subdata base, and the type of the file to be retrieved of user's input is retrieved described index list as keyword.
A kind of device of database quick indexing comprises:
The word bank division unit is used for according to the strategy that sets in advance the file of database being classified, and by file type the Divide File in the described database is become several subdata bases;
The file type retrieval unit is used for setting up according to the number of described subdata base the index list of described database, and the type of the file to be retrieved of user's input is retrieved described index list as keyword.
The present invention sets up subdata base by the method that employing classifies the documents, and sets up corresponding index list, retrieves as keyword according to the type of the file to be retrieved of user input again, have higher recall precision and the resource that takies less.
Description of drawings
Fig. 1 is the schematic flow sheet of the method for the database quick indexing that provides of the specific embodiment of the present invention;
Fig. 2 is the schematic flow sheet of method of the database quick indexing of the file to be retrieved of the increase user input that provides of the specific embodiment of the present invention search strategy that do not comprise file type;
Fig. 3 is the structural representation of the device of the database quick indexing that provides of the specific embodiment of the present invention;
Fig. 4 is the structural representation of the device of the increase filename retrieval unit that provides of the specific embodiment of the present invention and the index list database quick indexing that re-establishes the unit.
Embodiment
The specific embodiment of the present invention provides a kind of method of database quick indexing, comprises according to the strategy that sets in advance the file in the database is classified, and by file type the Divide File in the described database is become several subdata bases; Set up the index list of described database according to the number of described subdata base, and the type of the file to be retrieved of user's input is retrieved described index list as keyword.
Further, if the method can also comprise the file to be retrieved of user's input and not comprise file type, then set up the index list of described database by predetermined rule, and the filename of the file to be retrieved of user's input is retrieved described index list as keyword; And, if the type change of at least one file in the described database then rebulids the index list of described database.
The method of a kind of database quick indexing that provides for clearer explanation the specific embodiment of the present invention, can required existing oracle database, in SQL database, Access database and the INFOBANK database, now be elaborated in conjunction with the method for Figure of description to corresponding database quick indexing, as shown in Figure 1, the method specifically can comprise:
Step 11 is classified to the file in the database according to the strategy that sets in advance, and by file type the Divide File in the described database is become several subdata bases.
Concrete, classification to the database File can comprise multiple strategy, can adopt in this embodiment by file suffixes name classification (such as exe file, txt file, avi file etc.), by the classification of type (such as document files, graphic file, multimedia file etc.) of file or by the magnitude classification (as being that small files, 1M ~ 1G are medium-sized file, are mass file etc. more than the 1G below the 1M) of file.After classification is finished, database is set up the subdata base of respective numbers by the quantity of classification, each subdata base possesses separately the correlation function of database.The catalogue of the subdata base that division is finished can be used as an independent file and is kept in the database, inquires about for the user.
Step 12 is set up the index list of described database according to the number of described subdata base, and the type of the file to be retrieved of user's input is retrieved described index list as keyword.
Concrete, when having the user that the data storehouse is retrieved, can provide search function as index list for the user with the file of preserving the catalogue of subdata base, in retrieving, the type of the file to be retrieved of user input is retrieved described index list as keyword, then can obtain the title of file to be retrieved corresponding subdata base in index list, and then in corresponding subdata base, directly treat retrieving files and retrieve and to obtain corresponding content, above-mentioned retrieving has at first been got rid of the file dissimilar with file to be retrieved, range of search is significantly dwindled, thus the resource that raising recall precision and minimizing take.
Further, if the file to be retrieved of user's input does not comprise file type, then can't retrieve according to the type of file to be retrieved.This be because: user and do not know the type of file to be retrieved in some cases, even do not know the definite title of file to be retrieved, therefore can't determine the type of file to be retrieved, can only be by fuzzy search.This moment, the technical scheme of above-mentioned steps 11-step 12 record can not provide complete retrieval scheme for the user, therefore need to reformulate search rule.In this embodiment, as shown in Figure 2, on the basis of step 12, further increased:
Step 13 if the file to be retrieved of user input does not comprise file type, is then set up the index list of described database by predetermined rule, and the filename of the file to be retrieved of user's input is retrieved described index list as keyword.
Predetermined rule can be classified by the executive agent of file, is a class such as the Divide File with forms such as suffix .exe .bat by name .com because this class file can be directly by the identification of windows operating system and carry out, and do not need third party software; Suffix being called the Divide Files such as .doc .xls .vsd is a class, because this class file can and be carried out by the identification of the Office groupware; Suffix being called the Divide Files such as .avi .mp3 .rmvb is a class, because this class file can and be carried out by existing universal audio Video Decoder identification; Suffix being called the Divide Files such as .bmp .jpeg .png is a class, because this class file can and be carried out by existing general graphical demoder identification; File by specific third party's functional software (such as functional software such as PDF, PSD, RAR) identification and execution also can be divided into separately a class.Sorted file is set up index list as subdata base respectively, the filename of the file to be retrieved of again user being inputted is retrieved each index list as keyword, can obtain corresponding content, avoid without the defective of any when strategy to full library searching.
In addition, as shown in Figure 2, on the basis of step 13, can also further increase:
Step 14 is if the type change of at least one file in the described database then rebulids the index list of described database.This technical scheme is in order further to improve the accuracy of retrieval, can to adjust in real time the index list of database, making the user retrieve the most accurately content.
The technical scheme that adopts this embodiment to improve, set up subdata base by the method that employing classifies the documents, and set up corresponding index list, retrieve as keyword according to the type of the file to be retrieved of user input again, have higher recall precision and the resource that takies less.
The specific embodiment of the present invention also provides a kind of device of database quick indexing, as shown in Figure 3, specifically can comprise:
Word bank division unit 31 is used for according to the strategy that sets in advance the file of database being classified, and by file type the Divide File in the described database is become several subdata bases;
File type retrieval unit 32 is used for setting up according to the number of described subdata base the index list of described database, and the type of the file to be retrieved of user's input is retrieved described index list as keyword.
Preferably, as shown in Figure 4, described device can also comprise:
Filename retrieval unit 33, do not comprise file type if be used for the file to be retrieved of user's input, then set up the index list of described database by predetermined rule, and the filename of the file to be retrieved of user's input is retrieved described index list as keyword.
Preferably, as shown in Figure 4, described device can also comprise:
Index list re-establishes unit 34, if be used for the type change of at least one file of described database, then rebulids the index list of described database.
The embodiment of the processing capacity of each unit that comprises in the said apparatus is described in method embodiment before, no longer is repeated in this description at this.The technical scheme that adopts this embodiment to improve, set up subdata base by the method that employing classifies the documents, and set up corresponding index list, retrieve as keyword according to the type of the file to be retrieved of user input again, have higher recall precision and the resource that takies less.
The above; only for the better embodiment of the present invention, but protection scope of the present invention is not limited to this, anyly is familiar with those skilled in the art in the technical scope that the present invention discloses; the variation that can expect easily or replacement all should be encompassed within protection scope of the present invention.Therefore, protection scope of the present invention should be as the criterion with the protection domain of claims.

Claims (6)

1. the method for a database quick indexing is characterized in that, comprising:
According to the strategy that sets in advance the file in the database is classified, and by file type the Divide File in the described database is become several subdata bases;
Set up the index list of described database according to the number of described subdata base, and the type of the file to be retrieved of user's input is retrieved described index list as keyword.
2. method according to claim 1 is characterized in that, described method also comprises:
If the file to be retrieved of user input does not comprise file type, then set up the index list of described database by predetermined rule, and the filename of the file to be retrieved of user's input is retrieved described index list as keyword.
3. method according to claim 1 is characterized in that, described method also comprises:
If the type change of at least one file in the described database then rebulids the index list of described database.
4. the device of a database quick indexing is characterized in that, comprising:
The word bank division unit is used for according to the strategy that sets in advance the file of database being classified, and by file type the Divide File in the described database is become several subdata bases;
The file type retrieval unit is used for setting up according to the number of described subdata base the index list of described database, and the type of the file to be retrieved of user's input is retrieved described index list as keyword.
5. device according to claim 4 is characterized in that, described device also comprises:
The filename retrieval unit, do not comprise file type if be used for the file to be retrieved of user's input, then set up the index list of described database by predetermined rule, and the filename of the file to be retrieved of user's input is retrieved described index list as keyword.
6. device according to claim 4 is characterized in that, described device also comprises:
Index list re-establishes the unit, if be used for the type change of at least one file of described database, then rebulids the index list of described database.
CN201210491642.7A 2012-11-27 2012-11-27 A kind of method of database quick indexing and device Expired - Fee Related CN102930060B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210491642.7A CN102930060B (en) 2012-11-27 2012-11-27 A kind of method of database quick indexing and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210491642.7A CN102930060B (en) 2012-11-27 2012-11-27 A kind of method of database quick indexing and device

Publications (2)

Publication Number Publication Date
CN102930060A true CN102930060A (en) 2013-02-13
CN102930060B CN102930060B (en) 2016-05-04

Family

ID=47644857

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210491642.7A Expired - Fee Related CN102930060B (en) 2012-11-27 2012-11-27 A kind of method of database quick indexing and device

Country Status (1)

Country Link
CN (1) CN102930060B (en)

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104239586A (en) * 2014-10-16 2014-12-24 北京奇虎科技有限公司 Method and device for processing information material file
CN105302669A (en) * 2015-10-23 2016-02-03 浙江工商大学 Method and system for data deduplication in cloud backup process
CN106446269A (en) * 2016-10-19 2017-02-22 广东小天才科技有限公司 Data storage method and system
CN106649678A (en) * 2016-12-15 2017-05-10 咪咕文化科技有限公司 Data processing method and system
CN107168966A (en) * 2016-03-07 2017-09-15 阿里巴巴集团控股有限公司 A kind of search engine index construction method and device
CN108460075A (en) * 2017-12-28 2018-08-28 上海顶竹通讯技术有限公司 A kind of file content search method and system
CN109063215A (en) * 2018-10-16 2018-12-21 成都四方伟业软件股份有限公司 Data retrieval method and device
CN109344265A (en) * 2018-09-10 2019-02-15 新华三大数据技术有限公司 A kind of method for managing resource and device
CN110990430A (en) * 2019-11-29 2020-04-10 广西电网有限责任公司 Large-scale data parallel processing system
CN111045994A (en) * 2019-12-25 2020-04-21 山东方寸微电子科技有限公司 KV database-based file classification retrieval method and system
CN111143587A (en) * 2019-12-24 2020-05-12 深圳云天励飞技术有限公司 Data retrieval method and device and electronic equipment
CN111901684A (en) * 2020-07-30 2020-11-06 深圳市康冠科技股份有限公司 File classification method and related device
CN112633686A (en) * 2020-12-22 2021-04-09 华中科技大学同济医学院附属协和医院 Medical system labor dispatch management system and working method thereof

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1845032A (en) * 2005-04-06 2006-10-11 杭州波导软件有限公司 Method for realizing classification management of use right of mobile terminal user
CN101930444A (en) * 2009-06-18 2010-12-29 鸿富锦精密工业(深圳)有限公司 Image search system and method
CN102387422A (en) * 2010-08-31 2012-03-21 青岛海信电器股份有限公司 Digital media player, file searching method thereof and television
US20120179709A1 (en) * 2011-01-11 2012-07-12 Wataru Nakano Apparatus, method and program product for searching document

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1845032A (en) * 2005-04-06 2006-10-11 杭州波导软件有限公司 Method for realizing classification management of use right of mobile terminal user
CN101930444A (en) * 2009-06-18 2010-12-29 鸿富锦精密工业(深圳)有限公司 Image search system and method
CN102387422A (en) * 2010-08-31 2012-03-21 青岛海信电器股份有限公司 Digital media player, file searching method thereof and television
US20120179709A1 (en) * 2011-01-11 2012-07-12 Wataru Nakano Apparatus, method and program product for searching document

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104239586B (en) * 2014-10-16 2018-10-09 北京奇虎科技有限公司 A kind of method and apparatus of processing information material file
CN104239586A (en) * 2014-10-16 2014-12-24 北京奇虎科技有限公司 Method and device for processing information material file
CN105302669A (en) * 2015-10-23 2016-02-03 浙江工商大学 Method and system for data deduplication in cloud backup process
CN105302669B (en) * 2015-10-23 2019-04-30 浙江工商大学 The method and system of data deduplication in a kind of cloud backup procedure
CN107168966A (en) * 2016-03-07 2017-09-15 阿里巴巴集团控股有限公司 A kind of search engine index construction method and device
CN107168966B (en) * 2016-03-07 2020-10-20 创新先进技术有限公司 Search engine index construction method and device
CN106446269A (en) * 2016-10-19 2017-02-22 广东小天才科技有限公司 Data storage method and system
CN106649678B (en) * 2016-12-15 2020-07-10 咪咕文化科技有限公司 Data processing method and system
CN106649678A (en) * 2016-12-15 2017-05-10 咪咕文化科技有限公司 Data processing method and system
CN108460075A (en) * 2017-12-28 2018-08-28 上海顶竹通讯技术有限公司 A kind of file content search method and system
CN108460075B (en) * 2017-12-28 2021-11-30 上海顶竹通讯技术有限公司 File content retrieval method and system
CN109344265A (en) * 2018-09-10 2019-02-15 新华三大数据技术有限公司 A kind of method for managing resource and device
CN109063215A (en) * 2018-10-16 2018-12-21 成都四方伟业软件股份有限公司 Data retrieval method and device
CN110990430A (en) * 2019-11-29 2020-04-10 广西电网有限责任公司 Large-scale data parallel processing system
CN111143587A (en) * 2019-12-24 2020-05-12 深圳云天励飞技术有限公司 Data retrieval method and device and electronic equipment
CN111045994A (en) * 2019-12-25 2020-04-21 山东方寸微电子科技有限公司 KV database-based file classification retrieval method and system
CN111045994B (en) * 2019-12-25 2023-08-22 山东方寸微电子科技有限公司 File classification retrieval method and system based on KV database
CN111901684A (en) * 2020-07-30 2020-11-06 深圳市康冠科技股份有限公司 File classification method and related device
CN112633686A (en) * 2020-12-22 2021-04-09 华中科技大学同济医学院附属协和医院 Medical system labor dispatch management system and working method thereof

Also Published As

Publication number Publication date
CN102930060B (en) 2016-05-04

Similar Documents

Publication Publication Date Title
CN102930060B (en) A kind of method of database quick indexing and device
CN100458779C (en) Index and its extending and searching method
CN103902623B (en) Method and system for the accessing file in storage system
CN101067822B (en) Method and system for hierarchical storage management of metadata
CN102169507A (en) Distributed real-time search engine
CN103064906B (en) File management method and device
US20110265177A1 (en) Search result presentation
CN104239377A (en) Platform-crossing data retrieval method and device
US9842158B2 (en) Clustering web pages on a search engine results page
CN102375853A (en) Distributed database system, method for building index therein and query method
CN107491487A (en) A kind of full-text database framework and bitmap index establishment, data query method, server and medium
KR101744892B1 (en) System and method for data searching using time series tier indexing
CN104391908B (en) Multiple key indexing means based on local sensitivity Hash on a kind of figure
CN102024019B (en) Suffix tree based catalog organizing method in distributed file system
CN104391941A (en) Method for rapidly establishing full-text retrieval tool for common files
CN105404660A (en) Multistage data storage method and apparatus, multistage data structure and information retrieval method
CN103473324A (en) Multi-dimensional service attribute retrieving device and method based on unstructured data storage
CN107704633A (en) A kind of method and system of file migration
EP2541437A1 (en) Data base indexing
CN103279489A (en) Method and device for storing metadata
CN101963993B (en) Method for fast searching database sheet table record
CN102609531B (en) Method for pegging files according to keywords
Xu et al. Enhancing HDFS with a full-text search system for massive small files
Liu et al. A study of entity search in semantic search workshop
Нікітін et al. Combined indexing method in nosql databases

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20160504

Termination date: 20161127

CF01 Termination of patent right due to non-payment of annual fee