CN105045932B - A kind of data page querying method based on descending storage - Google Patents

A kind of data page querying method based on descending storage Download PDF

Info

Publication number
CN105045932B
CN105045932B CN201510557950.9A CN201510557950A CN105045932B CN 105045932 B CN105045932 B CN 105045932B CN 201510557950 A CN201510557950 A CN 201510557950A CN 105045932 B CN105045932 B CN 105045932B
Authority
CN
China
Prior art keywords
page
data
rowkey
time
descending
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201510557950.9A
Other languages
Chinese (zh)
Other versions
CN105045932A (en
Inventor
张登银
陈佳敏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai BETA Software Co., Ltd.
Original Assignee
Nanjing Post and Telecommunication University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Post and Telecommunication University filed Critical Nanjing Post and Telecommunication University
Priority to CN201510557950.9A priority Critical patent/CN105045932B/en
Publication of CN105045932A publication Critical patent/CN105045932A/en
Application granted granted Critical
Publication of CN105045932B publication Critical patent/CN105045932B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2471Distributed queries

Abstract

The present invention relates to a kind of data page querying methods based on descending storage, this method does not support the original paging of relational database for the large-scale Web information system application scene of HBase structures, solves the problems, such as existing HBase table paging query algorithm there are resource consumptions that big, search efficiency is low.This method is according to the storage characteristics of data in HBase, it is proposed that a kind of paging query algorithm arranged with time descending is arranged by the descending of timestamp, so that newest data is stored in the top of table, to meet requirements for access of the user to latest data.By specifying inquired size of data and initial time, the line number of every page of return is supported by PageFilter functions, realizes and arranges data with time descending, and paging is presented to the function of user, achievees the purpose that reduce data network transmission cost.

Description

A kind of data page querying method based on descending storage
Technical field
The present invention relates to a kind of data page querying methods based on descending storage, belong to data query technique field.
Background technology
With the rapid development of the information technologies such as mobile radio communication, internet, Internet of Things, information technology has been dissolved into people Life, the various communication terminals of throughout world various regions, sensing equipment etc. produce the flood tide number more than previous any epoch According to.At the same time, e-commerce, the rise of social networks are all generating various data all the time.Big data when In generation, has arrived, and has in the society of more information because we live in one, the interaction of people and data or the network information It will be close than any time before.
Contain bulk information in the data of magnanimity, a large amount of useful values can be extracted from information.In large-scale enterprise In grade Web information system, the data of a large amount of data even magnanimity are store, paging query technology is large-scale Web information system An indispensable technology in system exploitation.Relational database provides feature-rich sql like language, has complete set ripe Paging query method.
HBase, which is one, has high reliability, high-performance literacy, towards row, telescopic distributed memory system. HBase is stored using Key Value key-value pairs, but is different from some typical NoSQL databases, the Key of HBase be by Multiple portions machine is at including four dimensions are RowKey (line unit), Column Family (column family), Column respectively Qualifier (row name) and Timestamp (timestamp), thus the Key Value of HBase can be expressed as rowkey, column family,column qualifier,timestamp}-->Value forms, the purpose for the arrangement is that in order to allow use Family more easily accesses specified data.The Key Value formats of HBase are as shown in the table.
Different from traditional relational database, data volume is big in HBase table, and the power function provided is limited, can not provide Similar to the Paging in Stored Procedure of relation data.Therefore, HBase paging queries are the very research with Practical significance, Some scholars also give the solution of oneself.
(1) HBase tables of data paging query method carries out serial number to HBase table each row of data first, so that Each row of data has serial number, then stores data sequence number into concordance list with corresponding row major key.When carrying out paging query, by The line number of the page number and every page data calculates serial number range and obtains corresponding key assignments set further according to serial number range search index table, Finally by key assignments collection query original tables of data, Pagination Display data are obtained.This solves page jump inefficiency Problem, but the record sum for obtaining HBase is highly difficult.Although browse all data, it can leave in the database Current record sum, but if the data that inquiry request meets condition have many items, by time needs that totalize of data traversal Spend longer time.
(2) HBase method of Paged Browse, front end need not record the total page number of data when realization, in paging query processing When, it calls scan methods to be handled first, then setFilter methods is called to carry out query filter, finally used PageFilter limits the data number of return;And a kind of caching mechanism is devised, by binding each login user, in user When browsing data, the rowKey of this data is recorded, so as to avoid all data are scanned, then traversal is brought one by one Resource overhead problem.The shortcomings that this scheme is that realization is complex.And the present invention can well solve problem above.
Invention content
The problem of present invention aims at paging query algorithms for HBase table, it is proposed that one kind is stored based on descending Data page querying method, that this method solve the paging query algorithm resource consumptions of HBase table is big, search efficiency is low asks Topic improves the search efficiency of HBase table data page inquiry, reduces data query operating lag.
The technical scheme adopted by the invention to solve the technical problem is that:The size that the present invention passes through specified inquired data And initial time, the line number of every page of return is supported by PageFilter functions, is realized and is arranged data with time descending, and Paging is presented to the function of user, achievees the purpose that reduce data network transmission cost.
Method flow:
Step 1:The data that page block area size is retrieved in database, are stored on HDFS;
Step 2:System generation time stabs Tm, current time Tc, Tm-Tc and device id and forms Rowkey;
Step 3:After data carry out descending processing according to Rowkey, it is stored in HBase table;
Step 4:Initial time Tn and every page of display data item number N is inputted, Tn=startRow, setting are set StopRowkey is that acquiescence is constant;
Step 5:PageFilter (tableName, startRowkey, N+1) function is called, M datas are returned.
Further, step 1 of the invention includes:
1) paging processing is carried out in database side;
2) when carrying out page turn over operation every time, the data of page block area size are retrieved from database.
Further, the Rowkey of step 2 of the present invention includes:Data are deposited according to the lexicographic order of Rowkey in HBase table Storage, timestamp is incremental, and new Rowkey is arranged, and is the combination of Tm-Tc and device id, wherein Tm-Tc is expressed as the time Stamp Tm and current time Tc subtracts each other.
Further, the storage of step 3 of the present invention includes:In Tabbed browsing, data temporally ascending sort by preceding to After show, what is obtained at first is historical data, however user mostly compares concern to newest data, therefore to acquiring the number come According to being handled so that latest data is stored in HBase table top, meets demand of the user to information.
Further, step 4 of the invention includes:
1) paging mode need not record sum, be similar to social network sites and some forums, and client need not obtain always Record number, it is only necessary to whether also have data after judging every page, provide a user the page-turning function of " lower one page " and " page up ";
2) upper nextpage page-turning function is only provided, disposable paging is not carried out to the data of entire tables of data, takes out use every time The data of size of data are specified at family, improve search efficiency.
Further, step 5 of the invention includes:
1) support that every page is returned by PageFilter (tableName, startRowkey, stopRowkey, N+1) functions The line number returned is not needed data count and label in database of record and realizes per data line and arranged data with time descending, And paging is presented to the function of user;
2) after having inquired one page, lower one page is asked, page=page+1 is updated, calls page functions;Page up is asked, Page=page-1 is updated, judges whether page is 0, terminates if 0 operation, page functions is otherwise called, obtains number of pages According to.
The present invention is inquired applied to data page.
Advantageous effect:
1, the present invention does not support that relational database is original for the HBase large-scale Web information system application scenes realized Paging, the method increase the search efficiencies of HBase table data page inquiry, reduce data query operating lag.
2, the present invention, which is realized, arranges data with time descending, and paging is presented to the function of user, reduces data Net cost.
Description of the drawings
Fig. 1 is the method flow diagram in the embodiment of the present invention.
Fig. 2 is the process schematic that user of the present invention issues a request to returned data.
Fig. 3 is the data Stored Procedure figure of the present invention.
Fig. 4 is the data query flow chart of the present invention.
Fig. 5 is that data carry out the descending storage order figure in HBase table before and after the processing after descending stores.
Specific implementation mode
The invention is described in further detail with reference to the accompanying drawings of the specification.
As shown in Figure 1, the present invention be directed to the paging query method of HBase tables of data, by generating new timestamp, group The Rowkey of Cheng Xin realizes that temporally descending is stored in the function in HBase table to gathered data;This method does not use traditional The mode of each row of data and statistical data sum is marked, but using the starting Rowkey for recording every page, it calls in HBase PageFilter methods provide nextpage page-turning function, realize the requirement of paging query.
The process that the present invention implements, including:
1, paging query scheme
2, data descending storage method
3, paging query algorithm
4, data query flow
1. paging query scheme:
By taking the typical three-tier architecture of Fig. 2 as an example, when user sends out data inquiry request returned data, in database side, Web Server end and client browser end are (i.e.:Browser paging processing) can be carried out.It selects to carry out sentencing for paging in that layer Disconnected standard is:The speed of data processing and the expense of network transmission resource.Consider, selection is carried out in database side at paging Reason.
Compare rational paging query scheme:When carrying out page turn over operation every time, page block area size is retrieved from database Data.Although page turning is required for inquiry database to this scheme every time, the data recording number inquired is seldom, greatly drops The data volume transmitted between low network.In addition, establishing process for taking long database connection, data can be used Library connection pool is solved.
2. data descending storage method:
From figure 3, it can be seen that by data acquisition server, collect converged network real-time running data, and by this A little data are initially stored on HDFS, are then carried out descending processing to the data of acquisition, are finally stored in HBase table.HBase Data are stored according to the lexicographic order of Rowkey in table, and wherein timestamp is incremental, and newest data are stored in HBase The bottom end of table.In Tabbed browsing, temporally ascending sort is shown data from front to back, and what is obtained at first is historical data, so And user mostly compares concern to newest data, needs to handle acquiring the data come so that latest data is stored in HBase table top.
The method of descending processing is as follows, including:
Step 1:System generates a larger time stamp T m, and Tc is as being deposited when subtracting current using the time stamp T m of generation Store up a part of data Rowkey;
Step 2:Tm-Tc and device id are combined into Rowkey;
Step 3:The operation subtracted each other by timestamp, change data storage sequence in HBase.
Fig. 5 display datas pass through the descending storage order in HBase table before and after the processing.Rowkey in original HBase table It is arranged according to lexicographic order, latest data is stored in the bottom end of HBase tables of data;Because timestamp is incremental, by subtracting each other Processing operation, change data in HBase storage sequence, realize data and be stored in HBase table according to time descending Function.
3. paging query algorithm:
The present invention using it is a kind of need not record sum paging mode, be similar to social network sites, some forums way, Client need not obtain the total number of records, it is only necessary to judge whether also have data after every page, provide a user " lower one page " and The page-turning function of " page up ":
(1) the setting page number caches, and records the starting Rowkey of every page, Storage Format is<The page number, startRowkey>, It is to take out a data when carrying out " lower one page " operation every time more, and its Rowkey is stored in caching, as lower one page StartRowkey;
(2) a page variable and pageTemp variables are set, is respectively intended to record the page number being currently located and user is clear The total page number look at inquires database and obtains the startRowkey of lower one page, and update slow if page is equal to pageTemp It deposits, otherwise, the startRowkey of upper nextpage is read directly from caching.
Specific algorithm is described as follows, including:
Input:Initial time Tn and every page of display data item number N.
Output:Return to N datas.
Step 1:Input time Tn, every page of display data number N.
Step 2:Page number page=1 is initialized, pageTemp=1 is initialized, initializes Map<pageTemp,Tn>It is used for Store the corresponding startRowkey of every page.
Step 3:It is startRowkey with Tn, stopRowkey is set as giving tacit consent to constant, calling PageFilter (tableName, startRowkey, stopRowkey, N+1) function returns to M datas.
Step 4:Judge M whether be less than N, if it is, in the absence of one page, request " lower one page " operation terminates.
Step 5:If it is not, then one page in the presence of showing, goes to step 6, continue to execute.
Step 6:Judge whether page is equal to pageTemp, it is not equal then illustrate user ask number of pages all pageTemp it It is interior, it can directly execute step step 9 or step 10;It is equal, it continues to execute.
Step 7:PageTemp=pageTemp+1 is updated, record is currently located the page number.
Step 8:M datas are filtered, the Rowkey for taking out this data is stored in Map<pageTemp,Rowkey>, make For the startRowkey of lower one page.
Step 9:When user asks " lower one page ", page=page+1 is updated, is that key takes out corresponding to Map with page Value as startRowkey, call PageFilter (tableName, startRowkey, stopRowkey, N+1) letter Number.Repeat step 3,4,5.
Step 10:When user asks " page up ", page=page-1 is updated, judges whether page is equal to 0, if it is Page up is then not present, request " page up " operation terminates;Otherwise, it is that the value that key takes out corresponding to Map is with page StartRowkey calls PageFilter functions, obtains page up data.
4, data query flow
Data query is that user extracts the important of interested data and Data Analysis Services in Database Systems Component part.In HBase systems, data query flow is generally divided into three phases, as shown in figure 4, including:
(1) user submits inquiry request, client that inquiry request is uploaded to HBase by network communication by client Cluster server is (i.e.:HMaster).
(2) HMaster is asked according to user, and notice is responsible for retrieving data to RegionService, RegionService.
(3) result that inquiry obtains is returned to client by RegionService.

Claims (1)

1. a kind of data page querying method based on descending storage, which is characterized in that the method is looked into applied to data page It askes, includes the following steps:
Step 1:The data that page block area size is retrieved in database are stored on HDFS, including:
1) paging processing is carried out in database side;
2) when carrying out page turn over operation every time, the data of page block area size are retrieved from database;
Step 2:System generation time stabs Tm, current time Tc, Tm-Tc and device id and forms Rowkey, and the Rowkey includes: Data are stored according to the lexicographic order of Rowkey in HBase table, and the timestamp is incremental, Rowkey is arranged, i.e.,:For Tm-Tc With the combination of device id, wherein Tm-Tc is expressed as time stamp T m and current time Tc subtracts each other;
Step 3:After data carry out descending processing according to Rowkey, it is stored in HBase table, in Tabbed browsing, data are on time Between ascending sort show that is obtained at first is historical data, however user mostly compares concern to newest data from front to back, Therefore it handles acquiring the data come so that latest data is stored in HBase table top, meets need of the user to information It asks;
Step 4:Initial time Tn and every page of display data item number N is inputted, Tn=startRow is set, setting stopRowkey is Give tacit consent to it is constant, including:
1) paging mode need not record sum, be similar to social network sites and some forums, and client need not obtain summary journal Number, it is only necessary to whether also have data after judging every page, provide a user the page-turning function of " lower one page " and " page up ";
2) upper nextpage page-turning function is only provided, disposable paging is not carried out to the data of entire tables of data, user is taken out every time and refers to Determine the data of size of data, improves search efficiency;
Step 5:PageFilter (tableName, startRowkey, N+1) function is called, M datas are returned, including:
1) by PageFilter (tableName, startRowkey, stopRowkey, N+1) functions come support every page return Line number is not needed data count and label in database of record and realizes per data line and arranged data with time descending, and point Page is presented to the function of user;
2) after having inquired one page, lower one page is asked, page=page+1 is updated, calls page functions;Ask page up, update Page=page-1 judges whether page is 0, terminates if 0 operation, otherwise calls page functions, obtains page data;Institute The method for stating descending processing is as follows, including:
Step 1:System generates a larger time stamp T m, and Tc is as stored number when subtracting current using the time stamp T m of generation According to a part of Rowkey;
Step 2:Tm-Tc and device id are combined into Rowkey;
Step 3:The operation subtracted each other by timestamp, change data storage sequence in HBase;
The paging query algorithm of the method:
(1) the setting page number caches, and records the starting Rowkey of every page, Storage Format is<The page number, startRowkey>, when every It is to take out a data when secondary progress " lower one page " operation more, and its Rowkey is stored in caching, as lower one page startRowkey;
(2) a page variable and pageTemp variables are set, the page number that record is currently located and user's browsing are respectively intended to Total page number inquires database and obtains the startRowkey of lower one page, and update caching if page is equal to pageTemp, no Then, the startRowkey of upper nextpage is read directly from caching.
CN201510557950.9A 2015-09-02 2015-09-02 A kind of data page querying method based on descending storage Active CN105045932B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510557950.9A CN105045932B (en) 2015-09-02 2015-09-02 A kind of data page querying method based on descending storage

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510557950.9A CN105045932B (en) 2015-09-02 2015-09-02 A kind of data page querying method based on descending storage

Publications (2)

Publication Number Publication Date
CN105045932A CN105045932A (en) 2015-11-11
CN105045932B true CN105045932B (en) 2018-11-13

Family

ID=54452478

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510557950.9A Active CN105045932B (en) 2015-09-02 2015-09-02 A kind of data page querying method based on descending storage

Country Status (1)

Country Link
CN (1) CN105045932B (en)

Families Citing this family (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10574739B2 (en) * 2016-02-26 2020-02-25 Honeywell International Inc. System and method for smart event paging
CN105843956A (en) * 2016-04-14 2016-08-10 北京搜狐新媒体信息技术有限公司 Paging query method and system
CN106021357B (en) * 2016-05-09 2019-05-03 泰华智慧产业集团股份有限公司 Based on distributed big data paging query method and system
CN106126731B (en) * 2016-07-01 2020-02-14 百势软件(北京)有限公司 Method and device for acquiring Elasticissearch paging data
CN108073661A (en) * 2016-11-18 2018-05-25 北京京东尚科信息技术有限公司 Data retrieval method and device, report generating system and method
CN107391749B (en) * 2017-08-15 2020-07-31 杭州安恒信息技术股份有限公司 Method for realizing waterfall flow by inquiring sub-table data
CN109460404A (en) * 2018-09-03 2019-03-12 中新网络信息安全股份有限公司 A kind of efficient Hbase paging query method based on redis
CN109271597B (en) * 2018-09-19 2022-02-18 郑州云海信息技术有限公司 Method and device for paging display of multi-table scene of non-relational database
CN110597859B (en) * 2019-09-06 2022-03-29 天津车之家数据信息技术有限公司 Method and device for querying data in pages
CN111221815B (en) * 2019-11-07 2021-07-27 南京莱斯网信技术研究院有限公司 Script-based web service paging data acquisition system
CN111400347A (en) * 2020-03-20 2020-07-10 北京思特奇信息技术股份有限公司 Paging query method, system and electronic equipment
CN111488370B (en) * 2020-04-02 2023-09-12 杭州迪普科技股份有限公司 List paging quick response system and method
CN112182040A (en) * 2020-09-30 2021-01-05 深圳前海微众银行股份有限公司 Data query method, device, equipment and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103268341A (en) * 2013-05-10 2013-08-28 深圳市葡萄信息技术有限公司 Time line integration method and system on basis of multi-source
CN103617232A (en) * 2013-11-26 2014-03-05 北京京东尚科信息技术有限公司 Paging inquiring method for HBase table
CN104850640A (en) * 2015-05-26 2015-08-19 华北电力大学(保定) HBase based storage and query method and system for power equipment status monitoring data

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103268341A (en) * 2013-05-10 2013-08-28 深圳市葡萄信息技术有限公司 Time line integration method and system on basis of multi-source
CN103617232A (en) * 2013-11-26 2014-03-05 北京京东尚科信息技术有限公司 Paging inquiring method for HBase table
CN104850640A (en) * 2015-05-26 2015-08-19 华北电力大学(保定) HBase based storage and query method and system for power equipment status monitoring data

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
云数据库;wwduest;《百度文库》;20150729;ppt第9页 *

Also Published As

Publication number Publication date
CN105045932A (en) 2015-11-11

Similar Documents

Publication Publication Date Title
CN105045932B (en) A kind of data page querying method based on descending storage
CN104750681B (en) A kind of processing method and processing device of mass data
CN103646051B (en) Big-data parallel processing system and method based on column storage
CN103631909B (en) System and method for combined processing of large-scale structured and unstructured data
CN102446225A (en) Real-time search method, device and system
CN101196900A (en) Information searching method based on metadata
CN102164186A (en) Method and system for realizing cloud search service
CN108509437A (en) A kind of ElasticSearch inquiries accelerated method
CN104424258A (en) Multidimensional data query method and system, query server and column storage server
CN102999563A (en) Network resource semantic retrieval method and system based on resource description framework
CN103955533B (en) A kind of page tree data acquisition device based on buffer queue and method
CN108228743A (en) A kind of real-time big data search engine system
CN102253939A (en) Searching method and system based on cloud computing technology
Wang et al. A novel blockchain oracle implementation scheme based on application specific knowledge engines
CN107798062A (en) A kind of transformer station&#39;s historical data unifies storage method and system
CA3062944A1 (en) An emergency disposal support system
CN102156749B (en) Anatomic search and judgment method, system and distributed server system for map sites
CN109189873A (en) A kind of Meteorological Services big data monitoring analysis system platform
CN103218396B (en) The management and running visual analysis method of static Web page is generated according to visitation frequency feature
CN103823805B (en) Community-based correlation note commending system and recommendation method
CN106777395A (en) A kind of topic based on community&#39;s text data finds system
CN101788981A (en) Deep web mobile search method, server and system
Tang et al. Searching the Internet of Things using coding enabled index technology
Lu et al. Research and implementation of big data system of social media
Jun et al. Application of Web services on the real-time data warehouse technology

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20190726

Address after: Room 302 and 303, 86 blocks, 700 Yishan Road, Xuhui District, Shanghai

Patentee after: Shanghai BETA Software Co., Ltd.

Address before: 210003 Gulou District, Jiangsu, Nanjing new model road, No. 66

Patentee before: Nanjing Post & Telecommunication Univ.