CN105045932B - A kind of data page querying method based on descending storage - Google Patents
A kind of data page querying method based on descending storage Download PDFInfo
- Publication number
- CN105045932B CN105045932B CN201510557950.9A CN201510557950A CN105045932B CN 105045932 B CN105045932 B CN 105045932B CN 201510557950 A CN201510557950 A CN 201510557950A CN 105045932 B CN105045932 B CN 105045932B
- Authority
- CN
- China
- Prior art keywords
- page
- data
- rowkey
- time
- descending
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/18—File system types
- G06F16/182—Distributed file systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2458—Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
- G06F16/2471—Distributed queries
Abstract
The present invention relates to a kind of data page querying methods based on descending storage, this method does not support the original paging of relational database for the large-scale Web information system application scene of HBase structures, solves the problems, such as existing HBase table paging query algorithm there are resource consumptions that big, search efficiency is low.This method is according to the storage characteristics of data in HBase, it is proposed that a kind of paging query algorithm arranged with time descending is arranged by the descending of timestamp, so that newest data is stored in the top of table, to meet requirements for access of the user to latest data.By specifying inquired size of data and initial time, the line number of every page of return is supported by PageFilter functions, realizes and arranges data with time descending, and paging is presented to the function of user, achievees the purpose that reduce data network transmission cost.
Description
Technical field
The present invention relates to a kind of data page querying methods based on descending storage, belong to data query technique field.
Background technology
With the rapid development of the information technologies such as mobile radio communication, internet, Internet of Things, information technology has been dissolved into people
Life, the various communication terminals of throughout world various regions, sensing equipment etc. produce the flood tide number more than previous any epoch
According to.At the same time, e-commerce, the rise of social networks are all generating various data all the time.Big data when
In generation, has arrived, and has in the society of more information because we live in one, the interaction of people and data or the network information
It will be close than any time before.
Contain bulk information in the data of magnanimity, a large amount of useful values can be extracted from information.In large-scale enterprise
In grade Web information system, the data of a large amount of data even magnanimity are store, paging query technology is large-scale Web information system
An indispensable technology in system exploitation.Relational database provides feature-rich sql like language, has complete set ripe
Paging query method.
HBase, which is one, has high reliability, high-performance literacy, towards row, telescopic distributed memory system.
HBase is stored using Key Value key-value pairs, but is different from some typical NoSQL databases, the Key of HBase be by
Multiple portions machine is at including four dimensions are RowKey (line unit), Column Family (column family), Column respectively
Qualifier (row name) and Timestamp (timestamp), thus the Key Value of HBase can be expressed as rowkey,
column family,column qualifier,timestamp}-->Value forms, the purpose for the arrangement is that in order to allow use
Family more easily accesses specified data.The Key Value formats of HBase are as shown in the table.
Different from traditional relational database, data volume is big in HBase table, and the power function provided is limited, can not provide
Similar to the Paging in Stored Procedure of relation data.Therefore, HBase paging queries are the very research with Practical significance,
Some scholars also give the solution of oneself.
(1) HBase tables of data paging query method carries out serial number to HBase table each row of data first, so that
Each row of data has serial number, then stores data sequence number into concordance list with corresponding row major key.When carrying out paging query, by
The line number of the page number and every page data calculates serial number range and obtains corresponding key assignments set further according to serial number range search index table,
Finally by key assignments collection query original tables of data, Pagination Display data are obtained.This solves page jump inefficiency
Problem, but the record sum for obtaining HBase is highly difficult.Although browse all data, it can leave in the database
Current record sum, but if the data that inquiry request meets condition have many items, by time needs that totalize of data traversal
Spend longer time.
(2) HBase method of Paged Browse, front end need not record the total page number of data when realization, in paging query processing
When, it calls scan methods to be handled first, then setFilter methods is called to carry out query filter, finally used
PageFilter limits the data number of return;And a kind of caching mechanism is devised, by binding each login user, in user
When browsing data, the rowKey of this data is recorded, so as to avoid all data are scanned, then traversal is brought one by one
Resource overhead problem.The shortcomings that this scheme is that realization is complex.And the present invention can well solve problem above.
Invention content
The problem of present invention aims at paging query algorithms for HBase table, it is proposed that one kind is stored based on descending
Data page querying method, that this method solve the paging query algorithm resource consumptions of HBase table is big, search efficiency is low asks
Topic improves the search efficiency of HBase table data page inquiry, reduces data query operating lag.
The technical scheme adopted by the invention to solve the technical problem is that:The size that the present invention passes through specified inquired data
And initial time, the line number of every page of return is supported by PageFilter functions, is realized and is arranged data with time descending, and
Paging is presented to the function of user, achievees the purpose that reduce data network transmission cost.
Method flow:
Step 1:The data that page block area size is retrieved in database, are stored on HDFS;
Step 2:System generation time stabs Tm, current time Tc, Tm-Tc and device id and forms Rowkey;
Step 3:After data carry out descending processing according to Rowkey, it is stored in HBase table;
Step 4:Initial time Tn and every page of display data item number N is inputted, Tn=startRow, setting are set
StopRowkey is that acquiescence is constant;
Step 5:PageFilter (tableName, startRowkey, N+1) function is called, M datas are returned.
Further, step 1 of the invention includes:
1) paging processing is carried out in database side;
2) when carrying out page turn over operation every time, the data of page block area size are retrieved from database.
Further, the Rowkey of step 2 of the present invention includes:Data are deposited according to the lexicographic order of Rowkey in HBase table
Storage, timestamp is incremental, and new Rowkey is arranged, and is the combination of Tm-Tc and device id, wherein Tm-Tc is expressed as the time
Stamp Tm and current time Tc subtracts each other.
Further, the storage of step 3 of the present invention includes:In Tabbed browsing, data temporally ascending sort by preceding to
After show, what is obtained at first is historical data, however user mostly compares concern to newest data, therefore to acquiring the number come
According to being handled so that latest data is stored in HBase table top, meets demand of the user to information.
Further, step 4 of the invention includes:
1) paging mode need not record sum, be similar to social network sites and some forums, and client need not obtain always
Record number, it is only necessary to whether also have data after judging every page, provide a user the page-turning function of " lower one page " and " page up ";
2) upper nextpage page-turning function is only provided, disposable paging is not carried out to the data of entire tables of data, takes out use every time
The data of size of data are specified at family, improve search efficiency.
Further, step 5 of the invention includes:
1) support that every page is returned by PageFilter (tableName, startRowkey, stopRowkey, N+1) functions
The line number returned is not needed data count and label in database of record and realizes per data line and arranged data with time descending,
And paging is presented to the function of user;
2) after having inquired one page, lower one page is asked, page=page+1 is updated, calls page functions;Page up is asked,
Page=page-1 is updated, judges whether page is 0, terminates if 0 operation, page functions is otherwise called, obtains number of pages
According to.
The present invention is inquired applied to data page.
Advantageous effect:
1, the present invention does not support that relational database is original for the HBase large-scale Web information system application scenes realized
Paging, the method increase the search efficiencies of HBase table data page inquiry, reduce data query operating lag.
2, the present invention, which is realized, arranges data with time descending, and paging is presented to the function of user, reduces data
Net cost.
Description of the drawings
Fig. 1 is the method flow diagram in the embodiment of the present invention.
Fig. 2 is the process schematic that user of the present invention issues a request to returned data.
Fig. 3 is the data Stored Procedure figure of the present invention.
Fig. 4 is the data query flow chart of the present invention.
Fig. 5 is that data carry out the descending storage order figure in HBase table before and after the processing after descending stores.
Specific implementation mode
The invention is described in further detail with reference to the accompanying drawings of the specification.
As shown in Figure 1, the present invention be directed to the paging query method of HBase tables of data, by generating new timestamp, group
The Rowkey of Cheng Xin realizes that temporally descending is stored in the function in HBase table to gathered data;This method does not use traditional
The mode of each row of data and statistical data sum is marked, but using the starting Rowkey for recording every page, it calls in HBase
PageFilter methods provide nextpage page-turning function, realize the requirement of paging query.
The process that the present invention implements, including:
1, paging query scheme
2, data descending storage method
3, paging query algorithm
4, data query flow
1. paging query scheme:
By taking the typical three-tier architecture of Fig. 2 as an example, when user sends out data inquiry request returned data, in database side, Web
Server end and client browser end are (i.e.:Browser paging processing) can be carried out.It selects to carry out sentencing for paging in that layer
Disconnected standard is:The speed of data processing and the expense of network transmission resource.Consider, selection is carried out in database side at paging
Reason.
Compare rational paging query scheme:When carrying out page turn over operation every time, page block area size is retrieved from database
Data.Although page turning is required for inquiry database to this scheme every time, the data recording number inquired is seldom, greatly drops
The data volume transmitted between low network.In addition, establishing process for taking long database connection, data can be used
Library connection pool is solved.
2. data descending storage method:
From figure 3, it can be seen that by data acquisition server, collect converged network real-time running data, and by this
A little data are initially stored on HDFS, are then carried out descending processing to the data of acquisition, are finally stored in HBase table.HBase
Data are stored according to the lexicographic order of Rowkey in table, and wherein timestamp is incremental, and newest data are stored in HBase
The bottom end of table.In Tabbed browsing, temporally ascending sort is shown data from front to back, and what is obtained at first is historical data, so
And user mostly compares concern to newest data, needs to handle acquiring the data come so that latest data is stored in
HBase table top.
The method of descending processing is as follows, including:
Step 1:System generates a larger time stamp T m, and Tc is as being deposited when subtracting current using the time stamp T m of generation
Store up a part of data Rowkey;
Step 2:Tm-Tc and device id are combined into Rowkey;
Step 3:The operation subtracted each other by timestamp, change data storage sequence in HBase.
Fig. 5 display datas pass through the descending storage order in HBase table before and after the processing.Rowkey in original HBase table
It is arranged according to lexicographic order, latest data is stored in the bottom end of HBase tables of data;Because timestamp is incremental, by subtracting each other
Processing operation, change data in HBase storage sequence, realize data and be stored in HBase table according to time descending
Function.
3. paging query algorithm:
The present invention using it is a kind of need not record sum paging mode, be similar to social network sites, some forums way,
Client need not obtain the total number of records, it is only necessary to judge whether also have data after every page, provide a user " lower one page " and
The page-turning function of " page up ":
(1) the setting page number caches, and records the starting Rowkey of every page, Storage Format is<The page number, startRowkey>,
It is to take out a data when carrying out " lower one page " operation every time more, and its Rowkey is stored in caching, as lower one page
StartRowkey;
(2) a page variable and pageTemp variables are set, is respectively intended to record the page number being currently located and user is clear
The total page number look at inquires database and obtains the startRowkey of lower one page, and update slow if page is equal to pageTemp
It deposits, otherwise, the startRowkey of upper nextpage is read directly from caching.
Specific algorithm is described as follows, including:
Input:Initial time Tn and every page of display data item number N.
Output:Return to N datas.
Step 1:Input time Tn, every page of display data number N.
Step 2:Page number page=1 is initialized, pageTemp=1 is initialized, initializes Map<pageTemp,Tn>It is used for
Store the corresponding startRowkey of every page.
Step 3:It is startRowkey with Tn, stopRowkey is set as giving tacit consent to constant, calling PageFilter
(tableName, startRowkey, stopRowkey, N+1) function returns to M datas.
Step 4:Judge M whether be less than N, if it is, in the absence of one page, request " lower one page " operation terminates.
Step 5:If it is not, then one page in the presence of showing, goes to step 6, continue to execute.
Step 6:Judge whether page is equal to pageTemp, it is not equal then illustrate user ask number of pages all pageTemp it
It is interior, it can directly execute step step 9 or step 10;It is equal, it continues to execute.
Step 7:PageTemp=pageTemp+1 is updated, record is currently located the page number.
Step 8:M datas are filtered, the Rowkey for taking out this data is stored in Map<pageTemp,Rowkey>, make
For the startRowkey of lower one page.
Step 9:When user asks " lower one page ", page=page+1 is updated, is that key takes out corresponding to Map with page
Value as startRowkey, call PageFilter (tableName, startRowkey, stopRowkey, N+1) letter
Number.Repeat step 3,4,5.
Step 10:When user asks " page up ", page=page-1 is updated, judges whether page is equal to 0, if it is
Page up is then not present, request " page up " operation terminates;Otherwise, it is that the value that key takes out corresponding to Map is with page
StartRowkey calls PageFilter functions, obtains page up data.
4, data query flow
Data query is that user extracts the important of interested data and Data Analysis Services in Database Systems
Component part.In HBase systems, data query flow is generally divided into three phases, as shown in figure 4, including:
(1) user submits inquiry request, client that inquiry request is uploaded to HBase by network communication by client
Cluster server is (i.e.:HMaster).
(2) HMaster is asked according to user, and notice is responsible for retrieving data to RegionService, RegionService.
(3) result that inquiry obtains is returned to client by RegionService.
Claims (1)
1. a kind of data page querying method based on descending storage, which is characterized in that the method is looked into applied to data page
It askes, includes the following steps:
Step 1:The data that page block area size is retrieved in database are stored on HDFS, including:
1) paging processing is carried out in database side;
2) when carrying out page turn over operation every time, the data of page block area size are retrieved from database;
Step 2:System generation time stabs Tm, current time Tc, Tm-Tc and device id and forms Rowkey, and the Rowkey includes:
Data are stored according to the lexicographic order of Rowkey in HBase table, and the timestamp is incremental, Rowkey is arranged, i.e.,:For Tm-Tc
With the combination of device id, wherein Tm-Tc is expressed as time stamp T m and current time Tc subtracts each other;
Step 3:After data carry out descending processing according to Rowkey, it is stored in HBase table, in Tabbed browsing, data are on time
Between ascending sort show that is obtained at first is historical data, however user mostly compares concern to newest data from front to back,
Therefore it handles acquiring the data come so that latest data is stored in HBase table top, meets need of the user to information
It asks;
Step 4:Initial time Tn and every page of display data item number N is inputted, Tn=startRow is set, setting stopRowkey is
Give tacit consent to it is constant, including:
1) paging mode need not record sum, be similar to social network sites and some forums, and client need not obtain summary journal
Number, it is only necessary to whether also have data after judging every page, provide a user the page-turning function of " lower one page " and " page up ";
2) upper nextpage page-turning function is only provided, disposable paging is not carried out to the data of entire tables of data, user is taken out every time and refers to
Determine the data of size of data, improves search efficiency;
Step 5:PageFilter (tableName, startRowkey, N+1) function is called, M datas are returned, including:
1) by PageFilter (tableName, startRowkey, stopRowkey, N+1) functions come support every page return
Line number is not needed data count and label in database of record and realizes per data line and arranged data with time descending, and point
Page is presented to the function of user;
2) after having inquired one page, lower one page is asked, page=page+1 is updated, calls page functions;Ask page up, update
Page=page-1 judges whether page is 0, terminates if 0 operation, otherwise calls page functions, obtains page data;Institute
The method for stating descending processing is as follows, including:
Step 1:System generates a larger time stamp T m, and Tc is as stored number when subtracting current using the time stamp T m of generation
According to a part of Rowkey;
Step 2:Tm-Tc and device id are combined into Rowkey;
Step 3:The operation subtracted each other by timestamp, change data storage sequence in HBase;
The paging query algorithm of the method:
(1) the setting page number caches, and records the starting Rowkey of every page, Storage Format is<The page number, startRowkey>, when every
It is to take out a data when secondary progress " lower one page " operation more, and its Rowkey is stored in caching, as lower one page
startRowkey;
(2) a page variable and pageTemp variables are set, the page number that record is currently located and user's browsing are respectively intended to
Total page number inquires database and obtains the startRowkey of lower one page, and update caching if page is equal to pageTemp, no
Then, the startRowkey of upper nextpage is read directly from caching.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510557950.9A CN105045932B (en) | 2015-09-02 | 2015-09-02 | A kind of data page querying method based on descending storage |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510557950.9A CN105045932B (en) | 2015-09-02 | 2015-09-02 | A kind of data page querying method based on descending storage |
Publications (2)
Publication Number | Publication Date |
---|---|
CN105045932A CN105045932A (en) | 2015-11-11 |
CN105045932B true CN105045932B (en) | 2018-11-13 |
Family
ID=54452478
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510557950.9A Active CN105045932B (en) | 2015-09-02 | 2015-09-02 | A kind of data page querying method based on descending storage |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN105045932B (en) |
Families Citing this family (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10574739B2 (en) * | 2016-02-26 | 2020-02-25 | Honeywell International Inc. | System and method for smart event paging |
CN105843956A (en) * | 2016-04-14 | 2016-08-10 | 北京搜狐新媒体信息技术有限公司 | Paging query method and system |
CN106021357B (en) * | 2016-05-09 | 2019-05-03 | 泰华智慧产业集团股份有限公司 | Based on distributed big data paging query method and system |
CN106126731B (en) * | 2016-07-01 | 2020-02-14 | 百势软件(北京)有限公司 | Method and device for acquiring Elasticissearch paging data |
CN108073661A (en) * | 2016-11-18 | 2018-05-25 | 北京京东尚科信息技术有限公司 | Data retrieval method and device, report generating system and method |
CN107391749B (en) * | 2017-08-15 | 2020-07-31 | 杭州安恒信息技术股份有限公司 | Method for realizing waterfall flow by inquiring sub-table data |
CN109460404A (en) * | 2018-09-03 | 2019-03-12 | 中新网络信息安全股份有限公司 | A kind of efficient Hbase paging query method based on redis |
CN109271597B (en) * | 2018-09-19 | 2022-02-18 | 郑州云海信息技术有限公司 | Method and device for paging display of multi-table scene of non-relational database |
CN110597859B (en) * | 2019-09-06 | 2022-03-29 | 天津车之家数据信息技术有限公司 | Method and device for querying data in pages |
CN111221815B (en) * | 2019-11-07 | 2021-07-27 | 南京莱斯网信技术研究院有限公司 | Script-based web service paging data acquisition system |
CN111400347A (en) * | 2020-03-20 | 2020-07-10 | 北京思特奇信息技术股份有限公司 | Paging query method, system and electronic equipment |
CN111488370B (en) * | 2020-04-02 | 2023-09-12 | 杭州迪普科技股份有限公司 | List paging quick response system and method |
CN112182040A (en) * | 2020-09-30 | 2021-01-05 | 深圳前海微众银行股份有限公司 | Data query method, device, equipment and storage medium |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103268341A (en) * | 2013-05-10 | 2013-08-28 | 深圳市葡萄信息技术有限公司 | Time line integration method and system on basis of multi-source |
CN103617232A (en) * | 2013-11-26 | 2014-03-05 | 北京京东尚科信息技术有限公司 | Paging inquiring method for HBase table |
CN104850640A (en) * | 2015-05-26 | 2015-08-19 | 华北电力大学(保定) | HBase based storage and query method and system for power equipment status monitoring data |
-
2015
- 2015-09-02 CN CN201510557950.9A patent/CN105045932B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103268341A (en) * | 2013-05-10 | 2013-08-28 | 深圳市葡萄信息技术有限公司 | Time line integration method and system on basis of multi-source |
CN103617232A (en) * | 2013-11-26 | 2014-03-05 | 北京京东尚科信息技术有限公司 | Paging inquiring method for HBase table |
CN104850640A (en) * | 2015-05-26 | 2015-08-19 | 华北电力大学(保定) | HBase based storage and query method and system for power equipment status monitoring data |
Non-Patent Citations (1)
Title |
---|
云数据库;wwduest;《百度文库》;20150729;ppt第9页 * |
Also Published As
Publication number | Publication date |
---|---|
CN105045932A (en) | 2015-11-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN105045932B (en) | A kind of data page querying method based on descending storage | |
CN104750681B (en) | A kind of processing method and processing device of mass data | |
CN103646051B (en) | Big-data parallel processing system and method based on column storage | |
CN103631909B (en) | System and method for combined processing of large-scale structured and unstructured data | |
CN102446225A (en) | Real-time search method, device and system | |
CN101196900A (en) | Information searching method based on metadata | |
CN102164186A (en) | Method and system for realizing cloud search service | |
CN108509437A (en) | A kind of ElasticSearch inquiries accelerated method | |
CN104424258A (en) | Multidimensional data query method and system, query server and column storage server | |
CN102999563A (en) | Network resource semantic retrieval method and system based on resource description framework | |
CN103955533B (en) | A kind of page tree data acquisition device based on buffer queue and method | |
CN108228743A (en) | A kind of real-time big data search engine system | |
CN102253939A (en) | Searching method and system based on cloud computing technology | |
Wang et al. | A novel blockchain oracle implementation scheme based on application specific knowledge engines | |
CN107798062A (en) | A kind of transformer station's historical data unifies storage method and system | |
CA3062944A1 (en) | An emergency disposal support system | |
CN102156749B (en) | Anatomic search and judgment method, system and distributed server system for map sites | |
CN109189873A (en) | A kind of Meteorological Services big data monitoring analysis system platform | |
CN103218396B (en) | The management and running visual analysis method of static Web page is generated according to visitation frequency feature | |
CN103823805B (en) | Community-based correlation note commending system and recommendation method | |
CN106777395A (en) | A kind of topic based on community's text data finds system | |
CN101788981A (en) | Deep web mobile search method, server and system | |
Tang et al. | Searching the Internet of Things using coding enabled index technology | |
Lu et al. | Research and implementation of big data system of social media | |
Jun et al. | Application of Web services on the real-time data warehouse technology |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right | ||
TR01 | Transfer of patent right |
Effective date of registration: 20190726 Address after: Room 302 and 303, 86 blocks, 700 Yishan Road, Xuhui District, Shanghai Patentee after: Shanghai BETA Software Co., Ltd. Address before: 210003 Gulou District, Jiangsu, Nanjing new model road, No. 66 Patentee before: Nanjing Post & Telecommunication Univ. |