CN106503243A - Electric power big data querying method and system based on HBase secondary indexs - Google Patents

Electric power big data querying method and system based on HBase secondary indexs Download PDF

Info

Publication number
CN106503243A
CN106503243A CN201610980816.4A CN201610980816A CN106503243A CN 106503243 A CN106503243 A CN 106503243A CN 201610980816 A CN201610980816 A CN 201610980816A CN 106503243 A CN106503243 A CN 106503243A
Authority
CN
China
Prior art keywords
data
secondary index
index table
row
tables
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201610980816.4A
Other languages
Chinese (zh)
Other versions
CN106503243B (en
Inventor
马艳
苏建军
张方正
李红梅
郭志红
陈玉峰
祝永新
盛戈皞
杨祎
许乃媛
沈宇蓝
王畅
刘斌
孙占睿
李程启
林颖
耿玉杰
白德盟
李华东
王勇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Jiaotong University
State Grid Corp of China SGCC
Electric Power Research Institute of State Grid Shandong Electric Power Co Ltd
Original Assignee
Shanghai Jiaotong University
State Grid Corp of China SGCC
Electric Power Research Institute of State Grid Shandong Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Jiaotong University, State Grid Corp of China SGCC, Electric Power Research Institute of State Grid Shandong Electric Power Co Ltd filed Critical Shanghai Jiaotong University
Priority to CN201610980816.4A priority Critical patent/CN106503243B/en
Publication of CN106503243A publication Critical patent/CN106503243A/en
Application granted granted Critical
Publication of CN106503243B publication Critical patent/CN106503243B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2272Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24553Query execution of query operations

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses the electric power big data querying method and system based on HBase secondary indexs;It includes:Step (1):Set up secondary index table;Step (2):Judge whether tables of data has renewal, if having, just update secondary index table, if not having, do not update secondary index table;Step (3):Data are inquired about using secondary index table.The present invention can realize basic renewal operation, and more can efficiently realize the connection Query between tables of data and selection inquiry operation for each concrete business, so as to realize the support to complicated business demand.

Description

Electric power big data querying method and system based on HBase secondary indexs
Technical field
The present invention relates to the electric power big data querying method and system based on HBase secondary indexs.
Background technology
The safety of power transmission and transforming equipment is the basis of electric power netting safe running.The data message related to power transmission and transformation equipment state is produced Be conigenous patrol and examine, test, live detection, on-line monitoring, operation of power networks, the running such as environment weather and equipment account, dispersion Among different systems, data volume is big, and type is complicated.Design the effective distributed storage mould towards power transmission and transforming equipment big data Type is to realize basis that equipment state is comprehensively and accurately evaluated, is realize electrical network big data Complete Coupling Analysis important Support, significant.
Run on the HBase data bases in Hadoop platform be a high reliability, high-performance, towards row, extendible Distributed memory system.Large-scale storage cluster, energy can be erected on low-cost server cluster using HBase database technologys Enough meet the storage demand of electrical network big data.But, the big data storage scheme based on HBase is not fully solved data Efficient retrieval problem, especially in the face of electric power big data is complicated, flexible inquiry business demand, single line unit cannot necessarily meet Service inquiry needs, therefore a kind of urgently big data search method that disclosure satisfy that needs.
[1] electrical network sequential big data storage method, CN 104239447A, it is proposed that a kind of electrical network sequential big data storage Method, is used as accumulation layer by selecting distributed columnar database HBase of increasing income, in conjunction with SG-CIM models in electrical network business to industry The a collection of measuring point information that business has position correlation in logic re-starts description, is deposited by designing a kind of rational measuring point data Index organization's mode of storage table, using the subregion and load-balancing function of HBase so that there is in service logic position correlation The position of the historical data in physical store of a collection of measuring point be adjacent so that the historical data to this batch of measuring point is entered The disk tracking time can be reduced during row inquiry, improved search efficiency, provided immediate inquiring service for service application.
[2] HBase secondary indexs method and device, CN 104112013A propose to set up the two of user's table based on HBase Level index, the index entry of secondary index sort to the value of the rowkey of user's table, so as to conveniently according to value to user's table Make a look up.Every user's table corresponds to a secondary index table, and user's table is stored in corresponding secondary index table when storing In identical region server, it is to avoid transregional index.
Patent [1], [2] are different from the present invention.[1] proposed is a kind of number for corresponding service logic dependency According to secondary index organizational form, core concept be so that logically related data storage when realize physically adjacent, So as to improve search efficiency.[2] a kind of index for occuping HBase for proposing generates scheme, and core concept is a tables of data pair A concordance list is answered, and tables of data and manipulative indexing table are stored on the same server, so as to improve search efficiency.This The secondary index scheme of bright proposition is the electric power big data storage model based on HBase, first according to inquiry business to dependency number Secondary index table is set up according to row, a basic query business corresponds to a secondary index table, and a complex query business can be right Answer multiple secondary index tables.When inquiry, inquire about the line unit for obtaining corresponding data first according to concordance list, exist further according to line unit Inquire about in tables of data so as to obtaining data.Update the data table related column when, need simultaneously to update corresponding secondary index Table.
Content of the invention
The purpose of the present invention is exactly to solve the above problems, there is provided a kind of big number of the electric power based on HBase secondary indexs It is investigated that asking method and system, it is possible to achieve basic renewal operation, and can be more efficient real for each concrete business Connection Query and selection inquiry operation between existing tables of data, so that realize the support to complicated business demand.
To achieve these goals, the present invention is adopted the following technical scheme that:
Based on the electric power big data querying method of HBase secondary indexs, comprise the steps:
Step (1):Set up secondary index table;
Step (2):Judge whether tables of data has renewal, if having, just update secondary index table, if not having, do not update two Level concordance list;
Step (3):Data are inquired about using secondary index table.
Step (1) is set up the method for secondary index table and is comprised the steps:
Step (11):Secondary index table is generated according to action type;
Step (12):According to data column-generation secondary index entry and insert secondary index table;
The step of step (11) is:
Step (111):For inquiry operation is selected, M data row for being related to select inquiry are respectively stored into M two grades In concordance list, wherein, M is more than or equal to 1, and the line unit R of each secondary index table is constituted by three parts, is successively: QUALIFIER, VALUE and ROEKEY;Wherein QUALIFIER is the identifier of data row in tables of data, and VALUE is in tables of data The value of data row, ROWKEY is the line unit of tables of data;
Step (112):For connection Query is operated, N number of data row storage of connection Query will be related to two grades of ropes Draw in table, wherein, N is more than or equal to 2, and the line unit R of secondary index table is made up of three parts, is successively:PREFIX、VALUE、 QUALIFIER;Wherein PREFIX is generated by hash function, and for distinguishing the group of connection Query, VALUE is data row in tables of data Value, QUALIFIER be in tables of data data row identifier;
Step (113):For step (111) and step (112), in the secondary index table, the value of data row is corresponding number ROWKEY according to table;In the secondary index table, the value of data row and the line unit R of secondary index table collectively form secondary index table An entry;
Secondary index table (specifying the table name of secondary index table) is created using HBase, and data row are arrived corresponding two grades The incidence relation of concordance list is stored in metadata table, and the line unit of metadata table is constituted and is followed successively by:
The table name of tables of data, row Praenomen, row name, the action type of secondary index table, timestamp,
The corresponding value of the line unit of metadata table is:The action type of secondary index table and secondary index table name.
The action type of secondary index table includes:Select inquiry operation and connection Query operation.
The step of step (12) is:
Step (121):For inquiry operation is selected, M data row are scanned respectively, according to the bar described in step (113) Mesh form generates secondary index table clause, and secondary index entry is inserted in corresponding secondary index table.
Step (122):For connection Query is operated, N number of data row are scanned respectively, according to the bar described in step (113) Mesh form generates secondary index entry, and secondary index entry is inserted in same secondary index table.
Described step (2) update the method for secondary index table and comprise the steps:
Step (21):Update the data table:The Put method interfaces provided by the HBase in Hadoop platform, submit data to The identifier of the value of row, line unit, row race and row, the renewal of complete paired data table;
Step (22):Generate secondary index entry:For the row of the data for currently updating, query metadata table, needed Secondary index table to be updated and the corresponding action type of secondary index table, select corresponding secondary index according to action type Tableau format, is generated using the data message updated in tables of data and meets the tabular entry of corresponding secondary index;
Step (23):Update secondary index table:The interface provided by the HBase Coprocessor in Hadoop platform Method, the form of the secondary index entry generated according to step (22) submit the mark of the value of secondary index table, line unit, row race and row to Know symbol, complete the renewal to secondary index table.
Step (22) comprise the steps:
Step (221):If the action type of secondary index table is for selecting inquiry operation, according to (111) two grades of ropes of step Draw tableau format, generated using the data message updated in tables of data and meet the tabular entry of corresponding secondary index;
Step (222):If the action type of secondary index table is operated for connection Query, according to (112) two grades of ropes of step Draw tableau format, the tabular entry of compound corresponding secondary index is generated using the data message updated in tables of data.
Step (3) are inquired about to data using secondary index table, comprise the steps:
Step (31):Scanning secondary index table obtains the line unit of data to be checked;
Step (32):Collection query tables of data using the ROWKEY of data to be checked.
The step of step (31) is:
Step (311):Querying method for the secondary index table for selecting inquiry:
For each the data row in M data row for selecting inquiry business to be related to, first number is inquired about according to action type According to table, the title of corresponding secondary index table is obtained.The secondary index table is looked into, specific query script is:
According to the secondary index table row key form in step (111), directly fixed according to the condition value selected in inquiry Position is continued to scan on to first qualified data, until finding ineligible data;Scanned meets bar The data composition of part meets the set of the ROWKEY of the querying condition of current data row.
If M is equal to the set that 1, ROWKEY set is the ROWKEY of data to be checked;
If M is more than 1, according to the logical relation in M data row in inquiry business, the ROWKEY of different lines is gathered Do corresponding set operation:The corresponding intersection of sets collection operation of logical AND, logic or corresponding union operation, the result of computing is to be checked Ask the set of the ROWKEY of data.
Step (312):Querying method for the secondary index table of connection Query:
For N number of data row that connection Query business is related to, corresponding two are obtained according to action type query metadata table The title (the corresponding same secondary index table of N number of row) of level concordance list.The secondary index table is inquired about, specific query script is:
According to the secondary index table row key form in step (112), the N number of data with identical value are listed in two grades Corresponding entry continuous arrangement in concordance list;
If the number of the continuously arranged directory entry with identical data train value is N, the ROWKEY structures of N number of entry Into a N tuple for meeting querying condition<R1,R2,…,RN>;
Scan whole secondary index table, then obtain all N tuples for meeting condition set<R1,R2,…,RN>, then Set<R1,R2,…,RN>Be exactly data to be checked ROWKEY set.
The step of step (32) is:
There is provided by HBase using the set of the ROWKEY of the data to be checked obtained in step (311) and step (312) Get interface methods obtain corresponding data value in tables of data.
Based on the electric power big data inquiry system of HBase secondary indexs, including:
Module set up by secondary index table:For setting up secondary index table;
Judge update module:Judge whether tables of data has renewal, if having, just update secondary index table, if not having, not more New secondary index table;
Data inquiry module:Data are inquired about using secondary index table.
Beneficial effects of the present invention:
This patent proposes a kind of secondary index design based on HBase.The secondary index design can have Most basic connection Query in the support relational database of effect, inquiry operation is selected, so as to for electrical network big data complex query Business provides good support.Meanwhile, service-oriented sets up corresponding secondary index table, can be in the performance and business that inquires about It is balanced between motility.
The present invention proposes a kind of secondary index design based on HBase data bases, it is achieved that in relational database Basic selection inquiry and connection Query function, support can be provided to complicated inquiry business demand in network system.
The selection query performance of the present invention:For any table T1, inquiry meets condition<T1.a,a’>Record, the present invention The bar number of the data record for scanning is needed equal to the bar number of the record for meeting condition, less than bar number | T1 | of the record of whole table, Need the record strip number of scanning suitable with the row for establishing index for inquiring about traditional relational database.
The connection Query performance of the present invention:For any two table T1, T2 carry out connection Query operation, traditional relation number The bar number for needing the record of scanning according to storehouse is | T1 | * | T2 |, and it is | T1 |+| T2 | that the present invention needs the record strip number of scanning, comprehensive The join operation between set after consideration, the present invention can largely improve the performance of connection Query.
Description of the drawings
Fig. 1 is the data query flow chart of the present invention;
Fig. 2 is that the electric power big data of the present invention selects querying method flow chart;
Fig. 3 is the electric power big data connection Query method flow diagram of the present invention.
Specific embodiment
The invention will be further described with embodiment below in conjunction with the accompanying drawings.
The present invention program mainly includes the content of two aspects, the update scheme and logarithm to tables of data and secondary index According to the query scheme of table, the query scheme of wherein tables of data includes secondary index organization's scheme of basic selection inquiry and right Secondary index organization's scheme of basic connection Query.As Figure 1-3.
5.1 set up secondary index table
In the present invention, a basic query business corresponds to a secondary index table, and a complex query business can be right Answer multiple secondary index tables.The information of the identifier of the value of the data of secondary index table, line unit, row race and row is believed by former data Breath is integrated layout and is obtained.
A) for the secondary index for selecting inquiry operation, the corresponding row by the multiple tables for being related to select inquiry of the invention is deposited Store up in a table, the line unit R of concordance list is made up of three parts, is successively:QUALIFIER、VALUE、ROEKEY.Wherein QUALIFIER is the identifier arranged in tables of data, and VALUE is the value of data row in tables of data, and ROWKEY is the line unit of tables of data.
B) for connection Query is operated, the secondary index for being related to the corresponding row of multiple tables of connection Query is deposited by the present invention Store up in a table, the line unit R of concordance list is made up of three parts, is successively:PREFIX、VALUE、QUALIFIER.PREFIX by Hash function is generated, and for distinguishing the group of connection Query, VALUE is the value of data row in tables of data, and QUALIFIER is tables of data The identifier of middle row.
The train value of concordance list is the line unit of corresponding data, and concordance list line unit collectively forms an entry of concordance list.
5.2 select corresponding secondary index table according to operation requests
In the present invention, the corresponding relation of business and corresponding concordance list is stored in the metadata, is updated or inquiry one During the corresponding tables of data of business, corresponding secondary index table is obtained according to metadata.
5.3 data update
5.3.1 table is updated the data
HBase Coprocessor in the Hadoop platform that the present invention is used are provided to the interpolation deletion action of tables of data Basic support.The interface provided by HBase Coprocessor, submits the mark of value, line unit, row race and the row of data to Symbol, you can tables of data is updated.
5.3.2 generate secondary index entry
According to secondary index tableau format, the data message for updating generation is needed to meet corresponding two grades of ropes using known Draw tabular entry.
5.3.3 concordance list is updated
The update method of concordance list is similar with data table updating method, the interface provided by HBase Coprocessor, Submit the identifier of value, line unit, row race and the row of concordance list to, you can concordance list is updated.
5.4 data query
5.4.1 inquiry secondary index table
For being determined for compliance with the line unit value of condition data, need to carry out prescan to secondary index table before inquiry data.
A) for the querying method of the concordance list for selecting inquiry:
For a compound selection inquiry business, the compound selection querying condition of business is split as single query bar first Part, then obtains the entry set for meeting single condition by the line unit of concordance list, finally will meet the entry of each single condition Set carries out set operation, you can obtains all secondary index entries for meeting compound query condition, then carries from these entries Take all qualified tables of data line units.Wherein, when the secondary index bar destination aggregation (mda) for meeting single condition is obtained, can be according to Directly position to first qualified data according to the line unit of concordance list, down scan, until discovery one is ineligible Data, then scanned entry is merged into the secondary index bar destination aggregation (mda) for meeting single condition.
As shown in Fig. 2 there is tables of data T1, T2, for compound selection inquiry business (Y1):<T1.a,a’>||<T1.c, c’>||<T2.b,b’>(meet value " less than " a ' of the data row a in table T1, or meet the value of the data row c in table T1 " less than " c ', or meet value " less than " b ' of data row b in table T2), secondary index table is by the middle data of tables of data T1, T2 Corresponding secondary index entry is stored in a table.For Y1, in corresponding secondary index table, with identical QUALIFIER The line unit of beginning forms continuous storage record segment (secondary index table).For querying condition<T1.a,a’>, can be according to T1.a First record for meeting condition is directly targeted to, after continuous scanning, first record for being unsatisfactory for condition, i.e. data is run into Record of the value more than a ', scanning completes, and scanned entry is merged the set S1 for obtaining a line unit:{ R1 }, be Meet condition<T1.a,a’>All data Ji Lu tables of data in line unit set.In the same manner, sequential scan concordance list other Part, can be met condition successively<T1.c,c’>Set S2:{ R2 } and meet condition<T2.b,b’>Set S3: { R3 }, is met the value of line unit in all data Ji Lu tables of data of Y1 by then seeking S1 ∪ S2 ∪ S3.
B) for the querying method of connection Query concordance list:
For compound connection Query business, inquiry can be divided into two connection Query groups, the number of same connection Query group When being inserted in concordance list according to row, identical PREFIX value is produced by hash function.The corresponding values of line unit R are then that this is listed in data Line unit in table.Whole scan is carried out to secondary index table during inquiry, qualified many tuple-sets are recorded, then these are more Tuple-set carries out set operation, obtains the line unit value of eligible data.Wherein qualified many tuple-sets are being recorded During, when many tuples of only continuous entry composition can meet the condition of connection Query group, just this many tuple is added Add in many tuple-sets.
As shown in figure 3, there is tables of data T1, T2, T3, T4, for compound connection Query business (Y2):T1.a=T2.b= Inquiry can be divided into two by T4.d&&T1.e=T3.c (wherein, a, b, c, d, e are respectively the row in table T1, T2, T3, T4, T1) Individual connection Query group, organizes one (Z1):Two (Z2) of T1.a=T2.b=T4.d and group:T1.e=T3.c.For Y1, all of in Z1 Row all can be started with same PREFIX, therefore can form continuous storage record (secondary index table), and the scanning that starts anew should Section storage record, the record with identical VALUE can be scanned consecutively, scanning is counted, continuous three (because Z1 It is related to 3 tables) VALUE identicals are recorded as result record of connection Query, are as a result the set S1 of a tlv triple:{< R1,R2,R4>, R1, R2, R4 correspond to three data with same VALUE respectively and are listed in the line unit in tables of data T1, T2, T4 Value.S1 is the connection Query result of Z1.Equally, scan that Z2 is formed so record, can obtain similar connection Query knot Fruit S2:{<R1,R3>, because between Z1 and Z2 being the relation (&& for occuring simultaneously), therefore, connection Query behaviour is to S1 and S2 on R1 Can be obtained by the final Query Result S of business Y2:{<R1、R2、R3、R4>}.
5.4.2 content is obtained in tables of data using line unit
After the line unit value for obtaining the data for meeting querying condition, the line unit value for obtaining can be used to pass through HBase The Get interfaces that Coprocessor is provided obtain the data value corresponding to line unit value in tables of data.
Specific embodiment:
Hadoop distributed file systems are installed;
Install HBase data bases, version be 0.92 and after;
PrePut the and postPut methods of region observer in HBase Coprocessor are rewritten, according to The data of new insertion, are updated to corresponding secondary index table;
The preGet methods of region observer in HBase Coprocessor are realized, is first accessed according to query argument Corresponding secondary index table obtains the line unit of inquiry data, then inquires about the data for needing according to line unit.
Although the above-mentioned accompanying drawing that combines is described to the specific embodiment of the present invention, not to present invention protection model The restriction that encloses, one of ordinary skill in the art should be understood that on the basis of technical scheme those skilled in the art are not The various modifications that makes by needing to pay creative work or deformation are still within protection scope of the present invention.

Claims (10)

1. the electric power big data querying method based on HBase secondary indexs, is characterized in that, comprise the steps:
Step (1):Set up secondary index table;
Step (2):Judge whether tables of data has renewal, if having, just update secondary index table, if not having, do not update two grades of ropes Draw table;
Step (3):Data are inquired about using secondary index table.
2. the electric power big data querying method based on HBase secondary indexs as claimed in claim 1, is characterized in that, the step Suddenly (1) is set up the method for secondary index table and is comprised the steps:
Step (11):Secondary index table is generated according to action type;
Step (12):According to data column-generation secondary index entry and insert secondary index table.
3. the electric power big data querying method based on HBase secondary indexs as claimed in claim 2, is characterized in that, the step Suddenly the step of (11) it is:
Step (111):For inquiry operation is selected, M data row for being related to select inquiry are respectively stored into M secondary index In table, wherein, M is more than or equal to 1, and the line unit R of each secondary index table is constituted by three parts, is successively:QUALIFIER、 VALUE and ROEKEY;Wherein QUALIFIER is the identifier of data row in tables of data, and VALUE is data row in tables of data Value, ROWKEY is the line unit of tables of data;
Step (112):For connection Query is operated, N number of data row storage of connection Query will be related to a secondary index table In, wherein, N is more than or equal to 2, and the line unit R of secondary index table is made up of three parts, is successively:PREFIX、VALUE、 QUALIFIER;Wherein PREFIX is generated by hash function, and for distinguishing the group of connection Query, VALUE is data row in tables of data Value, QUALIFIER be in tables of data data row identifier;
Step (113):For step (111) and step (112), in the secondary index table, the value of data row is corresponding data table ROWKEY;In the secondary index table, the value of data row and the line unit R of secondary index table collectively form the one of secondary index table Individual entry;
Secondary index table is created using HBase, and by the incidence relation storage of data row to corresponding secondary index table to first number According to table, the line unit of metadata table is constituted and is followed successively by:
The table name of tables of data, row Praenomen, row name, the action type of secondary index table, timestamp,
The corresponding value of the line unit of metadata table is:The action type of secondary index table and secondary index table name.
4. the electric power big data querying method based on HBase secondary indexs as claimed in claim 3, is characterized in that,
The step of step (12) is:
Step (121):For inquiry operation is selected, M data row are scanned respectively, according to the entry lattice described in step (113) Formula generates secondary index table clause, and secondary index entry is inserted in corresponding secondary index table;
Step (122):For connection Query is operated, N number of data row are scanned respectively, according to the entry lattice described in step (113) Formula generates secondary index entry, and secondary index entry is inserted in same secondary index table.
5. the electric power big data querying method based on HBase secondary indexs as claimed in claim 3, is characterized in that, the step Suddenly the method for (2) renewal secondary index table comprises the steps:
Step (21):Update the data table:The Put method interfaces provided by the HBase in Hadoop platform, submit data row to The identifier of value, line unit, row race and row, the renewal of complete paired data table;
Step (22):Generate secondary index entry:For the row of the data for currently updating, query metadata table, obtaining needs more New secondary index table and the corresponding action type of secondary index table, select corresponding secondary index table according to action type Form, is generated using the data message updated in tables of data and meets the tabular entry of corresponding secondary index;
Step (23):Update secondary index table:The interface side provided by the HBase Coprocessor in Hadoop platform Method, the form of the secondary index entry generated according to step (22) submit the mark of the value of secondary index table, line unit, row race and row to Symbol, completes the renewal to secondary index table.
6. the electric power big data querying method based on HBase secondary indexs as claimed in claim 5, is characterized in that, the step Suddenly (22) comprise the steps:
Step (221):If the action type of secondary index table is for selecting inquiry operation, according to step (111) secondary index table Form, using in tables of data update data message generate meet the tabular entry of corresponding secondary index;
Step (222):If the action type of secondary index table is operated for connection Query, according to step (112) secondary index table Form, generate the tabular entry of compound corresponding secondary index using the data message updated in tables of data.
7. the electric power big data querying method based on HBase secondary indexs as claimed in claim 3, is characterized in that, the step Suddenly (3) are inquired about to data using secondary index table, comprise the steps:
Step (31):Scanning secondary index table obtains the line unit of data to be checked;
Step (32):Collection query tables of data using the ROWKEY of data to be checked.
8. the electric power big data querying method based on HBase secondary indexs as claimed in claim 7, is characterized in that, the step Suddenly the step of (31) it is:
Step (311):Querying method for the secondary index table for selecting inquiry:
For each the data row in M data row for selecting inquiry business to be related to, according to action type query metadata table, Obtain the title of corresponding secondary index table;The secondary index table is looked into, specific query script is:
According to the secondary index table row key form in step (111), according to the condition value selected in inquiry directly position to First qualified data, continues to scan on, until finding ineligible data;Scanned is qualified Data composition meets the set of the ROWKEY of the querying condition of current data row;
If M is equal to the set that 1, ROWKEY set is the ROWKEY of data to be checked;
If M is more than 1, according to the logical relation in M data row in inquiry business, phase is done to the ROWKEY set of different lines The set operation that answers:The corresponding intersection of sets collection operation of logical AND, logic or corresponding union operation, the result of computing is number to be checked According to ROWKEY set;
Step (312):Querying method for the secondary index table of connection Query:
For N number of data row that connection Query business is related to, corresponding two grades of ropes are obtained according to action type query metadata table Draw the title of table;The secondary index table is inquired about, specific query script is:
According to the secondary index table row key form in step (112), the N number of data with identical value are listed in secondary index Corresponding entry continuous arrangement in table;
If the number of the continuously arranged directory entry with identical data train value is N, the ROWKEY of N number of entry constitutes one The individual N tuples for meeting querying condition<R1,R2,…,RN>;
Scan whole secondary index table, then obtain all N tuples for meeting condition set<R1,R2,…,RN>, then gather {<R1,R2,…,RN>Be exactly data to be checked ROWKEY set.
9. the electric power big data querying method based on HBase secondary indexs as claimed in claim 8, is characterized in that, the step Suddenly the step of (32) it is:
There is provided by HBase using the set of the ROWKEY of the data to be checked obtained in step (311) and step (312) Get interface methods obtain corresponding data value in tables of data.
10. the electric power big data inquiry system based on HBase secondary indexs, is characterized in that, including:
Module set up by secondary index table:For setting up secondary index table;
Judge update module:Judge whether tables of data has renewal, if having, just update secondary index table, if not having, do not update two Level concordance list;
Data inquiry module:Data are inquired about using secondary index table.
CN201610980816.4A 2016-11-08 2016-11-08 Electric power big data querying method based on HBase secondary index Active CN106503243B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610980816.4A CN106503243B (en) 2016-11-08 2016-11-08 Electric power big data querying method based on HBase secondary index

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610980816.4A CN106503243B (en) 2016-11-08 2016-11-08 Electric power big data querying method based on HBase secondary index

Publications (2)

Publication Number Publication Date
CN106503243A true CN106503243A (en) 2017-03-15
CN106503243B CN106503243B (en) 2019-08-06

Family

ID=58323974

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610980816.4A Active CN106503243B (en) 2016-11-08 2016-11-08 Electric power big data querying method based on HBase secondary index

Country Status (1)

Country Link
CN (1) CN106503243B (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107341198A (en) * 2017-06-16 2017-11-10 云南电网有限责任公司信息中心 A kind of electric power mass data storage and querying method based on subject example
CN107506464A (en) * 2017-08-30 2017-12-22 武汉烽火众智数字技术有限责任公司 A kind of method that HBase secondary indexs are realized based on ES
CN108241724A (en) * 2017-05-11 2018-07-03 新华三大数据技术有限公司 A kind of metadata management method and device
CN108319665A (en) * 2018-01-18 2018-07-24 努比亚技术有限公司 Hbase train values lookup method, terminal and storage medium
CN108398641A (en) * 2017-11-30 2018-08-14 深圳市科列技术股份有限公司 A kind of battery data processing method and battery data server
CN109063186A (en) * 2018-08-27 2018-12-21 郑州云海信息技术有限公司 A kind of General query method and relevant apparatus
CN109299102A (en) * 2018-10-23 2019-02-01 中国电子科技集团公司第二十八研究所 A kind of HBase secondary index system and method based on Elastcisearch
CN109800222A (en) * 2018-12-11 2019-05-24 中国科学院信息工程研究所 A kind of HBase secondary index adaptive optimization method and system
CN110502524A (en) * 2019-08-15 2019-11-26 济南浪潮数据技术有限公司 Phoenix index data asynchronous updating method and device
CN113742344A (en) * 2021-09-01 2021-12-03 南方电网深圳数字电网研究院有限公司 Method and device for indexing power system data
CN114372064A (en) * 2022-03-22 2022-04-19 飞狐信息技术(天津)有限公司 Data processing apparatus, method, computer readable medium and processor

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104112013A (en) * 2014-07-17 2014-10-22 浪潮(北京)电子信息产业有限公司 HBase secondary indexing method and device
CN104217011A (en) * 2014-09-19 2014-12-17 浪潮(北京)电子信息产业有限公司 Method and device for inquiring HBase secondary index table

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104112013A (en) * 2014-07-17 2014-10-22 浪潮(北京)电子信息产业有限公司 HBase secondary indexing method and device
CN104217011A (en) * 2014-09-19 2014-12-17 浪潮(北京)电子信息产业有限公司 Method and device for inquiring HBase secondary index table

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108241724A (en) * 2017-05-11 2018-07-03 新华三大数据技术有限公司 A kind of metadata management method and device
CN107341198A (en) * 2017-06-16 2017-11-10 云南电网有限责任公司信息中心 A kind of electric power mass data storage and querying method based on subject example
CN107506464A (en) * 2017-08-30 2017-12-22 武汉烽火众智数字技术有限责任公司 A kind of method that HBase secondary indexs are realized based on ES
CN108398641A (en) * 2017-11-30 2018-08-14 深圳市科列技术股份有限公司 A kind of battery data processing method and battery data server
CN108319665B (en) * 2018-01-18 2022-04-19 努比亚技术有限公司 Hbase column value searching method, terminal and storage medium
CN108319665A (en) * 2018-01-18 2018-07-24 努比亚技术有限公司 Hbase train values lookup method, terminal and storage medium
CN109063186A (en) * 2018-08-27 2018-12-21 郑州云海信息技术有限公司 A kind of General query method and relevant apparatus
CN109299102A (en) * 2018-10-23 2019-02-01 中国电子科技集团公司第二十八研究所 A kind of HBase secondary index system and method based on Elastcisearch
CN109299102B (en) * 2018-10-23 2020-11-13 中国电子科技集团公司第二十八研究所 HBase secondary index system and method based on Elastcissearch
CN109800222B (en) * 2018-12-11 2021-06-01 中国科学院信息工程研究所 HBase secondary index self-adaptive optimization method and system
CN109800222A (en) * 2018-12-11 2019-05-24 中国科学院信息工程研究所 A kind of HBase secondary index adaptive optimization method and system
CN110502524A (en) * 2019-08-15 2019-11-26 济南浪潮数据技术有限公司 Phoenix index data asynchronous updating method and device
CN113742344A (en) * 2021-09-01 2021-12-03 南方电网深圳数字电网研究院有限公司 Method and device for indexing power system data
CN114372064A (en) * 2022-03-22 2022-04-19 飞狐信息技术(天津)有限公司 Data processing apparatus, method, computer readable medium and processor

Also Published As

Publication number Publication date
CN106503243B (en) 2019-08-06

Similar Documents

Publication Publication Date Title
CN106503243A (en) Electric power big data querying method and system based on HBase secondary indexs
CN103714134B (en) Network flow data index method and system
CN107506464A (en) A kind of method that HBase secondary indexs are realized based on ES
CN105338113B (en) A kind of multi-platform data interconnection system for Urban Data resource-sharing
CN105930446B (en) A kind of telecom client label generating method based on Hadoop distributed computing technology
CN104361113B (en) A kind of OLAP query optimization method under internal memory flash memory mixing memory module
CN113723810B (en) Power grid modeling method based on graph database
CN106372114A (en) Big data-based online analytical processing system and method
CN105989076A (en) Data statistical method and device
CN104216989A (en) Method for storing transmission line integrated data based on HBase
CN111258978A (en) Data storage method
CN103377236B (en) A kind of Connection inquiring method and system for distributed data base
CN103412883B (en) Semantic intelligent information distribution subscription method based on P2P technology
CN111737483A (en) Construction method of big data knowledge graph of smart power grid
CN109542846A (en) A kind of Internet of Things vulnerability information management system based on data virtualization
CN107491463A (en) The optimization method and system of data query
KR101255639B1 (en) Column-oriented database system and join process method using join index thereof
CN103034650A (en) System and method for processing data
CN109033173A (en) It is a kind of for generating the data processing method and device of multidimensional index data
CN107147531B (en) CDM cluster website management system
CN112052240A (en) HBase secondary memory index construction method based on coprocessor
CN116737753A (en) Service data processing method, device, computer equipment and storage medium
CN106446143A (en) Intelligent recommendation system and method based on graph structure matching
CN107066581B (en) The storage of distributed traffic monitor video data and quick retrieval system
CN102087655A (en) Web site system capable of embodying interpersonal relation net

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant