CN103488704A - Method and device for storing data - Google Patents

Method and device for storing data Download PDF

Info

Publication number
CN103488704A
CN103488704A CN201310403001.6A CN201310403001A CN103488704A CN 103488704 A CN103488704 A CN 103488704A CN 201310403001 A CN201310403001 A CN 201310403001A CN 103488704 A CN103488704 A CN 103488704A
Authority
CN
China
Prior art keywords
attribute field
rowkey
conditioned
hbase
conditioned attribute
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201310403001.6A
Other languages
Chinese (zh)
Other versions
CN103488704B (en
Inventor
张秀伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xinle Visual Intelligent Electronic Technology Tianjin Co ltd
Leshi Zhixin Electronic Technology Tianjin Co Ltd
Original Assignee
Leshi Zhixin Electronic Technology Tianjin Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Leshi Zhixin Electronic Technology Tianjin Co Ltd filed Critical Leshi Zhixin Electronic Technology Tianjin Co Ltd
Priority to CN201310403001.6A priority Critical patent/CN103488704B/en
Publication of CN103488704A publication Critical patent/CN103488704A/en
Application granted granted Critical
Publication of CN103488704B publication Critical patent/CN103488704B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2272Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/221Column-oriented storage; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2453Query optimisation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a method and a device for storing data, and belongs to the technical field of databases. The method includes that enabling a storage device to acquire data records to be stored; determining attribute fields which meet preset conditions in the data records; using the attribute fields which meet the preset conditions as prefixes Head of the row keys Rowkey in a database HBase; storing the row keys Rowkey in the database HBase. The method and the device have the advantages that the row keys Rowkey in the database HBase can be set and stored according to the preset conditions, so that the data query speed can be increased, the retrieval efficiency can be improved, and time consumption can be reduced.

Description

A kind of date storage method and device
Technical field
The present invention relates to database technical field, relate in particular to a kind of date storage method and device.
Background technology
Hadoop is a distributed system architecture, mainly distributed file system (HDFS, Hadoop Distributed File System), MapReduce and Hbase, consists of.Wherein, HBase be one distributed, towards row the database of increasing income.It is with the form storage data of table, and table is comprised of row and column, and row are divided into several row bunch, and a line is by line unit Rowkey, and timestamp and some row form.Line unit Rowkey is similar to the major key of relational database, for search records.
Hive is based on the Tool for Data Warehouse of Hadoop, structurized data file can be mapped as to a database table, and complete Structured Query Language (SQL) (sql, Structured Query Language) query function is provided.Its advantage is that learning cost is low, can realize fast simple statistics, the statistical study of very applicable data warehouse by class sql statement.
In the prior art, the Rowkey simplicity of design in HBase, when retrieved data record, inquired about the data in the HBase table by the hive statement.For example, the temporal filtering in hive is by attribute field, select*from tv_report where ts=' 2013-07-23 '.
But, by the data in hive statement inquiry HBase, need to the data in the HBase table all being scanned, recall precision is low, and length expends time in.
Summary of the invention
Embodiments of the invention provide a kind of date storage method and device, can improve data query speed according to the pre-conditioned line unit Rowkey preserved in the HBase database that arranges, and improve recall precision, have reduced expending of time.
For achieving the above object, embodiments of the invention adopt following technical scheme:
The embodiment of the present invention provides a kind of date storage method, comprising:
Obtain data recording to be stored;
Determine in described data recording and meet pre-conditioned attribute field;
Meet the prefix Head of pre-conditioned attribute field as the line unit Rowkey of HBase database using described;
Described Rowkey is stored in described HBase database, so that inquiry is described while meeting pre-conditioned attribute field in described HBase, by the prefix of inquiring about described Rowkey, obtains and meet pre-conditioned attribute field.
Describedly using described, meet the prefix Head of pre-conditioned attribute field as the line unit Rowkey of described HBase database, comprising:
Calculate the described digest value that meets pre-conditioned attribute field according to message digest algorithm MD5, described digest value is hexadecimal character string;
The prefix Head of line unit Rowkey using described digest value as described HBase database.
After meeting pre-conditioned attribute field in described definite described data recording, described method also comprises:
Determine in described data recording and do not meet described pre-conditioned attribute field;
Do not meet the row of described pre-conditioned attribute field as described HBase database using described;
Described row are stored in the HBase database.
Described Rowkey also comprises suffix, and described suffix length is fixed as 9 bytes, the long integer of "=" and 8 byte representations, consists of.
Described described Rowkey is stored in the HBase database, so that inquiry is described in described HBase while meeting pre-conditioned attribute field, by the prefix of inquiring about described Rowkey, obtains and meet pre-conditioned attribute field, specifically comprise:
Described Rowkey is stored in the HBase database;
According to the inquiry mechanism of blur filter FuzzyRowFilter regular expression up-to-date in described Hbase, inquiry meets pre-conditioned attribute field.
A kind of memory storage that the embodiment of the present invention provides comprises:
Acquiring unit, for obtaining data recording to be stored;
Processing unit, for determining that described data recording meets pre-conditioned attribute field; Meet the prefix Head of pre-conditioned attribute field as the line unit Rowkey of HBase database using described;
Storage unit, for described Rowkey being stored to described HBase database, so that inquiry is described while meeting pre-conditioned attribute field in described HBase, obtains and meet pre-conditioned attribute field by the prefix of inquiring about described Rowkey.
The unit of processing meet the prefix Head of pre-conditioned attribute field as the line unit Rowkey of described HBase database using described, comprising:
Calculate the described digest value that meets pre-conditioned attribute field according to message digest algorithm MD5, described digest value is hexadecimal character string;
The prefix Head of line unit Rowkey using described digest value as described HBase database.
Described processing unit also comprises after determining in described data recording and meeting pre-conditioned attribute field:
Determine in described data recording and do not meet described pre-conditioned attribute field;
Do not meet the row of described pre-conditioned attribute field as described HBase database using described;
Described row are stored in the HBase database.
Described Rowkey also comprises suffix, and described suffix length is fixed as 9 bytes, the long integer of "=" and 8 byte representations, consists of.
Described storage unit is stored to described Rowkey in the HBase database, so that inquiry is described while meeting pre-conditioned attribute field in described HBase, by the prefix of inquiring about described Rowkey, obtains and meets pre-conditioned attribute field, specifically comprises:
Described Rowkey is stored in the HBase database;
According to the inquiry mechanism of blur filter FuzzyRowFilter regular expression up-to-date in described Hbase, inquiry meets pre-conditioned attribute field.
The invention provides a kind of date storage method and device, memory storage obtains data recording to be stored, then meet pre-conditioned attribute field in the specified data record, and will meet the prefix Head of pre-conditioned attribute field as HBase database line unit Rowkey, finally Rowkey is stored in the HBase database, so that inquire about in HBase while meeting pre-conditioned attribute field, by the prefix of inquiring about Rowkey, obtain and meet pre-conditioned attribute field.By this scheme, memory storage can improve data query speed according to the pre-conditioned line unit Rowkey preserved in the HBase database that arranges, and improves recall precision, has reduced expending of time.
The accompanying drawing explanation
In order to be illustrated more clearly in the embodiment of the present invention or technical scheme of the prior art, below will the accompanying drawing of required use in embodiment or description of the Prior Art be briefly described, apparently, accompanying drawing in the following describes is only some embodiments of the present invention, for those of ordinary skills, under the prerequisite of not paying creative work, can also obtain according to these accompanying drawings other accompanying drawing.
The schematic flow sheet one of the date storage method that Fig. 1 is the embodiment of the present invention;
The schematic flow sheet two of the date storage method that Fig. 2 is the embodiment of the present invention;
The memory device structure schematic diagram that Fig. 3 is the embodiment of the present invention.
Embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is clearly and completely described, obviously, described embodiment is only the present invention's part embodiment, rather than whole embodiment.Embodiment based in the present invention, those of ordinary skills, not making under the creative work prerequisite the every other embodiment obtained, belong to the scope of protection of the invention.
Hadoop is a distributed system architecture, mainly distributed file system (HDFS, Hadoop Distributed File System), MapReduce and Hbase, consists of.It is one can more easily develop and move the software platform of processing large-scale data, and the user can develop distributed program in the situation that do not understand distributed bottom details.
HBase be one distributed, towards the database of increasing income of row, it is different from general relational database, it is a database that is suitable for unstructured data storage, be based on row rather than the pattern based on row.In prior art, Bigtable is the mapping (map) of a loose distributed lasting multidimensional ordering, and this map is by line unit (Rowkey), row key, and timestamp index.HBase is used and the very identical data model of Bigtable, and the user storage data row is in a table, and table is comprised of row and column, and row are divided into several row bunch, and a line is by line unit Rowkey, and timestamp and some row form.Line unit Rowkey is similar to the major key of relational database, for search records.A data line has the row of a selectable key and any amount, and table is the storage of loosening, so the user can give the row definition various row.In the HBase table, the Rowkey sequence pressed in all records, accesses the three kinds of modes that record of HBase table, is respectively: by single Rowkey access, the scope by Rowkey, full table scan.
Hive is based on the Tool for Data Warehouse of Hadoop, structurized data file can be mapped as to a database table, and complete sql query function is provided.Its advantage is that learning cost is low, can realize fast simple statistics, the statistical study of very applicable data warehouse by class sql statement.
Embodiment mono-
The embodiment of the present invention provides date storage method, and as shown in Figure 1, the method comprises:
S101, memory storage obtain data recording to be stored.
Concrete, memory storage, when the storage data, first will obtain data recording to be stored.
Wherein, data recording at least comprises event and time attribute, and event comprises unlatching, at least one in closing, triggering.
Meet pre-conditioned attribute field in S102, memory storage specified data record.
Wherein, pre-conditioned at least one comprising in time, event and account, event comprises unlatching, at least one in closing, triggering.
Concrete, pre-conditionedly when analyzing data, by the user, arranged voluntarily.
Optionally, pre-conditioned can be specified conditions, can be also combination condition.For example, pre-conditioned is time and/or event, and memory storage, according to time and/or event, is analyzed data recording to be stored, determines the attribute field that meets time and/or event in this data recording.
Concrete, after memory storage reads data recording to be stored, memory storage is according to described pre-conditioned, and from described data recording, inquiry is determined and is met pre-conditioned attribute field.
For example, tentation data is recorded as user access logs, and these Japan and China of user's access include the specifying information of user message table and user's access, are respectively: account, sex, company, zero-time, concluding time, accession page.If data analysis be according to zero-time and account to the data analysis of user access logs, the account in user access logs and zero-time are defined as meeting pre-conditioned attribute field.
S103, memory storage will meet the prefix Head of pre-conditioned attribute field as the line unit Rowkey of HBase database.
Concrete, memory storage, after record data record and determine the attribute field satisfied condition, is added into the attribute field satisfied condition in the line unit Rowkey of HBase database.
Wherein, the line unit Rowkey of HBase is divided into two parts, the prefix Head that first is regular length, and second portion is Tail.
The attribute field satisfied condition is added in the line unit Rowkey of HBase database, can be conducive to data query, raises the efficiency.
For example, tentation data is recorded as user access logs, and these Japan and China of user's access include the specifying information of user message table and user's access, are respectively: account, sex, company, zero-time, concluding time, accession page.If data analysis is the data analysis to user access logs according to zero-time and account, the account in user access logs and zero-time are defined as meeting pre-conditioned attribute field, and account and zero-time are added in the Head of Rowkey.
S104, memory storage are stored to Rowkey in the HBase database, so that inquire about in HBase while meeting pre-conditioned attribute field, by the prefix of inquiring about Rowkey, obtain and meet pre-conditioned attribute field.
Wherein, pre-conditioned at least one comprising in time, event and account, event comprises unlatching, at least one in closing, triggering.
Concrete, pre-conditionedly when analyzing data, by the user, arranged voluntarily.
Concrete, memory storage is after determining Rowkey, the Rowkey that adds content is saved in the HBase database, so that inquire about in HBase while meeting pre-conditioned attribute field, by the prefix of inquiring about Rowkey, obtains and meet pre-conditioned attribute field.
For example, tentation data is recorded as user access logs, and these Japan and China of user's access include the specifying information of user message table and user's access, are respectively: account, sex, company, zero-time, concluding time, accession page.Data analysis according to zero-time and account to user access logs, the account in user access logs and zero-time are defined as meeting pre-conditioned attribute field, and account and zero-time are added in the Head of Rowkey, the Rowkey that places account and zero-time is saved in the HBase database, so that while in HBase, according to account and zero-time, analyzing data, the prefix by inquiry Rowkey can obtain and meet pre-conditioned attribute field.
The invention provides a kind of date storage method, memory storage obtains data recording to be stored, then meet pre-conditioned attribute field in the specified data record, and will meet the prefix Head of pre-conditioned attribute field as the line unit Rowkey of HBase database, finally Rowkey is stored in the HBase database, so that inquire about in HBase while meeting pre-conditioned attribute field, by the prefix of inquiring about Rowkey, obtain and meet pre-conditioned attribute field.By this scheme, memory storage can improve data query speed according to the pre-conditioned line unit Rowkey preserved in the HBase database that arranges, and improves recall precision, has reduced expending of time.
Embodiment bis-
The embodiment of the present invention provides date storage method, and as shown in Figure 2, the method comprises:
S201, memory storage obtain data recording to be stored.
Concrete, memory storage, when the storage data, first will obtain data recording to be stored.
Wherein, data recording at least comprises event and time attribute, and event comprises unlatching, at least one in closing, triggering.
Meet pre-conditioned attribute field in S202, memory storage specified data record.
Wherein, pre-conditioned at least one comprising in time, event and account, event comprises unlatching, at least one in closing, triggering.
Concrete, pre-conditionedly when analyzing data, by the user, arranged voluntarily.
Optionally, pre-conditioned can be specified conditions, can be also combination condition.For example, pre-conditioned is time and/or event, and memory storage, according to time and/or event, is analyzed data recording to be stored, determines the attribute field that meets time and/or event in this data recording.
Concrete, after memory storage reads data recording to be stored, memory storage is according to described pre-conditioned, and from described data recording, inquiry is determined and is met pre-conditioned attribute field.
For example, tentation data is recorded as user access logs, and these Japan and China of user's access include the specifying information of user message table and user's access, are respectively: account, sex, company, zero-time, concluding time, accession page.If data analysis be according to zero-time and account to the data analysis of user access logs, account and zero-time are defined as meeting pre-conditioned attribute field.
Do not meet pre-conditioned attribute field in S203, memory storage specified data record.
Wherein, pre-conditioned at least one comprising in time, event and account, event comprises unlatching, at least one in closing, triggering.
Concrete, pre-conditionedly when analyzing data, by the user, arranged voluntarily.
Optionally, pre-conditioned can be specified conditions, can be also combination condition.For example, pre-conditioned is time and/or event, and memory storage, according to time and/or event, is analyzed data recording to be stored, determines the attribute field that meets time and/or event in this data recording.
Concrete, after memory storage reads data recording to be stored, memory storage is according to described pre-conditioned, and from described data recording, inquiry is determined and is met pre-conditioned attribute field, corresponding, need to determine and not meet pre-conditioned attribute field.
Further, memory storage, after analyzing data recording, is stored in this data recording in HBase.HBase is the form storage data of showing, and table is comprised of row and column, and memory storage is according to the form distributing storage data, after determining and meeting pre-conditioned attribute field, also needs to determine not meet pre-conditioned attribute field.
For example, tentation data is recorded as user access logs, and these Japan and China of user's access include the specifying information of user message table and user's access, are respectively: account, sex, company, zero-time, concluding time, accession page.If data analysis is the data analysis to user access logs according to zero-time and account, account and zero-time are defined as meeting pre-conditioned attribute field, all properties field by user access logs except account and zero-time is defined as not meeting pre-conditioned attribute field.
S204, memory storage will meet the prefix Head of pre-conditioned attribute field as the line unit Rowkey of HBase database.
HBase be one distributed, towards row the database of increasing income.It is with the form storage data of table, and table is comprised of row and column, and row are divided into several row bunch, and a line is by line unit Rowkey, and timestamp and some row form.Line unit Rowkey is similar to the major key of relational database, for search records.
Concrete, memory storage, after record data record and determine the attribute field satisfied condition, is added into the attribute field satisfied condition in the line unit Rowkey of HBase database.
Wherein, the line unit Rowkey of HBase is divided into two parts, the prefix Head that first is regular length, and second portion is suffix Tail.
Optionally, the length of the prefix Head of line unit Rowkey arranges voluntarily according to user's request, and content comprises the time, the digest value that event and MD5 calculate.
Wherein, memory storage calculates the digest value that meets pre-conditioned attribute field according to message digest algorithm MD5, and wherein, digest value is hexadecimal character string.
Optionally, the Head that length is 26 bytes can comprise following information:
[MD5hash?of?mac]16bytes
[0x00] 1byte reserve bytes
[Event type] 1byte, 0x00-0xFF, support at most 256 kinds of events
[event time YYYYmmdd] 8bytes, used String.getBytes () to generate
Wherein, MD5hash of mac is the physical address according to the MD5 hash; Event type is event type; Event is at least one in opening, closing, trigger; Event time is the time that event occurs, and can use String.getBytes () function to generate.
Concrete, Rowkey also comprises suffix, and suffix length is fixed as 9 bytes, the long integer of "=" and 8 byte representations, consists of.
Concrete, memory storage is added into the attribute field satisfied condition in the line unit Rowkey of HBase database, and this meets in the row that pre-conditioned attribute field is recorded in HBase.
For example, tentation data is recorded as user access logs, and these Japan and China of user's access include the specifying information of user message table and user's access, are respectively: account, sex, company, zero-time, concluding time, accession page.If data analysis is the data analysis to user access logs according to zero-time and account, the account in user access logs and zero-time are defined as meeting pre-conditioned attribute field, and account and zero-time are added in the Head of Rowkey.In HBase, the account in data recording and zero-time are recorded in the row of HBase.
S205, memory storage will not meet the row of pre-conditioned attribute field as the HBase database.
HBase be one distributed, towards row the database of increasing income.It is with the form storage data of table, and table is comprised of row and column, and row are divided into several row bunch, and a line is by line unit Rowkey, and timestamp and some row form.Line unit Rowkey is similar to the major key of relational database, for search records.
Concrete, memory storage, after record data records and determine and do not meet pre-conditioned attribute field, will not meet pre-conditioned attribute field and be added into the row of HBase database, and this does not meet one pre-conditioned of attribute field column formation and is listed as bunch.
For example, tentation data is recorded as user access logs, and these Japan and China of user's access include the specifying information of user message table and user's access, are respectively: account, sex, company, zero-time, concluding time, accession page.Data analysis according to zero-time and account to user access logs, the all properties field except account and zero-time is defined as not meeting pre-conditioned attribute field, and this will not met to the row of described pre-conditioned attribute field as the HBase database, these other attribute field columns form row bunch.
S206, memory storage are stored to Rowkey in the HBase database, so that inquire about in HBase while meeting pre-conditioned attribute field, by the prefix of inquiring about Rowkey, obtain and meet pre-conditioned attribute field.
Concrete, memory storage is after determining Rowkey, the Rowkey that adds content is saved in the HBase database, when in HBase, inquiry meets pre-conditioned attribute field, inquiry mechanism according to blur filter FuzzyRowFilter regular expression up-to-date in Hbase, inquiry meets pre-conditioned attribute field, can obtain desired data.
Wherein, pre-conditioned at least one comprising in time, event and account, event comprises unlatching, at least one in closing, triggering.
Concrete, pre-conditionedly when analyzing data, by the user, arranged voluntarily.
Optionally, pre-conditioned can be specified conditions, can be also combination condition.For example, pre-conditioned is time and/or event, and memory storage, according to time and/or event, is analyzed data recording to be stored, determines the attribute field that meets time and/or event in this data recording.
For example, tentation data is recorded as user access logs, and these Japan and China of user's access include the specifying information of user message table and user's access, are respectively: account, sex, company, zero-time, concluding time, accession page.If data analysis is the data analysis to user access logs according to zero-time and account, the account in user access logs and zero-time are defined as meeting pre-conditioned attribute field, and account and zero-time are added in the Head of Rowkey, the Rowkey that places account and zero-time is saved in the HBase database.While in HBase, according to account and zero-time, analyzing data, according to the inquiry mechanism of blur filter FuzzyRowFilter regular expression up-to-date in Hbase, inquiry meets the Rowkey of account and zero-time, can obtain desired data.
The invention provides a kind of date storage method, memory storage obtains data recording to be stored, then meet pre-conditioned attribute field in the specified data record, and will meet the prefix Head of pre-conditioned attribute field as the line unit Rowkey of HBase database, finally Rowkey is stored in the HBase database, so that inquire about in HBase while meeting pre-conditioned attribute field, by the prefix of inquiring about Rowkey, obtain and meet pre-conditioned attribute field.By this scheme, memory storage can improve data query speed according to the pre-conditioned line unit Rowkey preserved in the HBase database that arranges, and improves recall precision, has reduced expending of time.
Embodiment tri-
The invention provides a kind of memory storage, as shown in Figure 3, comprising:
Acquiring unit 10, for obtaining data recording to be stored;
Processing unit 11, for determining that described data recording meets pre-conditioned attribute field; Meet the prefix Head of pre-conditioned attribute field as the line unit Rowkey of HBase database using described;
Storage unit 12, for described Rowkey being stored to described HBase database, so that inquiry is described while meeting pre-conditioned attribute field in described HBase, obtains and meet pre-conditioned attribute field by the prefix of inquiring about described Rowkey.
Further, the unit of processing 11 meet the prefix Head of pre-conditioned attribute field as the line unit Rowkey of described HBase database using described, comprising:
Calculate the described digest value that meets pre-conditioned attribute field according to message digest algorithm MD5, described digest value is hexadecimal character string;
The prefix Head of line unit Rowkey using described digest value as described HBase database.
Further, after described processing unit 11 is determined in described data recording and is met pre-conditioned attribute field, also for:
Determine in described data recording and do not meet described pre-conditioned attribute field;
Do not meet the row of described pre-conditioned attribute field as described HBase database using described;
Described row are stored in the HBase database.
Further, described Rowkey also comprises suffix, and described suffix length is fixed as 9 bytes, the long integer of "=" and 8 byte representations, consists of.
Further, described storage unit 12 is stored to described Rowkey in the HBase database, so that inquiry is described while meeting pre-conditioned attribute field in described HBase, obtain and meet pre-conditioned attribute field by the prefix of inquiring about described Rowkey, specifically comprise:
Described Rowkey is stored in the HBase database;
According to the inquiry mechanism of blur filter FuzzyRowFilter regular expression up-to-date in described Hbase, inquiry meets pre-conditioned attribute field.
The invention provides a kind of memory storage, mainly comprise acquiring unit, processing unit and storage unit.Memory storage obtains data recording to be stored, then meet pre-conditioned attribute field in the specified data record, and will meet the prefix Head of pre-conditioned attribute field as the line unit Rowkey of HBase database, finally Rowkey is stored in the HBase database, so that inquire about in HBase while meeting pre-conditioned attribute field, by the prefix of inquiring about Rowkey, obtain and meet pre-conditioned attribute field.By this scheme, memory storage can improve data query speed according to the pre-conditioned line unit Rowkey preserved in the HBase database that arranges, and improves recall precision, has reduced expending of time.
The those skilled in the art can be well understood to, for convenience and simplicity of description, only the division with above-mentioned each functional module is illustrated, in practical application, can above-mentioned functions be distributed and completed by different functional modules as required, the inner structure that is about to device is divided into different functional modules, to complete all or part of function described above.The system of foregoing description, the specific works process of device and unit, can, with reference to the corresponding process in preceding method embodiment, not repeat them here.
In the several embodiment that provide in the application, should be understood that, disclosed system, apparatus and method, can realize by another way.For example, device embodiment described above is only schematic, for example, the division of described module or unit, be only that a kind of logic function is divided, during actual the realization, other dividing mode can be arranged, for example a plurality of unit or assembly can in conjunction with or can be integrated into another system, or some features can ignore, or do not carry out.Another point, shown or discussed coupling each other or direct-coupling or communication connection can be by some interfaces, indirect coupling or the communication connection of device or unit can be electrically, machinery or other form.
The above; be only the specific embodiment of the present invention, but protection scope of the present invention is not limited to this, anyly is familiar with those skilled in the art in the technical scope that the present invention discloses; can expect easily changing or replacing, within all should being encompassed in protection scope of the present invention.Therefore, protection scope of the present invention should be as the criterion with the protection domain of described claim.

Claims (10)

1. a date storage method, is characterized in that, is applied in the data storage procedure of HBase database, and the method comprises:
Obtain data recording to be stored;
Determine in described data recording and meet pre-conditioned attribute field;
Meet the prefix Head of pre-conditioned attribute field as the line unit Rowkey of HBase database using described;
Described Rowkey is stored in described HBase database, so that inquiry is described while meeting pre-conditioned attribute field in described HBase, by the prefix of inquiring about described Rowkey, obtains and meet pre-conditioned attribute field.
2. date storage method according to claim 1, is characterized in that, describedly using described, meets the prefix Head of pre-conditioned attribute field as the line unit Rowkey of described HBase database, comprising:
Calculate the described digest value that meets pre-conditioned attribute field according to message digest algorithm MD5, described digest value is hexadecimal character string;
The prefix Head of line unit Rowkey using described digest value as described HBase database.
3. date storage method according to claim 1, is characterized in that, after meeting pre-conditioned attribute field in described definite described data recording, described method also comprises:
Determine in described data recording and do not meet described pre-conditioned attribute field;
Do not meet the row of described pre-conditioned attribute field as described HBase database using described;
Described row are stored in the HBase database.
4. date storage method according to claim 1, is characterized in that, described Rowkey also comprises suffix (Tail), and described suffix length is fixed as 9 bytes, the long integer of "=" and 8 byte representations, consists of.
5. date storage method according to claim 1, it is characterized in that, described described Rowkey is stored in the HBase database, so that inquiry is described while meeting pre-conditioned attribute field in described HBase, obtain and meet pre-conditioned attribute field by the prefix of inquiring about described Rowkey, specifically comprise:
Described Rowkey is stored in the HBase database;
According to the inquiry mechanism of blur filter FuzzyRowFilter regular expression up-to-date in described Hbase, inquiry meets pre-conditioned attribute field.
6. a memory storage, is characterized in that, comprising:
Acquiring unit, for obtaining data recording to be stored;
Processing unit, for determining that described data recording meets pre-conditioned attribute field; Meet the prefix Head of pre-conditioned attribute field as the line unit Rowkey of HBase database using described;
Storage unit, for described Rowkey being stored to described HBase database, so that inquiry is described while meeting pre-conditioned attribute field in described HBase, obtains and meet pre-conditioned attribute field by the prefix of inquiring about described Rowkey.
7. memory storage according to claim 6, is characterized in that, the unit of processing meet the prefix Head of pre-conditioned attribute field as the line unit Rowkey of described HBase database using described, comprising:
Calculate the described digest value that meets pre-conditioned attribute field according to message digest algorithm MD5, described digest value is hexadecimal character string;
The prefix Head of line unit Rowkey using described digest value as described HBase database.
8. memory storage according to claim 6, is characterized in that, described processing unit also comprises after determining in described data recording and meeting pre-conditioned attribute field:
Determine in described data recording and do not meet described pre-conditioned attribute field;
Do not meet the row of described pre-conditioned attribute field as described HBase database using described;
Described row are stored in the HBase database.
9. memory storage according to claim 6, is characterized in that, described Rowkey also comprises suffix, and described suffix length is fixed as 9 bytes, the long integer of "=" and 8 byte representations, consists of.
10. memory storage according to claim 6, it is characterized in that, described storage unit is stored to described Rowkey in the HBase database, so that inquiry is described while meeting pre-conditioned attribute field in described HBase, obtain and meet pre-conditioned attribute field by the prefix of inquiring about described Rowkey, specifically comprise:
Described Rowkey is stored in the HBase database;
According to the inquiry mechanism of blur filter FuzzyRowFilter regular expression up-to-date in described Hbase, inquiry meets pre-conditioned attribute field.
CN201310403001.6A 2013-09-06 2013-09-06 A kind of date storage method and device Active CN103488704B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310403001.6A CN103488704B (en) 2013-09-06 2013-09-06 A kind of date storage method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310403001.6A CN103488704B (en) 2013-09-06 2013-09-06 A kind of date storage method and device

Publications (2)

Publication Number Publication Date
CN103488704A true CN103488704A (en) 2014-01-01
CN103488704B CN103488704B (en) 2016-10-05

Family

ID=49828930

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310403001.6A Active CN103488704B (en) 2013-09-06 2013-09-06 A kind of date storage method and device

Country Status (1)

Country Link
CN (1) CN103488704B (en)

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104331460A (en) * 2014-10-31 2015-02-04 北京思特奇信息技术股份有限公司 Hbase-based data read-write operation method and system
CN104376053A (en) * 2014-11-04 2015-02-25 南京信息工程大学 Storage and retrieval method based on massive meteorological data
CN104391910A (en) * 2014-11-17 2015-03-04 西安交通大学 HBase-based tax statistic report storage and calculation method
CN106156338A (en) * 2016-07-12 2016-11-23 复旦大学无锡研究院 The date storage method of a kind of INFORMATION DISCOVERY server and INFORMATION DISCOVERY method
CN106326381A (en) * 2016-08-16 2017-01-11 梁猛 HBase data retrieval method based on MapDB construction
CN106326317A (en) * 2015-07-09 2017-01-11 中国移动通信集团山西有限公司 Data processing method and device
CN106528674A (en) * 2016-10-31 2017-03-22 厦门服云信息科技有限公司 Method and device for high-performance query based on Hbase row keys
CN106570036A (en) * 2015-10-13 2017-04-19 北京国双科技有限公司 Data adding method and device based on HBase database
CN106777258A (en) * 2016-12-28 2017-05-31 银江股份有限公司 The coding and compression method of Hbase line units in a kind of medical big data storage
CN106940627A (en) * 2017-03-24 2017-07-11 联想(北京)有限公司 A kind of data processing method and server cluster
CN107291881A (en) * 2017-06-19 2017-10-24 北京计算机技术及应用研究所 Massive logs storage and querying method based on HBase
CN107515867A (en) * 2016-06-15 2017-12-26 阿里巴巴集团控股有限公司 The generation method and device that data storage, querying method and the device and a kind of rowKey of a kind of NoSQL databases combine entirely
CN109271413A (en) * 2018-10-11 2019-01-25 江苏易润信息技术有限公司 A kind of method, apparatus and computer storage medium of data query
CN109597857A (en) * 2018-12-06 2019-04-09 中电工业互联网有限公司 A kind of Internet of Things big data calculation method based on Spark
CN109918425A (en) * 2017-12-14 2019-06-21 北京京东尚科信息技术有限公司 A kind of method and system realized data and import non-relational database
CN112699149A (en) * 2020-12-31 2021-04-23 青岛海尔科技有限公司 Target data acquisition method and device, storage medium and electronic device
CN112817969A (en) * 2021-01-14 2021-05-18 内蒙古蒙商消费金融股份有限公司 Data query method, system, electronic device and storage medium
US20230120186A1 (en) * 2021-10-20 2023-04-20 Paypal, Inc. Database Management Using Sort Keys

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7852842B2 (en) * 2004-03-31 2010-12-14 Lg Electronics Inc. Data processing method for network layer
CN101916262A (en) * 2010-07-29 2010-12-15 北京用友政务软件有限公司 Acceleration method of financial element matching
CN101950297A (en) * 2010-09-10 2011-01-19 北京大学 Method and device for storing and inquiring mass semantic data
CN103116610A (en) * 2013-01-23 2013-05-22 浙江大学 Vector space big data storage method based on HBase

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7852842B2 (en) * 2004-03-31 2010-12-14 Lg Electronics Inc. Data processing method for network layer
CN101916262A (en) * 2010-07-29 2010-12-15 北京用友政务软件有限公司 Acceleration method of financial element matching
CN101950297A (en) * 2010-09-10 2011-01-19 北京大学 Method and device for storing and inquiring mass semantic data
CN103116610A (en) * 2013-01-23 2013-05-22 浙江大学 Vector space big data storage method based on HBase

Cited By (28)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104331460A (en) * 2014-10-31 2015-02-04 北京思特奇信息技术股份有限公司 Hbase-based data read-write operation method and system
CN104376053A (en) * 2014-11-04 2015-02-25 南京信息工程大学 Storage and retrieval method based on massive meteorological data
CN104376053B (en) * 2014-11-04 2017-12-22 南京信息工程大学 A kind of storage and retrieval method based on magnanimity meteorological data
CN104391910A (en) * 2014-11-17 2015-03-04 西安交通大学 HBase-based tax statistic report storage and calculation method
CN106326317A (en) * 2015-07-09 2017-01-11 中国移动通信集团山西有限公司 Data processing method and device
CN106570036B (en) * 2015-10-13 2019-11-12 北京国双科技有限公司 Data adding method and device based on HBase database
CN106570036A (en) * 2015-10-13 2017-04-19 北京国双科技有限公司 Data adding method and device based on HBase database
CN107515867B (en) * 2016-06-15 2021-06-29 阿里巴巴集团控股有限公司 Data storage and query method and device of NoSQL database and generation method and device of rowKey full combination
CN107515867A (en) * 2016-06-15 2017-12-26 阿里巴巴集团控股有限公司 The generation method and device that data storage, querying method and the device and a kind of rowKey of a kind of NoSQL databases combine entirely
CN106156338A (en) * 2016-07-12 2016-11-23 复旦大学无锡研究院 The date storage method of a kind of INFORMATION DISCOVERY server and INFORMATION DISCOVERY method
CN106326381A (en) * 2016-08-16 2017-01-11 梁猛 HBase data retrieval method based on MapDB construction
CN106326381B (en) * 2016-08-16 2019-06-25 梁猛 HBase data retrieval method based on MapDB building
CN106528674A (en) * 2016-10-31 2017-03-22 厦门服云信息科技有限公司 Method and device for high-performance query based on Hbase row keys
CN106528674B (en) * 2016-10-31 2019-10-01 厦门服云信息科技有限公司 The High Performance Data Query method and apparatus being good for based on Hbase row
CN106777258A (en) * 2016-12-28 2017-05-31 银江股份有限公司 The coding and compression method of Hbase line units in a kind of medical big data storage
CN106777258B (en) * 2016-12-28 2020-01-03 银江股份有限公司 Coding and compressing method for Hbase row key in medical big data storage
CN106940627A (en) * 2017-03-24 2017-07-11 联想(北京)有限公司 A kind of data processing method and server cluster
CN106940627B (en) * 2017-03-24 2020-08-25 联想(北京)有限公司 Data processing method and server cluster
CN107291881A (en) * 2017-06-19 2017-10-24 北京计算机技术及应用研究所 Massive logs storage and querying method based on HBase
CN109918425A (en) * 2017-12-14 2019-06-21 北京京东尚科信息技术有限公司 A kind of method and system realized data and import non-relational database
CN109271413A (en) * 2018-10-11 2019-01-25 江苏易润信息技术有限公司 A kind of method, apparatus and computer storage medium of data query
CN109597857A (en) * 2018-12-06 2019-04-09 中电工业互联网有限公司 A kind of Internet of Things big data calculation method based on Spark
CN112699149A (en) * 2020-12-31 2021-04-23 青岛海尔科技有限公司 Target data acquisition method and device, storage medium and electronic device
CN112699149B (en) * 2020-12-31 2023-09-19 青岛海尔科技有限公司 Target data acquisition method and device, storage medium and electronic device
CN112817969A (en) * 2021-01-14 2021-05-18 内蒙古蒙商消费金融股份有限公司 Data query method, system, electronic device and storage medium
CN112817969B (en) * 2021-01-14 2023-04-14 内蒙古蒙商消费金融股份有限公司 Data query method, system, electronic device and storage medium
US20230120186A1 (en) * 2021-10-20 2023-04-20 Paypal, Inc. Database Management Using Sort Keys
WO2023065134A1 (en) * 2021-10-20 2023-04-27 Paypal, Inc. Database management using sort keys

Also Published As

Publication number Publication date
CN103488704B (en) 2016-10-05

Similar Documents

Publication Publication Date Title
CN103488704A (en) Method and device for storing data
CN107402988B (en) Distributed NewSQL database system and semi-structured data query method
CN104252536B (en) A kind of internet log data query method and device based on hbase
CN103366015B (en) A kind of OLAP data based on Hadoop stores and querying method
CN102184222B (en) Quick searching method in large data volume storage
CN103544261B (en) A kind of magnanimity structuring daily record data global index's management method and device
CN104090889A (en) Method and system for data processing
CN102122285A (en) Data cache system and data inquiry method
CN103810219B (en) Line storage database-based data processing method and device
CN101727502A (en) Data query method, data query device and data query system
CN104424258A (en) Multidimensional data query method and system, query server and column storage server
CN103744913A (en) Database retrieval method based on search engine technology
CN102890721B (en) Based on database building method and the system of row memory technology
CN106326387B (en) A kind of Distributed Storage structure and date storage method and data query method
CN104573065A (en) Report display engine based on metadata
CN105159616A (en) Disk space management method and device
CN109597829B (en) Middleware method for realizing searchable encryption relational database cache
CN105373541A (en) Processing method and system for data operation request of database
CN104035956A (en) Time-series data storage method based on distributive column storage
CN104462161A (en) Structural data query method based on distributed database
CN103034650B (en) A kind of data handling system and method
CN102999600A (en) Method and system for automatically generating embedded database
CN106021357A (en) Distribution-based big data paging query method and system
Cao et al. Leveraging column family to improve multidimensional query performance in HBase
KR101255639B1 (en) Column-oriented database system and join process method using join index thereof

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
PP01 Preservation of patent right
PP01 Preservation of patent right

Effective date of registration: 20170721

Granted publication date: 20161005

PD01 Discharge of preservation of patent

Date of cancellation: 20200721

Granted publication date: 20161005

PD01 Discharge of preservation of patent
CP03 Change of name, title or address

Address after: 300453 Tianjin Binhai New Area, Tianjin Eco-city, No. 126 Animation and Animation Center Road, Area B1, Second Floor 201-427

Patentee after: Xinle Visual Intelligent Electronic Technology (Tianjin) Co.,Ltd.

Address before: 300467 Tianjin Binhai New Area, Tianjin ecological city animation Middle Road, building, No. two, B1 District, 201-427

Patentee before: LE SHI ZHI XIN ELECTRONIC TECHNOLOGY (TIANJIN) Ltd.

Address after: Room 301-1, Room 301-3, Area B2, Animation Building, No. 126 Animation Road, Zhongxin Eco-city, Tianjin Binhai New Area, Tianjin

Patentee after: LE SHI ZHI XIN ELECTRONIC TECHNOLOGY (TIANJIN) Ltd.

Address before: 300453 Tianjin Binhai New Area, Tianjin Eco-city, No. 126 Animation and Animation Center Road, Area B1, Second Floor 201-427

Patentee before: Xinle Visual Intelligent Electronic Technology (Tianjin) Co.,Ltd.

CP03 Change of name, title or address
PP01 Preservation of patent right

Effective date of registration: 20210201

Granted publication date: 20161005

PP01 Preservation of patent right