CN105512216A - Data storage and reading method, device and system - Google Patents

Data storage and reading method, device and system Download PDF

Info

Publication number
CN105512216A
CN105512216A CN201510853977.2A CN201510853977A CN105512216A CN 105512216 A CN105512216 A CN 105512216A CN 201510853977 A CN201510853977 A CN 201510853977A CN 105512216 A CN105512216 A CN 105512216A
Authority
CN
China
Prior art keywords
data
key
database
value
level key
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201510853977.2A
Other languages
Chinese (zh)
Inventor
严峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qingdao Haier Intelligent Home Appliance Technology Co Ltd
Original Assignee
Qingdao Haier Intelligent Home Appliance Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qingdao Haier Intelligent Home Appliance Technology Co Ltd filed Critical Qingdao Haier Intelligent Home Appliance Technology Co Ltd
Priority to CN201510853977.2A priority Critical patent/CN105512216A/en
Publication of CN105512216A publication Critical patent/CN105512216A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a data storage and reading method, device and system. The data reading method comprises steps of obtaining a data value corresponding to a first grade key in a database, and obtaining second grade keys; dividing the second grade keys into N parts in dependence on an obtained parameter N, wherein each part of the second grade keys corresponds to the input of a computing element, and N is larger than or equal to 1 and is smaller than or equal to the number of the second grade keys; and concurrently the reading the values corresponding to the second grade keys through the computing elements. According to the technical scheme, the data access scope of the hadoop MR is expanded, the database arranged in a two-level mode is also brought into the big data processing scope, the hadoop MR can directly read data in the database in quantities, and the operating efficiency of the hadoop MR is improved.

Description

Data store and read method, Apparatus and system
Technical field
The present invention relates to database technical field, particularly relate to a kind of data and store and read method, Apparatus and system.
Background technology
Hadoop is that current industry uses at most, and the most ripe large data of increasing income store and computing platform, and it contains a lot of assembly, and wherein MapReduce (i.e. MR) is the programming model on hadoop platform.Be applicable to the Distributed Calculation under big data quantity.Redis is a kind of index/value (key/value) memory database of increasing income, and supports the storage of several data form.
The reading that HadoopMR provides a lot of data source realizes, as file, and hbase database, but well do not support directly reading data in enormous quantities from redis, particularly, HadoopMR is the sharp weapon that large Data distribution8 formula calculates, and input is generally hdfs file.Redis is the memory database of key/value, is generally to read single value according to single key.Therefore the directly mode from redis reading data to hadoopMR in enormous quantities is lacked at present.
Summary of the invention
In view of the above problems, the present invention is proposed to provide a kind of overcoming the problems referred to above or the data that solve the problem at least in part store and read method, Apparatus and system.
The invention provides a kind of date storage method, comprising:
Keyword key in database is divided into two-stage according to granule size, and wherein, the data value value that the first order key that granularity is larger is corresponding is the second level key that granularity is less;
The value of the data needing to store as second level key is preserved.
Present invention also offers a kind of method for reading data, read data for Hadoop programming model MR from database, comprising:
Obtain the data value value that in database, first order keyword key is corresponding, obtain second level key;
Second level key is divided into N part by the Parameter N according to obtaining, wherein, the input of every part of corresponding computing unit of second level key, N is more than or equal to 1 and is less than or equal to the number of second level key;
By the value corresponding to the corresponding second level key of the concurrent reading of computing unit.
Present invention also offers a kind of data storage device, comprising:
Diversity module, for the keyword key in database is divided into two-stage according to granule size, wherein, the data value value that the first order key that granularity is larger is corresponding is the second level key that granularity is less;
Memory module, for preserving the value of the data needing to store as second level key.
Present invention also offers a kind of data fetch device, be arranged at Hadoop programming model MR, comprise:
Acquisition module, for obtaining the data value value that in database, first order keyword key is corresponding, obtains second level key;
Cutting module, for second level key being divided into N part according to the Parameter N obtained, wherein, the input of every part of corresponding computing unit of second level key, N is more than or equal to 1 and is less than or equal to the number of second level key;
Read module, for passing through the value corresponding to the corresponding second level key of the concurrent reading of computing unit.
Present invention also offers a kind of data and store reading system, comprise above-mentioned data storage device and above-mentioned data fetch device.
Beneficial effect of the present invention is as follows:
By database is set to secondary pattern, extend the data access scope of hadoopMR, the database being set to secondary pattern is also incorporated in large data processing scope, makes hadoopMR can read data from database directly in enormous quantities, improve the operational efficiency of hadoopMR.
Above-mentioned explanation is only the general introduction of technical solution of the present invention, in order to technological means of the present invention can be better understood, and can be implemented according to the content of instructions, and can become apparent, below especially exemplified by the specific embodiment of the present invention to allow above and other objects of the present invention, feature and advantage.
Accompanying drawing explanation
By reading hereafter detailed description of the preferred embodiment, various other advantage and benefit will become cheer and bright for those of ordinary skill in the art.Accompanying drawing only for illustrating the object of preferred implementation, and does not think limitation of the present invention.And in whole accompanying drawing, represent identical parts by identical reference symbol.In the accompanying drawings:
Fig. 1 is the process flow diagram of the date storage method of the embodiment of the present invention;
Fig. 2 is the schematic diagram of the Redis database of the embodiment of the present invention;
Fig. 3 is the process flow diagram of the method for reading data of the embodiment of the present invention;
Fig. 4 is the schematic diagram of hadoopMR from redis database reading data of the embodiment of the present invention;
Fig. 5 is the structural representation of the data storage device of the embodiment of the present invention;
Fig. 6 is the structural representation of the data fetch device of the embodiment of the present invention;
Fig. 7 is the structural representation of the data storage reading system of the embodiment of the present invention.
Embodiment
Below with reference to accompanying drawings exemplary embodiment of the present disclosure is described in more detail.Although show exemplary embodiment of the present disclosure in accompanying drawing, however should be appreciated that can realize the disclosure in a variety of manners and not should limit by the embodiment set forth here.On the contrary, provide these embodiments to be in order to more thoroughly the disclosure can be understood, and complete for the scope of the present disclosure can be conveyed to those skilled in the art.
Can not the direct problem reading data from Redis database in enormous quantities in order to solve hadoopMR in prior art, the invention provides a kind of data to store and read method, Apparatus and system, below in conjunction with accompanying drawing and embodiment, the present invention is further elaborated.Should be appreciated that specific embodiment described herein only in order to explain the present invention, do not limit the present invention.
Embodiment of the method one
According to embodiments of the invention, provide a kind of date storage method, Fig. 1 is the process flow diagram of the date storage method of the embodiment of the present invention, as shown in Figure 1, comprises following process according to the date storage method of the embodiment of the present invention:
Step 101, is divided into two-stage by the keyword key in database according to granule size, and wherein, the data value value that the first order key that granularity is larger is corresponding is the second level key that granularity is less; Preferably, described granule size comprises: the size of time range.That is, the time range that the key that the first order stores is hour, as the data of certain hour, value is the list that the second level stores key; The time range that the key that the second level stores is minute, as the data in certain 1 minute, value is reported data.
Step 102, preserves the value of the data needing to store as second level key.
Preferably, in embodiments of the present invention, described database is Redis database.That is, the mode that the technical scheme of the embodiment of the present invention adopts two-stage key/value to associate in Redis adapts to batch read operation.Fig. 2 is the schematic diagram of the Redis database of the embodiment of the present invention, and as shown in Figure 2, the value under the key that the first order stores is the list that the second level stores key, as key1, key2 ..., keyn.The value of key that the second level stores is reported data, and such as, the value of key1 is the value of data group dataset1, key2 is data group dataset2.
Data in Redis are divided two-level memory by the technical scheme of the embodiment of the present invention, make hadoopMR directly can read storage data in redis in batches, thus carry out Distributed Calculation to storing data in redis.HadoopMR can the storage data directly read in batches in redis be described in detail in embodiment of the method two.
Embodiment of the method two
According to embodiments of the invention, provide a kind of method for reading data, read data for Hadoop programming model MR from database, Fig. 3 is the process flow diagram of the method for reading data of the embodiment of the present invention, as shown in Figure 3, following process is comprised according to the method for reading data of the embodiment of the present invention:
Step 301, obtains the data value value that in database, first order keyword key is corresponding, obtains second level key;
Step 302, described second level key is divided into N part by the Parameter N according to obtaining, wherein, and the input of every part of corresponding computing unit of second level key, described N is more than or equal to 1 and is less than or equal to the number of second level key;
In step 302, preferably, the Parameter N of described acquisition specifically comprises: according to the number of the computing unit that execution concurrence calculates, determine Parameter N.
Step 303, by the value corresponding to the corresponding second level key of the concurrent reading of computing unit.
Read data to Hadoop programming model MR from redis database to be below described in detail.Fig. 4 is that the hadoopMR of the embodiment of the present invention reads the schematic diagram of data from redis database, as shown in Figure 4, specifically comprises following process:
Step 1: in hadoopMR, the time assembled one-level key first read as required (such as, 2015111016 represent 15 o'clock to 16 o'clock on the 10th November in the 2015) one-level read in redis stores, and obtains all key of corresponding secondary storage.
Step 2: all secondary key are divided into N part according to the Parameter N that external program imports into, the input of every part of correspondence computing unit mapper, wherein, the maximum number equaling all secondary key of N, N represents that specifying how many mapper programs to carry out execution concurrence calculates.
Each mapper of step 3:hadoopMR reads the secondary key/value distributing to it; All mapper are added together and just can have read all data be stored in this hour 15 o'clock to 16 o'clock on the 10th November in 2015 in redis.
In sum, by means of the technical scheme of the embodiment of the present invention, by database is set to secondary pattern, extend the data access scope of hadoopMR, the database being set to secondary pattern is also incorporated in large data processing scope, make hadoopMR can read data from database directly in enormous quantities, improve the operational efficiency of hadoopMR.Direct reading memory database, higher than the performance of file reading, the time spent on digital independent is less.
Device embodiment
According to embodiments of the invention, provide a kind of data storage device, Fig. 5 is the structural representation of the data storage device of the embodiment of the present invention, as shown in Figure 5, data storage device according to the embodiment of the present invention comprises: diversity module 50 and memory module 52, be described in detail the modules of the embodiment of the present invention below.
Diversity module 50, for the keyword key in database is divided into two-stage according to granule size, wherein, the data value value that the first order key that granularity is larger is corresponding is the second level key that granularity is less; Preferably, described granule size comprises: the size of time range.Described database is Redis database.
Memory module 52, for preserving the value of the data needing to store as second level key.
The concrete process of embodiment of the present invention modules has been described in detail in embodiment of the method one, does not repeat them here.
Device embodiment two
According to embodiments of the invention, provide a kind of data fetch device, be arranged at Hadoop programming model MR, Fig. 6 is the structural representation of the data fetch device of the embodiment of the present invention, as shown in Figure 6, data fetch device according to the embodiment of the present invention comprises: acquisition module 60, cutting module 62 and read module 64, be described in detail the modules of the embodiment of the present invention below.
Acquisition module 60, for obtaining the data value value that in database, first order keyword key is corresponding, obtains second level key;
Cutting module 62, for described second level key being divided into N part according to the Parameter N obtained, wherein, the input of every part of corresponding computing unit of second level key, described N is more than or equal to 1 and is less than or equal to the number of second level key; Belonging to cutting module 62 specifically for the number of computing unit that calculates according to execution concurrence, determine Parameter N.
Read module 64, for passing through the value corresponding to the corresponding second level key of the concurrent reading of computing unit.
The concrete process of embodiment of the present invention modules has been described in detail in embodiment of the method one, does not repeat them here.
System embodiment
According to embodiments of the invention, provide a kind of data and store reading system, Fig. 7 is the structural representation of the data storage reading system of the embodiment of the present invention, as shown in Figure 7, store reading system according to the data of the embodiment of the present invention to comprise: the data storage device 70 of said apparatus embodiment one and the data fetch device 72 of said apparatus embodiment two.Said apparatus has been described in detail in corresponding device embodiment, does not repeat them here.
In sum, by means of the technical scheme of the embodiment of the present invention, by database is set to secondary pattern, extend the data access scope of hadoopMR, the database being set to secondary pattern is also incorporated in large data processing scope, make hadoopMR can read data from database directly in enormous quantities, improve the operational efficiency of hadoopMR.Direct reading memory database, higher than the performance of file reading, the time spent on digital independent is less.
Obviously, those skilled in the art can carry out various change and modification to the present invention and not depart from the spirit and scope of the present invention.Like this, if these amendments of the present invention and modification belong within the scope of the claims in the present invention and equivalent technologies thereof, then the present invention is also intended to comprise these change and modification.
Intrinsic not relevant to any certain computer, virtual system or miscellaneous equipment with display at this algorithm provided.Various general-purpose system also can with use based on together with this teaching.According to description above, the structure constructed required by this type systematic is apparent.In addition, the present invention is not also for any certain programmed language.It should be understood that and various programming language can be utilized to realize content of the present invention described here, and the description done language-specific is above to disclose preferred forms of the present invention.
In instructions provided herein, describe a large amount of detail.But can understand, embodiments of the invention can be put into practice when not having these details.In some instances, be not shown specifically known method, structure and technology, so that not fuzzy understanding of this description.
Similarly, be to be understood that, in order to simplify the disclosure and to help to understand in each inventive aspect one or more, in the description above to exemplary embodiment of the present invention, each feature of the present invention is grouped together in single embodiment, figure or the description to it sometimes.But, the method for the disclosure should be construed to the following intention of reflection: namely the present invention for required protection requires feature more more than the feature clearly recorded in each claim.Or rather, as claims below reflect, all features of disclosed single embodiment before inventive aspect is to be less than.Therefore, the claims following embodiment are incorporated to this embodiment thus clearly, and wherein each claim itself is as independent embodiment of the present invention.
Those skilled in the art are appreciated that and adaptively can change the module in the client in embodiment and they are arranged in one or more clients different from this embodiment.Block combiner in embodiment can be become a module, and multiple submodule or subelement or sub-component can be put them in addition.Except at least some in such feature and/or process or unit be mutually repel except, any combination can be adopted to combine all processes of all features disclosed in this instructions (comprising adjoint claim, summary and accompanying drawing) and so disclosed any method or client or unit.Unless expressly stated otherwise, each feature disclosed in this instructions (comprising adjoint claim, summary and accompanying drawing) can by providing identical, alternative features that is equivalent or similar object replaces.
In addition, those skilled in the art can understand, although embodiments more described herein to comprise in other embodiment some included feature instead of further feature, the combination of the feature of different embodiment means and to be within scope of the present invention and to form different embodiments.Such as, in the following claims, the one of any of embodiment required for protection can use with arbitrary array mode.
All parts embodiment of the present invention with hardware implementing, or can realize with the software module run on one or more processor, or realizes with their combination.It will be understood by those of skill in the art that the some or all functions of some or all parts be loaded with in the client of sequence network address that microprocessor or digital signal processor (DSP) can be used in practice to realize according to the embodiment of the present invention.The present invention can also be embodied as part or all equipment for performing method as described herein or device program (such as, computer program and computer program).Realizing program of the present invention and can store on a computer-readable medium like this, or the form of one or more signal can be had.Such signal can be downloaded from internet website and obtain, or provides on carrier signal, or provides with any other form.
The present invention will be described instead of limit the invention to it should be noted above-described embodiment, and those skilled in the art can design alternative embodiment when not departing from the scope of claims.In the claims, any reference symbol between bracket should be configured to limitations on claims.Word " comprises " not to be got rid of existence and does not arrange element in the claims or step.Word "a" or "an" before being positioned at element is not got rid of and be there is multiple such element.The present invention can by means of including the hardware of some different elements and realizing by means of the computing machine of suitably programming.In the unit claim listing some devices, several in these devices can be carry out imbody by same hardware branch.Word first, second and third-class use do not represent any order.Can be title by these word explanations.

Claims (11)

1. a date storage method, is characterized in that, comprising:
Keyword key in database is divided into two-stage according to granule size, and wherein, the data value value that the first order key that granularity is larger is corresponding is the second level key that granularity is less;
The value of the data needing to store as second level key is preserved.
2. the method for claim 1, is characterized in that, described granule size comprises: the size of time range.
3. the method for claim 1, is characterized in that, described database is Redis database.
4. a method for reading data, reads data for Hadoop programming model MR from database, it is characterized in that, comprising:
Obtain the data value value that in database, first order keyword key is corresponding, obtain second level key;
Described second level key is divided into N part by the Parameter N according to obtaining, wherein, the input of every part of corresponding computing unit of second level key, described N is more than or equal to 1 and is less than or equal to the number of second level key;
By the value corresponding to the corresponding second level key of the concurrent reading of computing unit.
5. method as claimed in claim 4, it is characterized in that, the Parameter N of described acquisition specifically comprises:
According to the number of the computing unit that execution concurrence calculates, determine Parameter N.
6. a data storage device, is characterized in that, comprising:
Diversity module, for the keyword key in database is divided into two-stage according to granule size, wherein, the data value value that the first order key that granularity is larger is corresponding is the second level key that granularity is less;
Memory module, for preserving the value of the data needing to store as second level key.
7. device as claimed in claim 6, it is characterized in that, described granule size comprises: the size of time range.
8. device as claimed in claim 6, it is characterized in that, described database is Redis database.
9. a data fetch device, is arranged at Hadoop programming model MR, it is characterized in that,
Acquisition module, for obtaining the data value value that in database, first order keyword key is corresponding, obtains second level key;
Cutting module, for described second level key being divided into N part according to the Parameter N obtained, wherein, the input of every part of corresponding computing unit of second level key, described N is more than or equal to 1 and is less than or equal to the number of second level key;
Read module, for passing through the value corresponding to the corresponding second level key of the concurrent reading of computing unit.
10. device as claimed in claim 9, is characterized in that, affiliated cutting module specifically for:
According to the number of the computing unit that execution concurrence calculates, determine Parameter N.
11. 1 kinds of data store reading system, it is characterized in that, comprise the data storage device according to any one of claim 6-8 and the data fetch device according to any one of claim 9-10.
CN201510853977.2A 2015-11-30 2015-11-30 Data storage and reading method, device and system Pending CN105512216A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510853977.2A CN105512216A (en) 2015-11-30 2015-11-30 Data storage and reading method, device and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510853977.2A CN105512216A (en) 2015-11-30 2015-11-30 Data storage and reading method, device and system

Publications (1)

Publication Number Publication Date
CN105512216A true CN105512216A (en) 2016-04-20

Family

ID=55720198

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510853977.2A Pending CN105512216A (en) 2015-11-30 2015-11-30 Data storage and reading method, device and system

Country Status (1)

Country Link
CN (1) CN105512216A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111372277A (en) * 2018-12-26 2020-07-03 中兴通讯股份有限公司 Data distribution method, device and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102193917A (en) * 2010-03-01 2011-09-21 中国移动通信集团公司 Method and device for processing and querying data
US8375012B1 (en) * 2011-08-10 2013-02-12 Hewlett-Packard Development Company, L.P. Computer indexes with multiple representations
WO2013082507A1 (en) * 2011-11-30 2013-06-06 Decarta Systems and methods for performing geo-search and retrieval of electronic point-of-interest records using a big index
CN103309958A (en) * 2013-05-28 2013-09-18 中国人民大学 OLAP star connection query optimizing method under CPU and GPU mixing framework
CN103678520A (en) * 2013-11-29 2014-03-26 中国科学院计算技术研究所 Multi-dimensional interval query method and system based on cloud computing

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102193917A (en) * 2010-03-01 2011-09-21 中国移动通信集团公司 Method and device for processing and querying data
US8375012B1 (en) * 2011-08-10 2013-02-12 Hewlett-Packard Development Company, L.P. Computer indexes with multiple representations
WO2013082507A1 (en) * 2011-11-30 2013-06-06 Decarta Systems and methods for performing geo-search and retrieval of electronic point-of-interest records using a big index
CN103309958A (en) * 2013-05-28 2013-09-18 中国人民大学 OLAP star connection query optimizing method under CPU and GPU mixing framework
CN103678520A (en) * 2013-11-29 2014-03-26 中国科学院计算技术研究所 Multi-dimensional interval query method and system based on cloud computing

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111372277A (en) * 2018-12-26 2020-07-03 中兴通讯股份有限公司 Data distribution method, device and storage medium
CN111372277B (en) * 2018-12-26 2023-07-14 南京中兴新软件有限责任公司 Data distribution method, device and storage medium

Similar Documents

Publication Publication Date Title
US8380680B2 (en) Piecemeal list prefetch
CN102982121B (en) A kind of file scanning method, file scanning device and file detection system
US11093461B2 (en) Method for computing distinct values in analytical databases
CN110945477B (en) Counting elements in data items in a data processing device
CN103646082A (en) Method and device for checking files
CN104239133A (en) Log processing method, device and server
CN103020193A (en) Method and equipment for processing database operation request
CN105045631A (en) Method and device for upgrading client-side applications
CN110389812B (en) Method, apparatus, and computer-readable storage medium for managing virtual machines
CN102033948A (en) Method and device for updating data
CN108228799A (en) The storage method and device of object indexing information
CN105095367A (en) Method and device for acquiring client data
CN110019298A (en) Data processing method and device
CN103020196B (en) The system of process database operations request
CN105045789A (en) Game server database buffer memory method and system
CN107818125A (en) Assessment is iterated by SIMD processor register pair data
CN104504331A (en) Virtualization security detection method and system
CN105389394A (en) Data request processing method and device based on a plurality of database clusters
CN104363177A (en) Rule table entry optimization method and device used for message processing
US9348867B2 (en) Method for using multiple plans to achieve temporal and archive transparency performance
US20240126818A1 (en) Data filtering methods and apparatuses for data queries
CN102999722B (en) File detection system
CN105512216A (en) Data storage and reading method, device and system
Hentschel et al. Entropy-learned hashing: Constant time hashing with controllable uniformity
US20080306948A1 (en) String and binary data sorting

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20160420