CN105512216A

CN105512216A - Data storage and reading method, device and system

Info

Publication number: CN105512216A
Application number: CN201510853977.2A
Authority: CN
Inventors: 严峰
Original assignee: Qingdao Haier Intelligent Home Appliance Technology Co Ltd
Current assignee: Qingdao Haier Intelligent Home Appliance Technology Co Ltd
Priority date: 2015-11-30
Filing date: 2015-11-30
Publication date: 2016-04-20

Abstract

The invention discloses a data storage and reading method, device and system. The data reading method comprises steps of obtaining a data value corresponding to a first grade key in a database, and obtaining second grade keys; dividing the second grade keys into N parts in dependence on an obtained parameter N, wherein each part of the second grade keys corresponds to the input of a computing element, and N is larger than or equal to 1 and is smaller than or equal to the number of the second grade keys; and concurrently the reading the values corresponding to the second grade keys through the computing elements. According to the technical scheme, the data access scope of the hadoop MR is expanded, the database arranged in a two-level mode is also brought into the big data processing scope, the hadoop MR can directly read data in the database in quantities, and the operating efficiency of the hadoop MR is improved.

Description

Data store and read method, Apparatus and system

Technical field

The present invention relates to database technical field, particularly relate to a kind of data and store and read method, Apparatus and system.

Background technology

Hadoop is that current industry uses at most, and the most ripe large data of increasing income store and computing platform, and it contains a lot of assembly, and wherein MapReduce (i.e. MR) is the programming model on hadoop platform.Be applicable to the Distributed Calculation under big data quantity.Redis is a kind of index/value (key/value) memory database of increasing income, and supports the storage of several data form.

The reading that HadoopMR provides a lot of data source realizes, as file, and hbase database, but well do not support directly reading data in enormous quantities from redis, particularly, HadoopMR is the sharp weapon that large Data distribution8 formula calculates, and input is generally hdfs file.Redis is the memory database of key/value, is generally to read single value according to single key.Therefore the directly mode from redis reading data to hadoopMR in enormous quantities is lacked at present.

Summary of the invention

In view of the above problems, the present invention is proposed to provide a kind of overcoming the problems referred to above or the data that solve the problem at least in part store and read method, Apparatus and system.

The invention provides a kind of date storage method, comprising:

Keyword key in database is divided into two-stage according to granule size, and wherein, the data value value that the first order key that granularity is larger is corresponding is the second level key that granularity is less;

The value of the data needing to store as second level key is preserved.

Present invention also offers a kind of method for reading data, read data for Hadoop programming model MR from database, comprising:

Obtain the data value value that in database, first order keyword key is corresponding, obtain second level key;

Second level key is divided into N part by the Parameter N according to obtaining, wherein, the input of every part of corresponding computing unit of second level key, N is more than or equal to 1 and is less than or equal to the number of second level key;

By the value corresponding to the corresponding second level key of the concurrent reading of computing unit.

Present invention also offers a kind of data storage device, comprising:

Diversity module, for the keyword key in database is divided into two-stage according to granule size, wherein, the data value value that the first order key that granularity is larger is corresponding is the second level key that granularity is less;

Memory module, for preserving the value of the data needing to store as second level key.

Present invention also offers a kind of data fetch device, be arranged at Hadoop programming model MR, comprise:

Acquisition module, for obtaining the data value value that in database, first order keyword key is corresponding, obtains second level key;

Cutting module, for second level key being divided into N part according to the Parameter N obtained, wherein, the input of every part of corresponding computing unit of second level key, N is more than or equal to 1 and is less than or equal to the number of second level key;

Read module, for passing through the value corresponding to the corresponding second level key of the concurrent reading of computing unit.

Present invention also offers a kind of data and store reading system, comprise above-mentioned data storage device and above-mentioned data fetch device.

Beneficial effect of the present invention is as follows:

By database is set to secondary pattern, extend the data access scope of hadoopMR, the database being set to secondary pattern is also incorporated in large data processing scope, makes hadoopMR can read data from database directly in enormous quantities, improve the operational efficiency of hadoopMR.

Above-mentioned explanation is only the general introduction of technical solution of the present invention, in order to technological means of the present invention can be better understood, and can be implemented according to the content of instructions, and can become apparent, below especially exemplified by the specific embodiment of the present invention to allow above and other objects of the present invention, feature and advantage.

Accompanying drawing explanation

By reading hereafter detailed description of the preferred embodiment, various other advantage and benefit will become cheer and bright for those of ordinary skill in the art.Accompanying drawing only for illustrating the object of preferred implementation, and does not think limitation of the present invention.And in whole accompanying drawing, represent identical parts by identical reference symbol.In the accompanying drawings:

Fig. 1 is the process flow diagram of the date storage method of the embodiment of the present invention;

Fig. 2 is the schematic diagram of the Redis database of the embodiment of the present invention;

Fig. 3 is the process flow diagram of the method for reading data of the embodiment of the present invention;

Fig. 4 is the schematic diagram of hadoopMR from redis database reading data of the embodiment of the present invention;

Fig. 5 is the structural representation of the data storage device of the embodiment of the present invention;

Fig. 6 is the structural representation of the data fetch device of the embodiment of the present invention;

Fig. 7 is the structural representation of the data storage reading system of the embodiment of the present invention.

Embodiment

Below with reference to accompanying drawings exemplary embodiment of the present disclosure is described in more detail.Although show exemplary embodiment of the present disclosure in accompanying drawing, however should be appreciated that can realize the disclosure in a variety of manners and not should limit by the embodiment set forth here.On the contrary, provide these embodiments to be in order to more thoroughly the disclosure can be understood, and complete for the scope of the present disclosure can be conveyed to those skilled in the art.

Can not the direct problem reading data from Redis database in enormous quantities in order to solve hadoopMR in prior art, the invention provides a kind of data to store and read method, Apparatus and system, below in conjunction with accompanying drawing and embodiment, the present invention is further elaborated.Should be appreciated that specific embodiment described herein only in order to explain the present invention, do not limit the present invention.

Embodiment of the method one

According to embodiments of the invention, provide a kind of date storage method, Fig. 1 is the process flow diagram of the date storage method of the embodiment of the present invention, as shown in Figure 1, comprises following process according to the date storage method of the embodiment of the present invention:

Step 101, is divided into two-stage by the keyword key in database according to granule size, and wherein, the data value value that the first order key that granularity is larger is corresponding is the second level key that granularity is less; Preferably, described granule size comprises: the size of time range.That is, the time range that the key that the first order stores is hour, as the data of certain hour, value is the list that the second level stores key; The time range that the key that the second level stores is minute, as the data in certain 1 minute, value is reported data.

Step 102, preserves the value of the data needing to store as second level key.

Preferably, in embodiments of the present invention, described database is Redis database.That is, the mode that the technical scheme of the embodiment of the present invention adopts two-stage key/value to associate in Redis adapts to batch read operation.Fig. 2 is the schematic diagram of the Redis database of the embodiment of the present invention, and as shown in Figure 2, the value under the key that the first order stores is the list that the second level stores key, as key1, key2 ..., keyn.The value of key that the second level stores is reported data, and such as, the value of key1 is the value of data group dataset1, key2 is data group dataset2.

Data in Redis are divided two-level memory by the technical scheme of the embodiment of the present invention, make hadoopMR directly can read storage data in redis in batches, thus carry out Distributed Calculation to storing data in redis.HadoopMR can the storage data directly read in batches in redis be described in detail in embodiment of the method two.

Embodiment of the method two

According to embodiments of the invention, provide a kind of method for reading data, read data for Hadoop programming model MR from database, Fig. 3 is the process flow diagram of the method for reading data of the embodiment of the present invention, as shown in Figure 3, following process is comprised according to the method for reading data of the embodiment of the present invention:

Step 301, obtains the data value value that in database, first order keyword key is corresponding, obtains second level key;

Step 302, described second level key is divided into N part by the Parameter N according to obtaining, wherein, and the input of every part of corresponding computing unit of second level key, described N is more than or equal to 1 and is less than or equal to the number of second level key;

In step 302, preferably, the Parameter N of described acquisition specifically comprises: according to the number of the computing unit that execution concurrence calculates, determine Parameter N.

Step 303, by the value corresponding to the corresponding second level key of the concurrent reading of computing unit.

Read data to Hadoop programming model MR from redis database to be below described in detail.Fig. 4 is that the hadoopMR of the embodiment of the present invention reads the schematic diagram of data from redis database, as shown in Figure 4, specifically comprises following process:

Step 1: in hadoopMR, the time assembled one-level key first read as required (such as, 2015111016 represent 15 o'clock to 16 o'clock on the 10th November in the 2015) one-level read in redis stores, and obtains all key of corresponding secondary storage.

Step 2: all secondary key are divided into N part according to the Parameter N that external program imports into, the input of every part of correspondence computing unit mapper, wherein, the maximum number equaling all secondary key of N, N represents that specifying how many mapper programs to carry out execution concurrence calculates.

Each mapper of step 3:hadoopMR reads the secondary key/value distributing to it; All mapper are added together and just can have read all data be stored in this hour 15 o'clock to 16 o'clock on the 10th November in 2015 in redis.

In sum, by means of the technical scheme of the embodiment of the present invention, by database is set to secondary pattern, extend the data access scope of hadoopMR, the database being set to secondary pattern is also incorporated in large data processing scope, make hadoopMR can read data from database directly in enormous quantities, improve the operational efficiency of hadoopMR.Direct reading memory database, higher than the performance of file reading, the time spent on digital independent is less.

Device embodiment

According to embodiments of the invention, provide a kind of data storage device, Fig. 5 is the structural representation of the data storage device of the embodiment of the present invention, as shown in Figure 5, data storage device according to the embodiment of the present invention comprises: diversity module 50 and memory module 52, be described in detail the modules of the embodiment of the present invention below.

Diversity module 50, for the keyword key in database is divided into two-stage according to granule size, wherein, the data value value that the first order key that granularity is larger is corresponding is the second level key that granularity is less; Preferably, described granule size comprises: the size of time range.Described database is Redis database.

Memory module 52, for preserving the value of the data needing to store as second level key.

The concrete process of embodiment of the present invention modules has been described in detail in embodiment of the method one, does not repeat them here.

Device embodiment two

According to embodiments of the invention, provide a kind of data fetch device, be arranged at Hadoop programming model MR, Fig. 6 is the structural representation of the data fetch device of the embodiment of the present invention, as shown in Figure 6, data fetch device according to the embodiment of the present invention comprises: acquisition module 60, cutting module 62 and read module 64, be described in detail the modules of the embodiment of the present invention below.

Acquisition module 60, for obtaining the data value value that in database, first order keyword key is corresponding, obtains second level key;

Cutting module 62, for described second level key being divided into N part according to the Parameter N obtained, wherein, the input of every part of corresponding computing unit of second level key, described N is more than or equal to 1 and is less than or equal to the number of second level key; Belonging to cutting module 62 specifically for the number of computing unit that calculates according to execution concurrence, determine Parameter N.

Read module 64, for passing through the value corresponding to the corresponding second level key of the concurrent reading of computing unit.

System embodiment

According to embodiments of the invention, provide a kind of data and store reading system, Fig. 7 is the structural representation of the data storage reading system of the embodiment of the present invention, as shown in Figure 7, store reading system according to the data of the embodiment of the present invention to comprise: the data storage device 70 of said apparatus embodiment one and the data fetch device 72 of said apparatus embodiment two.Said apparatus has been described in detail in corresponding device embodiment, does not repeat them here.

Obviously, those skilled in the art can carry out various change and modification to the present invention and not depart from the spirit and scope of the present invention.Like this, if these amendments of the present invention and modification belong within the scope of the claims in the present invention and equivalent technologies thereof, then the present invention is also intended to comprise these change and modification.

Intrinsic not relevant to any certain computer, virtual system or miscellaneous equipment with display at this algorithm provided.Various general-purpose system also can with use based on together with this teaching.According to description above, the structure constructed required by this type systematic is apparent.In addition, the present invention is not also for any certain programmed language.It should be understood that and various programming language can be utilized to realize content of the present invention described here, and the description done language-specific is above to disclose preferred forms of the present invention.

In instructions provided herein, describe a large amount of detail.But can understand, embodiments of the invention can be put into practice when not having these details.In some instances, be not shown specifically known method, structure and technology, so that not fuzzy understanding of this description.

Similarly, be to be understood that, in order to simplify the disclosure and to help to understand in each inventive aspect one or more, in the description above to exemplary embodiment of the present invention, each feature of the present invention is grouped together in single embodiment, figure or the description to it sometimes.But, the method for the disclosure should be construed to the following intention of reflection: namely the present invention for required protection requires feature more more than the feature clearly recorded in each claim.Or rather, as claims below reflect, all features of disclosed single embodiment before inventive aspect is to be less than.Therefore, the claims following embodiment are incorporated to this embodiment thus clearly, and wherein each claim itself is as independent embodiment of the present invention.

Those skilled in the art are appreciated that and adaptively can change the module in the client in embodiment and they are arranged in one or more clients different from this embodiment.Block combiner in embodiment can be become a module, and multiple submodule or subelement or sub-component can be put them in addition.Except at least some in such feature and/or process or unit be mutually repel except, any combination can be adopted to combine all processes of all features disclosed in this instructions (comprising adjoint claim, summary and accompanying drawing) and so disclosed any method or client or unit.Unless expressly stated otherwise, each feature disclosed in this instructions (comprising adjoint claim, summary and accompanying drawing) can by providing identical, alternative features that is equivalent or similar object replaces.

In addition, those skilled in the art can understand, although embodiments more described herein to comprise in other embodiment some included feature instead of further feature, the combination of the feature of different embodiment means and to be within scope of the present invention and to form different embodiments.Such as, in the following claims, the one of any of embodiment required for protection can use with arbitrary array mode.

All parts embodiment of the present invention with hardware implementing, or can realize with the software module run on one or more processor, or realizes with their combination.It will be understood by those of skill in the art that the some or all functions of some or all parts be loaded with in the client of sequence network address that microprocessor or digital signal processor (DSP) can be used in practice to realize according to the embodiment of the present invention.The present invention can also be embodied as part or all equipment for performing method as described herein or device program (such as, computer program and computer program).Realizing program of the present invention and can store on a computer-readable medium like this, or the form of one or more signal can be had.Such signal can be downloaded from internet website and obtain, or provides on carrier signal, or provides with any other form.

The present invention will be described instead of limit the invention to it should be noted above-described embodiment, and those skilled in the art can design alternative embodiment when not departing from the scope of claims.In the claims, any reference symbol between bracket should be configured to limitations on claims.Word " comprises " not to be got rid of existence and does not arrange element in the claims or step.Word "a" or "an" before being positioned at element is not got rid of and be there is multiple such element.The present invention can by means of including the hardware of some different elements and realizing by means of the computing machine of suitably programming.In the unit claim listing some devices, several in these devices can be carry out imbody by same hardware branch.Word first, second and third-class use do not represent any order.Can be title by these word explanations.

Claims

1. a date storage method, is characterized in that, comprising:

The value of the data needing to store as second level key is preserved.

2. the method for claim 1, is characterized in that, described granule size comprises: the size of time range.

3. the method for claim 1, is characterized in that, described database is Redis database.

4. a method for reading data, reads data for Hadoop programming model MR from database, it is characterized in that, comprising:

Described second level key is divided into N part by the Parameter N according to obtaining, wherein, the input of every part of corresponding computing unit of second level key, described N is more than or equal to 1 and is less than or equal to the number of second level key;

5. method as claimed in claim 4, it is characterized in that, the Parameter N of described acquisition specifically comprises:

According to the number of the computing unit that execution concurrence calculates, determine Parameter N.

6. a data storage device, is characterized in that, comprising:

7. device as claimed in claim 6, it is characterized in that, described granule size comprises: the size of time range.

8. device as claimed in claim 6, it is characterized in that, described database is Redis database.

9. a data fetch device, is arranged at Hadoop programming model MR, it is characterized in that,

Cutting module, for described second level key being divided into N part according to the Parameter N obtained, wherein, the input of every part of corresponding computing unit of second level key, described N is more than or equal to 1 and is less than or equal to the number of second level key;

10. device as claimed in claim 9, is characterized in that, affiliated cutting module specifically for:

11. 1 kinds of data store reading system, it is characterized in that, comprise the data storage device according to any one of claim 6-8 and the data fetch device according to any one of claim 9-10.