CN106156125A - A kind of virtual identity management system replication policy based on different pieces of information organizational form - Google Patents
A kind of virtual identity management system replication policy based on different pieces of information organizational form Download PDFInfo
- Publication number
- CN106156125A CN106156125A CN201510163158.5A CN201510163158A CN106156125A CN 106156125 A CN106156125 A CN 106156125A CN 201510163158 A CN201510163158 A CN 201510163158A CN 106156125 A CN106156125 A CN 106156125A
- Authority
- CN
- China
- Prior art keywords
- data
- copy
- user
- virtual identity
- management system
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Abstract
The present invention discloses a kind of virtual identity management system replication policy based on different pieces of information organizational form, mainly includes the division of virtual identity data, the distribution of the data tissue of copy 1, the data tissue of copy 2, copy and data query.The replication policy to Cassandra database for the present invention improves, copy amount is set to 2, after virtual identity data are divided by Csassandra database consistent hashing algorithm, the copy of data is reorganized, select the division methods being conducive to inquiry to repartition virtual identity data, then defer to identical data copy and in the rule of same physical machine, copy is not placed.Use different Method of Data Organizations to tackle different inquiry request by different copies, reduce query time, reduce net cost, maximum system efficiency, it is adaptable to the data trnascription Placement Problems of virtual identity management system.
Description
Technical field
The invention belongs to Internet technical field, be specifically related to a kind of virtual body based on different pieces of information organizational form
Part management system replication policy.
Background technology
EID (electronic IDentity) full name is citizen's network electronic identity, and elD is remotely to demonstrate,prove on network
The authoritative electronic information file of bright individual's true identity.When eID remotely uses on network, use base
Complete the checking of true identity in public security demographic database and elD service platform, personal identification can realized
Authenticity and protection citizenship privacy while validation, there is authority, security, can chase after
Trace back, facilitate the features such as easy-to-use.In internet, between the virtual identity under user and various application, platform
There is the relation of one-to-many, and in the network environment based on eID, these corresponding relations above-mentioned all can base
In this unique mark of eID, and virtual identity data refer to is exactly that eID user has under different application
All data.
Consistent hashing algorithm in document [2] is a kind of special hash algorithm, when adjustment hash table size
When, average only K/n data need to be remapped, and wherein K is the size of data volume, and n is buffering
Size.Relatively, in other hash tables of great majority, the change of buffering array essentially results in wherein institute
Data are had to be required for remapping.
Distributed consensus hash algorithm in document [3] is exactly to increase on the basis of consistent hashing algorithm
The consideration of dummy node, its purpose is exactly the result of hash to be distributed to as far as possible fifty-fifty all of buffering
In, so so that all of cushion space is all obtained by.
Document [3] have employed Cassandra database to store virtual identity data, by setting up outside index
Improve search efficiency, use the replication policy that Cassandra carries to realize the backup of data, Appropriate application
The technology of document [1] and document [2], the higher efficiency that is stored with to magnanimity virtual identity data.But the party
Method is to improve search efficiency by setting up substantial amounts of outside index, and required memory space is relatively big, method comparison
Complicated;In copy problem, continued to use the replication policy that Cassandra database carries, by copy only when
Do storage redundancy to treat, the not effect of Appropriate application copy.
[1]JiaKui Zhao,PingFei Zhu,LiangHuai,Yang.Effective Data Localization Using
Consistent Hashing in Cloud Time-Series Databases[J].Applied Mechanics and
Materials,2013,347:2246-2251.
[2] uniformity Hash improves [EB/OL].
http://blog.163.com/lin_guoqian@126/blog/static/1693687432012151010409/.
[3] Deng Lu, the storage management key technology research of magnanimity virtual identity data and realization, 2010.
Content of the invention
For problem above, the method selection Cassandra database of bibliography of the present invention [3] stores virtual
Identity database, provides a kind of virtual identity management system replication policy based on different pieces of information organizational form,
It is applicable to the data trnascription Placement Problems of virtual identity management system.
Technical scheme is as follows:
A kind of virtual identity management system replication policy based on different pieces of information organizational form, mainly includes following
Step:
(1) virtual identity data divide: apply the thought by row storage of Cassandra column database
In virtual identity data, horizontal division and vertical division are carried out to virtual identity data, horizontal direction according to
EID divides, and vertical direction divides according to application program;
(2) the data tissue of copy 1: all data objects of same user are stored together, storage
After all data of a complete user, the more next user of storage, meanwhile, in the storage order of user,
By centrally stored for the user in same area and be ranked up by the sequencing of hour of log-on;
(3) the data tissue of copy 2;
(4) copy distribution;
(5) data query: when client data to be inquired about, first analyzes query statement, then basis
Analysis result selects instruction to be sent to copy 1 or copy 2.
Further, further comprising the steps of in described step (3):
1) by application platform, data object is divided;
2) inside each application platform, divide according to user location;
3) a regional user in application platform, sorts according to the mode of data trnascription 1;
4) if the user while unregistered under platform, then directly skip.
Further, further comprising the steps of in described step (4):
1) being stored separately copy 1 and copy 2, copy 1 is stored on the odd node of cluster, copy 2
It is stored on the even-numbered nodes of cluster;
2) odd node is adjacent with even-numbered nodes, and node 1 is adjacent with node 2n;
3) set two kinds of different hash functions, make the hash value of the data object calculating respectively by pair
The Method of Data Organization of basis 1 and copy 2 is ranked up, then with consistent hashing algorithm by data and node
Map.
Further, in described step 3) described in hash function be that object map is become another
One object, can be first by object order, and the then sortord further according to object arranges hash function.
Further, in described step 3) in carry out consistent hashing algorithm calculate data object hash
During value, the base unit of data object is set as all data under certain application for certain user, and one
Individual eID account and Apply Names collectively constitute the major key of a data object.
Further, further comprising the steps of in described step (5):
1) query statement is analyzed, it may be judged whether for the inquiry based on user;
2) if with the inquiry based on user, then send an instruction to copy 1, and perform inquiry operation;
3) if not with the inquiry based on user, then analyzing query statement, it may be judged whether for application platform being
Main inquiry;
4) if with the inquiry based on application platform, then send an instruction to copy 2, and perform inquiry operation;
5) if not with the inquiry based on application platform, then selecting copy according to current system load, and perform
Inquiry operation.
The invention has the beneficial effects as follows: traditional replication policy requires that the copy of identical data can not be put into same
In platform physical machine, but the placement between the copy of different pieces of information is not required.The present invention is mainly to Cassandra
The replication policy of database improves, and copy amount is set to 2, when Csassandra database uniformity
After virtual identity data are divided by hash algorithm, reorganizing the copy of data, selection is conducive to
Virtual identity data are repartitioned by the division methods of inquiry, then defer to identical data copy not together
Copy is placed by the rule of one physical machine.The organizational form of data trnascription 1 is conducive to eID user
Data region-by-region manages, and when needs read data, according to the feature wanting request data, selection to be entered
The copy of row operation, thus improves data access efficiency so that it is standby and enter that data trnascription is no longer intended merely to calamity
The redundant storage of row, but use different Method of Data Organizations to tackle different inquiries by different copies
Request, reduces query time, reduces net cost, maximum system efficiency, it is adaptable to virtual identity
The data trnascription Placement Problems of management system.
Brief description
Fig. 1 is the data query flow chart of the present invention.
Fig. 2 is the copy distribution map of the present invention.
Fig. 3 is the virtual identity information figure in the embodiment of the present invention 1.
Fig. 4 is the organizational form figure of the copy 1 in the embodiment of the present invention 1.
Fig. 5 is the virtual identity data profile in the embodiment of the present invention 1.
Fig. 6 is the organizational form figure of the copy 2 in the embodiment of the present invention 1.
Detailed description of the invention
For the ease of understanding the present invention, below in conjunction with Figure of description and embodiment, the present invention is made furtherly
Bright.
The present invention provides a kind of virtual identity management system replication policy based on different pieces of information organizational form, main
Comprise the following steps:
(1) virtual identity data divide: apply the data model of Cassandra column database at virtual body
Number, according to upper, carries out horizontal division and vertical division to virtual identity data, and horizontal direction is carried out according to eID
Dividing, vertical direction divides according to application program;
(2) the data tissue of copy 1: all data objects of same user are stored together, storage
After all data of a complete user, the more next user of storage, meanwhile, in the storage order of user,
By centrally stored for the user in same area and be ranked up by fixing order;
(3) the data tissue of copy 2;
(4) copy distribution;
(5) data query: when client data to be inquired about, first analyzes query statement, then basis
Analysis result selects instruction to be sent to copy 1 or copy 2.
The development environment of the present invention: the X86 platform of (SuSE) Linux OS, JDK1.7, use java language
Writing, data server needs to install the database software of Cassandra1.0 or more highest version, carries for system
Support for data.
The running environment of the present invention: server end runs on the X86 platform being provided with (SuSE) Linux OS,
Multiple machine nodes of JDK1.7 or more version, client is customary personal computer.
The below exemplary embodiment for the present invention:
Embodiment 1:
(1) virtual identity data divide: in domain space, user according to the demand of oneself in different application
Register account number on platform, these application platforms include ecommerce, social networks, online game etc..Virtual
These information unification are got up by identity management system by eID, and a user has unique eID mark,
He has again different virtual account under different application platforms, and these data portion structures differ, and size is not
With, and data volume is huge.Apply the data model of Cassandra column database in virtual identity data
On, as shown in table 1:
Table 1 virtual identity data model
The present invention, according to model stored above, carries out horizontal division and vertical division, water to virtual identity data
Square divide to according to eID, vertical direction divides according to application program, is carrying out uniformity
When hash algorithm calculates data object hash value, data object unit is set as, and certain user applies at certain
Under all data, and an eID account and Apply Names collectively constitute the " main of a data object
Key ".The present invention is not concerned with the Method of Data Organization within a data object, but mainly solves different secondary
Organizational form between notebook data object.
(2) Method of Data Organization of copy 1: in actual data request operation, often asks certain
Virtual identity information under all application platforms for the user, for example, the eID data of user Zhang San occur in that different
Often, there is stolen possibility, be now accomplished by checking all virtual identity data of Zhang San.Assume that user opens
Three have applied for altogether including 8 application platforms such as social networks, e-commerce website, online game, then
His all virtual identity informations in domain space are as it is shown on figure 3, one of them rectangle frame represents a number
According to object.
If be distributed in these data objects on back end in a random fashion, then complete above-mentioned
A lot of different nodes may be crossed over when inquiry, not only affect inquiry velocity, also can take network
Bandwidth, therefore, based on the technology of the present invention, the organizational form of data trnascription 1 is by all numbers of same user
It is stored together according to object, after having stored all data of a user, the more next user of storage, with this
Analogize.User storage sequentially, it is considered to by centrally stored for the user in same area, unified area
User be ranked up according to a fixing order.So be conducive to eID user data region-by-region pipe
Reason.The organizational form of final copy 1 is as shown in Figure 4.
(3) organizational form of copy 2: in actual data request operation, it will usually some is applied
The data of platform operate, for example, check the user situation in each area for the Taobao.Now need to obtain
All Virtual User data of Taobao's application, the virtual identity data of Taobao are distributed as shown in Figure 5.
If be randomly dispersed on back end or according to pair all of data object according to consistent hashing
This mode of 1 is distributed, and when carrying out this operation, can cross over a lot of node equally, take band
Width, the efficiency of relatively low cluster, data trnascription 2 is as just the redundancy of copy 1 simultaneously, not
Its effect is performed to maximum.Based on considerations above, data trnascription 2 is entered by the present invention as follows
Row tissue: the 1st, first by application platform, data object is divided;2nd, inside each application platform, press
Divide according to user location;3rd, a regional user in application platform, according to data trnascription 1
Mode sorts;If 4 certain user are unregistered under certain platform, then directly skip, as shown in Figure 6.
(4) copy distribution: copy 1 and copy 2 are stored separately by the present invention, i.e. copy 1 is stored in cluster
Wherein on half node, copy 2 is stored on second half node of whole cluster, and its distribution is as shown in Figure 2.
Wherein odd node is adjacent with even-numbered nodes, and node 1 is adjacent with node 2n.Select suitable hash function,
The hash value making the acute object of total is ranked up according to copy 1 and copy 2 organizational form, then uses uniformity
Data and node are mapped by hash algorithm.So far, all of data trnascription is all according to designation method distribution
In cluster.
Described hash function is that an object map is become another object, can be first by object order, so
After further according to the sortord of object, hash function is set.
(5) data query: when client data to be inquired about, first analyzes query statement, then basis
Analysis result selects instruction to be sent to copy 1 or copy 2, and its flow process is as shown in Figure 1.
Traditional replication policy requires that the copy of identical data can not be put in same physical machine, but to difference
Placement between the copy of data does not require.The replication policy of Cassandra database is mainly entered by the present invention
Row improves, and copy amount is set to 2, when Csassandra database consistent hashing algorithm is to virtual body
After number is according to dividing, the copy of data is reorganized, select the division methods pair being conducive to inquiry
Virtual identity data are repartitioned, and then defer to identical data copy not in the rule of same physical machine
Copy is placed.The organizational form of data trnascription 1 is conducive to managing eID user data region-by-region,
When needs read data, according to the feature wanting request data, select the copy to operate, from
And improving data access efficiency so that data trnascription is no longer intended merely to that calamity is standby and the redundant storage that carries out, and
It is to use different Method of Data Organizations to tackle different inquiry request by different copies, when reducing inquiry
Between, reduce net cost, maximum system efficiency, it is adaptable to the data of virtual identity management system are secondary
This Placement Problems.
It is above having carried out exemplary description to the present invention, it is clear that the realization of the present invention is not by aforesaid way
Restriction, as long as have employed the various improvement that technical solution of the present invention is carried out or not improved by the present invention's
Design and technical scheme directly apply to other occasions, all within the scope of the present invention.
Claims (5)
1. the virtual identity management system replication policy based on different pieces of information organizational form, its feature exists
In comprising the following steps:
Step one: virtual identity data divide: Cassandra column database applied virtual by row storage
On identity data, carrying out horizontal division and vertical division to virtual identity data, horizontal direction is entered according to eID
Row divides, and vertical direction divides according to application program;
Step 2: the data tissue of copy 1: all data objects of same user are stored together,
After having stored all data of a user, the more next user of storage, meanwhile, in the storage order of user
On, by centrally stored for the user in same area and be ranked up by the sequencing of hour of log-on;
Step 3: the data tissue of copy 2;
Step 4: copy is distributed;
Step 5: data query: when client data to be inquired about, first analyzes query statement, then
Instruction is selected to be sent to copy 1 or copy 2 according to analysis result.
2. a kind of virtual identity management system based on different pieces of information organizational form according to claim 1
Replication policy, it is characterised in that further comprising the steps of in described step 3:
Step A: data object is divided by application platform;
Step B: inside each application platform, divides according to user location;
Step C: a regional user in application platform, sorts according to the mode of data trnascription 1;
Step D: if the user while unregistered under platform, then directly skip.
3. a kind of virtual identity management system based on different pieces of information organizational form according to claim 1
Replication policy, it is characterised in that further comprising the steps of in described step 4:
Step a: being stored separately copy 1 and copy 2, copy 1 is stored on the odd node of cluster, secondary
Originally it 2 is stored on the even-numbered nodes of cluster;
Step b: odd node is adjacent with even-numbered nodes, node 1 is adjacent with node 2n;
Step c: set two different hash functions, makes the hash value of data object respectively by copy 1 He
The Method of Data Organization of copy 2 is ranked up, and then reflects data and node with consistent hashing algorithm
Penetrate.
4. a kind of virtual identity management system based on different pieces of information organizational form according to claim 1
Replication policy, it is characterised in that further comprising the steps of in described step 5:
Step 1: analyze query statement, it may be judged whether for the inquiry based on user;
Step 2: if with the inquiry based on user, then send an instruction to copy 1, and perform inquiry operation;
Step 3: if not with the inquiry based on user, then analyzing query statement, it may be judged whether for application
Inquiry based on platform;
Step 4: if with the inquiry based on application platform, then send an instruction to copy 2, and perform inquiry
Operation;
Step 5: if not with the inquiry based on application platform, then selecting copy according to current system load,
And perform inquiry operation.
5. a kind of virtual identity management system based on different pieces of information organizational form according to claim 3
Replication policy, it is characterised in that carry out consistent hashing algorithm in described step c and calculate data object hash
During value, the base unit of data object is set as all data under certain application for certain user, and one
Individual eID account and Apply Names collectively constitute the major key of a data object.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510163158.5A CN106156125B (en) | 2015-04-08 | 2015-04-08 | A method of the virtual identity management system copy based on different data organizational form |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510163158.5A CN106156125B (en) | 2015-04-08 | 2015-04-08 | A method of the virtual identity management system copy based on different data organizational form |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106156125A true CN106156125A (en) | 2016-11-23 |
CN106156125B CN106156125B (en) | 2019-08-23 |
Family
ID=57336271
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510163158.5A Active CN106156125B (en) | 2015-04-08 | 2015-04-08 | A method of the virtual identity management system copy based on different data organizational form |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106156125B (en) |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20020065919A1 (en) * | 2000-11-30 | 2002-05-30 | Taylor Ian Lance | Peer-to-peer caching network for user data |
WO2004097667A2 (en) * | 2003-04-25 | 2004-11-11 | Sap Ag | Automated data mining runs |
CN102457571A (en) * | 2011-09-15 | 2012-05-16 | 中标软件有限公司 | Method for uniformly distributing data in cloud storage |
CN103425756A (en) * | 2013-07-31 | 2013-12-04 | 西安交通大学 | Copy management strategy for data blocks in HDFS |
CN103440303A (en) * | 2013-08-21 | 2013-12-11 | 曙光信息产业股份有限公司 | Heterogeneous cloud storage system and data processing method thereof |
CN103577407A (en) * | 2012-07-19 | 2014-02-12 | 国际商业机器公司 | Query method and query device for distributed database |
-
2015
- 2015-04-08 CN CN201510163158.5A patent/CN106156125B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20020065919A1 (en) * | 2000-11-30 | 2002-05-30 | Taylor Ian Lance | Peer-to-peer caching network for user data |
WO2004097667A2 (en) * | 2003-04-25 | 2004-11-11 | Sap Ag | Automated data mining runs |
CN102457571A (en) * | 2011-09-15 | 2012-05-16 | 中标软件有限公司 | Method for uniformly distributing data in cloud storage |
CN103577407A (en) * | 2012-07-19 | 2014-02-12 | 国际商业机器公司 | Query method and query device for distributed database |
CN103425756A (en) * | 2013-07-31 | 2013-12-04 | 西安交通大学 | Copy management strategy for data blocks in HDFS |
CN103440303A (en) * | 2013-08-21 | 2013-12-11 | 曙光信息产业股份有限公司 | Heterogeneous cloud storage system and data processing method thereof |
Non-Patent Citations (1)
Title |
---|
邓璐 等: ""基于eID虚拟身份数据存储的研究"", 《第28次全国计算机安全学术交流会》 * |
Also Published As
Publication number | Publication date |
---|---|
CN106156125B (en) | 2019-08-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20230236935A1 (en) | Distributing Data on Distributed Storage Systems | |
US10585913B2 (en) | Apparatus and method for distributed query processing utilizing dynamically generated in-memory term maps | |
Liao et al. | Multi-dimensional index on hadoop distributed file system | |
US9305019B2 (en) | Method of associating user related data with spatial hierarchy identifiers for efficient location-based processing | |
CN104820714B (en) | Magnanimity tile small documents memory management method based on hadoop | |
US9081837B2 (en) | Scoped database connections | |
CN103812939B (en) | Big data storage system | |
US9367463B2 (en) | System and method utilizing a shared cache to provide zero copy memory mapped database | |
WO2019153592A1 (en) | User authority data management device and method, and computer readable storage medium | |
CN112579606A (en) | Workflow data processing method and device, computer equipment and storage medium | |
Wang et al. | Research and implementation on spatial data storage and operation based on Hadoop platform | |
JP2020515961A (en) | Blockchain partial ledger | |
CN103747073A (en) | Distributed caching method and system | |
CN103442090A (en) | Cloud computing system for data scatter storage | |
CN106156255A (en) | A kind of data buffer storage layer realization method and system | |
Liroz-Gistau et al. | Dynamic workload-based partitioning for large-scale databases | |
CN110008197A (en) | A kind of data processing method, system and electronic equipment and storage medium | |
JP6410932B2 (en) | Embedded cloud analytics | |
US10534765B2 (en) | Assigning segments of a shared database storage to nodes | |
US11625179B2 (en) | Cache indexing using data addresses based on data fingerprints | |
CN111221814B (en) | Method, device and equipment for constructing secondary index | |
KR101451280B1 (en) | Distributed database management system and method | |
Li et al. | MR‐tree: an efficient index for MapReduce | |
CN106960052B (en) | Credit investigation data acquisition method and system | |
CN106156125A (en) | A kind of virtual identity management system replication policy based on different pieces of information organizational form |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |