CN106156125A - A kind of virtual identity management system replication policy based on different pieces of information organizational form - Google Patents

A kind of virtual identity management system replication policy based on different pieces of information organizational form Download PDF

Info

Publication number
CN106156125A
CN106156125A CN201510163158.5A CN201510163158A CN106156125A CN 106156125 A CN106156125 A CN 106156125A CN 201510163158 A CN201510163158 A CN 201510163158A CN 106156125 A CN106156125 A CN 106156125A
Authority
CN
China
Prior art keywords
data
copy
user
virtual identity
management system
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201510163158.5A
Other languages
Chinese (zh)
Other versions
CN106156125B (en
Inventor
傅翔
朱伟辉
贾焰
韩伟红
李树栋
李爱平
周斌
杨树强
黄九鸣
全拥
邓璐
刘斐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
National University of Defense Technology
Original Assignee
National University of Defense Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National University of Defense Technology filed Critical National University of Defense Technology
Priority to CN201510163158.5A priority Critical patent/CN106156125B/en
Publication of CN106156125A publication Critical patent/CN106156125A/en
Application granted granted Critical
Publication of CN106156125B publication Critical patent/CN106156125B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The present invention discloses a kind of virtual identity management system replication policy based on different pieces of information organizational form, mainly includes the division of virtual identity data, the distribution of the data tissue of copy 1, the data tissue of copy 2, copy and data query.The replication policy to Cassandra database for the present invention improves, copy amount is set to 2, after virtual identity data are divided by Csassandra database consistent hashing algorithm, the copy of data is reorganized, select the division methods being conducive to inquiry to repartition virtual identity data, then defer to identical data copy and in the rule of same physical machine, copy is not placed.Use different Method of Data Organizations to tackle different inquiry request by different copies, reduce query time, reduce net cost, maximum system efficiency, it is adaptable to the data trnascription Placement Problems of virtual identity management system.

Description

A kind of virtual identity management system replication policy based on different pieces of information organizational form
Technical field
The invention belongs to Internet technical field, be specifically related to a kind of virtual body based on different pieces of information organizational form Part management system replication policy.
Background technology
EID (electronic IDentity) full name is citizen's network electronic identity, and elD is remotely to demonstrate,prove on network The authoritative electronic information file of bright individual's true identity.When eID remotely uses on network, use base Complete the checking of true identity in public security demographic database and elD service platform, personal identification can realized Authenticity and protection citizenship privacy while validation, there is authority, security, can chase after Trace back, facilitate the features such as easy-to-use.In internet, between the virtual identity under user and various application, platform There is the relation of one-to-many, and in the network environment based on eID, these corresponding relations above-mentioned all can base In this unique mark of eID, and virtual identity data refer to is exactly that eID user has under different application All data.
Consistent hashing algorithm in document [2] is a kind of special hash algorithm, when adjustment hash table size When, average only K/n data need to be remapped, and wherein K is the size of data volume, and n is buffering Size.Relatively, in other hash tables of great majority, the change of buffering array essentially results in wherein institute Data are had to be required for remapping.
Distributed consensus hash algorithm in document [3] is exactly to increase on the basis of consistent hashing algorithm The consideration of dummy node, its purpose is exactly the result of hash to be distributed to as far as possible fifty-fifty all of buffering In, so so that all of cushion space is all obtained by.
Document [3] have employed Cassandra database to store virtual identity data, by setting up outside index Improve search efficiency, use the replication policy that Cassandra carries to realize the backup of data, Appropriate application The technology of document [1] and document [2], the higher efficiency that is stored with to magnanimity virtual identity data.But the party Method is to improve search efficiency by setting up substantial amounts of outside index, and required memory space is relatively big, method comparison Complicated;In copy problem, continued to use the replication policy that Cassandra database carries, by copy only when Do storage redundancy to treat, the not effect of Appropriate application copy.
[1]JiaKui Zhao,PingFei Zhu,LiangHuai,Yang.Effective Data Localization Using Consistent Hashing in Cloud Time-Series Databases[J].Applied Mechanics and Materials,2013,347:2246-2251.
[2] uniformity Hash improves [EB/OL]. http://blog.163.com/lin_guoqian@126/blog/static/1693687432012151010409/.
[3] Deng Lu, the storage management key technology research of magnanimity virtual identity data and realization, 2010.
Content of the invention
For problem above, the method selection Cassandra database of bibliography of the present invention [3] stores virtual Identity database, provides a kind of virtual identity management system replication policy based on different pieces of information organizational form, It is applicable to the data trnascription Placement Problems of virtual identity management system.
Technical scheme is as follows:
A kind of virtual identity management system replication policy based on different pieces of information organizational form, mainly includes following Step:
(1) virtual identity data divide: apply the thought by row storage of Cassandra column database In virtual identity data, horizontal division and vertical division are carried out to virtual identity data, horizontal direction according to EID divides, and vertical direction divides according to application program;
(2) the data tissue of copy 1: all data objects of same user are stored together, storage After all data of a complete user, the more next user of storage, meanwhile, in the storage order of user, By centrally stored for the user in same area and be ranked up by the sequencing of hour of log-on;
(3) the data tissue of copy 2;
(4) copy distribution;
(5) data query: when client data to be inquired about, first analyzes query statement, then basis Analysis result selects instruction to be sent to copy 1 or copy 2.
Further, further comprising the steps of in described step (3):
1) by application platform, data object is divided;
2) inside each application platform, divide according to user location;
3) a regional user in application platform, sorts according to the mode of data trnascription 1;
4) if the user while unregistered under platform, then directly skip.
Further, further comprising the steps of in described step (4):
1) being stored separately copy 1 and copy 2, copy 1 is stored on the odd node of cluster, copy 2 It is stored on the even-numbered nodes of cluster;
2) odd node is adjacent with even-numbered nodes, and node 1 is adjacent with node 2n;
3) set two kinds of different hash functions, make the hash value of the data object calculating respectively by pair The Method of Data Organization of basis 1 and copy 2 is ranked up, then with consistent hashing algorithm by data and node Map.
Further, in described step 3) described in hash function be that object map is become another One object, can be first by object order, and the then sortord further according to object arranges hash function.
Further, in described step 3) in carry out consistent hashing algorithm calculate data object hash During value, the base unit of data object is set as all data under certain application for certain user, and one Individual eID account and Apply Names collectively constitute the major key of a data object.
Further, further comprising the steps of in described step (5):
1) query statement is analyzed, it may be judged whether for the inquiry based on user;
2) if with the inquiry based on user, then send an instruction to copy 1, and perform inquiry operation;
3) if not with the inquiry based on user, then analyzing query statement, it may be judged whether for application platform being Main inquiry;
4) if with the inquiry based on application platform, then send an instruction to copy 2, and perform inquiry operation;
5) if not with the inquiry based on application platform, then selecting copy according to current system load, and perform Inquiry operation.
The invention has the beneficial effects as follows: traditional replication policy requires that the copy of identical data can not be put into same In platform physical machine, but the placement between the copy of different pieces of information is not required.The present invention is mainly to Cassandra The replication policy of database improves, and copy amount is set to 2, when Csassandra database uniformity After virtual identity data are divided by hash algorithm, reorganizing the copy of data, selection is conducive to Virtual identity data are repartitioned by the division methods of inquiry, then defer to identical data copy not together Copy is placed by the rule of one physical machine.The organizational form of data trnascription 1 is conducive to eID user Data region-by-region manages, and when needs read data, according to the feature wanting request data, selection to be entered The copy of row operation, thus improves data access efficiency so that it is standby and enter that data trnascription is no longer intended merely to calamity The redundant storage of row, but use different Method of Data Organizations to tackle different inquiries by different copies Request, reduces query time, reduces net cost, maximum system efficiency, it is adaptable to virtual identity The data trnascription Placement Problems of management system.
Brief description
Fig. 1 is the data query flow chart of the present invention.
Fig. 2 is the copy distribution map of the present invention.
Fig. 3 is the virtual identity information figure in the embodiment of the present invention 1.
Fig. 4 is the organizational form figure of the copy 1 in the embodiment of the present invention 1.
Fig. 5 is the virtual identity data profile in the embodiment of the present invention 1.
Fig. 6 is the organizational form figure of the copy 2 in the embodiment of the present invention 1.
Detailed description of the invention
For the ease of understanding the present invention, below in conjunction with Figure of description and embodiment, the present invention is made furtherly Bright.
The present invention provides a kind of virtual identity management system replication policy based on different pieces of information organizational form, main Comprise the following steps:
(1) virtual identity data divide: apply the data model of Cassandra column database at virtual body Number, according to upper, carries out horizontal division and vertical division to virtual identity data, and horizontal direction is carried out according to eID Dividing, vertical direction divides according to application program;
(2) the data tissue of copy 1: all data objects of same user are stored together, storage After all data of a complete user, the more next user of storage, meanwhile, in the storage order of user, By centrally stored for the user in same area and be ranked up by fixing order;
(3) the data tissue of copy 2;
(4) copy distribution;
(5) data query: when client data to be inquired about, first analyzes query statement, then basis Analysis result selects instruction to be sent to copy 1 or copy 2.
The development environment of the present invention: the X86 platform of (SuSE) Linux OS, JDK1.7, use java language Writing, data server needs to install the database software of Cassandra1.0 or more highest version, carries for system Support for data.
The running environment of the present invention: server end runs on the X86 platform being provided with (SuSE) Linux OS, Multiple machine nodes of JDK1.7 or more version, client is customary personal computer.
The below exemplary embodiment for the present invention:
Embodiment 1:
(1) virtual identity data divide: in domain space, user according to the demand of oneself in different application Register account number on platform, these application platforms include ecommerce, social networks, online game etc..Virtual These information unification are got up by identity management system by eID, and a user has unique eID mark, He has again different virtual account under different application platforms, and these data portion structures differ, and size is not With, and data volume is huge.Apply the data model of Cassandra column database in virtual identity data On, as shown in table 1:
Table 1 virtual identity data model
The present invention, according to model stored above, carries out horizontal division and vertical division, water to virtual identity data Square divide to according to eID, vertical direction divides according to application program, is carrying out uniformity When hash algorithm calculates data object hash value, data object unit is set as, and certain user applies at certain Under all data, and an eID account and Apply Names collectively constitute the " main of a data object Key ".The present invention is not concerned with the Method of Data Organization within a data object, but mainly solves different secondary Organizational form between notebook data object.
(2) Method of Data Organization of copy 1: in actual data request operation, often asks certain Virtual identity information under all application platforms for the user, for example, the eID data of user Zhang San occur in that different Often, there is stolen possibility, be now accomplished by checking all virtual identity data of Zhang San.Assume that user opens Three have applied for altogether including 8 application platforms such as social networks, e-commerce website, online game, then His all virtual identity informations in domain space are as it is shown on figure 3, one of them rectangle frame represents a number According to object.
If be distributed in these data objects on back end in a random fashion, then complete above-mentioned A lot of different nodes may be crossed over when inquiry, not only affect inquiry velocity, also can take network Bandwidth, therefore, based on the technology of the present invention, the organizational form of data trnascription 1 is by all numbers of same user It is stored together according to object, after having stored all data of a user, the more next user of storage, with this Analogize.User storage sequentially, it is considered to by centrally stored for the user in same area, unified area User be ranked up according to a fixing order.So be conducive to eID user data region-by-region pipe Reason.The organizational form of final copy 1 is as shown in Figure 4.
(3) organizational form of copy 2: in actual data request operation, it will usually some is applied The data of platform operate, for example, check the user situation in each area for the Taobao.Now need to obtain All Virtual User data of Taobao's application, the virtual identity data of Taobao are distributed as shown in Figure 5.
If be randomly dispersed on back end or according to pair all of data object according to consistent hashing This mode of 1 is distributed, and when carrying out this operation, can cross over a lot of node equally, take band Width, the efficiency of relatively low cluster, data trnascription 2 is as just the redundancy of copy 1 simultaneously, not Its effect is performed to maximum.Based on considerations above, data trnascription 2 is entered by the present invention as follows Row tissue: the 1st, first by application platform, data object is divided;2nd, inside each application platform, press Divide according to user location;3rd, a regional user in application platform, according to data trnascription 1 Mode sorts;If 4 certain user are unregistered under certain platform, then directly skip, as shown in Figure 6.
(4) copy distribution: copy 1 and copy 2 are stored separately by the present invention, i.e. copy 1 is stored in cluster Wherein on half node, copy 2 is stored on second half node of whole cluster, and its distribution is as shown in Figure 2. Wherein odd node is adjacent with even-numbered nodes, and node 1 is adjacent with node 2n.Select suitable hash function, The hash value making the acute object of total is ranked up according to copy 1 and copy 2 organizational form, then uses uniformity Data and node are mapped by hash algorithm.So far, all of data trnascription is all according to designation method distribution In cluster.
Described hash function is that an object map is become another object, can be first by object order, so After further according to the sortord of object, hash function is set.
(5) data query: when client data to be inquired about, first analyzes query statement, then basis Analysis result selects instruction to be sent to copy 1 or copy 2, and its flow process is as shown in Figure 1.
Traditional replication policy requires that the copy of identical data can not be put in same physical machine, but to difference Placement between the copy of data does not require.The replication policy of Cassandra database is mainly entered by the present invention Row improves, and copy amount is set to 2, when Csassandra database consistent hashing algorithm is to virtual body After number is according to dividing, the copy of data is reorganized, select the division methods pair being conducive to inquiry Virtual identity data are repartitioned, and then defer to identical data copy not in the rule of same physical machine Copy is placed.The organizational form of data trnascription 1 is conducive to managing eID user data region-by-region, When needs read data, according to the feature wanting request data, select the copy to operate, from And improving data access efficiency so that data trnascription is no longer intended merely to that calamity is standby and the redundant storage that carries out, and It is to use different Method of Data Organizations to tackle different inquiry request by different copies, when reducing inquiry Between, reduce net cost, maximum system efficiency, it is adaptable to the data of virtual identity management system are secondary This Placement Problems.
It is above having carried out exemplary description to the present invention, it is clear that the realization of the present invention is not by aforesaid way Restriction, as long as have employed the various improvement that technical solution of the present invention is carried out or not improved by the present invention's Design and technical scheme directly apply to other occasions, all within the scope of the present invention.

Claims (5)

1. the virtual identity management system replication policy based on different pieces of information organizational form, its feature exists In comprising the following steps:
Step one: virtual identity data divide: Cassandra column database applied virtual by row storage On identity data, carrying out horizontal division and vertical division to virtual identity data, horizontal direction is entered according to eID Row divides, and vertical direction divides according to application program;
Step 2: the data tissue of copy 1: all data objects of same user are stored together, After having stored all data of a user, the more next user of storage, meanwhile, in the storage order of user On, by centrally stored for the user in same area and be ranked up by the sequencing of hour of log-on;
Step 3: the data tissue of copy 2;
Step 4: copy is distributed;
Step 5: data query: when client data to be inquired about, first analyzes query statement, then Instruction is selected to be sent to copy 1 or copy 2 according to analysis result.
2. a kind of virtual identity management system based on different pieces of information organizational form according to claim 1 Replication policy, it is characterised in that further comprising the steps of in described step 3:
Step A: data object is divided by application platform;
Step B: inside each application platform, divides according to user location;
Step C: a regional user in application platform, sorts according to the mode of data trnascription 1;
Step D: if the user while unregistered under platform, then directly skip.
3. a kind of virtual identity management system based on different pieces of information organizational form according to claim 1 Replication policy, it is characterised in that further comprising the steps of in described step 4:
Step a: being stored separately copy 1 and copy 2, copy 1 is stored on the odd node of cluster, secondary Originally it 2 is stored on the even-numbered nodes of cluster;
Step b: odd node is adjacent with even-numbered nodes, node 1 is adjacent with node 2n;
Step c: set two different hash functions, makes the hash value of data object respectively by copy 1 He The Method of Data Organization of copy 2 is ranked up, and then reflects data and node with consistent hashing algorithm Penetrate.
4. a kind of virtual identity management system based on different pieces of information organizational form according to claim 1 Replication policy, it is characterised in that further comprising the steps of in described step 5:
Step 1: analyze query statement, it may be judged whether for the inquiry based on user;
Step 2: if with the inquiry based on user, then send an instruction to copy 1, and perform inquiry operation;
Step 3: if not with the inquiry based on user, then analyzing query statement, it may be judged whether for application Inquiry based on platform;
Step 4: if with the inquiry based on application platform, then send an instruction to copy 2, and perform inquiry Operation;
Step 5: if not with the inquiry based on application platform, then selecting copy according to current system load, And perform inquiry operation.
5. a kind of virtual identity management system based on different pieces of information organizational form according to claim 3 Replication policy, it is characterised in that carry out consistent hashing algorithm in described step c and calculate data object hash During value, the base unit of data object is set as all data under certain application for certain user, and one Individual eID account and Apply Names collectively constitute the major key of a data object.
CN201510163158.5A 2015-04-08 2015-04-08 A method of the virtual identity management system copy based on different data organizational form Active CN106156125B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510163158.5A CN106156125B (en) 2015-04-08 2015-04-08 A method of the virtual identity management system copy based on different data organizational form

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510163158.5A CN106156125B (en) 2015-04-08 2015-04-08 A method of the virtual identity management system copy based on different data organizational form

Publications (2)

Publication Number Publication Date
CN106156125A true CN106156125A (en) 2016-11-23
CN106156125B CN106156125B (en) 2019-08-23

Family

ID=57336271

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510163158.5A Active CN106156125B (en) 2015-04-08 2015-04-08 A method of the virtual identity management system copy based on different data organizational form

Country Status (1)

Country Link
CN (1) CN106156125B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020065919A1 (en) * 2000-11-30 2002-05-30 Taylor Ian Lance Peer-to-peer caching network for user data
WO2004097667A2 (en) * 2003-04-25 2004-11-11 Sap Ag Automated data mining runs
CN102457571A (en) * 2011-09-15 2012-05-16 中标软件有限公司 Method for uniformly distributing data in cloud storage
CN103425756A (en) * 2013-07-31 2013-12-04 西安交通大学 Copy management strategy for data blocks in HDFS
CN103440303A (en) * 2013-08-21 2013-12-11 曙光信息产业股份有限公司 Heterogeneous cloud storage system and data processing method thereof
CN103577407A (en) * 2012-07-19 2014-02-12 国际商业机器公司 Query method and query device for distributed database

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020065919A1 (en) * 2000-11-30 2002-05-30 Taylor Ian Lance Peer-to-peer caching network for user data
WO2004097667A2 (en) * 2003-04-25 2004-11-11 Sap Ag Automated data mining runs
CN102457571A (en) * 2011-09-15 2012-05-16 中标软件有限公司 Method for uniformly distributing data in cloud storage
CN103577407A (en) * 2012-07-19 2014-02-12 国际商业机器公司 Query method and query device for distributed database
CN103425756A (en) * 2013-07-31 2013-12-04 西安交通大学 Copy management strategy for data blocks in HDFS
CN103440303A (en) * 2013-08-21 2013-12-11 曙光信息产业股份有限公司 Heterogeneous cloud storage system and data processing method thereof

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
邓璐 等: ""基于eID虚拟身份数据存储的研究"", 《第28次全国计算机安全学术交流会》 *

Also Published As

Publication number Publication date
CN106156125B (en) 2019-08-23

Similar Documents

Publication Publication Date Title
US20230236935A1 (en) Distributing Data on Distributed Storage Systems
US10585913B2 (en) Apparatus and method for distributed query processing utilizing dynamically generated in-memory term maps
Liao et al. Multi-dimensional index on hadoop distributed file system
US9305019B2 (en) Method of associating user related data with spatial hierarchy identifiers for efficient location-based processing
CN104820714B (en) Magnanimity tile small documents memory management method based on hadoop
US9081837B2 (en) Scoped database connections
CN103812939B (en) Big data storage system
US9367463B2 (en) System and method utilizing a shared cache to provide zero copy memory mapped database
WO2019153592A1 (en) User authority data management device and method, and computer readable storage medium
CN112579606A (en) Workflow data processing method and device, computer equipment and storage medium
Wang et al. Research and implementation on spatial data storage and operation based on Hadoop platform
JP2020515961A (en) Blockchain partial ledger
CN103747073A (en) Distributed caching method and system
CN103442090A (en) Cloud computing system for data scatter storage
CN106156255A (en) A kind of data buffer storage layer realization method and system
Liroz-Gistau et al. Dynamic workload-based partitioning for large-scale databases
CN110008197A (en) A kind of data processing method, system and electronic equipment and storage medium
JP6410932B2 (en) Embedded cloud analytics
US10534765B2 (en) Assigning segments of a shared database storage to nodes
US11625179B2 (en) Cache indexing using data addresses based on data fingerprints
CN111221814B (en) Method, device and equipment for constructing secondary index
KR101451280B1 (en) Distributed database management system and method
Li et al. MR‐tree: an efficient index for MapReduce
CN106960052B (en) Credit investigation data acquisition method and system
CN106156125A (en) A kind of virtual identity management system replication policy based on different pieces of information organizational form

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant