CN106156125B - A method of the virtual identity management system copy based on different data organizational form - Google Patents

A method of the virtual identity management system copy based on different data organizational form Download PDF

Info

Publication number
CN106156125B
CN106156125B CN201510163158.5A CN201510163158A CN106156125B CN 106156125 B CN106156125 B CN 106156125B CN 201510163158 A CN201510163158 A CN 201510163158A CN 106156125 B CN106156125 B CN 106156125B
Authority
CN
China
Prior art keywords
data
copy
virtual identity
user
inquiry
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201510163158.5A
Other languages
Chinese (zh)
Other versions
CN106156125A (en
Inventor
傅翔
朱伟辉
贾焰
韩伟红
李树栋
李爱平
周斌
杨树强
黄九鸣
全拥
邓璐
刘斐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
National University of Defense Technology
Original Assignee
National University of Defense Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National University of Defense Technology filed Critical National University of Defense Technology
Priority to CN201510163158.5A priority Critical patent/CN106156125B/en
Publication of CN106156125A publication Critical patent/CN106156125A/en
Application granted granted Critical
Publication of CN106156125B publication Critical patent/CN106156125B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

A kind of method that the present invention discloses virtual identity management system copy based on different data organizational form mainly includes the division of virtual identity data, the distribution of the data organization of copy 1, the data organization of copy 2, copy and data query.The present invention improves the replication policy of Cassandra database, copy amount is set as 2, after Csassandra database divides virtual identity data with consistent hashing algorithm, the copy of data is reorganized, it selects the division methods for being conducive to inquiry to repartition virtual identity data, then defers to identical data copy and copy is not placed in the rule of same physical machine.Different Method of Data Organization is used by different copies to cope with different inquiry requests, reduces query time, reduces net cost, maximum system efficiency, and the data copy suitable for virtual identity management system places problem.

Description

A method of the virtual identity management system copy based on different data organizational form
Technical field
The invention belongs to Internet technical fields, and in particular to a kind of virtual identity management based on different data organizational form The method of system copy.
Background technique
EID (electronic IDentity) full name is the identity of citizen's network electronic, and elD is remotely demonstrate,proved on network The authoritative electronic information file of bright individual's true identity.When eID is long-range in use, using public security population is based on network Database and elD service platform complete the verifying of true identity, can be in the authenticity and validation for realizing personal identification While protect citizenship privacy, have the characteristics that authority, safety, can be traced, it is easy-to-use.In internet, use There is one-to-many relationships between virtual identity under family and various applications, platform, and in the network environment based on eID, These above-mentioned corresponding relationships can all be based on this unique identification of eID, and what virtual identity data referred to is exactly eID user in difference Using the lower all data having.
Consistent hashing algorithm in document [2] is a kind of special hash algorithm, when adjusting hash table size, is put down Only have K/n data needs to be remapped, wherein K is the size of data volume, and n is the size of buffering.Relatively, big In most other hash tables, the variation for buffering array essentially results in wherein all data and requires to remap.
Distributed consensus hash algorithm in document [3] is exactly to increase on the basis of consistent hashing algorithm The considerations of dummy node, purpose are exactly that the result of hash is fifty-fifty distributed in all bufferings as far as possible, in this way may be used So that all cushion spaces are all utilized.
Use Cassandra database in document [3] and store virtual identity data, by establish external index come Improve search efficiency, the backups of data realized using the replication policy that Cassandra is carried, be rationally utilized document [1] and The technology of document [2] is stored with higher efficiency to magnanimity virtual identity data.But this method is a large amount of by establishing For outside index to improve search efficiency, required memory space is larger, and algorithm comparison is complicated;In copy problem, continue to use Copy is regarded storage redundancy only to treat by the included replication policy of Cassandra database, and there is no rationally using secondary This effect.
[1]JiaKui Zhao,PingFei Zhu,LiangHuai,Yang.Effective Data Localization Using Consistent Hashing in Cloud Time-Series Databases[J].Applied Mechanics and Materials,2013,347:2246-2251.
[2] consistency Hash improves [EB/OL] http://blog.163.com/lin_guoqian@126/blog/ static/1693687432012151010409/.
[3] Deng Lu, the storage management key technology research of magnanimity virtual identity data and realization, 2010.
Summary of the invention
In view of the above problems, the method selection Cassandra database of bibliography [3] of the present invention stores virtual body Part database, provides a kind of method of virtual identity management system copy based on different data organizational form, is suitable for virtual The data copy of identity management system places problem.
Technical scheme is as follows:
A method of the virtual identity management system copy based on different data organizational form mainly includes following step It is rapid:
(1) virtual identity data divide: the thought by column storage of Cassandra column database is applied in virtual identity In data, horizontal division is carried out to virtual identity data and vertical division, horizontal direction are divided according to eID, vertical direction It is divided according to application program;
(2) data organization of copy 1: all data objects of the same user are stored together, and have stored a use After all data at family, then next user is stored, meanwhile, in the storage order of user, the user in the same area is collected Middle storage is simultaneously ranked up by the sequencing of registion time;
(3) data organization of copy 2;
(4) copy is distributed;
(5) data query: when client will inquire data, query statement is first analyzed, then based on the analysis results Selection instruction is sent to copy 1 or copy 2.
Further, further comprising the steps of in the step (3):
1) data object is divided by application platform;
2) it inside each application platform, is divided according to user location;
3) a regional user in application platform, sorts in the way of data copy 1;
4) if user is unregistered under platform, directly skip.
Further, further comprising the steps of in the step (4):
1) copy 1 and copy 2 are stored separately, copy 1 is stored on the odd node of cluster, and copy 2 is stored in cluster Even-numbered nodes on;
2) odd node is adjacent with even-numbered nodes, and node 1 is adjacent with node 2n;
3) two different hash functions are set, make the hash value for the data object being calculated respectively by copy 1 and pair This 2 Method of Data Organization is ranked up, and is then mapped data and node with consistent hashing algorithm.
Further, the hash function described in the step 3) is that an object is mapped to another is right As that, then further according to the sortord of object, hash function can be arranged first by object order.
It further, will when carrying out consistent hashing algorithm calculating data object hash value in the step 3) The basic unit of data object is set as all data and an eID account and one of some user under some application and answers The major key of a data object is collectively constituted with title.
Further, further comprising the steps of in the step (5):
1) query statement is analyzed, the inquiry based on user is judged whether it is;
If 2) inquiry based on user then sends an instruction to copy 1, and executes inquiry operation;
3) if not inquiry based on user, then analyze query statement, judge whether it is looking into based on application platform It askes;
If 4) inquiry based on application platform then sends an instruction to copy 2, and executes inquiry operation;
5) if not inquiry based on application platform, then select copy according to current system load, and inquiry behaviour is executed Make.
The beneficial effects of the present invention are: traditional replication policy requires the copy of identical data that cannot be put into same physics On machine, but the placement between the copy of different data is not required.The present invention is mainly to the copy of Cassandra database Strategy improves, and copy amount is set as 2, when Csassandra database with consistent hashing algorithm to virtual identity number After being divided, the copies of data is reorganized, select be conducive to the division methods of inquiry to virtual identity data into Row is repartitioned, and is then deferred to identical data copy and is not placed in the rule of same physical machine to copy.Data copy 1 organizational form is conducive to eID user data region-by-region management, when needing to read data, according to wanting request data The characteristics of, the copy to be operated is selected, so that data access efficiency is improved, so that data copy is no longer intended merely to calamity Standby and progress redundant storage, but use different Method of Data Organization to ask to cope with different inquiries by different copies It asks, reduces query time, reduce net cost, maximum system efficiency, the data suitable for virtual identity management system Replica placement problem.
Detailed description of the invention
Fig. 1 is data query flow chart of the invention.
Fig. 2 is copy distribution map of the invention.
Fig. 3 is the virtual identity information figure in the embodiment of the present invention 1.
Fig. 4 is the organizational form figure of the copy 1 in the embodiment of the present invention 1.
Fig. 5 is the virtual identity data profile in the embodiment of the present invention 1.
Fig. 6 is the organizational form figure of the copy 2 in the embodiment of the present invention 1.
Specific embodiment
To facilitate the understanding of the present invention, below in conjunction with Figure of description and embodiment, the invention will be further described.
The present invention provides a kind of method of virtual identity management system copy based on different data organizational form, main to wrap Include following steps:
(1) virtual identity data divide: the data model of Cassandra column database is applied in virtual identity data On, horizontal division and vertical division, horizontal direction are carried out to virtual identity data and divided according to eID, vertical direction according to Application program is divided;
(2) data organization of copy 1: all data objects of the same user are stored together, and have stored a use After all data at family, then next user is stored, meanwhile, in the storage order of user, the user in the same area is collected Middle storage is simultaneously ranked up by fixed sequence;
(3) data organization of copy 2;
(4) copy is distributed;
(5) data query: when client will inquire data, query statement is first analyzed, then based on the analysis results Selection instruction is sent to copy 1 or copy 2.
Exploitation environment of the invention: the X86 platform of (SuSE) Linux OS, JDK1.7 are write, data using java language Server needs to install the database software of Cassandra1.0 or more highest version, provides data for system and supports.
Running environment of the invention: server end runs on the X86 platform for being equipped with (SuSE) Linux OS, JDK1.7 or Multiple machine nodes of the above version, client is customary personal computer.
The following are exemplary embodiments of the invention:
Embodiment 1:
(1) virtual identity data divide: in domain space, user is according to their own needs in different application platforms Register account number, these application platforms include e-commerce, social networks, online game etc..Virtual identity management system passes through EID gets up these information unifications, and a user possesses unique eID mark, he possesses not again under different application platforms Same virtual account, these data portion structures are different, of different sizes, and data volume is huge.By Cassandra column database Data model apply in virtual identity data, as shown in table 1:
1 virtual identity data model of table
The present invention carries out horizontal division and vertical division, horizontal direction according to model stored above, to virtual identity data It is divided according to eID, is divided in vertical direction according to application program, calculate data carrying out consistent hashing algorithm When object hash value, data object unit is set as all data and an eID account of some user under some application Number and an Apply Names collectively constitute " major key " of a data object.The present invention is not concerned with inside a data object Method of Data Organization, but mainly solve the organizational form between different copy data objects.
(2) Method of Data Organization of copy 1: in actual data request operation, often request some user in institute There is the virtual identity information under application platform, for example, there is exception in the eID data of user Zhang San, there are stolen possibility, Just need to check all virtual identity data of Zhang San at this time.Assuming that user Zhang San has applied for altogether including social networks, electronics Business web site, 8 application platforms such as online game, then his all virtual identity informations such as Fig. 3 institute in domain space Show, one of rectangle frame indicates a data object.
If these data objects are distributed on back end in a random fashion, above-mentioned inquiry is completed When may cross over very multiple and different nodes, not only influence inquiry velocity, can also occupy network bandwidth, therefore, be based on All data objects of the same user are stored together by the organizational form of the technology of the present invention, data copy 1, have stored one After all data of a user, then next user is stored, and so on.User storage sequentially, consideration will be same The user in a area is centrally stored, and the user in unified area is ranked up according to a fixed sequence.Be conducive in this way by EID user data region-by-region management.The organizational form of final copy 1 is as shown in Figure 4.
(3) organizational form of copy 2: in actual data request operation, it will usually to the number of some application platform According to being operated, such as check Taobao in the user situation in each area.Need to obtain all void of Taobao's application at this time Quasi- user data, the virtual identity data distribution of Taobao are as shown in Figure 5.
If all data objects are randomly dispersed on back end according to consistent hashing or according to copy 1 Mode be distributed, when carrying out this operation, can equally cross over many nodes, occupied bandwidth, lower cluster Efficiency, while data copy 2, as just the redundancy of copy 1, there is no its effect is performed to maximum.Based on Data copy 2 is carried out as follows tissue by upper consideration, the present invention: 1, first being drawn by application platform to data object Point;2, it inside each application platform, is divided according to user location;3, a regional user in application platform, It sorts in the way of data copy 1;If 4, certain user is unregistered under certain platform, directly skip, such as Fig. 6 institute Show.
(4) copy is distributed: copy 1 and copy 2 are stored separately by the present invention, i.e., copy 1 is stored in a cluster wherein half-section On point, copy 2 is stored on the other half node of entire cluster, and distribution is as shown in Figure 2.Wherein odd node and even number section Point is adjacent, and node 1 is adjacent with node 2n.Suitable hash function is selected, makes the hash value of total play object according to 1 He of copy 2 organizational form of copy is ranked up, and is then mapped data and node with consistent hashing algorithm.So far, all numbers It has been distributed in cluster according to copy all in accordance with designation method.
The hash function is that an object is mapped to another object, can be first by object order, then root again According to the sortord of object, hash function is set.
(5) data query: when client will inquire data, query statement is first analyzed, then based on the analysis results Selection instruction is sent to copy 1 or copy 2, and process is as shown in Figure 1.
Traditional replication policy requires the copy of identical data that cannot be put into same physical machine, but to different data Copy between placement do not require.The present invention mainly improves the replication policy of Cassandra database, copy Quantity is set as 2, will after Csassandra database divides virtual identity data with consistent hashing algorithm The copy of data reorganizes, and selects the division methods for being conducive to inquiry to repartition virtual identity data, then abides by Copy is not placed in the rule of same physical machine from identical data copy.The organizational form of data copy 1 is conducive to By eID user data region-by-region management, when needing to read data, the characteristics of according to request data is wanted, selection will be carried out The copy of operation, to improve data access efficiency so that data copy is no longer intended merely to that calamity is standby and the redundancy that carries out is deposited Storage, but different Method of Data Organization is used by different copies to cope with different inquiry requests, query time is reduced, Net cost, maximum system efficiency are reduced, the data copy suitable for virtual identity management system places problem.
It is that an exemplary description of the invention above, it is clear that of the invention realizes not by the limit of aforesaid way System, as long as using the various improvement that technical solution of the present invention carries out, or not improved by conception and technical scheme of the invention Other occasions are directly applied to, are within the scope of the invention.

Claims (4)

1. a kind of method of the virtual identity management system copy based on different data organizational form, which is characterized in that including with Lower step:
Step 1: virtual identity data divide: by applying by column storage in virtual identity data for Cassandra column database On, horizontal division and vertical division, horizontal direction are carried out to virtual identity data and divided according to eID, vertical direction according to Application program is divided;
Step 2: the data organization of copy 1: all data objects of the same user are stored together, and have stored a use After all data at family, then next user is stored, meanwhile, in the storage order of user, the user in the same area is collected Middle storage is simultaneously ranked up by the sequencing of registion time;
Step 3: the data organization of copy 2;
Step 4: copy distribution;
Step 5: data query: when client will inquire data, query statement is first analyzed, then based on the analysis results Selection instruction is sent to copy 1 or copy 2;
It is further comprising the steps of in the step three:
Step A: data object is divided by application platform;
Step B: it inside each application platform, is divided according to user location;
Step C: a regional user in application platform sorts in the way of data copy 1;
Step D: it if user is unregistered under platform, directly skips.
2. a kind of side of virtual identity management system copy based on different data organizational form according to claim 1 Method, which is characterized in that further comprising the steps of in the step four:
Step a: copy 1 and copy 2 are stored separately, and copy 1 is stored on the odd node of cluster, and copy 2 is stored in cluster Even-numbered nodes on;
Step b: odd node is adjacent with even-numbered nodes, and node 1 is adjacent with node 2n;
Step c: two different hash functions of setting make the hash value of data object respectively by the data group of copy 1 and copy 2 The mode of knitting is ranked up, and is then mapped data and node with consistent hashing algorithm.
3. a kind of side of virtual identity management system copy based on different data organizational form according to claim 2 Method, which is characterized in that when carrying out consistent hashing algorithm calculating data object hash value in the step c, by data object Basic unit to be set as all data and an eID account and an Apply Names of some user under some application total With the major key of one data object of composition.
4. a kind of side of virtual identity management system copy based on different data organizational form according to claim 1 Method, which is characterized in that further comprising the steps of in the step five:
Step 1: analysis query statement judges whether it is the inquiry based on user;
Step 2: if the inquiry based on user, then sending an instruction to copy 1, and execute inquiry operation;
Step 3: if not the inquiry based on user, then analyze query statement, judging whether it is looking into based on application platform It askes;
Step 4: if the inquiry based on application platform, then sending an instruction to copy 2, and execute inquiry operation;
Step 5: if not the inquiry based on application platform, then select copy according to current system load, and executing inquiry behaviour Make.
CN201510163158.5A 2015-04-08 2015-04-08 A method of the virtual identity management system copy based on different data organizational form Active CN106156125B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510163158.5A CN106156125B (en) 2015-04-08 2015-04-08 A method of the virtual identity management system copy based on different data organizational form

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510163158.5A CN106156125B (en) 2015-04-08 2015-04-08 A method of the virtual identity management system copy based on different data organizational form

Publications (2)

Publication Number Publication Date
CN106156125A CN106156125A (en) 2016-11-23
CN106156125B true CN106156125B (en) 2019-08-23

Family

ID=57336271

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510163158.5A Active CN106156125B (en) 2015-04-08 2015-04-08 A method of the virtual identity management system copy based on different data organizational form

Country Status (1)

Country Link
CN (1) CN106156125B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2004097667A2 (en) * 2003-04-25 2004-11-11 Sap Ag Automated data mining runs
CN102457571A (en) * 2011-09-15 2012-05-16 中标软件有限公司 Method for uniformly distributing data in cloud storage
CN103425756A (en) * 2013-07-31 2013-12-04 西安交通大学 Copy management strategy for data blocks in HDFS
CN103440303A (en) * 2013-08-21 2013-12-11 曙光信息产业股份有限公司 Heterogeneous cloud storage system and data processing method thereof
CN103577407A (en) * 2012-07-19 2014-02-12 国际商业机器公司 Query method and query device for distributed database

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020065919A1 (en) * 2000-11-30 2002-05-30 Taylor Ian Lance Peer-to-peer caching network for user data

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2004097667A2 (en) * 2003-04-25 2004-11-11 Sap Ag Automated data mining runs
CN102457571A (en) * 2011-09-15 2012-05-16 中标软件有限公司 Method for uniformly distributing data in cloud storage
CN103577407A (en) * 2012-07-19 2014-02-12 国际商业机器公司 Query method and query device for distributed database
CN103425756A (en) * 2013-07-31 2013-12-04 西安交通大学 Copy management strategy for data blocks in HDFS
CN103440303A (en) * 2013-08-21 2013-12-11 曙光信息产业股份有限公司 Heterogeneous cloud storage system and data processing method thereof

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
"基于eID虚拟身份数据存储的研究";邓璐 等;《第28次全国计算机安全学术交流会》;20131031;第2章

Also Published As

Publication number Publication date
CN106156125A (en) 2016-11-23

Similar Documents

Publication Publication Date Title
Liu et al. A low-cost multi-failure resilient replication scheme for high-data availability in cloud storage
AU2017203459B2 (en) Efficient data reads from distributed storage systems
US10585913B2 (en) Apparatus and method for distributed query processing utilizing dynamically generated in-memory term maps
Liao et al. Multi-dimensional index on hadoop distributed file system
CN106233259B (en) The method and system of more generation storing datas is retrieved in decentralized storage networks
AU2014357640B2 (en) Distributing data on distributed storage systems
JP2020515961A (en) Blockchain partial ledger
US11126641B2 (en) Optimized data distribution system
CN108268614B (en) Distributed management method for forest resource spatial data
CN110222030A (en) The method of Database Dynamic dilatation, storage medium
Guo et al. A data placement strategy based on genetic algorithm in cloud computing platform
CN111159140B (en) Data processing method, device, electronic equipment and storage medium
JP2007317138A (en) Data storage system, file retrieval device, and program
Schreiner et al. A hybrid partitioning strategy for NewSQL databases: the VoltDB case
Abdelhafiz Distributed database using sharding database architecture
CN106156125B (en) A method of the virtual identity management system copy based on different data organizational form
Heng et al. Survey on multi-tenant data architecture for SaaS
Moise et al. Improving the Hadoop map/reduce framework to support concurrent appends through the BlobSeer BLOB management system
Vokorokos et al. Performance optimization of applications based on non-relational databases
CN110166279B (en) Dynamic layout method of unstructured cloud data management system
Kim et al. Improving I/O performance in distributed file systems for flash-based SSDs by access pattern reshaping
Hua et al. A performance evaluation of load balancing techniques for join operations on multicomputer database systems
Nicoleta–Magdalena Distributed Queries in the E-learning Environment
Kumar et al. A review on partitioning techniques in database
Jain et al. Bloom Filter in Cloud Storage for Efficient Data Membership Identification

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant