CN109902132A - A kind of relational model method for building up and its system for intellectual property multidimensional data - Google Patents
A kind of relational model method for building up and its system for intellectual property multidimensional data Download PDFInfo
- Publication number
- CN109902132A CN109902132A CN201910143405.3A CN201910143405A CN109902132A CN 109902132 A CN109902132 A CN 109902132A CN 201910143405 A CN201910143405 A CN 201910143405A CN 109902132 A CN109902132 A CN 109902132A
- Authority
- CN
- China
- Prior art keywords
- data
- center table
- center
- space
- dimension
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The present invention relates to a kind of relational model method for building up and its system for intellectual property multidimensional data, it is related to the technical field of data mining, it is relatively high to solve the specialized general complexity of data warehouse software, while the requirement to hardware is relatively high, it is also desirable to which specialized personnel go to safeguard;When using Star Model and galactic model, center table can become especially big, cause search efficiency not high, the low problem of access efficiency comprising following steps: P101, establishing theme and dimension;P102, building dynamic Hash ring;P103, Hash ring is adjusted;P104, data positioning;P105, it completes data pick-up and extracts data.By new center table building mode, intending building, multiple cooperate the present invention with the center table of theme, reduce the data volume of single center table, improve the effect of the access efficiency of center table.
Description
Technical field
The present invention relates to the technical fields of data mining, more particularly, to a kind of relationship for intellectual property multidimensional data
Method for establishing model and its system.
Background technique
In data warehouse field, data are generally modeled in the form of various dimensions.
Shown in referring to Fig.1, with the brand sales data instance of intellectual property, both want which international classification (classification counted
Dimension) registration number highest, and want to count the registration number highest in which month (time).In general, during modeling, meeting
Confirm a theme, the content that theme needs to count.Such as confirm brand sales quantity as model theme;In established theme
Later, the measurement of established data, i.e. statistical dimension can also be established.Such as confirm international classification, month, the measure dimension of region three.
In general, the relationship of this complexity can be expressed with data cube.
In data warehouse, actually n ties up cube, here in order to show conveniently, illustrates only three-dimensional cube.It is vertical
Each reference axis of cube represents a statistical dimension.Common expert data warehouse software can indicate this cube well
Body structure.
Relational database is that have very high versatility in a kind of widely used technology of enterprises, utilize relationship number
A kind of mode that low cost is general is also become according to library building data warehouse
Traditional building mode includes Star Model and galactic model.
Referring to shown in Fig. 2, in Star Schema, mainly include:
1) a center table is mainly used to store the key of all data dimensions under some theme;
2) multiple dimension tables are mainly used to store the particular content of single dimension;
When needing to count some dimension data, is first found from dimension table and extract specific keyset conjunction, then center table is gone to extract tool
The decorum counts.
Referring to shown in Fig. 3, Galaxy Diagram is the combination of multiple type Star Schemas, is mainly used in there are when multiple themes,
Dimension table can be shared between different themes.Such as in addition to trade mark registration number theme, can also create a statistics theme is brand sales volume,
Its statistical dimension and trade mark registration number are similarly three kinds.
Prior art among the above has the following deficiencies: that the specialized general complexity of data warehouse software compares
Height, while the requirement to hardware is relatively high (usually requiring minicomputer), it is also desirable to specialized personnel go to safeguard;Meanwhile it adopting
When with Star Model and galactic model, center table can become especially big, cause search efficiency not high, access efficiency is low, also
Improved space.
Summary of the invention
The first object of the present invention is to provide a kind of relational model method for building up for intellectual property multidimensional data, passes through
New center table building mode, intending building, multiple cooperate with the center table of theme, reduce the data volume of single center table, mention
The access efficiency of high center table.
Foregoing invention purpose of the invention has the technical scheme that
A kind of relational model method for building up for intellectual property multidimensional data, includes the following steps:
P101, theme and dimension are established;
P102, building dynamic Hash ring;
P103, Hash ring is adjusted;
P104, data positioning;
P105, it completes data pick-up and extracts data.
By using above-mentioned technical proposal, by the establishment of dimension and the establishment of theme, so that the theme to inquiry carries out
Clear, dynamic Hash ring belongs to one kind of consistent Hash ring, can be recycled by dynamic Hash ring to form one to data
Closed annulus, then Hash ring is adjusted, the later period is by the positioning of data and extracts to extract data, intends building
Multiple cooperate with the center table of theme, reduce the data volume of single center table, improve the access efficiency of center table.
The present invention is further arranged to: in step P101 the following steps are included:
P1011, the central theme for needing to count is determined;
P1012, it determines data dimension, according to user demand, determines data respectively from A different directions, then include A data dimension
Degree, the DA that is denoted as D1, D2, D3 respectively ... include specific dimension data inside each dimension again.
By using above-mentioned technical proposal, confirmed by the central theme counted to needs, to improve entirety
Statistics direction, and the confirmation of data dimension is to needing additional conditions to be added to match, while data dimension being divided
It cuts, so that user selects.
The present invention is further arranged to: in step P102 the following steps are included:
P1021, a hash space size is chosen first, space size G chooses node quantity, is denoted as N;
P1022, the quantity for defining all center tables are the node quantity N in step P1021, and the length that a vector V is arranged is
N, each component correspond to a random natural number in the G of space;
P1023, all center tables component value of vector V is numbered, single center table has been split into N number of;
P1024, the data in dimension table are spliced into a character string by column, then using digital hash function, generate one only
One numeric string, the number that this is obtained are denoted as Num, are input to following formula:
O=Num*mod(G);
Wherein mod represents modulo operation, and O is output valve, then the record has been mapped in the G of space;
P1025, the numerical value being defined in half-interval are stored in the center table on the left side, and half-interval is that right open interval is closed on a left side, to each
Data duplication in a dimension table D1, D2, D3 ... DA similarly operates, it can each data distribution to different center tables
In.
The cooperation of node quantity is cooperated, thus to Kazakhstan by the selection to hash space by using above-mentioned technical proposal
Uncommon ring carries out temporary segmentation, by the segmentation to center table, reduces the data volume in each center table, to improve inquiry
When efficiency.
The present invention is further arranged to: include following set-up procedure in step P103:
The new center table of P1031, addition, arbitrarily takes a value to former center table in the G of space, which is inserted into original in order
In vector V, then the data in former section are moved to inter-node by newly-generated vector V ', i.e., it is preceding to center table splitting be
Two new center tables;
When the memory space of former center table is greater than highest memory space, operating procedure P1031.
By using above-mentioned technical proposal, the space of center table is judged, to reduce center table because of data mistake
The problem of operational efficiency caused by mostly, by way of increasing center table automatically, to improve operational efficiency.
The present invention is further arranged to: further include following set-up procedure in step P103:
P1032, former center table is removed, the content in adjacent former center table is merged, then removed former two adjacent
Center table simultaneously generates new center table;
When the memory space of adjacent former center table is respectively less than most memory space, operating procedure P1032.
By using above-mentioned technical proposal, by way of removing center table, thus the space for the operation saved, by right
The few spatial statistics of calculation amount, so that the utilization rate of whole resource is improved, it is practical.
The present invention is further arranged to: in step P104 the following steps are included:
P1041, when finding data and being located at which center table, repeatedly the calculation method of step P102 calculates the data and is located at space
The numeric string of G, is denoted as N;
P1042, corresponding digital section is found, met with lower inequality:
The right node j of the left side corresponding region node i≤N < of corresponding region, the left side node i of (i, j ∈ G), section are
The node of data storage.
By using above-mentioned technical proposal, by the lookup to data in the table of center, so that corresponding data-link is searched out,
By mode identical with when establishing, to improve the stability of overall operation, reduction system goes wrong.
The present invention is further arranged to: in step P105 the following steps are included:
P1051, corresponding data record set C is found in dimension table first;
P1052, then in data record set C these record operating procedure P104 localization method, find corresponding data
Which it is located in center table;
P1053, after finding specific center table, the data dimension extracted as needed removes the physical record in the table of matching center,
Then obtain the data of the record;
P1054, just having obtained needing the data dimension that extracts after the data accumulation of all records.
By using above-mentioned technical proposal, by the acquisition to corresponding data in dimension table, so that data are positioned, from
And the data of corresponding record are obtained, by the extraction to data dimension, so that Data Matching is gone out, it is practical.
The second object of the present invention is to provide a kind of relational model for intellectual property multidimensional data and establishes system, passes through
New center table building mode, intending building, multiple cooperate with the center table of theme, reduce the data volume of single center table, mention
The access efficiency of high center table.
Foregoing invention purpose of the invention has the technical scheme that
A kind of relational model for intellectual property multidimensional data establishes system, comprising:
Main control module, user data storage and data processing;
Dimension establishes module, connect with main control module and is used to establish the theme of data and the dimension of statistics;
Dynamic Hash ring constructs module, connect with main control module and for splitting multiple center tables and controlling multiple center tables simultaneously
Work;
Hash ring adjustment module connect with main control module and is used to adjust the quantity for the center table being located in Hash ring;
Data locating module is connect and for finding center table that data belong to and exporting numeric string with main control module;
Data extraction module, connect with main control module and the data for obtaining to needs extract;
Display module is connect and for receiving the data extracted and being shown with main control module.
By using above-mentioned technical proposal, the setting of module is established by safeguarding, the central theme that needs count is carried out
Confirmation, to improve whole statistics direction, dynamic Hash ring constructs the use of module and Hash ring adjustment module, thus right
Data form a closed annulus that can be recycled, then Hash ring are adjusted, data locating module and data extraction module
Setting, shown finally by display module, positioning to data and extract to extract data, intend building multiple
Center table with theme cooperates, and reduces the data volume of single center table, improves the access efficiency of center table.
The present invention is further arranged to: the dynamic Hash ring building module includes:
Node processing units, choose a hash space size, and space size G chooses node quantity, is denoted as N;
Cutting unit defines the quantity of all center tables as the node quantity N in node processing units, is arranged a vector V's
Length is N, and each component corresponds to a random natural number in the G of space;
Split cells is numbered all center tables component value of vector V, single center table has been split into N number of;
Data in dimension table are spliced into a character string by column by remainder unit, then using digital hash function, generate one
Unique numeric string, the number that this is obtained are denoted as Num, are input in O=Num*mod (G) formula, wherein mod represents modulus
Operation, O are output valve, then the record have been mapped in the G of space;
Section confirmation unit is defined in the center table on the numerical value deposit left side in half-interval, and half-interval is the area Zuo Biyoukai
Between, the Data duplication in each dimension table D1, D2, D3 ... DA is similarly operated, it can each data distribution to difference
Center table in.
By using above-mentioned technical proposal, the setting of cutting unit and split cells separates data, cooperates remainder
The setting of unit, the cooperation with section confirmation unit, so that corresponding data are obtained, it is practical.
The present invention is further arranged to: the Hash ring adjustment module includes:
Center table space detection unit, for detecting the size of data of current center table and exporting space detection signal;
Adding unit, for increasing center table number;
Unit is removed, for reducing center table number;
Highest space reference signal and lowest spatial reference signal, and highest space reference signal are preset in the main control module
Greater than lowest spatial reference signal;
When space, detection signal is greater than highest space detection signal, current center table splitting is in two by the adding unit
Heart table is to increase center table number;Conversely, not increasing;
When two neighboring space, detection signal is respectively less than lowest spatial detection signal, the unit that removes is by currently adjacent two
A center table merges into a center table to reduce center table number;Conversely, not reducing.
Center table is split by increasing A Ji unit and removing the setting of unit by using above-mentioned technical proposal
And combination, so that the arrangement to space, to improve whole operational efficiency, while saving overall space.
In conclusion advantageous effects of the invention are as follows: by new center table building mode, intending building, multiple are same main
The center table of topic cooperates, and reduces the data volume of single center table, improves the access efficiency of center table.
Detailed description of the invention
Fig. 1 is the schematic diagram of data solid body in background technique.
Fig. 2 is Star Schema figure in background technique.
Fig. 3 is galaxy ideograph in background technique.
Fig. 4 is the schematic diagram of the relational model method for building up for intellectual property multidimensional data.
Fig. 5 is data content schematic diagram.
Fig. 6 is the system schematic that system is established for the relational model of intellectual property multidimensional data.
In figure, 1, main control module;2, dimension establishes module;3, dynamic Hash ring constructs module;4, Hash ring adjustment module;
5, data locating module;6, data extraction module;7, display module;8, node processing units;9, cutting unit;10, it splits single
Member;11, remainder unit;12, section confirmation unit;13, center table space detection unit;14, adding unit;15, unit is removed.
Specific embodiment
Below in conjunction with attached drawing, invention is further described in detail.
It is a kind of relational model method for building up for intellectual property multidimensional data disclosed by the invention referring to Fig. 4, including
Following steps:
P101, theme and dimension are established;
P102, building dynamic Hash ring;
P103, Hash ring is adjusted;
P104, data positioning;
P105, it completes data pick-up and extracts data.
Referring to Figure 5, when carrying out step P101, comprising the following steps:
P1011, the central theme for needing to count is determined;
P1012, it determines data dimension, according to user demand, determines data respectively from A different directions, then include A data dimension
Degree, the DA that is denoted as D1, D2, D3 respectively ... include specific dimension data inside each dimension again.
When determining central theme, it is assumed that theme is the quantity of brand sales, then when determining data dimension, according to business
Scene can count the data of brand sales respectively from time, international classification, region, contain 3 data dimensions at this time, point
For for time D1, international classification D2, region D3.Include again specific data dimension inside each dimension, such as is wrapped in D1
2012-02-13,2013-03-15 etc. are contained.
And when constructing dynamic Hash ring, Hash ring is an annulus, and data carry out duplicate circulation on Hash ring, dynamic
Hash ring is a kind of data location technology, for splitting multiple center tables, them is allowed to cooperate.
Include the following steps in step P102, following steps are the step of constructing dynamic Hash ring:
P1021, a hash space size is chosen first, space size G chooses node quantity, is denoted as N.
In the present embodiment, space size is chosen for 232-1。
P1022, the quantity for defining all center tables are the node quantity N in step P1021, and the length of a vector V is arranged
Degree is N, and each component corresponds to a random natural number in the G of space.
Such as when N is 5, vector V can value be 40,1010,29392,30000,94039220, and the value of vector V
Randomly select, and according to being arranged from big to small or from small to large, and be necessary for natural number, thus be integer simultaneously
Form an annular data.
P1023, all center tables component value of vector V is numbered, single center table has been split into N number of.
In the present embodiment, all center tables component value of vector V is numbered, note number is TC-40, TC-1010, TC-
Single center table has so far been split into 5 by 29392, TC-30000, TC-94039220.
P1024, the data in dimension table are spliced into a character string by column, then using digital hash function, generate one
A unique numeric string, the number that this is obtained are denoted as Num, are input to following formula:
O=Num*mod(G);
Wherein mod represents modulo operation, and O is output valve, then the record has been mapped in the G of space.
In the present embodiment, digital hash function is a function y=F (x), input parameter x can be mapped to a number
String y exported, and preferably use CRC32 function, generate a unique numeric string, such as 39029311, at this time formula be O=
39029311*mod(232-1)。
P1025, the numerical value being defined in half-interval are stored in the center table on the left side, and half-interval is that right open interval is closed on a left side,
Data duplication in each dimension table D1, D2, D3 ... DA is similarly operated, it can each data distribution in different
In heart table.
It in the present embodiment, is put into the section in 2 from the remainder taken out in P1024, finds the initial table in section, be defined on
In the center table TC-i on the numerical value deposit left side in half-interval (left side is closed the right side and opened).As 39029311 in P1024 step are located at area
Between [TC-30000, TC-94039220) in, so in data deposit this center table of TC-30000.In this way, to each dimension table
Data duplication in D1, D2, D3 similarly operates, it can each data distribution into different center table TC-i.In number
When according to positioning, same operation is carried out, can also navigate to certain dimension table data in which center table.
It include following set-up procedure in step P103, below when the quantity of center table needs to adjust, mainly by property
The reasons such as energy, it is therefore desirable to which the adjustment for doing Hash ring, the rule entirely adjusted are as follows:
The new center table of P1031, addition, arbitrarily takes a value to former center table in the G of space, which is inserted into original in order
In vector V, then the data in former section are moved to inter-node by newly-generated vector V ', i.e., it is preceding to center table splitting be
Two new center tables;
When the memory space of former center table is greater than highest memory space, operating procedure P1031.
P1032, former center table is removed, the content in adjacent former center table is merged, then remove former two phases
Adjacent center table simultaneously generates new center table;
When the memory space of adjacent former center table is respectively less than most memory space, operating procedure P1032.
Step P1031 is the rule of increase center table, and step P1032 is the rule of removal center table.When former center table
Memory space be greater than highest memory space when, operating procedure P1031;When the memory space of former center table is greater than highest storage sky
Between when, operating procedure P1031.Highest memory space is 10,000,000 datas in the present embodiment, and minimum memory space is 1,000,000
Data.
In step P1031, new center table TC-i is added.To TC-i arbitrarily in space 232A value, such as TC- are taken in -1
210202.The value is inserted into order in former vector V, then newly-generated vector V ' (40,1010,29392,30000,
210202,94039220).At this point, section [TC-21020, TC-94039220) in (before be located at the center TC-30000 table
It is interior) data be moved to TC-21020 inter-node, i.e., it is preceding to center table TC- (i-1) divide.
In step P1032, center table TC-i is removed.The reverse operating in step P1031 is then executed, first in TC-i
Content be merged into TC- (i-1), then remove the center TC-i table.
In step P104 the following steps are included:
P1041, when finding data and being located at which center table, repeatedly the calculation method of step P102 calculates the data and is located at space
The numeric string of G, is denoted as N;
P1042, corresponding digital section is found, met with lower inequality:
The right node j of the left side corresponding region node i≤N < of corresponding region, the left side node i of (i, j ∈ G), section are
The node of data storage.
It is located at space 2 when calculating the data according to the calculation method of step P10232- 1 numeric string is denoted as N, then etc.
Formula is TC-i≤N < TC-j, (i, j ∈ 232- 1), the left side node TC-i in section is the node of data storage.
In step P105 the following steps are included:
P1051, corresponding data record set C is found in dimension table first;
P1052, then in data record set C these record operating procedure P104 localization method, find corresponding data
Which it is located in center table;
P1053, after finding specific center table, the data dimension extracted as needed removes the physical record in the table of matching center,
Then obtain the data of the record;
P1054, just having obtained needing the data dimension that extracts after the data accumulation of all records.
Such as: want to extract sales volume in 2018, then finds the data that the time is 2018 in dimension table first and remember
Record set C.Then to the localization method of these record operation P104 steps in C, corresponding data is found and are located in which
In heart table TC-i.After finding specific TC-i table, the physical record in the table of matching center in TC-i is gone according to key time_key, then
Obtain the sales volume Number-i of the record.All Number-i are just obtained the sale numbers of 2018 whole years after cumulative
According to.And Number-T=Number-1+Number-2+ ...+Number-N.
Referring to shown in Fig. 6, based on the same inventive concept, the embodiment of the present invention provides a kind of for intellectual property multidimensional data
Relational model establish system, including main control module 1, the dimension that connect with main control module 1 establishes module 2, dynamic Hash ring structure
Model block 3, Hash ring adjustment module 4, data locating module 5, data extraction module 6, display module 7.
In this implementation, main control module 1 is mainframe computer, and display module 7 is display screen.And main control module 1 be used for into
The storage of row data and the processing analytic operation for carrying out data.Display module 7 show the data of extraction.
Dimension establishes module 2 and is used to establish the theme of data and the dimension of statistics, and dynamic Hash ring building module 3 is for tearing open
Divide multiple center tables and control multiple center tables and work at the same time, Hash ring adjustment module 4 is for adjusting in Hash ring
The quantity of heart table, data locating module 5 are used to find the center table that data belong to and export numeric string, and data extraction module 6 is used
It is extracted in the data obtained to needs.
Dimension confirmation module confirms central theme, it is assumed that theme be brand sales quantity, be denoted as TC.According to
Business scenario was needed from time, international classification, and region counts the data of brand sales respectively.It then include three data dimensions, point
It is not denoted as time D1, international classification D2, region D3.It include again specific dimension data inside each dimension, as included in D1
2018-01-01,2018-01-02 etc..
Dynamic Hash ring construct module 3 include node processing units 8, cutting unit 9, split cells 10, remainder unit 11,
Section confirmation unit 12.
Node processing units 8 choose node quantity, are denoted as N for choosing a hash space size, space size G.
In the present embodiment, space size is chosen for 232-1。
Cutting unit 9 define all center tables quantity be node processing units 8 in node quantity N, setting one to
The length for measuring V is N, and each component corresponds to a random natural number in the G of space.Such as when N be 5 when, vector V can value be
40,1010,29392,30000,94039220, and the value of vector V is randomly selected, and according to from big to small or from small
To being arranged greatly, and it is necessary for natural number, therefore is integer and forms an annular data.
Split cells 10 numbers all center tables component value of vector V, single center table has been split into N number of.
In the present embodiment, all center tables component value of vector V is numbered, note number is TC-40, TC-1010, TC-29392, TC-
30000, TC-94039220, so far, single center table is split into 5.
Data in dimension table are spliced into a character string by column by remainder unit 11, raw then using digital hash function
At a unique numeric string, the number that this is obtained is denoted as Num, is input in O=Num*mod (G) formula, wherein mod generation
Table modulo operation, O are output valve, then the record have been mapped in the G of space.In the present embodiment, digital hash function is one
Function y=F (x) can be mapped to input parameter x one numeric string y and be exported, and preferably use CRC32 function, generate
One unique numeric string, such as 39029311, formula is O=39029311*mod (2 at this time32-1)。
Section confirmation unit 12 is defined in the center table on the deposit of the numerical value in the half-interval left side, and half-interval is that the right side is closed on a left side
Open interval similarly operates the Data duplication in each dimension table D1, D2, D3 ... DA, it can each data distribution is arrived
In different center tables.It in the present embodiment, is put into the section in 2 from the remainder taken out in P1024, finds the initial in section
Table is defined in the center table TC-i on the numerical value deposit left side in half-interval (left side is closed the right side and opened).In P1024 step
39029311 be located at section [TC-30000, TC-94039220) in, so data deposit this center table of TC-30000 in.This
Sample, to each dimension table D1, the Data duplication in D2, D3 is similarly operated, it can each data distribution to different centers
In table TC-i.In data positioning, same operation is carried out, can also navigate to certain dimension table data in which center table.
Hash ring adjustment module 4 includes center table space detection unit 13, adding unit 14, removes unit 15.Center table
Spatial detection unit 13 is used to detect the size of data of current center table and exports space detection signal, and adding unit 14 is for increasing
Add center table number, removes unit 15 for reducing center table number.
Highest space reference signal and lowest spatial reference signal are preset in main control module 1, and highest space reference is believed
Number be greater than lowest spatial reference signal.
When space, detection signal is greater than highest space detection signal, current center table splitting is two by adding unit 14
Center table is to increase center table number;When space detection signal is no more than highest space detection signal, adding unit 14 will not
Current center table splitting is two center tables.
When two neighboring space, detection signal is respectively less than lowest spatial detection signal, removing unit 15 will be current adjacent
Two center tables merge into a center table to reduce center table number;When two neighboring space, detection signal is not less than minimum
When signal is detected in space, removes unit 15 and current two adjacent center tables are not merged into a center table.When only one
When detecting signal less than lowest spatial, without merging.
Data locating module 5 is for inquiring which center table is data be located at, when needing to find which center table is data be located at
When, then the data is calculated according to the calculation method that dynamic Hash ring constructs module 3 and is located at space 232- 1 numeric string, is denoted as N.
Then corresponding digital section is found, meets inequality: TC-i≤N < TC-j, (i, j ∈ 232, and the left side node in section -1)
TC-i is the node of data storage.
Data extraction module 6 is such as wanted to extract sales volume in 2018, is then being tieed up first for extracting to receipt
The data record set C that the time is 2018 is found in table.Then to the positioning of these record operation data locating modules 5 in C
Method finds corresponding data and is located in which center table TC-i.After finding specific TC-i table, according to key time_key
The physical record in the table of matching center in TC-i is removed, then obtains the sales volume Number-i of the record.All Number-
2018 annual sales datas have just been obtained after i is cumulative.
That is Number-T=Number-1+Number-2+ ...+Number-N.
The content of present invention is compared with traditional relationship modeling, using the distribution algorithms of distributed data, huge center
Table splits into multiple center sublist cooperatings, can largely improve the retractility of data volume.
The embodiment of present embodiment is presently preferred embodiments of the present invention, not limits protection of the invention according to this
Range, therefore: the equivalence changes that all structures under this invention, shape, principle are done, should all be covered by protection scope of the present invention it
It is interior.
Claims (10)
1. a kind of relational model method for building up for intellectual property multidimensional data, characterized by the following steps:
P101, theme and dimension are established;
P102, building dynamic Hash ring;
P103, Hash ring is adjusted;
P104, data positioning;
P105, it completes data pick-up and extracts data.
2. a kind of relational model method for building up for intellectual property multidimensional data according to claim 1, feature exist
In: in step P101 the following steps are included:
P1011, the central theme for needing to count is determined;
P1012, it determines data dimension, according to user demand, determines data respectively from A different directions, then include A data dimension
Degree, the DA that is denoted as D1, D2, D3 respectively ... include specific dimension data inside each dimension again.
3. a kind of relational model method for building up for intellectual property multidimensional data according to claim 1, feature exist
In: in step P102 the following steps are included:
P1021, a hash space size is chosen first, space size G chooses node quantity, is denoted as N;
P1022, the quantity for defining all center tables are the node quantity N in step P1021, and the length that a vector V is arranged is
N, each component correspond to a random natural number in the G of space;
P1023, all center tables component value of vector V is numbered, single center table has been split into N number of;
P1024, the data in dimension table are spliced into a character string by column, then using digital hash function, generate one only
One numeric string, the number that this is obtained are denoted as Num, are input to following formula:
O=Num*mod(G);
Wherein mod represents modulo operation, and O is output valve, then the record has been mapped in the G of space;
P1025, the numerical value being defined in half-interval are stored in the center table on the left side, and half-interval is that right open interval is closed on a left side, to each
Data duplication in a dimension table D1, D2, D3 ... DA similarly operates, it can each data distribution to different center tables
In.
4. a kind of relational model method for building up for intellectual property multidimensional data according to claim 1, feature exist
In: include following set-up procedure in step P103:
The new center table of P1031, addition, arbitrarily takes a value to former center table in the G of space, which is inserted into original in order
In vector V, then the data in former section are moved to inter-node by newly-generated vector V ', i.e., it is preceding to center table splitting be
Two new center tables;
When the memory space of former center table is greater than highest memory space, operating procedure P1031.
5. a kind of relational model method for building up for intellectual property multidimensional data according to claim 4, feature exist
In: further include following set-up procedure in step P103:
P1032, former center table is removed, the content in adjacent former center table is merged, then removed former two adjacent
Center table simultaneously generates new center table;
When the memory space of adjacent former center table is respectively less than most memory space, operating procedure P1032.
6. a kind of relational model method for building up for intellectual property multidimensional data according to claim 1, feature exist
In: in step P104 the following steps are included:
P1041, when finding data and being located at which center table, repeatedly the calculation method of step P102 calculates the data and is located at space
The numeric string of G, is denoted as N;
P1042, corresponding digital section is found, met with lower inequality:
The right node j of the left side corresponding region node i≤N < of corresponding region, the left side node i of (i, j ∈ G), section are
The node of data storage.
7. a kind of relational model method for building up for intellectual property multidimensional data according to claim 1, feature exist
In: in step P105 the following steps are included:
P1051, corresponding data record set C is found in dimension table first;
P1052, then in data record set C these record operating procedure P104 localization method, find corresponding data
Which it is located in center table;
P1053, after finding specific center table, the data dimension extracted as needed removes the physical record in the table of matching center,
Then obtain the data of the record;
P1054, just having obtained needing the data dimension that extracts after the data accumulation of all records.
8. a kind of relational model for intellectual property multidimensional data establishes system characterized by comprising
Main control module (1), user data storage and data processing;
Dimension establishes module (2), connect with main control module (1) and is used to establish the theme of data and the dimension of statistics;
Dynamic Hash ring constructs module (3), connect with main control module (1) and for splitting multiple center tables and controlling multiple centers
Table works at the same time;
Hash ring adjustment module (4) connect with main control module (1) and is used to adjust the quantity for the center table being located in Hash ring;
Data locating module (5) is connect and for finding center table that data belong to and exporting numeric string with main control module (1);
Data extraction module (6), connect with main control module (1) and the data for obtaining to needs extract;
Display module (7) is connect and for receiving the data extracted and being shown with main control module (1).
9. a kind of relational model for intellectual property multidimensional data according to claim 8 establishes system, feature exists
In: dynamic Hash ring building module (3) includes:
Node processing units (8), choose a hash space size, and space size G chooses node quantity, is denoted as N;
Cutting unit (9), define all center tables quantity be node processing units (8) in node quantity N, setting one to
The length for measuring V is N, and each component corresponds to a random natural number in the G of space;
Split cells (10) is numbered all center tables component value of vector V, single center table has been split into N number of;
Data in dimension table are spliced into a character string by column by remainder unit (11), then using digital hash function, are generated
One unique numeric string, the number that this is obtained are denoted as Num, are input in O=Num*mod (G) formula, and wherein mod is represented
Modulo operation, O are output valve, then the record have been mapped in the G of space;
Section confirmation unit (12) is defined in the center table on the numerical value deposit left side in half-interval, and half-interval is that the right side is closed on a left side
Open interval similarly operates the Data duplication in each dimension table D1, D2, D3 ... DA, it can each data distribution is arrived
In different center tables.
10. a kind of relational model for intellectual property multidimensional data according to claim 8 establishes system, feature exists
In: the Hash ring adjustment module (4) includes:
Center table space detection unit (13), for detecting the size of data of current center table and exporting space detection signal;
Adding unit (14), for increasing center table number;
It removes unit (15), for reducing center table number;
Highest space reference signal and lowest spatial reference signal, and highest space reference are preset in the main control module (1)
Signal is greater than lowest spatial reference signal;
When space, detection signal is greater than highest space detection signal, current center table splitting is two by the adding unit (14)
A center table is to increase center table number;Conversely, not increasing;
When two neighboring space, detection signal is respectively less than lowest spatial detection signal, the removal unit (15) will be current adjacent
Two center tables merge into a center table to reduce center table number;Conversely, not reducing.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910143405.3A CN109902132B (en) | 2019-02-26 | 2019-02-26 | Relation model establishing method and system for intellectual property multi-dimensional data |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910143405.3A CN109902132B (en) | 2019-02-26 | 2019-02-26 | Relation model establishing method and system for intellectual property multi-dimensional data |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109902132A true CN109902132A (en) | 2019-06-18 |
CN109902132B CN109902132B (en) | 2023-03-03 |
Family
ID=66945629
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910143405.3A Active CN109902132B (en) | 2019-02-26 | 2019-02-26 | Relation model establishing method and system for intellectual property multi-dimensional data |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109902132B (en) |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101008954A (en) * | 2007-01-30 | 2007-08-01 | 金蝶软件(中国)有限公司 | Multidimensional expression data caching method and device in online analytical processing system |
CN101308496A (en) * | 2008-07-04 | 2008-11-19 | 沈阳格微软件有限责任公司 | Large scale text data external clustering method and system |
CN102063486A (en) * | 2010-12-28 | 2011-05-18 | 东北大学 | Multi-dimensional data management-oriented cloud computing query processing method |
KR20120047656A (en) * | 2010-11-04 | 2012-05-14 | 목포대학교산학협력단 | Vessle's data management system |
CN103942343A (en) * | 2014-05-12 | 2014-07-23 | 中国人民大学 | Data storage optimization method for hash joint |
CN104424229A (en) * | 2013-08-26 | 2015-03-18 | 腾讯科技(深圳)有限公司 | Calculating method and system for multi-dimensional division |
-
2019
- 2019-02-26 CN CN201910143405.3A patent/CN109902132B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101008954A (en) * | 2007-01-30 | 2007-08-01 | 金蝶软件(中国)有限公司 | Multidimensional expression data caching method and device in online analytical processing system |
CN101308496A (en) * | 2008-07-04 | 2008-11-19 | 沈阳格微软件有限责任公司 | Large scale text data external clustering method and system |
KR20120047656A (en) * | 2010-11-04 | 2012-05-14 | 목포대학교산학협력단 | Vessle's data management system |
CN102063486A (en) * | 2010-12-28 | 2011-05-18 | 东北大学 | Multi-dimensional data management-oriented cloud computing query processing method |
CN104424229A (en) * | 2013-08-26 | 2015-03-18 | 腾讯科技(深圳)有限公司 | Calculating method and system for multi-dimensional division |
CN103942343A (en) * | 2014-05-12 | 2014-07-23 | 中国人民大学 | Data storage optimization method for hash joint |
Also Published As
Publication number | Publication date |
---|---|
CN109902132B (en) | 2023-03-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN105488231B (en) | A kind of big data processing method divided based on adaptive table dimension | |
US8335783B2 (en) | Collection of statistics for spatial columns or R-tree indexes | |
CN104915450A (en) | HBase-based big data storage and retrieval method and system | |
Lin et al. | An extendible hash for multi-precision similarity querying of image databases | |
WO2002089013A3 (en) | Method, system, program, and computer readable medium for indexing object oriented objects in an object oriented database | |
DE60130475D1 (en) | IMPLEMENTATION OF CALCULATIONS OF THE TABLE CALCULATION TYPE IN A DATABASE SYSTEM | |
CN106844324B (en) | Method for exporting variable column data into Excel format | |
CN106599127A (en) | Log storage and query method applied to standalone server | |
CN103198522A (en) | Three-dimensional scene model generation method | |
CN102867065B (en) | Based on Data Transform Device and the method for relevant database | |
CN103678550A (en) | Mass data real-time query method based on dynamic index structure | |
WO2021000500A1 (en) | Method and device for incremental building of cube model, server and storage medium | |
CN104615782A (en) | Address matching method based on sliding window maximum matching algorithm | |
WO2001088656A3 (en) | Apparatus and method for performing transformation-based indexing of high-dimensional data | |
Tzouramanis et al. | Multiversion linear quadtree for spatio-temporal data | |
CN109902132A (en) | A kind of relational model method for building up and its system for intellectual property multidimensional data | |
CN111737490B (en) | Knowledge graph ontology model generation method and device based on banking channel | |
CN117151922A (en) | Knowledge-graph-based part processing process route generation method and device | |
CN102236721B (en) | Method for extracting complex window space information in space data engine | |
Rejito et al. | Optimization CBIR using k-means clustering for image database | |
CN108052587B (en) | Big data analysis method based on decision tree | |
Yang et al. | Research on distributed Hilbert R tree spatial index based on BIRCH clustering | |
CN104978395A (en) | Vision dictionary construction and application method and apparatus | |
Dong et al. | The application of association rule mining to remotely sensed data | |
CN112148830A (en) | Semantic data storage and retrieval method and device based on maximum area grid |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
CB02 | Change of applicant information |
Address after: 518000 1901, block D, building 1, Section 1, Chuangzhi Yuncheng, Liuxian Avenue, Xili community, Xili street, Nanshan District, Shenzhen, Guangdong Province Applicant after: Weizheng Intellectual Property Technology Co.,Ltd. Address before: 518000 Guangdong Province Shenzhen Nanshan District Xili Street Chaguang Road and Chuangke Road intersection of Botton Science Park B Vanke Yunchuang 20 stories 2001 Applicant before: WEIZHENG INTELLECTUAL PROPERTY SERVICES Ltd. |
|
CB02 | Change of applicant information | ||
GR01 | Patent grant | ||
GR01 | Patent grant |