CN109902132A - A kind of relational model method for building up and its system for intellectual property multidimensional data - Google Patents

A kind of relational model method for building up and its system for intellectual property multidimensional data Download PDF

Info

Publication number
CN109902132A
CN109902132A CN201910143405.3A CN201910143405A CN109902132A CN 109902132 A CN109902132 A CN 109902132A CN 201910143405 A CN201910143405 A CN 201910143405A CN 109902132 A CN109902132 A CN 109902132A
Authority
CN
China
Prior art keywords
data
center table
center
space
dimension
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910143405.3A
Other languages
Chinese (zh)
Other versions
CN109902132B (en
Inventor
朱凯旋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wei Zheng Intellectual Property Service Co Ltd
Original Assignee
Wei Zheng Intellectual Property Service Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wei Zheng Intellectual Property Service Co Ltd filed Critical Wei Zheng Intellectual Property Service Co Ltd
Priority to CN201910143405.3A priority Critical patent/CN109902132B/en
Publication of CN109902132A publication Critical patent/CN109902132A/en
Application granted granted Critical
Publication of CN109902132B publication Critical patent/CN109902132B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention relates to a kind of relational model method for building up and its system for intellectual property multidimensional data, it is related to the technical field of data mining, it is relatively high to solve the specialized general complexity of data warehouse software, while the requirement to hardware is relatively high, it is also desirable to which specialized personnel go to safeguard;When using Star Model and galactic model, center table can become especially big, cause search efficiency not high, the low problem of access efficiency comprising following steps: P101, establishing theme and dimension;P102, building dynamic Hash ring;P103, Hash ring is adjusted;P104, data positioning;P105, it completes data pick-up and extracts data.By new center table building mode, intending building, multiple cooperate the present invention with the center table of theme, reduce the data volume of single center table, improve the effect of the access efficiency of center table.

Description

A kind of relational model method for building up and its system for intellectual property multidimensional data
Technical field
The present invention relates to the technical fields of data mining, more particularly, to a kind of relationship for intellectual property multidimensional data Method for establishing model and its system.
Background technique
In data warehouse field, data are generally modeled in the form of various dimensions.
Shown in referring to Fig.1, with the brand sales data instance of intellectual property, both want which international classification (classification counted Dimension) registration number highest, and want to count the registration number highest in which month (time).In general, during modeling, meeting Confirm a theme, the content that theme needs to count.Such as confirm brand sales quantity as model theme;In established theme Later, the measurement of established data, i.e. statistical dimension can also be established.Such as confirm international classification, month, the measure dimension of region three. In general, the relationship of this complexity can be expressed with data cube.
In data warehouse, actually n ties up cube, here in order to show conveniently, illustrates only three-dimensional cube.It is vertical Each reference axis of cube represents a statistical dimension.Common expert data warehouse software can indicate this cube well Body structure.
Relational database is that have very high versatility in a kind of widely used technology of enterprises, utilize relationship number A kind of mode that low cost is general is also become according to library building data warehouse
Traditional building mode includes Star Model and galactic model.
Referring to shown in Fig. 2, in Star Schema, mainly include:
1) a center table is mainly used to store the key of all data dimensions under some theme;
2) multiple dimension tables are mainly used to store the particular content of single dimension;
When needing to count some dimension data, is first found from dimension table and extract specific keyset conjunction, then center table is gone to extract tool The decorum counts.
Referring to shown in Fig. 3, Galaxy Diagram is the combination of multiple type Star Schemas, is mainly used in there are when multiple themes, Dimension table can be shared between different themes.Such as in addition to trade mark registration number theme, can also create a statistics theme is brand sales volume, Its statistical dimension and trade mark registration number are similarly three kinds.
Prior art among the above has the following deficiencies: that the specialized general complexity of data warehouse software compares Height, while the requirement to hardware is relatively high (usually requiring minicomputer), it is also desirable to specialized personnel go to safeguard;Meanwhile it adopting When with Star Model and galactic model, center table can become especially big, cause search efficiency not high, access efficiency is low, also Improved space.
Summary of the invention
The first object of the present invention is to provide a kind of relational model method for building up for intellectual property multidimensional data, passes through New center table building mode, intending building, multiple cooperate with the center table of theme, reduce the data volume of single center table, mention The access efficiency of high center table.
Foregoing invention purpose of the invention has the technical scheme that
A kind of relational model method for building up for intellectual property multidimensional data, includes the following steps:
P101, theme and dimension are established;
P102, building dynamic Hash ring;
P103, Hash ring is adjusted;
P104, data positioning;
P105, it completes data pick-up and extracts data.
By using above-mentioned technical proposal, by the establishment of dimension and the establishment of theme, so that the theme to inquiry carries out Clear, dynamic Hash ring belongs to one kind of consistent Hash ring, can be recycled by dynamic Hash ring to form one to data Closed annulus, then Hash ring is adjusted, the later period is by the positioning of data and extracts to extract data, intends building Multiple cooperate with the center table of theme, reduce the data volume of single center table, improve the access efficiency of center table.
The present invention is further arranged to: in step P101 the following steps are included:
P1011, the central theme for needing to count is determined;
P1012, it determines data dimension, according to user demand, determines data respectively from A different directions, then include A data dimension Degree, the DA that is denoted as D1, D2, D3 respectively ... include specific dimension data inside each dimension again.
By using above-mentioned technical proposal, confirmed by the central theme counted to needs, to improve entirety Statistics direction, and the confirmation of data dimension is to needing additional conditions to be added to match, while data dimension being divided It cuts, so that user selects.
The present invention is further arranged to: in step P102 the following steps are included:
P1021, a hash space size is chosen first, space size G chooses node quantity, is denoted as N;
P1022, the quantity for defining all center tables are the node quantity N in step P1021, and the length that a vector V is arranged is N, each component correspond to a random natural number in the G of space;
P1023, all center tables component value of vector V is numbered, single center table has been split into N number of;
P1024, the data in dimension table are spliced into a character string by column, then using digital hash function, generate one only One numeric string, the number that this is obtained are denoted as Num, are input to following formula:
O=Num*mod(G);
Wherein mod represents modulo operation, and O is output valve, then the record has been mapped in the G of space;
P1025, the numerical value being defined in half-interval are stored in the center table on the left side, and half-interval is that right open interval is closed on a left side, to each Data duplication in a dimension table D1, D2, D3 ... DA similarly operates, it can each data distribution to different center tables In.
The cooperation of node quantity is cooperated, thus to Kazakhstan by the selection to hash space by using above-mentioned technical proposal Uncommon ring carries out temporary segmentation, by the segmentation to center table, reduces the data volume in each center table, to improve inquiry When efficiency.
The present invention is further arranged to: include following set-up procedure in step P103:
The new center table of P1031, addition, arbitrarily takes a value to former center table in the G of space, which is inserted into original in order In vector V, then the data in former section are moved to inter-node by newly-generated vector V ', i.e., it is preceding to center table splitting be Two new center tables;
When the memory space of former center table is greater than highest memory space, operating procedure P1031.
By using above-mentioned technical proposal, the space of center table is judged, to reduce center table because of data mistake The problem of operational efficiency caused by mostly, by way of increasing center table automatically, to improve operational efficiency.
The present invention is further arranged to: further include following set-up procedure in step P103:
P1032, former center table is removed, the content in adjacent former center table is merged, then removed former two adjacent Center table simultaneously generates new center table;
When the memory space of adjacent former center table is respectively less than most memory space, operating procedure P1032.
By using above-mentioned technical proposal, by way of removing center table, thus the space for the operation saved, by right The few spatial statistics of calculation amount, so that the utilization rate of whole resource is improved, it is practical.
The present invention is further arranged to: in step P104 the following steps are included:
P1041, when finding data and being located at which center table, repeatedly the calculation method of step P102 calculates the data and is located at space The numeric string of G, is denoted as N;
P1042, corresponding digital section is found, met with lower inequality:
The right node j of the left side corresponding region node i≤N < of corresponding region, the left side node i of (i, j ∈ G), section are The node of data storage.
By using above-mentioned technical proposal, by the lookup to data in the table of center, so that corresponding data-link is searched out, By mode identical with when establishing, to improve the stability of overall operation, reduction system goes wrong.
The present invention is further arranged to: in step P105 the following steps are included:
P1051, corresponding data record set C is found in dimension table first;
P1052, then in data record set C these record operating procedure P104 localization method, find corresponding data Which it is located in center table;
P1053, after finding specific center table, the data dimension extracted as needed removes the physical record in the table of matching center, Then obtain the data of the record;
P1054, just having obtained needing the data dimension that extracts after the data accumulation of all records.
By using above-mentioned technical proposal, by the acquisition to corresponding data in dimension table, so that data are positioned, from And the data of corresponding record are obtained, by the extraction to data dimension, so that Data Matching is gone out, it is practical.
The second object of the present invention is to provide a kind of relational model for intellectual property multidimensional data and establishes system, passes through New center table building mode, intending building, multiple cooperate with the center table of theme, reduce the data volume of single center table, mention The access efficiency of high center table.
Foregoing invention purpose of the invention has the technical scheme that
A kind of relational model for intellectual property multidimensional data establishes system, comprising:
Main control module, user data storage and data processing;
Dimension establishes module, connect with main control module and is used to establish the theme of data and the dimension of statistics;
Dynamic Hash ring constructs module, connect with main control module and for splitting multiple center tables and controlling multiple center tables simultaneously Work;
Hash ring adjustment module connect with main control module and is used to adjust the quantity for the center table being located in Hash ring;
Data locating module is connect and for finding center table that data belong to and exporting numeric string with main control module;
Data extraction module, connect with main control module and the data for obtaining to needs extract;
Display module is connect and for receiving the data extracted and being shown with main control module.
By using above-mentioned technical proposal, the setting of module is established by safeguarding, the central theme that needs count is carried out Confirmation, to improve whole statistics direction, dynamic Hash ring constructs the use of module and Hash ring adjustment module, thus right Data form a closed annulus that can be recycled, then Hash ring are adjusted, data locating module and data extraction module Setting, shown finally by display module, positioning to data and extract to extract data, intend building multiple Center table with theme cooperates, and reduces the data volume of single center table, improves the access efficiency of center table.
The present invention is further arranged to: the dynamic Hash ring building module includes:
Node processing units, choose a hash space size, and space size G chooses node quantity, is denoted as N;
Cutting unit defines the quantity of all center tables as the node quantity N in node processing units, is arranged a vector V's Length is N, and each component corresponds to a random natural number in the G of space;
Split cells is numbered all center tables component value of vector V, single center table has been split into N number of;
Data in dimension table are spliced into a character string by column by remainder unit, then using digital hash function, generate one Unique numeric string, the number that this is obtained are denoted as Num, are input in O=Num*mod (G) formula, wherein mod represents modulus Operation, O are output valve, then the record have been mapped in the G of space;
Section confirmation unit is defined in the center table on the numerical value deposit left side in half-interval, and half-interval is the area Zuo Biyoukai Between, the Data duplication in each dimension table D1, D2, D3 ... DA is similarly operated, it can each data distribution to difference Center table in.
By using above-mentioned technical proposal, the setting of cutting unit and split cells separates data, cooperates remainder The setting of unit, the cooperation with section confirmation unit, so that corresponding data are obtained, it is practical.
The present invention is further arranged to: the Hash ring adjustment module includes:
Center table space detection unit, for detecting the size of data of current center table and exporting space detection signal;
Adding unit, for increasing center table number;
Unit is removed, for reducing center table number;
Highest space reference signal and lowest spatial reference signal, and highest space reference signal are preset in the main control module Greater than lowest spatial reference signal;
When space, detection signal is greater than highest space detection signal, current center table splitting is in two by the adding unit Heart table is to increase center table number;Conversely, not increasing;
When two neighboring space, detection signal is respectively less than lowest spatial detection signal, the unit that removes is by currently adjacent two A center table merges into a center table to reduce center table number;Conversely, not reducing.
Center table is split by increasing A Ji unit and removing the setting of unit by using above-mentioned technical proposal And combination, so that the arrangement to space, to improve whole operational efficiency, while saving overall space.
In conclusion advantageous effects of the invention are as follows: by new center table building mode, intending building, multiple are same main The center table of topic cooperates, and reduces the data volume of single center table, improves the access efficiency of center table.
Detailed description of the invention
Fig. 1 is the schematic diagram of data solid body in background technique.
Fig. 2 is Star Schema figure in background technique.
Fig. 3 is galaxy ideograph in background technique.
Fig. 4 is the schematic diagram of the relational model method for building up for intellectual property multidimensional data.
Fig. 5 is data content schematic diagram.
Fig. 6 is the system schematic that system is established for the relational model of intellectual property multidimensional data.
In figure, 1, main control module;2, dimension establishes module;3, dynamic Hash ring constructs module;4, Hash ring adjustment module; 5, data locating module;6, data extraction module;7, display module;8, node processing units;9, cutting unit;10, it splits single Member;11, remainder unit;12, section confirmation unit;13, center table space detection unit;14, adding unit;15, unit is removed.
Specific embodiment
Below in conjunction with attached drawing, invention is further described in detail.
It is a kind of relational model method for building up for intellectual property multidimensional data disclosed by the invention referring to Fig. 4, including Following steps:
P101, theme and dimension are established;
P102, building dynamic Hash ring;
P103, Hash ring is adjusted;
P104, data positioning;
P105, it completes data pick-up and extracts data.
Referring to Figure 5, when carrying out step P101, comprising the following steps:
P1011, the central theme for needing to count is determined;
P1012, it determines data dimension, according to user demand, determines data respectively from A different directions, then include A data dimension Degree, the DA that is denoted as D1, D2, D3 respectively ... include specific dimension data inside each dimension again.
When determining central theme, it is assumed that theme is the quantity of brand sales, then when determining data dimension, according to business Scene can count the data of brand sales respectively from time, international classification, region, contain 3 data dimensions at this time, point For for time D1, international classification D2, region D3.Include again specific data dimension inside each dimension, such as is wrapped in D1 2012-02-13,2013-03-15 etc. are contained.
And when constructing dynamic Hash ring, Hash ring is an annulus, and data carry out duplicate circulation on Hash ring, dynamic Hash ring is a kind of data location technology, for splitting multiple center tables, them is allowed to cooperate.
Include the following steps in step P102, following steps are the step of constructing dynamic Hash ring:
P1021, a hash space size is chosen first, space size G chooses node quantity, is denoted as N.
In the present embodiment, space size is chosen for 232-1。
P1022, the quantity for defining all center tables are the node quantity N in step P1021, and the length of a vector V is arranged Degree is N, and each component corresponds to a random natural number in the G of space.
Such as when N is 5, vector V can value be 40,1010,29392,30000,94039220, and the value of vector V Randomly select, and according to being arranged from big to small or from small to large, and be necessary for natural number, thus be integer simultaneously Form an annular data.
P1023, all center tables component value of vector V is numbered, single center table has been split into N number of.
In the present embodiment, all center tables component value of vector V is numbered, note number is TC-40, TC-1010, TC- Single center table has so far been split into 5 by 29392, TC-30000, TC-94039220.
P1024, the data in dimension table are spliced into a character string by column, then using digital hash function, generate one A unique numeric string, the number that this is obtained are denoted as Num, are input to following formula:
O=Num*mod(G);
Wherein mod represents modulo operation, and O is output valve, then the record has been mapped in the G of space.
In the present embodiment, digital hash function is a function y=F (x), input parameter x can be mapped to a number String y exported, and preferably use CRC32 function, generate a unique numeric string, such as 39029311, at this time formula be O= 39029311*mod(232-1)。
P1025, the numerical value being defined in half-interval are stored in the center table on the left side, and half-interval is that right open interval is closed on a left side, Data duplication in each dimension table D1, D2, D3 ... DA is similarly operated, it can each data distribution in different In heart table.
It in the present embodiment, is put into the section in 2 from the remainder taken out in P1024, finds the initial table in section, be defined on In the center table TC-i on the numerical value deposit left side in half-interval (left side is closed the right side and opened).As 39029311 in P1024 step are located at area Between [TC-30000, TC-94039220) in, so in data deposit this center table of TC-30000.In this way, to each dimension table Data duplication in D1, D2, D3 similarly operates, it can each data distribution into different center table TC-i.In number When according to positioning, same operation is carried out, can also navigate to certain dimension table data in which center table.
It include following set-up procedure in step P103, below when the quantity of center table needs to adjust, mainly by property The reasons such as energy, it is therefore desirable to which the adjustment for doing Hash ring, the rule entirely adjusted are as follows:
The new center table of P1031, addition, arbitrarily takes a value to former center table in the G of space, which is inserted into original in order In vector V, then the data in former section are moved to inter-node by newly-generated vector V ', i.e., it is preceding to center table splitting be Two new center tables;
When the memory space of former center table is greater than highest memory space, operating procedure P1031.
P1032, former center table is removed, the content in adjacent former center table is merged, then remove former two phases Adjacent center table simultaneously generates new center table;
When the memory space of adjacent former center table is respectively less than most memory space, operating procedure P1032.
Step P1031 is the rule of increase center table, and step P1032 is the rule of removal center table.When former center table Memory space be greater than highest memory space when, operating procedure P1031;When the memory space of former center table is greater than highest storage sky Between when, operating procedure P1031.Highest memory space is 10,000,000 datas in the present embodiment, and minimum memory space is 1,000,000 Data.
In step P1031, new center table TC-i is added.To TC-i arbitrarily in space 232A value, such as TC- are taken in -1 210202.The value is inserted into order in former vector V, then newly-generated vector V ' (40,1010,29392,30000, 210202,94039220).At this point, section [TC-21020, TC-94039220) in (before be located at the center TC-30000 table It is interior) data be moved to TC-21020 inter-node, i.e., it is preceding to center table TC- (i-1) divide.
In step P1032, center table TC-i is removed.The reverse operating in step P1031 is then executed, first in TC-i Content be merged into TC- (i-1), then remove the center TC-i table.
In step P104 the following steps are included:
P1041, when finding data and being located at which center table, repeatedly the calculation method of step P102 calculates the data and is located at space The numeric string of G, is denoted as N;
P1042, corresponding digital section is found, met with lower inequality:
The right node j of the left side corresponding region node i≤N < of corresponding region, the left side node i of (i, j ∈ G), section are The node of data storage.
It is located at space 2 when calculating the data according to the calculation method of step P10232- 1 numeric string is denoted as N, then etc. Formula is TC-i≤N < TC-j, (i, j ∈ 232- 1), the left side node TC-i in section is the node of data storage.
In step P105 the following steps are included:
P1051, corresponding data record set C is found in dimension table first;
P1052, then in data record set C these record operating procedure P104 localization method, find corresponding data Which it is located in center table;
P1053, after finding specific center table, the data dimension extracted as needed removes the physical record in the table of matching center, Then obtain the data of the record;
P1054, just having obtained needing the data dimension that extracts after the data accumulation of all records.
Such as: want to extract sales volume in 2018, then finds the data that the time is 2018 in dimension table first and remember Record set C.Then to the localization method of these record operation P104 steps in C, corresponding data is found and are located in which In heart table TC-i.After finding specific TC-i table, the physical record in the table of matching center in TC-i is gone according to key time_key, then Obtain the sales volume Number-i of the record.All Number-i are just obtained the sale numbers of 2018 whole years after cumulative According to.And Number-T=Number-1+Number-2+ ...+Number-N.
Referring to shown in Fig. 6, based on the same inventive concept, the embodiment of the present invention provides a kind of for intellectual property multidimensional data Relational model establish system, including main control module 1, the dimension that connect with main control module 1 establishes module 2, dynamic Hash ring structure Model block 3, Hash ring adjustment module 4, data locating module 5, data extraction module 6, display module 7.
In this implementation, main control module 1 is mainframe computer, and display module 7 is display screen.And main control module 1 be used for into The storage of row data and the processing analytic operation for carrying out data.Display module 7 show the data of extraction.
Dimension establishes module 2 and is used to establish the theme of data and the dimension of statistics, and dynamic Hash ring building module 3 is for tearing open Divide multiple center tables and control multiple center tables and work at the same time, Hash ring adjustment module 4 is for adjusting in Hash ring The quantity of heart table, data locating module 5 are used to find the center table that data belong to and export numeric string, and data extraction module 6 is used It is extracted in the data obtained to needs.
Dimension confirmation module confirms central theme, it is assumed that theme be brand sales quantity, be denoted as TC.According to Business scenario was needed from time, international classification, and region counts the data of brand sales respectively.It then include three data dimensions, point It is not denoted as time D1, international classification D2, region D3.It include again specific dimension data inside each dimension, as included in D1 2018-01-01,2018-01-02 etc..
Dynamic Hash ring construct module 3 include node processing units 8, cutting unit 9, split cells 10, remainder unit 11, Section confirmation unit 12.
Node processing units 8 choose node quantity, are denoted as N for choosing a hash space size, space size G. In the present embodiment, space size is chosen for 232-1。
Cutting unit 9 define all center tables quantity be node processing units 8 in node quantity N, setting one to The length for measuring V is N, and each component corresponds to a random natural number in the G of space.Such as when N be 5 when, vector V can value be 40,1010,29392,30000,94039220, and the value of vector V is randomly selected, and according to from big to small or from small To being arranged greatly, and it is necessary for natural number, therefore is integer and forms an annular data.
Split cells 10 numbers all center tables component value of vector V, single center table has been split into N number of. In the present embodiment, all center tables component value of vector V is numbered, note number is TC-40, TC-1010, TC-29392, TC- 30000, TC-94039220, so far, single center table is split into 5.
Data in dimension table are spliced into a character string by column by remainder unit 11, raw then using digital hash function At a unique numeric string, the number that this is obtained is denoted as Num, is input in O=Num*mod (G) formula, wherein mod generation Table modulo operation, O are output valve, then the record have been mapped in the G of space.In the present embodiment, digital hash function is one Function y=F (x) can be mapped to input parameter x one numeric string y and be exported, and preferably use CRC32 function, generate One unique numeric string, such as 39029311, formula is O=39029311*mod (2 at this time32-1)。
Section confirmation unit 12 is defined in the center table on the deposit of the numerical value in the half-interval left side, and half-interval is that the right side is closed on a left side Open interval similarly operates the Data duplication in each dimension table D1, D2, D3 ... DA, it can each data distribution is arrived In different center tables.It in the present embodiment, is put into the section in 2 from the remainder taken out in P1024, finds the initial in section Table is defined in the center table TC-i on the numerical value deposit left side in half-interval (left side is closed the right side and opened).In P1024 step 39029311 be located at section [TC-30000, TC-94039220) in, so data deposit this center table of TC-30000 in.This Sample, to each dimension table D1, the Data duplication in D2, D3 is similarly operated, it can each data distribution to different centers In table TC-i.In data positioning, same operation is carried out, can also navigate to certain dimension table data in which center table.
Hash ring adjustment module 4 includes center table space detection unit 13, adding unit 14, removes unit 15.Center table Spatial detection unit 13 is used to detect the size of data of current center table and exports space detection signal, and adding unit 14 is for increasing Add center table number, removes unit 15 for reducing center table number.
Highest space reference signal and lowest spatial reference signal are preset in main control module 1, and highest space reference is believed Number be greater than lowest spatial reference signal.
When space, detection signal is greater than highest space detection signal, current center table splitting is two by adding unit 14 Center table is to increase center table number;When space detection signal is no more than highest space detection signal, adding unit 14 will not Current center table splitting is two center tables.
When two neighboring space, detection signal is respectively less than lowest spatial detection signal, removing unit 15 will be current adjacent Two center tables merge into a center table to reduce center table number;When two neighboring space, detection signal is not less than minimum When signal is detected in space, removes unit 15 and current two adjacent center tables are not merged into a center table.When only one When detecting signal less than lowest spatial, without merging.
Data locating module 5 is for inquiring which center table is data be located at, when needing to find which center table is data be located at When, then the data is calculated according to the calculation method that dynamic Hash ring constructs module 3 and is located at space 232- 1 numeric string, is denoted as N. Then corresponding digital section is found, meets inequality: TC-i≤N < TC-j, (i, j ∈ 232, and the left side node in section -1) TC-i is the node of data storage.
Data extraction module 6 is such as wanted to extract sales volume in 2018, is then being tieed up first for extracting to receipt The data record set C that the time is 2018 is found in table.Then to the positioning of these record operation data locating modules 5 in C Method finds corresponding data and is located in which center table TC-i.After finding specific TC-i table, according to key time_key The physical record in the table of matching center in TC-i is removed, then obtains the sales volume Number-i of the record.All Number- 2018 annual sales datas have just been obtained after i is cumulative.
That is Number-T=Number-1+Number-2+ ...+Number-N.
The content of present invention is compared with traditional relationship modeling, using the distribution algorithms of distributed data, huge center Table splits into multiple center sublist cooperatings, can largely improve the retractility of data volume.
The embodiment of present embodiment is presently preferred embodiments of the present invention, not limits protection of the invention according to this Range, therefore: the equivalence changes that all structures under this invention, shape, principle are done, should all be covered by protection scope of the present invention it It is interior.

Claims (10)

1. a kind of relational model method for building up for intellectual property multidimensional data, characterized by the following steps:
P101, theme and dimension are established;
P102, building dynamic Hash ring;
P103, Hash ring is adjusted;
P104, data positioning;
P105, it completes data pick-up and extracts data.
2. a kind of relational model method for building up for intellectual property multidimensional data according to claim 1, feature exist In: in step P101 the following steps are included:
P1011, the central theme for needing to count is determined;
P1012, it determines data dimension, according to user demand, determines data respectively from A different directions, then include A data dimension Degree, the DA that is denoted as D1, D2, D3 respectively ... include specific dimension data inside each dimension again.
3. a kind of relational model method for building up for intellectual property multidimensional data according to claim 1, feature exist In: in step P102 the following steps are included:
P1021, a hash space size is chosen first, space size G chooses node quantity, is denoted as N;
P1022, the quantity for defining all center tables are the node quantity N in step P1021, and the length that a vector V is arranged is N, each component correspond to a random natural number in the G of space;
P1023, all center tables component value of vector V is numbered, single center table has been split into N number of;
P1024, the data in dimension table are spliced into a character string by column, then using digital hash function, generate one only One numeric string, the number that this is obtained are denoted as Num, are input to following formula:
O=Num*mod(G);
Wherein mod represents modulo operation, and O is output valve, then the record has been mapped in the G of space;
P1025, the numerical value being defined in half-interval are stored in the center table on the left side, and half-interval is that right open interval is closed on a left side, to each Data duplication in a dimension table D1, D2, D3 ... DA similarly operates, it can each data distribution to different center tables In.
4. a kind of relational model method for building up for intellectual property multidimensional data according to claim 1, feature exist In: include following set-up procedure in step P103:
The new center table of P1031, addition, arbitrarily takes a value to former center table in the G of space, which is inserted into original in order In vector V, then the data in former section are moved to inter-node by newly-generated vector V ', i.e., it is preceding to center table splitting be Two new center tables;
When the memory space of former center table is greater than highest memory space, operating procedure P1031.
5. a kind of relational model method for building up for intellectual property multidimensional data according to claim 4, feature exist In: further include following set-up procedure in step P103:
P1032, former center table is removed, the content in adjacent former center table is merged, then removed former two adjacent Center table simultaneously generates new center table;
When the memory space of adjacent former center table is respectively less than most memory space, operating procedure P1032.
6. a kind of relational model method for building up for intellectual property multidimensional data according to claim 1, feature exist In: in step P104 the following steps are included:
P1041, when finding data and being located at which center table, repeatedly the calculation method of step P102 calculates the data and is located at space The numeric string of G, is denoted as N;
P1042, corresponding digital section is found, met with lower inequality:
The right node j of the left side corresponding region node i≤N < of corresponding region, the left side node i of (i, j ∈ G), section are The node of data storage.
7. a kind of relational model method for building up for intellectual property multidimensional data according to claim 1, feature exist In: in step P105 the following steps are included:
P1051, corresponding data record set C is found in dimension table first;
P1052, then in data record set C these record operating procedure P104 localization method, find corresponding data Which it is located in center table;
P1053, after finding specific center table, the data dimension extracted as needed removes the physical record in the table of matching center, Then obtain the data of the record;
P1054, just having obtained needing the data dimension that extracts after the data accumulation of all records.
8. a kind of relational model for intellectual property multidimensional data establishes system characterized by comprising
Main control module (1), user data storage and data processing;
Dimension establishes module (2), connect with main control module (1) and is used to establish the theme of data and the dimension of statistics;
Dynamic Hash ring constructs module (3), connect with main control module (1) and for splitting multiple center tables and controlling multiple centers Table works at the same time;
Hash ring adjustment module (4) connect with main control module (1) and is used to adjust the quantity for the center table being located in Hash ring;
Data locating module (5) is connect and for finding center table that data belong to and exporting numeric string with main control module (1);
Data extraction module (6), connect with main control module (1) and the data for obtaining to needs extract;
Display module (7) is connect and for receiving the data extracted and being shown with main control module (1).
9. a kind of relational model for intellectual property multidimensional data according to claim 8 establishes system, feature exists In: dynamic Hash ring building module (3) includes:
Node processing units (8), choose a hash space size, and space size G chooses node quantity, is denoted as N;
Cutting unit (9), define all center tables quantity be node processing units (8) in node quantity N, setting one to The length for measuring V is N, and each component corresponds to a random natural number in the G of space;
Split cells (10) is numbered all center tables component value of vector V, single center table has been split into N number of;
Data in dimension table are spliced into a character string by column by remainder unit (11), then using digital hash function, are generated One unique numeric string, the number that this is obtained are denoted as Num, are input in O=Num*mod (G) formula, and wherein mod is represented Modulo operation, O are output valve, then the record have been mapped in the G of space;
Section confirmation unit (12) is defined in the center table on the numerical value deposit left side in half-interval, and half-interval is that the right side is closed on a left side Open interval similarly operates the Data duplication in each dimension table D1, D2, D3 ... DA, it can each data distribution is arrived In different center tables.
10. a kind of relational model for intellectual property multidimensional data according to claim 8 establishes system, feature exists In: the Hash ring adjustment module (4) includes:
Center table space detection unit (13), for detecting the size of data of current center table and exporting space detection signal;
Adding unit (14), for increasing center table number;
It removes unit (15), for reducing center table number;
Highest space reference signal and lowest spatial reference signal, and highest space reference are preset in the main control module (1) Signal is greater than lowest spatial reference signal;
When space, detection signal is greater than highest space detection signal, current center table splitting is two by the adding unit (14) A center table is to increase center table number;Conversely, not increasing;
When two neighboring space, detection signal is respectively less than lowest spatial detection signal, the removal unit (15) will be current adjacent Two center tables merge into a center table to reduce center table number;Conversely, not reducing.
CN201910143405.3A 2019-02-26 2019-02-26 Relation model establishing method and system for intellectual property multi-dimensional data Active CN109902132B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910143405.3A CN109902132B (en) 2019-02-26 2019-02-26 Relation model establishing method and system for intellectual property multi-dimensional data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910143405.3A CN109902132B (en) 2019-02-26 2019-02-26 Relation model establishing method and system for intellectual property multi-dimensional data

Publications (2)

Publication Number Publication Date
CN109902132A true CN109902132A (en) 2019-06-18
CN109902132B CN109902132B (en) 2023-03-03

Family

ID=66945629

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910143405.3A Active CN109902132B (en) 2019-02-26 2019-02-26 Relation model establishing method and system for intellectual property multi-dimensional data

Country Status (1)

Country Link
CN (1) CN109902132B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101008954A (en) * 2007-01-30 2007-08-01 金蝶软件(中国)有限公司 Multidimensional expression data caching method and device in online analytical processing system
CN101308496A (en) * 2008-07-04 2008-11-19 沈阳格微软件有限责任公司 Large scale text data external clustering method and system
CN102063486A (en) * 2010-12-28 2011-05-18 东北大学 Multi-dimensional data management-oriented cloud computing query processing method
KR20120047656A (en) * 2010-11-04 2012-05-14 목포대학교산학협력단 Vessle's data management system
CN103942343A (en) * 2014-05-12 2014-07-23 中国人民大学 Data storage optimization method for hash joint
CN104424229A (en) * 2013-08-26 2015-03-18 腾讯科技(深圳)有限公司 Calculating method and system for multi-dimensional division

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101008954A (en) * 2007-01-30 2007-08-01 金蝶软件(中国)有限公司 Multidimensional expression data caching method and device in online analytical processing system
CN101308496A (en) * 2008-07-04 2008-11-19 沈阳格微软件有限责任公司 Large scale text data external clustering method and system
KR20120047656A (en) * 2010-11-04 2012-05-14 목포대학교산학협력단 Vessle's data management system
CN102063486A (en) * 2010-12-28 2011-05-18 东北大学 Multi-dimensional data management-oriented cloud computing query processing method
CN104424229A (en) * 2013-08-26 2015-03-18 腾讯科技(深圳)有限公司 Calculating method and system for multi-dimensional division
CN103942343A (en) * 2014-05-12 2014-07-23 中国人民大学 Data storage optimization method for hash joint

Also Published As

Publication number Publication date
CN109902132B (en) 2023-03-03

Similar Documents

Publication Publication Date Title
CN105488231B (en) A kind of big data processing method divided based on adaptive table dimension
US8335783B2 (en) Collection of statistics for spatial columns or R-tree indexes
CN104915450A (en) HBase-based big data storage and retrieval method and system
Lin et al. An extendible hash for multi-precision similarity querying of image databases
WO2002089013A3 (en) Method, system, program, and computer readable medium for indexing object oriented objects in an object oriented database
DE60130475D1 (en) IMPLEMENTATION OF CALCULATIONS OF THE TABLE CALCULATION TYPE IN A DATABASE SYSTEM
CN106844324B (en) Method for exporting variable column data into Excel format
CN106599127A (en) Log storage and query method applied to standalone server
CN103198522A (en) Three-dimensional scene model generation method
CN102867065B (en) Based on Data Transform Device and the method for relevant database
CN103678550A (en) Mass data real-time query method based on dynamic index structure
WO2021000500A1 (en) Method and device for incremental building of cube model, server and storage medium
CN104615782A (en) Address matching method based on sliding window maximum matching algorithm
WO2001088656A3 (en) Apparatus and method for performing transformation-based indexing of high-dimensional data
Tzouramanis et al. Multiversion linear quadtree for spatio-temporal data
CN109902132A (en) A kind of relational model method for building up and its system for intellectual property multidimensional data
CN111737490B (en) Knowledge graph ontology model generation method and device based on banking channel
CN117151922A (en) Knowledge-graph-based part processing process route generation method and device
CN102236721B (en) Method for extracting complex window space information in space data engine
Rejito et al. Optimization CBIR using k-means clustering for image database
CN108052587B (en) Big data analysis method based on decision tree
Yang et al. Research on distributed Hilbert R tree spatial index based on BIRCH clustering
CN104978395A (en) Vision dictionary construction and application method and apparatus
Dong et al. The application of association rule mining to remotely sensed data
CN112148830A (en) Semantic data storage and retrieval method and device based on maximum area grid

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: 518000 1901, block D, building 1, Section 1, Chuangzhi Yuncheng, Liuxian Avenue, Xili community, Xili street, Nanshan District, Shenzhen, Guangdong Province

Applicant after: Weizheng Intellectual Property Technology Co.,Ltd.

Address before: 518000 Guangdong Province Shenzhen Nanshan District Xili Street Chaguang Road and Chuangke Road intersection of Botton Science Park B Vanke Yunchuang 20 stories 2001

Applicant before: WEIZHENG INTELLECTUAL PROPERTY SERVICES Ltd.

CB02 Change of applicant information
GR01 Patent grant
GR01 Patent grant