CN105447137A - Algorithm for retrieving same master-slave relation data from big data based on relational database - Google Patents
Algorithm for retrieving same master-slave relation data from big data based on relational database Download PDFInfo
- Publication number
- CN105447137A CN105447137A CN201510810811.2A CN201510810811A CN105447137A CN 105447137 A CN105447137 A CN 105447137A CN 201510810811 A CN201510810811 A CN 201510810811A CN 105447137 A CN105447137 A CN 105447137A
- Authority
- CN
- China
- Prior art keywords
- data
- algorithm
- enterprise
- order
- retrieving
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 claims abstract description 12
- 239000002674 ointment Substances 0.000 claims description 3
- 238000000547 structure data Methods 0.000 abstract description 3
- 238000004458 analytical method Methods 0.000 description 9
- 230000000739 chaotic effect Effects 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 238000000151 deposition Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000008676 import Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/242—Query formulation
- G06F16/2433—Query languages
- G06F16/244—Grouping and aggregation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/28—Databases characterised by their database models, e.g. relational or object models
- G06F16/284—Relational databases
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
- G06Q10/063—Operations research, analysis or management
- G06Q10/0639—Performance analysis of employees; Performance analysis of enterprise or organisation operations
- G06Q10/06395—Quality analysis or management
Landscapes
- Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Theoretical Computer Science (AREA)
- Human Resources & Organizations (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Entrepreneurship & Innovation (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Strategic Management (AREA)
- Development Economics (AREA)
- Educational Administration (AREA)
- Economics (AREA)
- Computational Linguistics (AREA)
- Game Theory and Decision Science (AREA)
- Mathematical Physics (AREA)
- Marketing (AREA)
- Operations Research (AREA)
- Quality & Reliability (AREA)
- Tourism & Hospitality (AREA)
- General Business, Economics & Management (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention provides a relation database-based algorithm for retrieving the same master-slave relation data from big data, which is an algorithm for comparing data in mass data, and adopts algorithms of 'big and small, first surface and then point', and gradually reduces the data comparison range by utilizing algorithms of packet traversal, middle table storage and the like, thereby efficiently retrieving the same record. The method for rapidly retrieving the same record aiming at massive master-slave structure data in the enterprise data is suitable for various situations in enterprise management and control requiring retrieval of the same master-slave structure data, enhances the management and control capability of the enterprise, creates a better market environment for the enterprise and improves the competitiveness of the enterprise.
Description
Technical field
The present invention relates to based on relational database, be specifically related to a kind of algorithm retrieving identical master slave relation data based on relational database from large data.
Background technology
Enter large data age, with data-driven development, thus raising business decision ability and public service quality become enterprise's trend.For in the analysis of mass data, data type comprises structural data, unstructured data, semi-structured data, and wherein structural data includes again simple structure data and complex types of data.For simple structured data, such as character type, digital data directly can carry out statistical study by database SQL, such as, GROUPBY statement can be utilized to carry out Querying by group, thus find out identical data; Also can compare the circulation of data in employing program, thus find out data completely.When mass data, namely the Data Comparison of this simple types can significantly improve calculated performance by optimization data storehouse, optimized algorithm.But for the analyses and comparison of master slave relation data, then lack efficient search method easily.
Summary of the invention
Technical assignment of the present invention is for the deficiencies in the prior art, provides a kind of algorithm retrieving identical master slave relation data based on relational database from large data.For magnanimity host-guest architecture data in business data, provide a kind of quick-searching to go out the method for identical recordings, thus provide data supporting for the management and control analysis of enterprise.
The technical solution adopted for the present invention to solve the technical problems is:
A kind of algorithm retrieving identical master slave relation data based on relational database from large data, it is a kind of algorithm carrying out comparing in mass data, adopt " changing little greatly; point behind first face ", utilize packet traverses, middle table to store scheduling algorithm and progressively reduce comparing scope, efficient retrieval goes out identical record.
By extract master-salve table grouping according to-determine order of packets-performs grouping, in execution grouping process, progressively reduce the algorithm of data area in conjunction with ergodic algorithm and middle table storage.
Of the present inventionly a kind ofly from large data, retrieve the algorithm of identical master slave relation data compared with prior art based on relational database, the beneficial effect produced is: the present invention is directed to magnanimity host-guest architecture data in business data, and the method that quick-searching goes out identical recordings is applicable to the various situations that needing in enterprise managing retrieves identical host-guest architecture data.The retrieval of same order data, can be applicable to enterprise's falsifying management.Enterprise's falsifying can upset the market order of enterprise product, causes market to engage in internal strife, price is chaotic, have a strong impact on manufacturer's reputation.For the management and control analysis of enterprise's falsifying, will by embodying the analysis of order, wherein analysis mode is exactly find out an identical order from magnanimity order, then whether has that artificial malice brushes list, false order, internal staff gang up down the situation that goods etc. causes falsifying by finding out the judgement of same order.Finally, strengthen the management and control ability of enterprise, for better market environment is built by enterprise, improve enterprise competitiveness.
Accompanying drawing explanation
Fig. 1 is this algorithm steps figure.
Fig. 2 is the entity relationship diagram of master slave relation data instance, order data.
Fig. 3 is the algorithm steps figure retrieving same order in example.
Embodiment
Below a kind of algorithm retrieving identical master slave relation data based on relational database from large data of the present invention is described in detail below.
From large data, retrieve an algorithm for identical master slave relation data based on relational database, adopt " changing little greatly, point behind first face ", utilize packet traverses, middle table to store scheduling algorithm and progressively reduce comparing scope, efficient retrieval goes out identical record.
By extract master-salve table grouping according to-determine order of packets-performs grouping, in execution grouping process, progressively reduce the algorithm of data area in conjunction with ergodic algorithm and middle table storage.
1) concrete steps are as Fig. 1:
Conveniently set forth, with the common master slave relation data-order data of enterprise exemplarily, suppose that master meter tables of data is called: CO_MAIN, be called from table tables of data: CO_SUB.E-R graph of a relation is as Fig. 2:
Object: find out same order from magnanimity order data, that is: the identical order of the quantity of order commodity and commodity.
Algorithm steps is as Fig. 3
1: acknowledgment packet index is:
Master meter index: total amount of the orders, order total amount.
From table index: order type of merchandize quantity, order commodity amount, order goods amount.
Finally to divide into groups foundation: 1) total amount of the orders+order total amount
2) total amount of the orders+order total amount+order type of merchandize quantity
3) order commodity amount+order goods amount
2: acknowledgment packet execution sequence:
1) total amount of the orders+order total amount
2) total amount of the orders+order total amount+order type of merchandize quantity
3) order commodity amount+order goods amount
3: perform grouping comparison step by step according to order of packets
A: total amount of the orders+order total amount grouping; Total amount of the orders+order total amount+order type of merchandize number of packets
Utilize two-layer nested GROUPBY grouping to find, the order that on order total charge, order total amount, order, the quantity of type of merchandize is identical, is stored in maysamelist.
Wherein CO_COUNT represents the quantity of order in grouping, and CO_COUNT_NUM1 represents the order in grouping.
B: order commodity amount+order goods amount grouping
Circulation maycolist, judges each subgroup submaycolist, judges that the public method whether two orders are identical judges whether there is same order in this grouping, by same order stored in SAME_CO_MAIN, SAME_CO_SUB by calling.Specific algorithm:
For (maycolist, intercept subgroup (i.e. an order grouping that may be identical) according to CO_COUNT_NUM1, run into 1 stopping)
{
1: obtain depositing co_id=CO_ID, goodcount=GOOD_COUNT in submaycolist:list)
2: import submaycolist into, goodcount call the method judging that whether an order grouping is identical, and the inside recursive call judges the method whether two orders are identical
3:for(submaycolist){
3.1 call the method judging that whether two orders are identical
twocossame(coid1,coid2,goodcount)
If 3.2 return results as T, judge two orders whether in SAME_CO_SUB
1) all it's not true then stored in SAME_CO_MAIN, SAME_CO_SUB for coid1, coid2;
2) one is had, by another stored in SAME_CO_SUB
3) have, inoperation
}
}
C: judge a method whether two orders are identical, twocossame (coid1, coid2, goodcount)
The retrieval of same order data in this example, can be applicable to enterprise's falsifying management.Enterprise's falsifying can upset the market order of enterprise product, causes market to engage in internal strife, price is chaotic, have a strong impact on manufacturer's reputation.For the management and control analysis of enterprise's falsifying, will by embodying the analysis of order, wherein analysis mode is exactly find out an identical order from magnanimity order, then whether has that artificial malice brushes list, false order, internal staff gang up down the situation that goods etc. causes falsifying by finding out the judgement of same order.Finally, strengthen the management and control ability of enterprise, for better market environment is built by enterprise, improve enterprise competitiveness.
Claims (2)
1. from large data, retrieve the algorithm of identical master slave relation data based on relational database for one kind, it is a kind of algorithm carrying out comparing in mass data, it is characterized in that adopting " changing little greatly; point behind first face ", utilize packet traverses, middle table to store scheduling algorithm and progressively reduce comparing scope, efficient retrieval goes out identical record.
2. a kind of algorithm retrieving identical master slave relation data based on relational database from large data according to claim 1, it is characterized in that, by extract master-salve table grouping according to-determine order of packets-performs grouping, in execution grouping process, progressively reduce the algorithm of data area in conjunction with ergodic algorithm and middle table storage.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510810811.2A CN105447137A (en) | 2015-11-23 | 2015-11-23 | Algorithm for retrieving same master-slave relation data from big data based on relational database |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510810811.2A CN105447137A (en) | 2015-11-23 | 2015-11-23 | Algorithm for retrieving same master-slave relation data from big data based on relational database |
Publications (1)
Publication Number | Publication Date |
---|---|
CN105447137A true CN105447137A (en) | 2016-03-30 |
Family
ID=55557314
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510810811.2A Pending CN105447137A (en) | 2015-11-23 | 2015-11-23 | Algorithm for retrieving same master-slave relation data from big data based on relational database |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN105447137A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106779126A (en) * | 2016-12-30 | 2017-05-31 | 中国民航信息网络股份有限公司 | Malice accounts for the processing method and system of an order |
CN107291908A (en) * | 2017-06-26 | 2017-10-24 | 浪潮软件股份有限公司 | Cross-database mass data comparison method |
-
2015
- 2015-11-23 CN CN201510810811.2A patent/CN105447137A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106779126A (en) * | 2016-12-30 | 2017-05-31 | 中国民航信息网络股份有限公司 | Malice accounts for the processing method and system of an order |
CN107291908A (en) * | 2017-06-26 | 2017-10-24 | 浪潮软件股份有限公司 | Cross-database mass data comparison method |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN102479223B (en) | Data query method and system | |
US10162857B2 (en) | Optimized inequality join method | |
US9158812B2 (en) | Enhancing parallelism in evaluation ranking/cumulative window functions | |
CN106104525B (en) | Event processing system | |
Chai et al. | Crowdsourcing database systems: Overview and challenges | |
US10565201B2 (en) | Query processing management in a database management system | |
Liu et al. | Efficient distributed query processing in large RFID-enabled supply chains | |
US9390129B2 (en) | Scalable and adaptive evaluation of reporting window functions | |
CN102968420A (en) | Database query method and system | |
CN110222029A (en) | A kind of big data multidimensional analysis computational efficiency method for improving and system | |
CN103176974A (en) | Method and device used for optimizing access path in data base | |
US9135630B2 (en) | Systems and methods for large-scale link analysis | |
WO2021036452A1 (en) | Real-time data deduplication counting method and device | |
Giannakouris et al. | MuSQLE: Distributed SQL query execution over multiple engine environments | |
US10496645B1 (en) | System and method for analysis of a database proxy | |
CN112015741A (en) | Method and device for storing massive data in different databases and tables | |
US20180341679A1 (en) | Selectivity Estimation For Database Query Planning | |
Tank et al. | Speeding ETL processing in data warehouses using high-performance joins for changed data capture (cdc) | |
US11726975B2 (en) | Auto unload | |
KR20180077830A (en) | Processing method for a relational query in distributed stream processing engine based on shared-nothing architecture, recording medium and device for performing the method | |
CN105447137A (en) | Algorithm for retrieving same master-slave relation data from big data based on relational database | |
US8832157B1 (en) | System, method, and computer-readable medium that facilitates efficient processing of distinct counts on several columns in a parallel processing system | |
Wang et al. | A hybrid index for temporal big data | |
CN106339432A (en) | System and method for balancing load according to content to be inquired | |
CN115391424A (en) | Database query processing method, storage medium and computer equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
WD01 | Invention patent application deemed withdrawn after publication | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20160330 |