CN104601732A - Method for merging multichannel data quickly - Google Patents
Method for merging multichannel data quickly Download PDFInfo
- Publication number
- CN104601732A CN104601732A CN201510076043.2A CN201510076043A CN104601732A CN 104601732 A CN104601732 A CN 104601732A CN 201510076043 A CN201510076043 A CN 201510076043A CN 104601732 A CN104601732 A CN 104601732A
- Authority
- CN
- China
- Prior art keywords
- data
- data set
- heap
- raw data
- multichannel
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5027—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
- G06F9/5038—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the execution order of a plurality of tasks, e.g. taking priority or time dependency constraints into consideration
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Traffic Control Systems (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention discloses a method for merging multichannel data quickly. The method includes step 1, collecting a multichannel original data set; step 2, merging the multichannel original data set according to a heapsort technology by a computer; step 3, outputting a merged data stream. By the method, merging efficiency of the multichannel ordered original data set can be improved.
Description
Technical field
The present invention relates to a kind of aggregation of data method.More particularly, the present invention relates to a kind of method realizing multichannel data merger fast.
Background technology
Along with each road portable set, intelligent home device are more and more universal, replace the life style that traditional PC changes the mankind gradually, all there is explosive growth in the source of data, kind, quantity, the phenomenon of information overload is more and more general, and the data how fast and effectively user wants from these numerous and diverse data are this large data age problems in the urgent need to address.
Summary of the invention
An object of the present invention is to solve at least the problems referred to above, and the advantage will illustrated at least is below provided.
A further object of the invention is the merger utilizing heapsort technology to realize multichannel data, and for the ordered data of multichannel, merger efficiency is higher.
In order to realize, according to these objects of the present invention and other advantages, providing a kind of method realizing multichannel data merger fast.
Step one, collection multichannel raw data set;
Step 2, heapsort technology is utilized to carry out merge operation by computer to described multichannel raw data set;
Data flow after step 3, output merger.
Preferably, the described method realizing multichannel data merger fast, in described step one, the element in each raw data set is ordered arrangement.
Preferably, the described method realizing multichannel data merger fast, when the element in the raw data set of every road is ascending order arrangement, described merge operation specifically comprises the following steps:
A, obtain total way of described raw data set;
If the total way of B is 1, then directly exports all data in this road raw data set, terminate merge operation; If total way is not 1, then obtain multichannel raw data set, and first element taken out in the raw data set of every road creates initial heap;
C, adjust described initial heap for most rickle, export heap top element;
If the initial data at element place, D heap top concentrates countless certificate, then exchange heap top element and last node, total way subtracts 1; If the initial data at element place, heap top is concentrated data, then take out and push up the adjacent data of element with heap, if these data are all less than the left and right child of heap top element, then directly export these data, circulation like this, until take all data in this road or find data larger than left and right child;
If E finds data larger than left and right child in above-mentioned steps D, then this data assignment is given heap top element, repeat step C, D;
F, in above-mentioned steps D, if be 0 after total way subtracts 1, then discharge heap space, terminate merge operation; If be not 0 after total way subtracts 1, then repeat step C, D, E, F until total way is 0.
Preferably, the described method realizing multichannel data merger fast, when the element in the raw data set of every road is descending, described merge operation specifically comprises the following steps:
A, obtain total way of described raw data set;
If the total way of B is 1, then directly exports all data in this road raw data set, terminate merge operation; If total way is not 1, then obtain multichannel raw data set, and first element taken out in the raw data set of every road creates initial heap;
C, adjust described initial heap for most raft, export heap top element;
If the initial data at element place, D heap top concentrates countless certificate, then exchange heap top element and last node, total way subtracts 1; If the initial data at element place, heap top is concentrated data, then take out and push up the adjacent data of element with heap, if these data are all larger than the left and right child of heap top element, then directly export these data, circulation like this, until take all data in this road or find data less than left and right child;
If E finds data less than left and right child in above-mentioned steps D, then this data assignment is given heap top element, repeat step C, D;
F, in above-mentioned steps D, if be 0 after total way subtracts 1, then discharge heap space, terminate merge operation; If be not 0 after total way subtracts 1, then repeat step C, D, E, F until total way is 0.
Preferably, the described method realizing multichannel data merger fast, described raw data set derives from external file, other machines and internal memory.
Preferably, the described method realizing multichannel data merger fast, when external sort, described raw data set obtains from external file; When distributed treatment, described raw data set obtains from network; When unit memory order, described raw data set obtains from internal memory.
Preferably, the described method realizing multichannel data merger fast, the data area in the raw data set on all roads does not overlap.
The present invention at least comprises following beneficial effect: owing to adopting heapsort technology to carry out the Merging of multichannel ordered data, therefore, it is possible to effectively reduce the number of comparisons between each circuit-switched data element, significantly improve Merging efficiency.
Part is embodied by explanation below by other advantage of the present invention, target and feature, part also will by research and practice of the present invention by those skilled in the art is understood.
Accompanying drawing explanation
Fig. 1 is this structure chart total of merge operation;
What Fig. 2 illustrated is, and of the present invention to realize original data centralization data in the method for multichannel data merger fast be the schematic flow sheet of multichannel data merger of ascending order when arranging;
The schematic flow sheet of multichannel data merger when realizing that in the method for multichannel data merger, original data centralization data are descending fast of the present invention that what Fig. 3 illustrated is.
Embodiment
Below in conjunction with accompanying drawing, the present invention is described in further detail, can implement according to this with reference to specification word to make those skilled in the art.
Fig. 1 shows this structure chart total of merge operation.Wherein SD1, SD2, SD3, SDN represent the raw data set that N part is orderly, and these data on the disk of one or more computer or in internal memory, can be applicable to local and distributed utilization; Merge represents the merge operation that will carry out, and direction arrow represents the flow direction of data.
Fig. 2 shows the described schematic flow sheet realizing multichannel data merger when original data centralization data in the method for multichannel data merger are ascending order arrangement fast.Wherein, Y is yes, and N is no.
Fig. 3 shows the schematic flow sheet of described multichannel data merger when realizing that in the method for multichannel data merger, original data centralization data are descending fast.Wherein, Y is yes, and N is no.
As shown in Figure 1, the method realizing multichannel data merger fast of the present invention, comprising:
Step one, collection multichannel raw data set;
Step 2, heapsort technology is utilized to carry out merge operation by computer to described multichannel raw data set;
Data flow after step 3, output merger.The technical program uses the technology of heapsort can data volume is large, the quick merger of data that raw data set way is many.
As shown in Figure 1, 2, 3, the element in each raw data set is ordered arrangement.When data way is many, the element in the raw data set of each road can reduce rapidly the number of comparisons between element in order, reaches efficient object.
As shown in Figure 2, when the element in the raw data set of every road is ascending order arrangement, described merge operation specifically comprises the following steps:
A, obtain total way of described raw data set;
If the total way of B is 1, then directly exports all data in this road raw data set, terminate merge operation; If total way is not 1, then obtain multichannel raw data set, and first element taken out in the raw data set of every road creates initial heap;
C, adjust described initial heap for most rickle, export heap top element;
If the initial data at element place, D heap top concentrates countless certificate, then exchange heap top element and last node, total way subtracts 1; If the initial data at element place, heap top is concentrated data, then take out and push up the adjacent data of element with heap, if these data are all less than the left and right child of heap top element, then directly export these data, circulation like this, until take all data in this road or find data larger than left and right child;
If E finds data larger than left and right child in above-mentioned steps D, then this data assignment is given heap top element, repeat step C, D;
F, in above-mentioned steps D, if be 0 after total way subtracts 1, then discharge heap space, terminate merge operation; If be not 0 after total way subtracts 1, then repeat step C, D, E, F until total way is 0.This technical scheme can progressively the exporting the mode that the element in the raw data set of each road arranges by ascending order of quickly and orderly, and the data flow merging into the larger arrangement in ascending order is to supply user.
As shown in Figure 3, when the element in the raw data set of every road is descending, described merge operation specifically comprises the following steps:
A, obtain total way of described raw data set;
If the total way of B is 1, then directly exports all data in this road raw data set, terminate merge operation; If total way is not 1, then obtain multichannel raw data set, and first element taken out in the raw data set of every road creates initial heap;
C, adjust described initial heap for most raft, export heap top element;
If the initial data at element place, D heap top concentrates countless certificate, then exchange heap top element and last node, total way subtracts 1; If the initial data at element place, heap top is concentrated data, then take out and push up the adjacent data of element with heap, if these data are all larger than the left and right child of heap top element, then directly export these data, circulation like this, until take all data in this road or find data less than left and right child;
If E finds data less than left and right child in above-mentioned steps D, then this data assignment is given heap top element, repeat step C, D;
F, in above-mentioned steps D, if be 0 after total way subtracts 1, then discharge heap space, terminate merge operation; If be not 0 after total way subtracts 1, then repeat step C, D, E, F until total way is 0.This technical scheme can progressively the exporting the element in the raw data set of each road by the mode of descending of quickly and orderly, and merge into the larger data flow in descending to supply user.
Described raw data set derives from external file, other machines and internal memory.This technical scheme can be tried one's best and be obtained all data completely, to ensure the comprehensive of data flow after merger.
When external sort, described raw data set obtains from external file; When distributed treatment, described raw data set obtains from network; When unit memory order, described raw data set obtains from internal memory.The program to be advanced line ordering at merge operation to all kinds of raw data set, improves the efficiency in heapsort process.
Data area in the raw data set on all roads does not overlap.This technical scheme can save the sequential operation time, makes merge process speed faster.
Here the treatment scale illustrated is used to simplify explanation of the present invention.The application of multichannel data merging method of the present invention, modifications and variations be will be readily apparent to persons skilled in the art.
As mentioned above, according to the present invention, owing to carrying out heapsort merge operation to the orderly raw data set of multichannel, therefore there is the effect of quick merger.
Although embodiment of the present invention are open as above, but it is not restricted to listed in specification and execution mode utilization, it can be applied to various applicable the field of the invention completely, for those skilled in the art, can easily realize other amendment.Therefore do not deviating under the universal that claim and equivalency range limit, the present invention is not limited to specific details and illustrates here and the legend described.
Claims (7)
1. realize a method for multichannel data merger fast, it is characterized in that, comprising:
Step one, collection multichannel raw data set;
Step 2, heapsort technology is utilized to carry out merge operation by computer to described multichannel raw data set;
Data flow after step 3, output merger.
2. realize the method for multichannel data merger as claimed in claim 1 fast, it is characterized in that, in described step one, the element in each raw data set is ordered arrangement.
3. realize the method for multichannel data merger as claimed in claim 2 fast, it is characterized in that, when the element in the raw data set of every road is ascending order arrangement, described merge operation specifically comprises the following steps:
A, obtain total way of described raw data set;
If the total way of B is 1, then directly exports all data in this road raw data set, terminate merge operation; If total way is not 1, then obtain multichannel raw data set, and first element taken out in the raw data set of every road creates initial heap;
C, adjust described initial heap for most rickle, export heap top element;
If the initial data at element place, D heap top concentrates countless certificate, then exchange heap top element and last node, total way subtracts 1; If the initial data at element place, heap top is concentrated data, then take out and push up the adjacent data of element with heap, if these data are all less than the left and right child of heap top element, then directly export these data, circulation like this, until take all data in this road or find data larger than left and right child;
If E finds data larger than left and right child in above-mentioned steps D, then this data assignment is given heap top element, repeat step C, D;
F, in above-mentioned steps D, if be 0 after total way subtracts 1, then discharge heap space, terminate merge operation; If be not 0 after total way subtracts 1, then repeat step C, D, E, F until total way is 0.
4. realize the method for multichannel data merger as claimed in claim 2 fast, it is characterized in that, when the element in the raw data set of every road is descending, described merge operation specifically comprises the following steps:
A, obtain total way of described raw data set;
If the total way of B is 1, then directly exports all data in this road raw data set, terminate merge operation; If total way is not 1, then obtain multichannel raw data set, and first element taken out in the raw data set of every road creates initial heap;
C, adjust described initial heap for most raft, export heap top element;
If the initial data at element place, D heap top concentrates countless certificate, then exchange heap top element and last node, total way subtracts 1; If the initial data at element place, heap top is concentrated data, then take out and push up the adjacent data of element with heap, if these data are all larger than the left and right child of heap top element, then directly export these data, circulation like this, until take all data in this road or find data less than left and right child;
If E finds data less than left and right child in above-mentioned steps D, then this data assignment is given heap top element, repeat step C, D;
F, in above-mentioned steps D, if be 0 after total way subtracts 1, then discharge heap space, terminate merge operation; If be not 0 after total way subtracts 1, then repeat step C, D, E, F until total way is 0.
5. realize the method for multichannel data merger as claimed in claim 1 fast, it is characterized in that, described raw data set derives from external file, network and internal memory.
6. realize the method for multichannel data merger as claimed in claim 5 fast, it is characterized in that,
When external sort, described raw data set obtains from external file;
When distributed treatment, described raw data set obtains from network;
When unit memory order, described raw data set obtains from internal memory.
7. realize the method for multichannel data merger as claimed in claim 1 fast, it is characterized in that, the data area in the raw data set on all roads does not overlap.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510076043.2A CN104601732B (en) | 2015-02-12 | 2015-02-12 | A kind of quick method for realizing multichannel data merger |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510076043.2A CN104601732B (en) | 2015-02-12 | 2015-02-12 | A kind of quick method for realizing multichannel data merger |
Publications (2)
Publication Number | Publication Date |
---|---|
CN104601732A true CN104601732A (en) | 2015-05-06 |
CN104601732B CN104601732B (en) | 2018-01-23 |
Family
ID=53127225
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510076043.2A Active CN104601732B (en) | 2015-02-12 | 2015-02-12 | A kind of quick method for realizing multichannel data merger |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN104601732B (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104850618A (en) * | 2015-05-18 | 2015-08-19 | 北京京东尚科信息技术有限公司 | System and method for providing sorted data |
CN107908714A (en) * | 2017-11-10 | 2018-04-13 | 上海达梦数据库有限公司 | A kind of aggregation of data sort method and device |
CN110377642A (en) * | 2019-07-24 | 2019-10-25 | 杭州太尼科技有限公司 | A kind of device of quick obtaining ordered sequence data |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20070088699A1 (en) * | 2005-10-18 | 2007-04-19 | Edmondson James R | Multiple Pivot Sorting Algorithm |
CN102609490A (en) * | 2012-01-20 | 2012-07-25 | 东华大学 | Column-storage-oriented B+ tree index method for DWMS (data warehouse management system) |
CN102968496A (en) * | 2012-12-04 | 2013-03-13 | 天津神舟通用数据技术有限公司 | Parallel sequencing method based on task derivation and double buffering mechanism |
CN103716237A (en) * | 2013-12-25 | 2014-04-09 | 广东天拓资讯科技有限公司 | Path-finding method and device utilizing binary heap sorting |
-
2015
- 2015-02-12 CN CN201510076043.2A patent/CN104601732B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20070088699A1 (en) * | 2005-10-18 | 2007-04-19 | Edmondson James R | Multiple Pivot Sorting Algorithm |
CN102609490A (en) * | 2012-01-20 | 2012-07-25 | 东华大学 | Column-storage-oriented B+ tree index method for DWMS (data warehouse management system) |
CN102968496A (en) * | 2012-12-04 | 2013-03-13 | 天津神舟通用数据技术有限公司 | Parallel sequencing method based on task derivation and double buffering mechanism |
CN103716237A (en) * | 2013-12-25 | 2014-04-09 | 广东天拓资讯科技有限公司 | Path-finding method and device utilizing binary heap sorting |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104850618A (en) * | 2015-05-18 | 2015-08-19 | 北京京东尚科信息技术有限公司 | System and method for providing sorted data |
CN104850618B (en) * | 2015-05-18 | 2018-06-01 | 北京京东尚科信息技术有限公司 | A kind of system and method that ordered data is provided |
CN107908714A (en) * | 2017-11-10 | 2018-04-13 | 上海达梦数据库有限公司 | A kind of aggregation of data sort method and device |
CN107908714B (en) * | 2017-11-10 | 2021-05-04 | 上海达梦数据库有限公司 | Data merging and sorting method and device |
CN110377642A (en) * | 2019-07-24 | 2019-10-25 | 杭州太尼科技有限公司 | A kind of device of quick obtaining ordered sequence data |
Also Published As
Publication number | Publication date |
---|---|
CN104601732B (en) | 2018-01-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP7453143B2 (en) | Data storage and query methods and devices | |
CN1955958A (en) | Sort data storage and split catalog inquiry method based on catalog tree | |
CN105205105A (en) | Data ETL (Extract Transform Load) system based on storm and treatment method based on storm | |
CN104601732A (en) | Method for merging multichannel data quickly | |
CN105389402A (en) | Big-data-oriented ETL (Extraction-Transformation-Loading) method and device | |
JP2007531115A5 (en) | ||
TWI709049B (en) | Random walk, cluster-based random walk method, device and equipment | |
CN103942245A (en) | Data extracting method based on metadata | |
CN106033442B (en) | A kind of parallel breadth first search method based on shared drive architecture | |
CN101030230A (en) | Image searching method and system | |
CN102902742A (en) | Spatial data partitioning method in cloud environment | |
CN103064841A (en) | Retrieval device and retrieval method | |
CN106250110A (en) | Set up the method and device of model | |
CN103064991A (en) | Mass data clustering method | |
CN105005638B (en) | A kind of High Level Synthesis dispatching method based on linear delay model | |
WO2016206377A1 (en) | Data integration and processing method and device | |
CN104573002A (en) | Data organization model based on human, affair and object classification filing | |
CN104182208A (en) | Method and system utilizing cracking rule to crack password | |
CN104376055B (en) | A kind of large-sized model data comparing method based on allocation methods | |
CN103092630B (en) | Interface data output unit and interface data output intent | |
CN105447142A (en) | Dual-mode agricultural scientific and technical achievement classification method and system | |
CN106528739B (en) | A kind of method for building up in Digital Dyeing picture material big data warehouse | |
CN104573101A (en) | System and method for real-time data stream classification on basis of rule routes | |
CN104932982A (en) | Message access memory compiling method and related apparatus | |
CN105515818B (en) | The method and system of cyclic structure are split in a kind of network topology layout |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |