CN104601732A - Method for merging multichannel data quickly - Google Patents

Method for merging multichannel data quickly Download PDF

Info

Publication number
CN104601732A
CN104601732A CN201510076043.2A CN201510076043A CN104601732A CN 104601732 A CN104601732 A CN 104601732A CN 201510076043 A CN201510076043 A CN 201510076043A CN 104601732 A CN104601732 A CN 104601732A
Authority
CN
China
Prior art keywords
data
data set
heap
raw data
multichannel
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201510076043.2A
Other languages
Chinese (zh)
Other versions
CN104601732B (en
Inventor
杨爱民
张天祥
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jinher Software Co Ltd
Original Assignee
Beijing Jinher Software Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jinher Software Co Ltd filed Critical Beijing Jinher Software Co Ltd
Priority to CN201510076043.2A priority Critical patent/CN104601732B/en
Publication of CN104601732A publication Critical patent/CN104601732A/en
Application granted granted Critical
Publication of CN104601732B publication Critical patent/CN104601732B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/5038Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the execution order of a plurality of tasks, e.g. taking priority or time dependency constraints into consideration

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Traffic Control Systems (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a method for merging multichannel data quickly. The method includes step 1, collecting a multichannel original data set; step 2, merging the multichannel original data set according to a heapsort technology by a computer; step 3, outputting a merged data stream. By the method, merging efficiency of the multichannel ordered original data set can be improved.

Description

A kind of method realizing multichannel data merger fast
Technical field
The present invention relates to a kind of aggregation of data method.More particularly, the present invention relates to a kind of method realizing multichannel data merger fast.
Background technology
Along with each road portable set, intelligent home device are more and more universal, replace the life style that traditional PC changes the mankind gradually, all there is explosive growth in the source of data, kind, quantity, the phenomenon of information overload is more and more general, and the data how fast and effectively user wants from these numerous and diverse data are this large data age problems in the urgent need to address.
Summary of the invention
An object of the present invention is to solve at least the problems referred to above, and the advantage will illustrated at least is below provided.
A further object of the invention is the merger utilizing heapsort technology to realize multichannel data, and for the ordered data of multichannel, merger efficiency is higher.
In order to realize, according to these objects of the present invention and other advantages, providing a kind of method realizing multichannel data merger fast.
Step one, collection multichannel raw data set;
Step 2, heapsort technology is utilized to carry out merge operation by computer to described multichannel raw data set;
Data flow after step 3, output merger.
Preferably, the described method realizing multichannel data merger fast, in described step one, the element in each raw data set is ordered arrangement.
Preferably, the described method realizing multichannel data merger fast, when the element in the raw data set of every road is ascending order arrangement, described merge operation specifically comprises the following steps:
A, obtain total way of described raw data set;
If the total way of B is 1, then directly exports all data in this road raw data set, terminate merge operation; If total way is not 1, then obtain multichannel raw data set, and first element taken out in the raw data set of every road creates initial heap;
C, adjust described initial heap for most rickle, export heap top element;
If the initial data at element place, D heap top concentrates countless certificate, then exchange heap top element and last node, total way subtracts 1; If the initial data at element place, heap top is concentrated data, then take out and push up the adjacent data of element with heap, if these data are all less than the left and right child of heap top element, then directly export these data, circulation like this, until take all data in this road or find data larger than left and right child;
If E finds data larger than left and right child in above-mentioned steps D, then this data assignment is given heap top element, repeat step C, D;
F, in above-mentioned steps D, if be 0 after total way subtracts 1, then discharge heap space, terminate merge operation; If be not 0 after total way subtracts 1, then repeat step C, D, E, F until total way is 0.
Preferably, the described method realizing multichannel data merger fast, when the element in the raw data set of every road is descending, described merge operation specifically comprises the following steps:
A, obtain total way of described raw data set;
If the total way of B is 1, then directly exports all data in this road raw data set, terminate merge operation; If total way is not 1, then obtain multichannel raw data set, and first element taken out in the raw data set of every road creates initial heap;
C, adjust described initial heap for most raft, export heap top element;
If the initial data at element place, D heap top concentrates countless certificate, then exchange heap top element and last node, total way subtracts 1; If the initial data at element place, heap top is concentrated data, then take out and push up the adjacent data of element with heap, if these data are all larger than the left and right child of heap top element, then directly export these data, circulation like this, until take all data in this road or find data less than left and right child;
If E finds data less than left and right child in above-mentioned steps D, then this data assignment is given heap top element, repeat step C, D;
F, in above-mentioned steps D, if be 0 after total way subtracts 1, then discharge heap space, terminate merge operation; If be not 0 after total way subtracts 1, then repeat step C, D, E, F until total way is 0.
Preferably, the described method realizing multichannel data merger fast, described raw data set derives from external file, other machines and internal memory.
Preferably, the described method realizing multichannel data merger fast, when external sort, described raw data set obtains from external file; When distributed treatment, described raw data set obtains from network; When unit memory order, described raw data set obtains from internal memory.
Preferably, the described method realizing multichannel data merger fast, the data area in the raw data set on all roads does not overlap.
The present invention at least comprises following beneficial effect: owing to adopting heapsort technology to carry out the Merging of multichannel ordered data, therefore, it is possible to effectively reduce the number of comparisons between each circuit-switched data element, significantly improve Merging efficiency.
Part is embodied by explanation below by other advantage of the present invention, target and feature, part also will by research and practice of the present invention by those skilled in the art is understood.
Accompanying drawing explanation
Fig. 1 is this structure chart total of merge operation;
What Fig. 2 illustrated is, and of the present invention to realize original data centralization data in the method for multichannel data merger fast be the schematic flow sheet of multichannel data merger of ascending order when arranging;
The schematic flow sheet of multichannel data merger when realizing that in the method for multichannel data merger, original data centralization data are descending fast of the present invention that what Fig. 3 illustrated is.
Embodiment
Below in conjunction with accompanying drawing, the present invention is described in further detail, can implement according to this with reference to specification word to make those skilled in the art.
Fig. 1 shows this structure chart total of merge operation.Wherein SD1, SD2, SD3, SDN represent the raw data set that N part is orderly, and these data on the disk of one or more computer or in internal memory, can be applicable to local and distributed utilization; Merge represents the merge operation that will carry out, and direction arrow represents the flow direction of data.
Fig. 2 shows the described schematic flow sheet realizing multichannel data merger when original data centralization data in the method for multichannel data merger are ascending order arrangement fast.Wherein, Y is yes, and N is no.
Fig. 3 shows the schematic flow sheet of described multichannel data merger when realizing that in the method for multichannel data merger, original data centralization data are descending fast.Wherein, Y is yes, and N is no.
As shown in Figure 1, the method realizing multichannel data merger fast of the present invention, comprising:
Step one, collection multichannel raw data set;
Step 2, heapsort technology is utilized to carry out merge operation by computer to described multichannel raw data set;
Data flow after step 3, output merger.The technical program uses the technology of heapsort can data volume is large, the quick merger of data that raw data set way is many.
As shown in Figure 1, 2, 3, the element in each raw data set is ordered arrangement.When data way is many, the element in the raw data set of each road can reduce rapidly the number of comparisons between element in order, reaches efficient object.
As shown in Figure 2, when the element in the raw data set of every road is ascending order arrangement, described merge operation specifically comprises the following steps:
A, obtain total way of described raw data set;
If the total way of B is 1, then directly exports all data in this road raw data set, terminate merge operation; If total way is not 1, then obtain multichannel raw data set, and first element taken out in the raw data set of every road creates initial heap;
C, adjust described initial heap for most rickle, export heap top element;
If the initial data at element place, D heap top concentrates countless certificate, then exchange heap top element and last node, total way subtracts 1; If the initial data at element place, heap top is concentrated data, then take out and push up the adjacent data of element with heap, if these data are all less than the left and right child of heap top element, then directly export these data, circulation like this, until take all data in this road or find data larger than left and right child;
If E finds data larger than left and right child in above-mentioned steps D, then this data assignment is given heap top element, repeat step C, D;
F, in above-mentioned steps D, if be 0 after total way subtracts 1, then discharge heap space, terminate merge operation; If be not 0 after total way subtracts 1, then repeat step C, D, E, F until total way is 0.This technical scheme can progressively the exporting the mode that the element in the raw data set of each road arranges by ascending order of quickly and orderly, and the data flow merging into the larger arrangement in ascending order is to supply user.
As shown in Figure 3, when the element in the raw data set of every road is descending, described merge operation specifically comprises the following steps:
A, obtain total way of described raw data set;
If the total way of B is 1, then directly exports all data in this road raw data set, terminate merge operation; If total way is not 1, then obtain multichannel raw data set, and first element taken out in the raw data set of every road creates initial heap;
C, adjust described initial heap for most raft, export heap top element;
If the initial data at element place, D heap top concentrates countless certificate, then exchange heap top element and last node, total way subtracts 1; If the initial data at element place, heap top is concentrated data, then take out and push up the adjacent data of element with heap, if these data are all larger than the left and right child of heap top element, then directly export these data, circulation like this, until take all data in this road or find data less than left and right child;
If E finds data less than left and right child in above-mentioned steps D, then this data assignment is given heap top element, repeat step C, D;
F, in above-mentioned steps D, if be 0 after total way subtracts 1, then discharge heap space, terminate merge operation; If be not 0 after total way subtracts 1, then repeat step C, D, E, F until total way is 0.This technical scheme can progressively the exporting the element in the raw data set of each road by the mode of descending of quickly and orderly, and merge into the larger data flow in descending to supply user.
Described raw data set derives from external file, other machines and internal memory.This technical scheme can be tried one's best and be obtained all data completely, to ensure the comprehensive of data flow after merger.
When external sort, described raw data set obtains from external file; When distributed treatment, described raw data set obtains from network; When unit memory order, described raw data set obtains from internal memory.The program to be advanced line ordering at merge operation to all kinds of raw data set, improves the efficiency in heapsort process.
Data area in the raw data set on all roads does not overlap.This technical scheme can save the sequential operation time, makes merge process speed faster.
Here the treatment scale illustrated is used to simplify explanation of the present invention.The application of multichannel data merging method of the present invention, modifications and variations be will be readily apparent to persons skilled in the art.
As mentioned above, according to the present invention, owing to carrying out heapsort merge operation to the orderly raw data set of multichannel, therefore there is the effect of quick merger.
Although embodiment of the present invention are open as above, but it is not restricted to listed in specification and execution mode utilization, it can be applied to various applicable the field of the invention completely, for those skilled in the art, can easily realize other amendment.Therefore do not deviating under the universal that claim and equivalency range limit, the present invention is not limited to specific details and illustrates here and the legend described.

Claims (7)

1. realize a method for multichannel data merger fast, it is characterized in that, comprising:
Step one, collection multichannel raw data set;
Step 2, heapsort technology is utilized to carry out merge operation by computer to described multichannel raw data set;
Data flow after step 3, output merger.
2. realize the method for multichannel data merger as claimed in claim 1 fast, it is characterized in that, in described step one, the element in each raw data set is ordered arrangement.
3. realize the method for multichannel data merger as claimed in claim 2 fast, it is characterized in that, when the element in the raw data set of every road is ascending order arrangement, described merge operation specifically comprises the following steps:
A, obtain total way of described raw data set;
If the total way of B is 1, then directly exports all data in this road raw data set, terminate merge operation; If total way is not 1, then obtain multichannel raw data set, and first element taken out in the raw data set of every road creates initial heap;
C, adjust described initial heap for most rickle, export heap top element;
If the initial data at element place, D heap top concentrates countless certificate, then exchange heap top element and last node, total way subtracts 1; If the initial data at element place, heap top is concentrated data, then take out and push up the adjacent data of element with heap, if these data are all less than the left and right child of heap top element, then directly export these data, circulation like this, until take all data in this road or find data larger than left and right child;
If E finds data larger than left and right child in above-mentioned steps D, then this data assignment is given heap top element, repeat step C, D;
F, in above-mentioned steps D, if be 0 after total way subtracts 1, then discharge heap space, terminate merge operation; If be not 0 after total way subtracts 1, then repeat step C, D, E, F until total way is 0.
4. realize the method for multichannel data merger as claimed in claim 2 fast, it is characterized in that, when the element in the raw data set of every road is descending, described merge operation specifically comprises the following steps:
A, obtain total way of described raw data set;
If the total way of B is 1, then directly exports all data in this road raw data set, terminate merge operation; If total way is not 1, then obtain multichannel raw data set, and first element taken out in the raw data set of every road creates initial heap;
C, adjust described initial heap for most raft, export heap top element;
If the initial data at element place, D heap top concentrates countless certificate, then exchange heap top element and last node, total way subtracts 1; If the initial data at element place, heap top is concentrated data, then take out and push up the adjacent data of element with heap, if these data are all larger than the left and right child of heap top element, then directly export these data, circulation like this, until take all data in this road or find data less than left and right child;
If E finds data less than left and right child in above-mentioned steps D, then this data assignment is given heap top element, repeat step C, D;
F, in above-mentioned steps D, if be 0 after total way subtracts 1, then discharge heap space, terminate merge operation; If be not 0 after total way subtracts 1, then repeat step C, D, E, F until total way is 0.
5. realize the method for multichannel data merger as claimed in claim 1 fast, it is characterized in that, described raw data set derives from external file, network and internal memory.
6. realize the method for multichannel data merger as claimed in claim 5 fast, it is characterized in that,
When external sort, described raw data set obtains from external file;
When distributed treatment, described raw data set obtains from network;
When unit memory order, described raw data set obtains from internal memory.
7. realize the method for multichannel data merger as claimed in claim 1 fast, it is characterized in that, the data area in the raw data set on all roads does not overlap.
CN201510076043.2A 2015-02-12 2015-02-12 A kind of quick method for realizing multichannel data merger Active CN104601732B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510076043.2A CN104601732B (en) 2015-02-12 2015-02-12 A kind of quick method for realizing multichannel data merger

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510076043.2A CN104601732B (en) 2015-02-12 2015-02-12 A kind of quick method for realizing multichannel data merger

Publications (2)

Publication Number Publication Date
CN104601732A true CN104601732A (en) 2015-05-06
CN104601732B CN104601732B (en) 2018-01-23

Family

ID=53127225

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510076043.2A Active CN104601732B (en) 2015-02-12 2015-02-12 A kind of quick method for realizing multichannel data merger

Country Status (1)

Country Link
CN (1) CN104601732B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104850618A (en) * 2015-05-18 2015-08-19 北京京东尚科信息技术有限公司 System and method for providing sorted data
CN107908714A (en) * 2017-11-10 2018-04-13 上海达梦数据库有限公司 A kind of aggregation of data sort method and device
CN110377642A (en) * 2019-07-24 2019-10-25 杭州太尼科技有限公司 A kind of device of quick obtaining ordered sequence data

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070088699A1 (en) * 2005-10-18 2007-04-19 Edmondson James R Multiple Pivot Sorting Algorithm
CN102609490A (en) * 2012-01-20 2012-07-25 东华大学 Column-storage-oriented B+ tree index method for DWMS (data warehouse management system)
CN102968496A (en) * 2012-12-04 2013-03-13 天津神舟通用数据技术有限公司 Parallel sequencing method based on task derivation and double buffering mechanism
CN103716237A (en) * 2013-12-25 2014-04-09 广东天拓资讯科技有限公司 Path-finding method and device utilizing binary heap sorting

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070088699A1 (en) * 2005-10-18 2007-04-19 Edmondson James R Multiple Pivot Sorting Algorithm
CN102609490A (en) * 2012-01-20 2012-07-25 东华大学 Column-storage-oriented B+ tree index method for DWMS (data warehouse management system)
CN102968496A (en) * 2012-12-04 2013-03-13 天津神舟通用数据技术有限公司 Parallel sequencing method based on task derivation and double buffering mechanism
CN103716237A (en) * 2013-12-25 2014-04-09 广东天拓资讯科技有限公司 Path-finding method and device utilizing binary heap sorting

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104850618A (en) * 2015-05-18 2015-08-19 北京京东尚科信息技术有限公司 System and method for providing sorted data
CN104850618B (en) * 2015-05-18 2018-06-01 北京京东尚科信息技术有限公司 A kind of system and method that ordered data is provided
CN107908714A (en) * 2017-11-10 2018-04-13 上海达梦数据库有限公司 A kind of aggregation of data sort method and device
CN107908714B (en) * 2017-11-10 2021-05-04 上海达梦数据库有限公司 Data merging and sorting method and device
CN110377642A (en) * 2019-07-24 2019-10-25 杭州太尼科技有限公司 A kind of device of quick obtaining ordered sequence data

Also Published As

Publication number Publication date
CN104601732B (en) 2018-01-23

Similar Documents

Publication Publication Date Title
JP7453143B2 (en) Data storage and query methods and devices
CN1955958A (en) Sort data storage and split catalog inquiry method based on catalog tree
CN105205105A (en) Data ETL (Extract Transform Load) system based on storm and treatment method based on storm
CN104601732A (en) Method for merging multichannel data quickly
CN105389402A (en) Big-data-oriented ETL (Extraction-Transformation-Loading) method and device
JP2007531115A5 (en)
TWI709049B (en) Random walk, cluster-based random walk method, device and equipment
CN103942245A (en) Data extracting method based on metadata
CN106033442B (en) A kind of parallel breadth first search method based on shared drive architecture
CN101030230A (en) Image searching method and system
CN102902742A (en) Spatial data partitioning method in cloud environment
CN103064841A (en) Retrieval device and retrieval method
CN106250110A (en) Set up the method and device of model
CN103064991A (en) Mass data clustering method
CN105005638B (en) A kind of High Level Synthesis dispatching method based on linear delay model
WO2016206377A1 (en) Data integration and processing method and device
CN104573002A (en) Data organization model based on human, affair and object classification filing
CN104182208A (en) Method and system utilizing cracking rule to crack password
CN104376055B (en) A kind of large-sized model data comparing method based on allocation methods
CN103092630B (en) Interface data output unit and interface data output intent
CN105447142A (en) Dual-mode agricultural scientific and technical achievement classification method and system
CN106528739B (en) A kind of method for building up in Digital Dyeing picture material big data warehouse
CN104573101A (en) System and method for real-time data stream classification on basis of rule routes
CN104932982A (en) Message access memory compiling method and related apparatus
CN105515818B (en) The method and system of cyclic structure are split in a kind of network topology layout

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant