CN104601732B - A kind of quick method for realizing multichannel data merger - Google Patents

A kind of quick method for realizing multichannel data merger Download PDF

Info

Publication number
CN104601732B
CN104601732B CN201510076043.2A CN201510076043A CN104601732B CN 104601732 B CN104601732 B CN 104601732B CN 201510076043 A CN201510076043 A CN 201510076043A CN 104601732 B CN104601732 B CN 104601732B
Authority
CN
China
Prior art keywords
data
data set
merger
raw data
heap
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201510076043.2A
Other languages
Chinese (zh)
Other versions
CN104601732A (en
Inventor
杨爱民
张天祥
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jinher Software Co Ltd
Original Assignee
Beijing Jinher Software Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jinher Software Co Ltd filed Critical Beijing Jinher Software Co Ltd
Priority to CN201510076043.2A priority Critical patent/CN104601732B/en
Publication of CN104601732A publication Critical patent/CN104601732A/en
Application granted granted Critical
Publication of CN104601732B publication Critical patent/CN104601732B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/5038Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the execution order of a plurality of tasks, e.g. taking priority or time dependency constraints into consideration

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Traffic Control Systems (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a kind of quick method for realizing multichannel data merger, including:Step 1: collect multichannel raw data set;Step 2: merger operation is carried out using heapsort technology to multichannel raw data set by computer;Step 3: the data flow after output merger.The present invention has the merger efficiency that can improve the orderly raw data set of multichannel.

Description

A kind of quick method for realizing multichannel data merger
Technical field
The present invention relates to a kind of aggregation of data method.It is more particularly related to a kind of quickly realize multichannel data The method of merger.
Background technology
As each road portable set, intelligent home device are increasingly popularized, traditional PC is gradually substituted to change the mankind Life style, all there is explosive growth in the source of data, species, quantity, and the phenomenon of information overload is more and more common, The data how fast and effectively user wants from these numerous and diverse data are that this big data epoch is in the urgent need to address Problem.
The content of the invention
It is an object of the invention to solve at least the above, and provide the advantages of at least will be described later.
A further object of the invention is that the merger of multichannel data is realized using heapsort technology, has ordinal number for multichannel For, merger is more efficient.
In order to realize according to object of the present invention and other advantages, there is provided one kind quickly realizes multichannel data merger Method.
Step 1: collect multichannel raw data set;
Step 2: merger operation is carried out using heapsort technology to the multichannel raw data set by computer;
Step 3: the data flow after output merger.
Preferably, the described quick method for realizing multichannel data merger, in the step 1, each initial data Element in collection is in ordered arrangement.
Preferably, the described quick method for realizing multichannel data merger, when the element in every road raw data set is in When ascending order arranges, the merger operation specifically includes following steps:
A, total way of the raw data set is obtained;
If B, total way is 1, all data in the road raw data set are directly exported, terminate merger operation;It is if total Way is not 1, then obtains multichannel raw data set, and take out and create initial heap per first element in the raw data set of road;
C, it is most rickle to adjust the initial heap, output heap top element;
If initial data D, where the element of heap top concentrates no data, heap top element and last node are exchanged, always Way subtracts 1;If the initial data where the element of heap top, which is concentrated, there are data, the data adjacent with heap top element are taken out, if the number It is all small according to the left and right child than heap top element, then the data are directly exported, so circulation, until taking all data in the road or looking for The data bigger than left and right child to one;
If E, finding a data bigger than left and right child in above-mentioned steps D, the data are assigned to heap top element, Repeat step C, D;
F, in above-mentioned steps D, if being 0 after always way subtracts 1, heap space is discharged, terminates merger operation;If total way subtracts After 1 be 0, then repeat step C, D, E, F until total way be 0.
Preferably, the described quick method for realizing multichannel data merger, when the element in every road raw data set is in When descending arranges, the merger operation specifically includes following steps:
A, total way of the raw data set is obtained;
If B, total way is 1, all data in the road raw data set are directly exported, terminate merger operation;It is if total Way is not 1, then obtains multichannel raw data set, and take out and create initial heap per first element in the raw data set of road;
C, it is most raft to adjust the initial heap, output heap top element;
If initial data D, where the element of heap top concentrates no data, heap top element and last node are exchanged, always Way subtracts 1;If the initial data where the element of heap top, which is concentrated, there are data, the data adjacent with heap top element are taken out, if the number It is all big according to the left and right child than heap top element, then the data are directly exported, so circulation, until taking all data in the road or looking for The data smaller than left and right child to one;
If E, finding a data smaller than left and right child in above-mentioned steps D, the data are assigned to heap top element, Repeat step C, D;
F, in above-mentioned steps D, if being 0 after always way subtracts 1, heap space is discharged, terminates merger operation;If total way subtracts After 1 be 0, then repeat step C, D, E, F until total way be 0.
Preferably, the described quick method for realizing multichannel data merger, the raw data set is from outside text Part, other machines and internal memory.
Preferably, the described quick method for realizing multichannel data merger, when external sort, the raw data set Obtained from external file;When distributed treatment, the raw data set obtains from network;When unit memory order, The raw data set obtains from internal memory.
Preferably, the described quick method for realizing multichannel data merger, the data in the raw data set on all roads Scope does not overlap.
The present invention comprises at least following beneficial effect:Due to carrying out the merger of multichannel ordered data using heapsort technology Computing, therefore the number of comparisons between each circuit-switched data element can be effectively reduced, significantly improve Merging efficiency.
Further advantage, target and the feature of the present invention embodies part by following explanation, and part will also be by this The research and practice of invention and be understood by the person skilled in the art.
Brief description of the drawings
Fig. 1 is this total structure chart of merger operation;
What Fig. 2 illustrated is that data are in raw data set in the quick method for realizing multichannel data merger of the present invention The schematic flow sheet of multichannel data merger when ascending order arranges;
What Fig. 3 illustrated is that data are in raw data set in the quick method for realizing multichannel data merger of the present invention The schematic flow sheet of multichannel data merger when descending arranges.
Embodiment
The present invention is described in further detail below in conjunction with the accompanying drawings, to make those skilled in the art with reference to specification text Word can be implemented according to this.
Fig. 1 shows this total structure chart of merger operation.Wherein SD1, SD2, SD3, SDN represent the orderly original number of N parts According to collection, these data can be used on the disk of one or more computer or in internal memory suitable for local with distributed; Merge represents the merger to be carried out operation, and direction arrow represents the flow direction of data.
Fig. 2 shows that data arrange in ascending order in raw data set in the described quick method for realizing multichannel data merger When multichannel data merger schematic flow sheet.Wherein, Y is yes that N is no.
Fig. 3 shows that data arrange in descending in raw data set in the described quick method for realizing multichannel data merger When multichannel data merger schematic flow sheet.Wherein, Y is yes that N is no.
As shown in figure 1, the quick method for realizing multichannel data merger of the present invention, including:
Step 1: collect multichannel raw data set;
Step 2: merger operation is carried out using heapsort technology to the multichannel raw data set by computer;
Step 3: the data flow after output merger.The technical program can be big, original by data volume with the technology of heapsort The quick merger of data more than data set way.
As shown in Figure 1, 2, 3, the element in each raw data set is in ordered arrangement.When data way is more, Element in each road raw data set can reduce the number of comparisons between element rapidly in order, reach efficient purpose.
As shown in Fig. 2 when the element in every road raw data set arranges in ascending order, merger operation specifically include with Lower step:
A, total way of the raw data set is obtained;
If B, total way is 1, all data in the road raw data set are directly exported, terminate merger operation;It is if total Way is not 1, then obtains multichannel raw data set, and take out and create initial heap per first element in the raw data set of road;
C, it is most rickle to adjust the initial heap, output heap top element;
If initial data D, where the element of heap top concentrates no data, heap top element and last node are exchanged, always Way subtracts 1;If the initial data where the element of heap top, which is concentrated, there are data, the data adjacent with heap top element are taken out, if the number It is all small according to the left and right child than heap top element, then the data are directly exported, so circulation, until taking all data in the road or looking for The data bigger than left and right child to one;
If E, finding a data bigger than left and right child in above-mentioned steps D, the data are assigned to heap top element, Repeat step C, D;
F, in above-mentioned steps D, if being 0 after always way subtracts 1, heap space is discharged, terminates merger operation;If total way subtracts After 1 be 0, then repeat step C, D, E, F until total way be 0.The technical scheme can be with quickly and orderly by each road original number Progressively exported in the way of ascending order arranges according to the element in collection, and merge into the bigger data flow in ascending order arrangement to supply use Family.
As shown in figure 3, when the element in every road raw data set arranges in descending, merger operation specifically include with Lower step:
A, total way of the raw data set is obtained;
If B, total way is 1, all data in the road raw data set are directly exported, terminate merger operation;It is if total Way is not 1, then obtains multichannel raw data set, and take out and create initial heap per first element in the raw data set of road;
C, it is most raft to adjust the initial heap, output heap top element;
If initial data D, where the element of heap top concentrates no data, heap top element and last node are exchanged, always Way subtracts 1;If the initial data where the element of heap top, which is concentrated, there are data, the data adjacent with heap top element are taken out, if the number It is all big according to the left and right child than heap top element, then the data are directly exported, so circulation, until taking all data in the road or looking for The data smaller than left and right child to one;
If E, finding a data smaller than left and right child in above-mentioned steps D, the data are assigned to heap top element, Repeat step C, D;
F, in above-mentioned steps D, if being 0 after always way subtracts 1, heap space is discharged, terminates merger operation;If total way subtracts After 1 be 0, then repeat step C, D, E, F until total way be 0.The technical scheme can be with quickly and orderly by each road original number The mode arranged in descending order according to the element in collection progressively exports, and merges into the bigger data flow in descending arrangement to supply use Family.
The raw data set derives from external file, other machines and internal memory.The technical scheme can obtain completely as far as possible All data, to ensure the comprehensive of data flow after merger.
When external sort, the raw data set obtains from external file;When distributed treatment, the original number Obtained according to collection from network;When unit memory order, the raw data set obtains from internal memory.The program is to all kinds of original Data set is ranked up before merger operation, improves the efficiency during heapsort.
Data area in the raw data set on all roads does not overlap.When the technical scheme can save sequential operation Between, make merger process speed faster.
Treatment scale described herein is the explanation for simplifying the present invention.To multichannel data merging method of the present invention It will be readily apparent to persons skilled in the art using, modifications and variations.
As described above, according to the present invention, due to carrying out heapsort merger operation to the orderly raw data set of multichannel, therefore have There is the effect of quick merger.
Although embodiment of the present invention is disclosed as above, it is not restricted in specification and embodiment listed With it can be applied to various suitable the field of the invention completely, can be easily for those skilled in the art Realize other modification.Therefore it is of the invention and unlimited under the universal limited without departing substantially from claim and equivalency range In specific details and shown here as the legend with description.

Claims (4)

  1. A kind of 1. quick method for realizing multichannel data merger, it is characterised in that including:
    Step 1: collect multichannel raw data set;
    Step 2: merger operation is carried out using heapsort technology to the multichannel raw data set by computer;
    Step 3: the data flow after output merger;
    In the step 1, the element in each raw data set is in ordered arrangement;
    Wherein,
    When the element in every road raw data set arranges in ascending order, the merger operation specifically includes following steps:
    A, total way of the raw data set is obtained;
    If B, total way is 1, all data in the road raw data set are directly exported, terminate merger operation;If total way It is not 1, then obtains multichannel raw data set, and take out and create initial heap per first element in the raw data set of road;
    C, it is most rickle to adjust the initial heap, output heap top element;
    If initial data D, where the element of heap top concentrates no data, heap top element and last node, total way are exchanged Subtract 1;If the initial data where the element of heap top, which is concentrated, there are data, the data adjacent with heap top element are taken out, if the data ratio The left and right child of heap top element is small, then directly exports the data, so circulation, until taking all data in the road or finding one The individual data bigger than left and right child;
    If E, finding a data bigger than left and right child in above-mentioned steps D, the data are assigned to heap top element, repeated Step C, D;
    F, in above-mentioned steps D, if being 0 after always way subtracts 1, heap space is discharged, terminates merger operation;After if total way subtracts 1 Be 0, then repeat step C, D, E, F until total way be 0;
    When the element in every road raw data set arranges in descending, the merger operation specifically includes following steps:
    A, total way of the raw data set is obtained;
    If B, total way is 1, all data in the road raw data set are directly exported, terminate merger operation;If total way It is not 1, then obtains multichannel raw data set, and take out and create initial heap per first element in the raw data set of road;
    C, it is most raft to adjust the initial heap, output heap top element;
    If initial data D, where the element of heap top concentrates no data, heap top element and last node, total way are exchanged Subtract 1;If the initial data where the element of heap top, which is concentrated, there are data, the data adjacent with heap top element are taken out, if the data ratio The left and right child of heap top element is big, then directly exports the data, so circulation, until taking all data in the road or finding one The individual data smaller than left and right child;
    If E, finding a data smaller than left and right child in above-mentioned steps D, the data are assigned to heap top element, repeated Step C, D;
    F, in above-mentioned steps D, if being 0 after always way subtracts 1, heap space is discharged, terminates merger operation;After if total way subtracts 1 Be 0, then repeat step C, D, E, F until total way be 0.
  2. 2. the method as claimed in claim 1 for quickly realizing multichannel data merger, it is characterised in that the raw data set comes Come from external file, network and internal memory.
  3. 3. the method as claimed in claim 2 for quickly realizing multichannel data merger, it is characterised in that when external sort, institute Raw data set is stated to obtain from external file;
    When distributed treatment, the raw data set obtains from network;
    When unit memory order, the raw data set obtains from internal memory.
  4. 4. the method as claimed in claim 1 for quickly realizing multichannel data merger, it is characterised in that the initial data on all roads Data area in collection does not overlap.
CN201510076043.2A 2015-02-12 2015-02-12 A kind of quick method for realizing multichannel data merger Active CN104601732B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510076043.2A CN104601732B (en) 2015-02-12 2015-02-12 A kind of quick method for realizing multichannel data merger

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510076043.2A CN104601732B (en) 2015-02-12 2015-02-12 A kind of quick method for realizing multichannel data merger

Publications (2)

Publication Number Publication Date
CN104601732A CN104601732A (en) 2015-05-06
CN104601732B true CN104601732B (en) 2018-01-23

Family

ID=53127225

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510076043.2A Active CN104601732B (en) 2015-02-12 2015-02-12 A kind of quick method for realizing multichannel data merger

Country Status (1)

Country Link
CN (1) CN104601732B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104850618B (en) * 2015-05-18 2018-06-01 北京京东尚科信息技术有限公司 A kind of system and method that ordered data is provided
CN107908714B (en) * 2017-11-10 2021-05-04 上海达梦数据库有限公司 Data merging and sorting method and device
CN110377642B (en) * 2019-07-24 2020-06-02 杭州太尼科技有限公司 Device for rapidly acquiring ordered sequence data

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102609490A (en) * 2012-01-20 2012-07-25 东华大学 Column-storage-oriented B+ tree index method for DWMS (data warehouse management system)
CN102968496A (en) * 2012-12-04 2013-03-13 天津神舟通用数据技术有限公司 Parallel sequencing method based on task derivation and double buffering mechanism
CN103716237A (en) * 2013-12-25 2014-04-09 广东天拓资讯科技有限公司 Path-finding method and device utilizing binary heap sorting

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070088699A1 (en) * 2005-10-18 2007-04-19 Edmondson James R Multiple Pivot Sorting Algorithm

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102609490A (en) * 2012-01-20 2012-07-25 东华大学 Column-storage-oriented B+ tree index method for DWMS (data warehouse management system)
CN102968496A (en) * 2012-12-04 2013-03-13 天津神舟通用数据技术有限公司 Parallel sequencing method based on task derivation and double buffering mechanism
CN103716237A (en) * 2013-12-25 2014-04-09 广东天拓资讯科技有限公司 Path-finding method and device utilizing binary heap sorting

Also Published As

Publication number Publication date
CN104601732A (en) 2015-05-06

Similar Documents

Publication Publication Date Title
CN104601732B (en) A kind of quick method for realizing multichannel data merger
JP7453143B2 (en) Data storage and query methods and devices
CN103425632B (en) A kind of method of serializing, device and processor
CN1877563A (en) Report form defining method and system
JP2005512237A5 (en)
WO2015165381A1 (en) Universal internet information data mining method
WO2006078912A3 (en) Automatic dynamic contextual data entry completion system
JP2011028749A5 (en)
WO2017101591A1 (en) Method for constructing knowledge base, and controller
WO2014008139A3 (en) Generating search results
CN102169491B (en) Dynamic detection method for multi-data concentrated and repeated records
CN1741026A (en) Method for fast generating logical circuit
CN1694090A (en) Multiple search engine guiding method
CN106557571A (en) A kind of data duplicate removal method and device based on K V storage engines
CN101030230A (en) Image searching method and system
CN109344468A (en) CAD diagram paper introduction method, system and computer readable storage medium
CN103064991A (en) Mass data clustering method
Matsui Challenge for manga processing: Sketch-based manga retrieval
CN104317836B (en) The method and device of Mass production data file
CN103092630B (en) Interface data output unit and interface data output intent
CN1160776C (en) Transistor optimizing method, integrated circuit distribution design method and device relating to same
CN104102480B (en) The method and apparatus for generating configuration file
JP2005165393A5 (en)
CN105447142A (en) Dual-mode agricultural scientific and technical achievement classification method and system
CN102385598A (en) Seamless query and update interface for relational data and extensible makeup language (XML) data

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant