New! View global litigation for patent families

CN104850618A - System and method for providing sorted data - Google Patents

System and method for providing sorted data Download PDF

Info

Publication number
CN104850618A
CN104850618A CN 201510250580 CN201510250580A CN104850618A CN 104850618 A CN104850618 A CN 104850618A CN 201510250580 CN201510250580 CN 201510250580 CN 201510250580 A CN201510250580 A CN 201510250580A CN 104850618 A CN104850618 A CN 104850618A
Authority
CN
Grant status
Application
Patent type
Prior art keywords
data
sorting
link
corresponding
record
Prior art date
Application number
CN 201510250580
Other languages
Chinese (zh)
Inventor
张成远
田琪
季锡强
Original Assignee
北京京东尚科信息技术有限公司
北京京东世纪贸易有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor ; File system structures therefor
    • G06F17/30286Information retrieval; Database structures therefor ; File system structures therefor in structured data stores
    • G06F17/30386Retrieval requests
    • G06F17/30424Query processing
    • G06F17/30533Other types of queries
    • G06F17/30545Distributed queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor ; File system structures therefor
    • G06F17/30286Information retrieval; Database structures therefor ; File system structures therefor in structured data stores
    • G06F17/30575Replication, distribution or synchronisation of data between databases or within a distributed database; Distributed database system architectures therefor

Abstract

The present invention relates to a system and a method for providing sorted data. The system comprises: a client, configured to generate a query request with a sorting requirement, a plurality of data sources, a link part and a sorting mechanism, wherein each data source generates a sorting data record according to a corresponding sub query request and transmits the data record to a corresponding link; a link part, configured to manage and assign links to the data sources; a sorting mechanism, configured to process the data record of the partial sorting from the data sources via the corresponding links to provide the data record for overall sorting to the client. The sorting mechanism uses a limit mark of a buffer region corresponding to each link to control the use of a memory, and repeatedly performs heapsort on each link to complete the overall sorting of the data.

Description

一种提供有序数据的系统和方法 A method of providing a system and method of ordered data

技术领域 FIELD

[0001] 本发明涉及从多数据源获取有序数据的系统和方法。 [0001] The present invention relates to a system and method for acquiring data from a plurality of ordered data source.

背景技术 Background technique

[0002] 随着互联网的不断发展,互联网上的数据量在急剧增长,传统单机数据库在处理大规模数据时已经面临明显的瓶颈,各大互联网公司都着手研宄分布式数据库的实现方案。 [0002] With the continuous development of the Internet, the amount of data on the Internet is increasing rapidly, the traditional stand-alone database when processing large-scale data already facing significant bottleneck, the major Internet companies are working study based on implementation of a distributed database. 在分布式数据库的实现方案中包括两类,一类是客户端的解决方案,引入一个新的客户端,对数据进行分片处理,另一类是引入数据库中间件,对数据的分片处理由该中间件完成,应用程序只需要访问该数据库中间件即可,整个访问过程和访问原生的数据库几乎是一样的。 In implementations including two distributed database, a client class of solutions, the introduction of a new client, the data fragmentation process, the introduction of another database middleware, data processed by the fragment the middleware is complete, the application only needs to access the database middleware, the entire process and access to native database access is almost the same.

[0003] 在数据库中间件的解决方案中,针对MySQL的中间件的解决方案相对来说比较多,在实现MySQL中间件的时候需要解决的一个非常重要的问题就是从多个MySQL实例获取数据的时候如何对这些数据进行汇总排序。 [0003] In the database middleware solution for MySQL middleware solutions relatively many, a very important issue when implementing MySQL middleware to be solved is to obtain data from multiple instances of MySQL how to summarize time to sort the data.

[0004] 一些开源的数据库中间件如Cobar不支持涉及多个MySQL实例的数据排序。 [0004] The open source database middleware does not support data sorting Cobar involving multiple instances of MySQL. 还有一些中间件支持排序,但是实现的时候是将所有涉及到的数据从各个MySQL实例上都获取到,然后将这些数据放在中间件所在机器的内存中或者是落到磁盘上,然后再对其进行排序,排完以后再发给客户端。 Some sort of middleware support, but when the implementation is to involve all the data from the various MySQL instances get to, and then the data on the machine where the middleware in memory or on disk falls, then sort them, drained later sent to the client.

[0005] 中间件所在机器本身内存有限,在数据量大时就处理不了,如果需要将数据转到磁盘上则实现会比较复杂且性能会严重下降,而如果需要将数据专门汇总到另外的机器上,则会增加网络10,同样会降低性能。 [0005] middleware machine where limited memory itself, when it can not handle large volumes of data, if the data needs to go to the disk then realized it would be more complicated and serious decline in performance, but if you need a special summary data to another machine on the network 10 will increase, it will also reduce performance.

发明内容 SUMMARY

[0006] 本发明的目的是提供从多数据源有效获取有序数据的系统和方法。 [0006] The object of the present invention is to provide a system and method for acquiring data from a plurality of ordered data source is active.

[0007] 根据本发明的一个方面,提供一种提供有序数据的系统,包括:客户端,用于生成带有排序要求的查询请求;多个数据源,所述多个数据源的每个根据相应的查询子请求产生排序的数据记录并传递到对应的连接;连接部分,用于管理和分配到各数据源的连接;排序机构,用于执行如下步骤:从客户端接收带有排序要求的查询请求,并解析所述查询请求以产生对应于各数据源的子查询请求;获取各数据源对应的连接,建立与各连接对应的缓冲区并清空且标记为未满,并且把产生的各子查询请求通过相应连接发送给相应的数据源,其中所述数据源通过相应连接返回有序的数据记录;按照预定的规则来轮询各连接以确定哪个连接有数据可读取,其中,当确定一连接有数据可读取并且该连接对应的内存缓冲区未满,则读取该连接中所有可读的数据记录并 [0007] In accordance with one aspect of the present invention, there is provided a sequenced data providing system, comprising: a client, configured to generate a query request with the ordering requirements; a plurality of data sources, each of said plurality of data sources the respective sub-query request is generated and sorted data records transmitted to the corresponding connector; a connecting portion for connecting the management and distribution to the data source; ordering means for performing the steps of: receiving from the client with the ordering requirements query request, and parsing the query request to the data-generating sources corresponding to each query request; acquisition source corresponding to the respective data connection, establishing connections corresponding to the respective buffer and marked as full and empty, and the resulting each sub-query request sent by the respective connection to the corresponding data source, wherein said data source is connected by a respective return ordered data record; according to predetermined rules to poll each of the connection is connected to determine which data can be read, wherein when a connection is determined and the data can be read connected to the corresponding memory buffer is not full, all the readable data record is read and the connection 储到该连接对应的内存缓冲区中,当该缓冲区中的数据量超过预定阈值时把该缓冲区标记为已满,并且当确定需要通过该连接读取的数据全部读取完毕时把该连接标记为读取结束;执行所有连接的堆排序;当堆顶的连接对应的内存缓冲区非空时,从该内存缓冲区中取出第一个记录发送给客户端,并且,重复所有连接的堆排序并把堆顶的连接对应的内存缓冲区中的第一个记录取出发送给客户端,直到堆顶的连接对应的内存缓冲区为空,其中在从堆顶对应的内存缓冲区中取出第一个记录发送给客户端之后要判断该内存缓冲区是否未满,当未满时,取消该内存缓冲区的已满标记;当堆顶的连接对应的内存缓冲区为空并且堆顶的该连接被标记读取结束,则标记该连接处理完毕,否则继续所述的轮询;当所有的连接都已经处理完毕,则把结束标志发送给客 The reservoir is connected to the corresponding memory buffer, when the amount of data in the buffer exceeds a predetermined threshold value to the buffer is marked as full and needs to determine when all the data read by the reading is finished to the connection labeled end is connected to read; hEAPSORT perform all connected; when the top of the stack memory connected to the corresponding buffer is not empty, to take the first record to the client from the buffer memory, and repeating all connected heap sort and connecting the first record corresponding to the top of the stack is taken out of the memory buffer to the client, until the memory buffer is connected to the corresponding top of the heap is empty, wherein the stack removed from the top of the corresponding memory buffer a first side after recording is sent to the client to determine whether the memory buffer is not full, full, full cancellation of the tag memory buffer; when the top of the stack memory connected to the corresponding buffer is empty and the top of the stack the indicia reading connection is ended, the connection is marked processed, otherwise continuing the polling; when all connections have been processed, to put off the transmission end flag 端,否则继续所述轮询。 End, or continue the polling. 其中,所述连接按照如下规则取值:如果该连接被标记为处理完毕,对升序排序该连接取值无限大,对降序排序该连接的取值无限小;如果该连接未被标记为处理完毕且该连接对应的缓冲区为空,对升序排序该连接的取值无限小,对降序排序该连接的取值无限大;如果不是上述两种情况,该连接的值是该连接对应的缓冲区中第一个记录的排序字段值。 Wherein the connector according to the following criteria argument: if the connection is marked as processed, in ascending order of the connection of infinite value, the value of the connection in descending order infinitesimal; if the connection is not marked as done and the connection corresponding buffer is empty, the values ​​in ascending order of the connection of infinitesimal, in descending order of the value of the connection infinite; if not both cases, the value of which is connected to the buffer corresponding to the connection sort the first record field value.

[0008] 根据本发明的另一方面,提供一种提供有序数据的方法,包括步骤:从客户端接收带有排序要求的查询请求,并解析所述查询请求以产生对应于各数据源的子查询请求;获取各数据源对应的连接,建立与各连接对应的缓冲区并清空且标记为未满,并且把产生的各子查询请求通过相应连接发送给相应的数据源,其中所述数据源通过相应连接返回有序的数据记录;按照预定的规则来轮询各连接以确定哪个连接有数据可读取,其中,当确定一连接有数据可读取并且该连接对应的内存缓冲区未满,则读取该连接中所有可读的数据记录并存储到该连接对应的内存缓冲区中,当该缓冲区中的数据量超过预定阈值时把该缓冲区标记为已满,并且当确定需要通过该连接读取的数据全部读取完毕时把该连接标记为读取结束;执行所有连接的堆排序;当堆顶的连接 [0008] According to another aspect of the present invention, there is provided a method of providing ordered data, comprising the steps of: receiving a query request from a client with ordering requirements, and parsing the query request to generate a data source corresponding to each sub-queries; acquires each data source corresponding connecting, establishing connections corresponding to the respective buffer and marked as empty and full, and to generate the respective sub-query request sent to the corresponding data source via respective connections, wherein said data returns the ordered data records source via respective connections; in accordance with a predetermined rule to poll each of the connection to determine which data can be read is connected, wherein, when a connection is determined and the data can be read buffer memory is not connected to the corresponding full, all the readable data records this connection is read and stored in the memory buffer corresponding to the connection, when the amount of data in the buffer exceeds a predetermined threshold value to the buffer is marked as full, and when it is determined when the required finished reading all the connecting end is marked as read data read via the connection; hEAPSORT perform all connected; top of the stack when connected 应的内存缓冲区非空时,从该内存缓冲区中取出第一个记录发送给客户端,并且,重复所有连接的堆排序并把堆顶的连接对应的内存缓冲区中的第一个记录取出发送给客户端,直到堆顶的连接对应的内存缓冲区为空,其中在从堆顶对应的内存缓冲区中取出第一个记录发送给客户端之后要判断该内存缓冲区是否未满,当未满时,取消该内存缓冲区的已满标记;当堆顶的连接对应的内存缓冲区为空并且堆顶的该连接被标记读取结束,则标记该连接处理完毕,否则继续所述的轮询;当所有的连接都已经处理完毕,则把结束标志发送给客户端,否则继续所述轮询。 Should be non-empty memory buffer, retrieves the first record from the buffer memory to the client, and, repeated connections and all heap sort the first record corresponding to the connection of the top of the stack memory buffer remove sent to the client, until the top of the stack memory connected to the corresponding buffer is empty, wherein in the first extraction after recording is sent to the client to determine whether the memory buffer is full from the top of the stack corresponding to the memory buffer, when full, the cancellation flag of the memory buffer is full; when the connection corresponding to the top of heap memory buffer is empty and the top of the stack is read the end flag, the flag of the completion of connection processing, otherwise continue said polling; when all connections have been processed, put end flag sent to the client, otherwise, continuing the polling. 其中,所述连接按照如下规则取值:如果该连接被标记为处理完毕,对升序排序该连接取值无限大,对降序排序该连接的取值无限小;如果该连接未被标记为处理完毕且该连接对应的缓冲区为空,对升序排序该连接的取值无限小,对降序排序该连接的取值无限大;如果不是上述两种情况,该连接的值是该连接对应的缓冲区中第一个记录的排序字段值。 Wherein the connector according to the following criteria argument: if the connection is marked as processed, in ascending order of the connection of infinite value, the value of the connection in descending order infinitesimal; if the connection is not marked as done and the connection corresponding buffer is empty, the values ​​in ascending order of the connection of infinitesimal, in descending order of the value of the connection infinite; if not both cases, the value of which is connected to the buffer corresponding to the connection sort the first record field value.

附图说明 BRIEF DESCRIPTION

[0009] 下面将参考附图详细地描述本发明的实施例,其中: [0009] The following embodiments of the present invention are described in detail with reference to the accompanying drawings, wherein:

[0010] 图1是本发明的基于多数据源提供有序数据的系统组成以及数据排序的示意图; [0010] FIG. 1 is a schematic diagram providing an ordered data based on multiple data sources and system of the present invention is composed of data sorting;

[0011] 图2是本发明的基于多数据源提供有序数据的方法的流程图。 [0011] FIG 2 is a flowchart of an ordered data provide multiple data sources based on the method of the present invention.

具体实施方式 detailed description

[0012] 根据本发明的实施例,提供一种提供有序数据的系统,包括数据源部分、连接部分、排序机构部分及客户端。 [0012] According to an embodiment of the present invention, there is provided a sequenced data providing system comprising a data source portion, the connecting portion, sorting portion and institutional clients. 各数据源通过连接部分向排序机构提供局部有序的数据,经排序机构处理后,最终在客户端提供全局有序的数据。 Each data source provides data to the local order sorting mechanism by a connecting portion, ordered processing means, the final data in order to provide global client.

[0013] 数据源部分的各数据源例如是MySQL实例。 [0013] Each data source is the data source, for example, part of the MySQL instance.

[0014] 连接部分用于管理和分配到各数据源的连接,以实现排序机构对各数据源存取的管道,使得排序机构可从各数据源获取数据。 [0014] The connecting portion for connecting the management and allocation to the data sources, in order to achieve the sort of conduit means each access to the data source, such that the sorting mechanism may obtain data from each data source.

[0015] 排序机构从每个数据源获取的数据是按照规定顺序获取的,即各数据源提供的数据本身是局部有序的,例如都是递增顺序。 [0015] sorting mechanism acquired from each data source data is acquired in a predetermined order, i.e., data of each data source is itself provided locally ordered, for example, they are in increasing order. 排序机构需要对来自不同数据源的数据进行全局排序。 Sorting mechanism needs to be globally ordered data from different data sources.

[0016] 排序机构例如是数据库中间件。 [0016] The ordering means, for example, database middleware. 该数据库中间件到一MySQL之间有一条TCP连接。 The database middleware between a MySQL has a TCP connection. MySQL实例上的数据通过该连接不断发送给中间件。 MySQL instance data continuously sent over the connection to the middleware.

[0017] TCP连接的发送端和接收端各有一个缓冲区。 [0017] TCP connection transmitting and receiving ends each have a buffer. MySQL实例在向TCP连接发送数据的时候,首先会将数据放到TCP连接的发送端的缓冲区中,然后发送到相对的接收端,接收端接收到数据以后会先将数据存放到该TCP连接的接收端的缓冲区中,这些操作都是由操作系统本身完成的。 MySQL instance when connecting to the TCP sending data, the buffer data into the first transmission will end TCP connection, and then transmitted to the receiving end opposite the receiving end after the data is first received data stored in the TCP connection the buffer of the receiving end, these operations are performed by the operating system itself.

[0018]中间件通过TCP连接接收MySQL实例上的数据,就是中间件读取TCP连接上的接收端的缓冲区中的内容。 [0018] The intermediate data received via the TCP MySQL instance, reading the contents of the buffer is intermediate the receiving end of the TCP connection in.

[0019] 当接收端的缓冲区未满,MySQL实例上的数据才能继续通过TCP连接发送过来,否则MySQL实例上的数据的发送将被阻塞。 [0019] When the buffer is not full the receiving end, the data on the MySQL instance to continue transmission over a TCP connection, or on the transmission data will be blocked MySQL instance.

[0020] 根据本发明的实施例,整个排序处理的示意图如图1所示,相应图上也有四个部分: [0020] According to an embodiment of the present invention, the entire sort processing diagram shown in Figure 1, there are four parts corresponding to FIG:

[0021] MySQL 实例集101; [0021] MySQL instance set 101;

[0022] 中间件与MySQL实例之间的TCP连接102 ; [0022] TCP between the intermediate connector 102 and the MySQL instance;

[0023] 中间件排序机构103 ;以及 [0023] 103 intermediate sort means; and

[0024]客户端 104。 [0024] The client 104.

[0025] 根据本发明的实施例,在图1中示出一种系统组成,其中数据源部分MySQL实例集101有六个MySQL实例。 [0025] According to an embodiment of the present invention, in FIG. 1 illustrates a system, where the data source portion 101 has six sets example MySQL MySQL instance. 排序中间件103与每个MySQL实例有一条连接,共有六条连接A、B、C、D、E 和F0 Sort middleware 103 has a connection to each MySQL instance, a total of six connections A, B, C, D, E, F0

[0026] 作为例子,假设每个MySQL实例上存在一些数据,这些数据通过连接部分传输给排序中间件。 [0026] As an example, assume that there are some data on each MySQL instance, data transmission through the connecting portion to the intermediate sort. 连接A要传输的数据有0、6和12,连接B要传输的数据有1、7和13,连接C要传输的数据有8和20,连接D要传输的数据有9、15和21,连接E要传输的数据有4和10,连接F要传输的数据有5、11和17。 A connection data to be transmitted are 0,6 and 12, the data to be transmitted are connected to B 1, 7 and 13, the data to be transmitted are connected to C 8 and 20, the data to be transmitted are connected to D 9, 15 and 21, data to be transmitted are connected to E 4 and 10, the data to be transmitted are connected to F 5, 11 and 17. 每条连接上要传输的数据都是经排序的,假设顺序是从小到大。 Data to be transmitted on each connection is ordered, the order is assumed small to large.

[0027] 根据本发明的提供有序数据的方法包括如下所述的步骤。 [0027] The method for providing the ordered data according to the present invention comprises the step of.

[0028] 排序机构从客户端接收带有排序要求的查询请求,并解析所述查询请求以产生对应于各数据源的子查询请求。 [0028] received from the client ordering mechanism query request with the ordering requirements, and parsing the query request to generate a data source corresponding to each of the sub-queries.

[0029] 然后,排序机构获取上述各数据源所对应的连接,建立与各连接对应的缓冲区并清空且标记为未满,并且把产生的各子查询请求通过相应连接发送给相应的数据源。 [0029] Then, each of the sorting means acquires the corresponding data sources are connected, the connection established corresponding to each buffer and marked as full and empty, and the respective sub-queries generated corresponding to the request sent to the data source via respective connection . 所述各数据源响应对应的子查询在其数据库中检索出相应的数据记录,并通过其对应的连接返回按要求排序的数据记录。 The response of each data source corresponding to a sub-query the corresponding data record in its database, and returns sorted data records required by the corresponding connection.

[0030] 排序机构103按照预定的规则来轮询各连接以确定哪个连接有数据可读取。 [0030] The sorting mechanism 103 in accordance with a predetermined rule to poll each of the connection is connected to determine which data can be read. 当排序机构确定某一连接有数据可读取并且该连接对应的内存缓冲区未满,则读取该连接中所有可读的数据记录并存储到该连接对应的内存缓冲区中。 When the sorting means determine that a connection can be read and data corresponding to the connection memory buffer is not full, the read data record read all the connection and stored in the memory buffer corresponding to the connection. 当该缓冲区中的数据量超过预定阈值时,把该缓冲区标记为已满。 When the amount of data in the buffer exceeds a predetermined threshold value, that the buffer is marked as full. 然后,排序机构确定需要通过该连接读取的数据是否已处理完毕,并且当需要通过该连接读取的数据全部读取完毕时把该连接标记为读取结束。 Then, data sorting means decides to read through the connection has been processed, and all have been read when needed is connected to the read data via the connection marked as the end of the reading.

[0031] 然后,排序机构103执行所有连接的堆排序。 [0031] Then, the sorting mechanism 103 to perform all connected heap sort. 连接的排序值按照如下规则确定:如果该连接被标记为处理完毕,对升序排序该连接取值无限大,对降序排序该连接的取值无限小;如果该连接未被标记为处理完毕且该连接对应的缓冲区为空,对升序排序该连接的取值无限小,对降序排序该连接的取值无限大;如果不是上述两种情况,该连接的值是该连接对应的缓冲区中第一个记录的排序字段值。 Ranking value is determined according to the following connection rule: if the connection is marked as processed, in ascending order of the connection of infinite value, in descending order of the value of the infinitesimal connection; if the connection is not marked as processed and the connected to the corresponding buffer is empty, the value of the connection infinitesimal ascending sort in descending order of the value of the connection infinite; if not both cases, the value of which is connected to the buffer corresponding to the connection section a sort field value records.

[0032] 在一次堆排序之后,确定位于堆顶的连接对应的缓冲区是否为空。 [0032] After a heap sort, whether located on the top of the stack is connected to the corresponding buffer is empty. 当堆顶的连接对应的内存缓冲区非空时,从该内存缓冲区中取出第一个记录发送给客户端104,然后重复所有连接的堆排序以及把堆顶的连接对应的内存缓冲区中的第一个记录取出发送给客户端的处理,直到堆顶的连接对应的内存缓冲区为空,其中在从堆顶对应的内存缓冲区中取出第一个记录发送给客户端之后要判断该内存缓冲区是否未满,当未满时,取消该内存缓冲区的已满标记。 When the top of the stack memory connected to the corresponding buffer is not empty, remove a record from the first memory buffer 104 is sent to the client, and then repeat all HEAPSORT connection and connected to the corresponding memory buffer in the top of the stack taken after the first recording is sent to the client process, until the top of the stack connected to the corresponding buffer memory is empty, which retrieves the first record from the top of the stack corresponding memory buffer to be sent to the client determines that the memory whether the buffer is full, and the full, full marks to cancel the memory buffer.

[0033]当堆顶的连接对应的内存缓冲区为空并且堆顶的该连接被标记读取结束,则标记该连接为处理完毕,否则继续所述的轮询,以试图读取某个连接中的数据。 [0033] When the connection corresponding to the top of heap memory buffer is empty and the top of the stack is read end flag, the connection is marked as processed, otherwise the polling continues to attempt to read a connection the data.

[0034] 当所有的连接都已经处理完毕,则把结束标志发送给客户端表示排序过程结束,否则继续所述轮询,以试图读取某个连接中的数据。 [0034] When all connections have been processed, put end flag sent to the client indicating the end of the sorting process, otherwise the polling continues to attempt to read a certain data connection.

[0035] 下面结合图2描述本发明提供有序数据的方法的详细过程。 [0035] The following detailed description of the method of the present invention provides a process ordered FIG 2 binding data.

[0036] 排序中间件接收来自客户端的带有排序要求的查询请求,例如是SQL请求。 [0036] Sort intermediate receiving a query request from the client with ordering requirements, for example, SQL requests.

[0037] 在步骤S2001排序中间件接收该查询请求并解析该查询请求。 [0037] receives the inquiry request in step S2001 intermediate sort and parse the query request. 如果该查询请求涉及多个MySQL实例,则对该查询请求做拆分,产生对应于各MySQL实例的子查询请求。 If the query requests directed to a plurality of MySQL instance, resolving the query request is made, it is generated corresponding to each child instance of MySQL query request.

[0038] 如果查询请求仅涉及单个MySQL实例,则不需要分解查询请求,中间件只需要接收该单个MySQL实例发送过来的数据即可。 [0038] If the query request involves only a single instance of MySQL, decomposed query request is not required, only needs to receive the single intermediate MySQL instance can be transmitted over the data.

[0039] 在步骤S2003排序中间件获取有关各MySQL实例的连接,然后在步骤S2005建立与这些连接对应的缓冲区用于存储数据记录,并且把这些缓冲区清空且标记为未满。 [0039] acquired in step S2003 is connected about each intermediate sort MySQL instance, and establishing the connection at step S2005 corresponding to these buffers for storing data records, and these buffers cleared and marked as full.

[0040] 在步骤S2007排序中间件通过这些连接把产生的子SQL请求发送给相应的MySQL实例。 [0040] Examples of MySQL sent to the corresponding child through these connectors the SQL request generated in step S2007 to sort middleware.

[0041] 各个MySQL实例接收相应的子SQL请求,根据子SQL请求检索数据,并相应对数据做排序,然后通过对应的连接将有序数据返回给排序中间件。 [0041] each receiving a respective sub MySQL instance SQL request, a request to retrieve data according to the sub SQL, and make the corresponding sorting data, then returns to the sequenced data sorted by a corresponding intermediate connections.

[0042] 在步骤S2009排序中间件按照预定的规则来轮询哪些连接有数据可以读取,如果没有连接有数据可读,则继续等待,并重新轮询是否有连接有数据可读。 [0042] In step S2009 middleware sorted according to predetermined rules which are connected to the polling data can be read, if the read data is not connected, continue to wait, and connection re-poll the read data. 例如,可根据设定的时间间隔来定期对各连接做轮询,以判断连接是否有数据可读取。 For example, according to the set time intervals regular polling of each connection, the connection to determine whether data can be read.

[0043] 如何执行这种判断依赖于连接的具体实现方式。 [0043] How to perform such judgment depends on the specific implementation of the connection. 例如,在TCP层实现的连接中,TCP连接的发送端和接收端各有一个缓冲区。 For example, the TCP layer implementation of the connection, TCP connection transmitting and receiving ends each have a buffer. 数据库例如MySQL实例在向TCP连接发送数据的时候,首先会将数据放到TCP连接的发送端的缓冲区中,然后发送到相对的接收端,接收端接收到数据以后会先将数据存放到该TCP连接的接收端的缓冲区中,这些操作都是由相应操作系统完成的。 Database such as MySQL instance when connecting to the TCP transmitting data, first data into the buffer will be transmitting end TCP connection, and then sent to an opposite receiving terminal, the receiving end will be received after the data stored in the first data TCP a buffer connected to the receiving end, these operations are performed by the respective operating systems.

[0044] 如果在步骤S2009判断存在连接已经可从中读取数据,则在步骤S2011判断该连接对应的内存缓冲区是否已满。 [0044] If the determination in step S2009 has a connection exists can read data, then the connection is determined in step S2011 corresponding to the memory buffer is full.

[0045] 这里的缓冲区是否已满是指缓冲区中的数据量是否已经超过了预定的阈值。 [0045] The buffer is full herein refers to whether the amount of data in the buffer has exceeded a predetermined threshold value. 在读取一连接中的数据时,应该把该连接中可读取的数据全部读取完毕。 When reading data of a connection, this connection should the data can be read all been read. 因此在读取一个连接的数据的过程中,即使该连接对应的缓冲区已满,也要把可读取的数据读取完毕并存储在该缓冲区中(例如缓冲区实际有用来保存“溢出部分”的空间)。 Therefore, in reading data during a connection, even if the data buffer is full corresponding to the connection, we should also be read has been read and stored in the buffer (e.g. a buffer for holding the actual have "overflowed part of the "space).

[0046] 当在步骤S2011判定缓冲区未满,则排序中间件在步骤S2012读取该连接中所有可读的数据记录并存储到该连接对应的内存缓冲区中,然后在步骤S2013判断缓冲区是否已满,当已满时时,在步骤S2014把该缓冲区标记为已满。 [0046] When the buffer is not full at step S2011 it is determined, at step S2012 the sort middleware read all connection data record read and stored in the memory buffer corresponding to the connection, and then is determined at step S2013 the buffer is full, as always full, at step S2014 as to mark this buffer is full.

[0047] 本发明通过对连接的缓冲区设置“满”标记的这种方式来控制内存的使用,实现了效率和可用性的平衡。 [0047] The present invention to control the use of the memory buffer is provided in this manner connected to "full" flag, to achieve a balance of efficiency and availability.

[0048] 接下来,在步骤S2015根据读取的数据判断需要通过该连接读取的数据是否已经全部读取完毕,即该连接对应的数据源(MySQL实例)应该提供的数据是否已经全部被读取完毕。 [0048] Next, at step S2015 to read the data required by the connection is already finished reading all the data read out is determined, i.e. whether the connection data corresponding to the data source (MySQL example) have all been provided should read take complete. 该判断可以根据本领域已知的技术来实现。 This determination may be accomplished according to techniques known in the art. 如果已经读取完毕,则在步骤S2017把该连接标记为读取结束。 If the reading has been completed, at step S2017 the end of the connection is marked as read.

[0049] 接下来,在步骤S2019对所有的连接做堆排序,或称对所有连接进行一次堆化处理。 [0049] Next, in step S2019 do HEAPSORT all connections, or say that all connections are a heap treatment.

[0050] 堆化处理是本发明中提升性能的关键,因为如果只是采用传统的归并算法的思路来处理整个排序过程会导致多个连接上的数据重复比较。 [0050] The stack is the key to improve the performance of the process of the present invention, if only because the idea of ​​using a conventional merge sort algorithm to handle the entire process would result in data comparison is repeated a plurality of connection. 关于本发明的“堆化处理”的细节,在后面专门描述。 Details regarding "heap process" of the present invention, specifically described later.

[0051] 对所有的连接进行堆化处理以后,在步骤S2021判断处于堆顶的连接对应的内存缓冲区是否为空。 If the [0051] process for heap after all the connections, the determination in step S2021 is connected to the top of the stack corresponding to the memory buffer is empty.

[0052] 如果为空,则说明还有连接的数据没有发送过来,再次重新判断哪些连接的数据已经可读。 [0052] If it is empty, then there is no data transmitted over the connection, again re-connected to determine which data has been read.

[0053] 如果堆顶的连接对应的内存缓冲区中有数据记录,则缓冲区中的第一个数据记录就是尚未输出给客户端的数据记录中最小的记录(这是指升序的情况,如果是降序排序,则是最大的记录),在步骤S2029从堆顶的连接的内存缓冲区中取出第一个记录发送给客户端。 [0053] If data is recorded corresponding to the top of the stack is connected to the memory buffer, the first data record in the buffer is not yet output to the minimum recording data in the client (this refers to the case of ascending order, if in descending order, the record is the largest), retrieves the first record in the memory buffer connected to the step S2029 from the top of the stack and sent to the client. 这里所说的“取出记录”是指从内存缓冲区中读出该记录之后,把该记录从内存缓冲区中删除,这样,堆顶的连接的内存缓冲区的原来的第二个记录变成了第一个记录,意味着该连接的“取值”发生了变化。 The term "removal of recording" means after reading out from the recording buffer memory, the record is deleted from the memory buffer, so that the original second recording memory buffer connected to the top of the stack becomes the first record, means "value" of the connection changes. 接着在步骤S2030再判断该连接的内存缓冲区是否是满的,当已不满,则在步骤S2031取消该连接缓冲区的已满标记。 Next, at step S2030 then determines whether the attached memory buffer is full, when dissatisfaction is, at step S2031 the connection flag cancellation buffer is full. 然后,再转步骤S2019对所有的连接做堆化处理,然后继续视堆顶连接的内存缓冲区是否有数据决定是否继续循环处理还是重新去接收数据。 Then, sub-step S2019 do all connections stack processing, and then continue to the top of the memory buffer depending on whether the stack is connected to the data to determine whether to process or re-cycle to receive data.

[0054] 当在步骤S2021判定位于堆顶的连接对应的缓冲区为空,则转步骤S2022判断该连接是否被标记读取结束。 [0054] When it is determined in step S2021 corresponding connector located on top of the stack buffer is empty, then go to step S2022 to determine whether the read end of the connection is marked. 如果没有读取结束,则转步骤S2009,否则在步骤S2023标记该连接处理完毕。 If the reading is not completed, then go to step S2009, otherwise, marking the completion of the connection process in step S2023.

[0055] 然后,在步骤S2025判断是否所有的连接都已经处理完毕。 [0055] Then, at step S2025 whether all connections have been processed.

[0056] 如果所有的连接都已经处理完毕,则转步骤S2027把结束标志发送给客户端。 [0056] If all connections have been processed, then go to step S2027 to end flag sent to the client. 例如,结束标志对应MySQL协议中是指EOF包。 For example, the end flag corresponding to EOF MySQL protocol packet means.

[0057] 如果还有连接没有处理完,则再转S2009重新判断哪些连接的数据已经可读。 [0057] If there is no connection has been processed, it is determined which sub S2009 re-readable data connection already.

[0058] 根据本发明,连接的缓冲区的限额保证了中间件在处理排序的时候所需要的内存是有限且可控的,而堆化部分保证了比较次数是最少的确保整个排序的时间复杂度是O(nIgn)ο [0058] According to the present invention, the limit of the buffer is connected ensures that when the intermediate sorting process required memory is limited and controlled, and the stack of the number of comparisons is guaranteed to ensure the least overall time complexity of ordering degree is O (nIgn) ο

[0059] 堆化处理 [0059] stack processing

[0060] 归并算法是一种已知的排序算法,用于把两个或者多个有序的序列合并成一个有序的序列。 [0060] The merging algorithm is a known sorting algorithm for the two or more ordered sequences of combined into an ordered sequence. 假设排序顺序是从小到大的,在涉及到η个(η是大于2的整数)有序序列的时候每次需要将所有的序列对比一次,获取到最小的元素,然后将该最小的元素从相应的有序序列中移除。 Suppose the sort order is ascending, comes to a [eta] ([eta] is an integer greater than 2) each time when the ordered sequence comparison of all sequences necessary to once acquired smallest element, then the smallest element from removing the corresponding ordered sequence. 假设这些有序序列分别是记为1...k...n,其中序列k的首个元素是最小的。 Assuming that these are referred to as an ordered sequence of 1 ... k ... n, where k is the first element of the sequence is minimal. 在将序列k的首个元素移除以后,所有的有序序列的第一个元素又得比较一次,即比较n(nl)/2o但此时其余n-1个有序序列彼此之间已经比较过了,只需要将序列k的新的首个元素与n-Ι个有序序列的首个元素进行比较即可,即再比较n-Ι次。 After the first sequence of elements k removed, all of the first element of the ordered sequence is again a comparison, i.e. the comparison n (nl) / 2o but this time the n-1 remaining between them has been ordered sequence Compare over, only the first element you want new sequence k with n-Ι ordered sequence is compared to the first element, that is, and then compare the n-Ι times. 实际上因为其余的n-Ι个有序序列的首个元素已经比较过了,如果之前的比较结果都已经有记录,则只需要将序列k的新的首个元素与其余的n-Ι个有序序列的首个元素中的最小的元素比较即可,所以最理想情况下只需要再比较I次即可获取到下一个最小的元素,但有可能序列k的新的首元素可能比其他的序列的首个元素都大,那么还是需要跟其余的n-Ι个有序序列再比较一次。 In fact, because the rest of the n-Ι a first element of an ordered sequence has been compared, and if the previous comparison results have been recorded, we only need a new first element of the sequence k with the rest of the n-Ι th ordered the smallest element in the first element of the sequence comparison can, under ideal circumstances so I just have to compare times to get to the next smallest element, but the new first element likely than other possible sequence of k the first element of the sequence are large, it is still required with the rest of the n-Ι ordered sequence comparison once again. 但如果之前n-Ι个序列的比较结果都已经全部记录下来且以堆的形式记录下来,则只需要比较Ign次即可。 However, if the comparison result of n-Ι sequences of all recorded previously have been recorded in the form of a stack, and the only necessary to compare Ign times.

[0061] 罗伯特.弗洛伊德(Robert ff.Floyd)和威廉姆斯(J.Williams)在1964年共同发明了著名的堆排序算法(Heap Sort)。 [0061] Robert Floyd (Robert ff.Floyd) and Williams (J.Williams) in 1964, co-inventor of the famous heap sort algorithm (Heap Sort). 堆排序产生一种有序的堆结构,是完全的二叉树。 Heapsort produce an orderly stack structure, it is complete binary tree. 堆分为最大堆(或称大根堆、大顶堆)和最小堆(或称小根堆、小顶堆)。 Heap into the largest heap (or large root heap, heap big top) and minimum heap (or small roots heap, the top small heap). 在最大堆中,每个节点的值都不大于其父节点的值。 Maximum stack, each node value is not greater than the value of its parent node. 在最小堆中,非叶子节点的值小于其孩子节点的值。 The minimum heap, the value of non-leaf node is less than the value of its child nodes.

[0062] 对连接做堆化处理就是对各连接做堆排序。 [0062] The connection process is done to make the stack heap sort for each connection. 本发明的堆化处理中产生的堆(二叉树)中的节点元素是连接而不是连接中的记录。 Node element (binary) in Heap process of the present invention is produced but not connected to the connection record.

[0063] 对于从小到大排序的情况,采用最小堆,即堆顶的元素最小,并且按照如下规则确定连接的取值: [0063] In the case of the small to large, with a minimum heap, i.e. the minimum of the top of the stack of elements, and the value determined according to the following connection rule:

[0064] 如果该连接对应的缓冲区为空(没有数据),即数据还没有从MySQL实例上发送过来,该连接的值取无限小(可认为是系统的最小值,即本系统不可能有数值比这个最小值更小); [0064] If the connection corresponding buffer is empty (no data), i.e. data has not sent over from the MySQL instance, take the value of the connection infinitesimal (system may be considered a minimum value, i.e., the present system is unlikely It is smaller than this minimum value);

[0065] 如果该连接要传输的数据都已经全部被处理完毕,该连接的值取无限大(可认为是系统的最大值,即本系统不可能有数值比这个最大值更大); [0065] If the connection data to be transmitted have all been processed, the connection takes the value infinity (the maximum value of the system may be considered, i.e., the system can not have a maximum value greater than this value);

[0066] 如果不是上述两种情况,该连接的值是该连接对应的缓冲区中第一个记录的排序字段值。 [0066] If this is not the above two cases, the value of which is connected to the sort field value of the first record corresponding to the connection buffer.

[0067] 在最小堆的情况下,如果有连接还没有数据过来,那么该空的连接一定会被推到堆顶,如果有连接的数据已经全部处理完毕了,那么该连接已经会被推到堆的最底部。 [0067] In the case of the minimum heap, if there is no data over the connection, then the air connection will be pushed to the top of the heap, if all the data connection has been processed, then the connection is to be pushed the very bottom of the heap. 除去这两种情况以外,其他时候首条记录为最小的连接一定是位于堆的堆顶。 Removed outside the both cases, the other time the first record is a minimum of connections must be located on top of the stack the stack.

[0068] 本发明的堆化就是使作为堆中元素的连接都能满足双亲节点连接的首条记录的值小于孩子节点连接中的首条记录的值。 [0068] The reactor of the present invention is to make the stack as the connection element can satisfy the value of the first record of the parent node is connected to a value smaller than the first record in the child nodes are connected.

[0069] 如果要求按照从大到小排序,本发明可采用最大堆,并在比较的时候用调整的比较函数。 [0069] If the descending order in accordance with the requirements, the present invention may be the maximum stack, and when compared with the adjusted comparison function.

[0070] 说明书和附图所示的实施例仅用于解释和说明,而非限制本发明的范围,本发明由权利要求书来限定。 [0070] the specification and examples are used to explain and illustrate, and not limit the scope of the present invention, the present invention is defined by the claims shown in the drawings.

Claims (8)

1.一种提供有序数据的方法,包括步骤: 从客户端接收带有排序要求的查询请求,并解析所述查询请求以产生对应于各数据源的子查询请求; 获取各数据源对应的连接,建立与各连接对应的缓冲区并清空且标记为未满,并且把产生的各子查询请求通过相应连接发送给相应的数据源,其中所述数据源通过相应连接返回有序的数据记录; 按照预定的规则来轮询各连接以确定哪个连接有数据可读取,其中,当确定一连接有数据可读取并且该连接对应的内存缓冲区未满,则读取该连接中所有可读的数据记录并存储到该连接对应的内存缓冲区中,当该缓冲区中的数据量超过预定阈值时把该缓冲区标记为已满,并且当确定需要通过该连接读取的数据全部读取完毕时把该连接标记为读取结束; 执行所有连接的堆排序; 当堆顶的连接对应的内存缓冲区非空时,从 1. A method of providing ordered data, comprising the steps of: receiving a query request from a client with ordering requirements, and parsing the query request to the data-generating sources corresponding to each query request; obtaining each data source corresponding to connection, establishment of connections corresponding to each buffer and marked as empty and full, and to generate the respective sub-query request sent to the corresponding data source via respective connections, wherein the data source returns the data records ordered by a respective connector ; according to a predetermined rule to poll each of the connection to determine which data can be read is connected, wherein, when it is determined there is a data connection and corresponding to the connection readable memory buffer is not full, the connection is read all reading data recorded and stored in the memory buffer corresponding to the connection, when the amount of data in the buffer exceeds a predetermined threshold value to the buffer is marked as full, and all be read when it is determined by reading the connection data the time taken to complete the connection is marked as read end; hEAPSORT perform all connected; when the top of the stack memory connected to the corresponding buffer is not empty, from 内存缓冲区中取出第一个记录发送给客户端,并且,重复所有连接的堆排序并把堆顶的连接对应的内存缓冲区中的第一个记录取出发送给客户端,直到堆顶的连接对应的内存缓冲区为空,其中在从堆顶对应的内存缓冲区中取出第一个记录发送给客户端之后要判断该内存缓冲区是否未满,当未满时,取消该内存缓冲区的已满标记; 当堆顶的连接对应的内存缓冲区为空并且堆顶的该连接被标记读取结束,则标记该连接处理完毕,否则继续所述的轮询; 当所有的连接都已经处理完毕,则把结束标志发送给客户端,否则继续所述轮询, 其中,所述连接按照如下规则取值: 如果该连接被标记为处理完毕,对升序排序该连接取值无限大,对降序排序该连接的取值无限小; 如果该连接未被标记为处理完毕且该连接对应的缓冲区为空,对升序排序该连接的取值无 Take the first memory buffer a record sent to the client, and, repeated connections and all heap sort the first record corresponding to the connection in the top of the stack is taken out of the memory buffer to the client, until the top of the stack is connected corresponding to the memory buffer is empty, wherein the stack removed from the top of the corresponding memory buffer after the first record to be sent to the client determines whether the memory buffer is not full, when full, the memory buffer is canceled full mark; when the connection corresponding to the top of the stack memory buffer is empty and the top of the stack is read end flag, the connection process is marked completed, or continue the polling; when all connections have been processed is completed, put end flag sent to the client, otherwise continue the polling, wherein said connection parameter according to the following rules: if the connection is marked as processed, in ascending order of the connection of infinite value, in descending order of the value of the connection infinitesimal sort; if the connection is not marked as processed and the corresponding buffer is empty this connection, in ascending order of the value of the connectionless 限小,对降序排序该连接的取值无限大; 如果不是上述两种情况,该连接的值是该连接对应的缓冲区中第一个记录的排序字段值。 Small limited value in descending order of the connection of the infinite; if not both cases, the value of which is connected to the sort field value of the first record corresponding to the connection buffer.
2.根据权利要求1所述的方法,所述数据源是MySQL实例。 The method according to claim 1, said data source is a MySQL instance.
3.根据权利要求1所述的方法,其中所述按照预定的规则来轮询各连接是根据设定的时间间隔来定期对各连接做轮询。 3. The method according to claim 1, wherein said predetermined rules to poll each of the connection are set according to a regular polling interval for each connection.
4.根据权利要求1所述的方法,所述连接是在TCP层实现的连接。 4. The method according to claim 1, the connection is implemented in the TCP layer connections.
5.一种提供有序数据的系统,包括: 客户端,用于生成带有排序要求的查询请求; 多个数据源,所述多个数据源的每个根据相应的查询子请求产生排序的数据记录并传递到对应的连接; 连接部分,用于管理和分配到各数据源的连接; 排序机构,用于执行如下步骤: 从客户端接收带有排序要求的查询请求,并解析所述查询请求以产生对应于各数据源的子查询请求; 获取各数据源对应的连接,建立与各连接对应的缓冲区并清空且标记为未满,并且把产生的各子查询请求通过相应连接发送给相应的数据源,其中所述数据源通过相应连接返回有序的数据记录; 按照预定的规则来轮询各连接以确定哪个连接有数据可读取,其中,当确定一连接有数据可读取并且该连接对应的内存缓冲区未满,则读取该连接中所有可读的数据记录并存储到该连接对应的内存缓 An ordered data providing system, comprising: a client configured to generate a query request with the ordering requirements; a plurality of data sources, each request is generated based on the corresponding query ordered subset of the plurality of data sources and recording data is transmitted to the corresponding connector; a connecting portion for connecting the management and distribution to the data source; ordering means for performing the steps of: receiving a query request with the ordering requirements from the client, parse the query and request to generate a data source corresponding to each sub-query request; obtaining each data source corresponding to the connection establishment corresponding to each connection and the buffer is cleared and the flag is not full, and the respective sub-queries generated by a request to the respective connection the corresponding data source, wherein said data source is connected by a respective return ordered data record; according to predetermined rules to poll each of the connection to determine which data can be read is connected, wherein, when determining a connection data can be read and the connection corresponding to the memory buffer is not full, the read data record read all connection and stored in the buffer memory corresponding to the connection 区中,当该缓冲区中的数据量超过预定阈值时把该缓冲区标记为已满,并且当确定需要通过该连接读取的数据全部读取完毕时把该连接标记为读取结束; 执行所有连接的堆排序; 当堆顶的连接对应的内存缓冲区非空时,从该内存缓冲区中取出第一个记录发送给客户端,并且,重复所有连接的堆排序并把堆顶的连接对应的内存缓冲区中的第一个记录取出发送给客户端,直到堆顶的连接对应的内存缓冲区为空,其中在从堆顶对应的内存缓冲区中取出第一个记录发送给客户端之后要判断该内存缓冲区是否未满,当未满时,取消该内存缓冲区的已满标记; 当堆顶的连接对应的内存缓冲区为空并且堆顶的该连接被标记读取结束,则标记该连接处理完毕,否则继续所述的轮询; 当所有的连接都已经处理完毕,则把结束标志发送给客户端,否则继续所述轮询 Region, when the data in the buffer exceeds a predetermined threshold value to the buffer is marked as full, and when all the data to be read is determined by the connection has been read to the end of the connection is marked as read; performed All hEAPSORT connected; when the top of the stack memory connected to the corresponding buffer is not empty, remove a record from the first memory buffer to the client, and repeat all heap sort the top of the stack and connected to a connection a first memory buffer corresponding to the record is taken out to the client, until the top of the stack memory connected to the corresponding buffer is empty, wherein in the first extraction sending to the client from a record corresponding to the top of the stack memory buffer after to determine whether the full memory buffer, when full, full cancellation of the tag memory buffer; the top of the stack when the connector is connected to the corresponding buffer memory is empty and the top of the stack is read the end flag, this connection is marked processed, otherwise continuing the polling; when all connections have been processed, put end flag sent to the client, otherwise continue the polling 其中,所述连接按照如下规则取值: 如果该连接被标记为处理完毕,对升序排序该连接取值无限大,对降序排序该连接的取值无限小; 如果该连接未被标记为处理完毕且该连接对应的缓冲区为空,对升序排序该连接的取值无限小,对降序排序该连接的取值无限大; 如果不是上述两种情况,该连接的值是该连接对应的缓冲区中第一个记录的排序字段值。 Wherein the connector according to the following criteria argument: if the connection is marked as processed, in ascending order of the connection of infinite value, the value of the connection in descending order infinitesimal; if the connection is not marked as done and the connection corresponding buffer is empty, the values ​​in ascending order of the connection of infinitesimal, in descending order of the value of the connection infinite; if not both cases, the value of which is connected to the buffer corresponding to the connection sort the first record field value.
6.根据权利要求5所述的系统,所述数据源是MySQL实例。 6. The system according to claim 5, wherein said data source is a MySQL instance.
7.根据权利要求5所述的系统,其中所述按照预定的规则来轮询各连接是根据设定的时间间隔来定期对各连接做轮询。 7. The system according to claim 5, wherein said predetermined rules to poll each of the connection are set according to a regular polling interval for each connection.
8.根据权利要求5所述的系统,所述连接是在TCP层实现的连接。 8. The system according to claim 5, the connection is connected to the TCP layer implementation.
CN 201510250580 2015-05-18 2015-05-18 System and method for providing sorted data CN104850618A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 201510250580 CN104850618A (en) 2015-05-18 2015-05-18 System and method for providing sorted data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN 201510250580 CN104850618A (en) 2015-05-18 2015-05-18 System and method for providing sorted data

Publications (1)

Publication Number Publication Date
CN104850618A true true CN104850618A (en) 2015-08-19

Family

ID=53850262

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 201510250580 CN104850618A (en) 2015-05-18 2015-05-18 System and method for providing sorted data

Country Status (1)

Country Link
CN (1) CN104850618A (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020087692A1 (en) * 1997-12-01 2002-07-04 Netselector, Inc. Site access via intervening control layer
CN102968496A (en) * 2012-12-04 2013-03-13 天津神舟通用数据技术有限公司 Parallel sequencing method based on task derivation and double buffering mechanism
CN103116655A (en) * 2013-03-06 2013-05-22 亿赞普(北京)科技有限公司 Clustered data query method, client side and system
CN103399944A (en) * 2013-08-14 2013-11-20 曙光信息产业(北京)有限公司 Implementation method and implementation device for data duplication elimination query
CN104111936A (en) * 2013-04-18 2014-10-22 阿里巴巴集团控股有限公司 Method and system for querying data
CN104363277A (en) * 2014-11-13 2015-02-18 上海交通大学 Allocation management system and management method for bandwidth resources in cloud game distributed system
CN104601732A (en) * 2015-02-12 2015-05-06 北京金和软件股份有限公司 Method for merging multichannel data quickly

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020087692A1 (en) * 1997-12-01 2002-07-04 Netselector, Inc. Site access via intervening control layer
CN102968496A (en) * 2012-12-04 2013-03-13 天津神舟通用数据技术有限公司 Parallel sequencing method based on task derivation and double buffering mechanism
CN103116655A (en) * 2013-03-06 2013-05-22 亿赞普(北京)科技有限公司 Clustered data query method, client side and system
CN104111936A (en) * 2013-04-18 2014-10-22 阿里巴巴集团控股有限公司 Method and system for querying data
CN103399944A (en) * 2013-08-14 2013-11-20 曙光信息产业(北京)有限公司 Implementation method and implementation device for data duplication elimination query
CN104363277A (en) * 2014-11-13 2015-02-18 上海交通大学 Allocation management system and management method for bandwidth resources in cloud game distributed system
CN104601732A (en) * 2015-02-12 2015-05-06 北京金和软件股份有限公司 Method for merging multichannel data quickly

Similar Documents

Publication Publication Date Title
Gilbert et al. Algorithmic linear dimension reduction in the l_1 norm for sparse vectors
Logothetis et al. Stateful bulk processing for incremental analytics
Abu-Ghazaleh et al. Differential deserialization for optimized soap performance
US8229902B2 (en) Managing storage of individually accessible data units
US7523130B1 (en) Storing and retrieving objects on a computer network in a distributed database
Callahan et al. A census of cusped hyperbolic 3-manifolds
US20070100808A1 (en) High speed non-concurrency controlled database
US6978458B1 (en) Distributing data items to corresponding buckets for use in parallel operations
US6754799B2 (en) System and method for indexing and retrieving cached objects
US20050187946A1 (en) Data overlay, self-organized metadata overlay, and associated methods
Tang et al. Peersearch: Efficient information retrieval in peer-to-peer networks
US6377984B1 (en) Web crawler system using parallel queues for queing data sets having common address and concurrently downloading data associated with data set in each queue
US8042112B1 (en) Scheduler for search engine crawler
US20090287986A1 (en) Managing storage of individually accessible data units
US6735600B1 (en) Editing protocol for flexible search engines
US20040181523A1 (en) System and method for generating and processing results data in a distributed system
US6263364B1 (en) Web crawler system using plurality of parallel priority level queues having distinct associated download priority levels for prioritizing document downloading and maintaining document freshness
US6351755B1 (en) System and method for associating an extensible set of data with documents downloaded by a web crawler
US20120303622A1 (en) Efficient Indexing of Documents with Similar Content
US7702640B1 (en) Stratified unbalanced trees for indexing of data items within a computer system
Rajasekaran Efficient parallel hierarchical clustering algorithms
WO2005008524A1 (en) Distributed database system
Albers et al. Self-organizing data structures
US20120310917A1 (en) Accelerated Join Process in Relational Database Management System
Hambrusch et al. Query processing in broadcasted spatial index trees

Legal Events

Date Code Title Description
C06 Publication
EXSB Decision made by sipo to initiate substantive examination