CN106933855B - Object sorting method, device and system - Google Patents

Object sorting method, device and system Download PDF

Info

Publication number
CN106933855B
CN106933855B CN201511021334.8A CN201511021334A CN106933855B CN 106933855 B CN106933855 B CN 106933855B CN 201511021334 A CN201511021334 A CN 201511021334A CN 106933855 B CN106933855 B CN 106933855B
Authority
CN
China
Prior art keywords
processed
sorting
data
data information
current
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201511021334.8A
Other languages
Chinese (zh)
Other versions
CN106933855A (en
Inventor
陈友林
肖强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Beijing Software Services Co Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN201511021334.8A priority Critical patent/CN106933855B/en
Publication of CN106933855A publication Critical patent/CN106933855A/en
Application granted granted Critical
Publication of CN106933855B publication Critical patent/CN106933855B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques

Abstract

The application provides an object sorting method, device and system, wherein the method comprises the following steps: determining to-be-processed data information of an object to be processed; when the sorting mode is that the data values are sorted in a descending order, the data information to be processed is sent to the sorting node only when the data value to be processed is larger than the current threshold value; under the condition that the sorting mode is ascending sorting according to the data values, the data information to be processed is sent to the sorting node only under the condition that the data value to be processed is smaller than the current threshold value; the current threshold value is a data value in last data information in an existing sorting result of the sorting node, and the existing sorting result is generated after the sorting node sorts a plurality of data information according to the sorting mode. The method and the device can occupy less time and less memory, thereby improving the sequencing efficiency of mass data and reducing the network throughput.

Description

Object sorting method, device and system
Technical Field
The present application relates to the field of computer technologies, and in particular, to a method, an apparatus, and a system for object sorting.
Background
With the rapid development of science and technology, computer technology is beginning to be popularized in various industries. Along with the daily operation of various industries, the computers can generate mass data containing rich data information, and the data information has higher application value. In order to extract valuable data information in the mass data, data analysis can be performed on the mass data.
At present, data analysis can be performed on mass data by adopting technical means such as a data sequencing mode, a modeling analysis mode, a data statistical mode and the like. The sequencing of the objects in the mass data is a common technical means for obtaining valuable data information in the mass data. For example, the order size of a seller on the Taobao network is ranked, the click volume of news on the Internet is ranked, the PM2.5 pollution index of each city across the country is ranked, and the like.
In the process of sequencing the objects in the mass data, if the objects in the mass data are fixed and unchangeable, the sequencing result is also fixed and unchangeable. In this case, the objects in the mass data may be sorted by a simple sorting algorithm. However, mass data is constantly changing with the constant operation of computers in various industries. That is, the total number of objects to be sorted in the mass data is changed, or the data value of each object is changed. Under the condition that the mass data are constantly changed, in order to ensure the accuracy of the sequencing result, the mass data need to be sequenced in real time.
At present, the technical scheme of real-time sequencing of mass data generally ranks each object to be sequenced in the mass data in real time, and then a final sequencing result can be obtained. However, the total amount of objects in the mass data is huge, and ranking of each object causes each ranking to occupy a long time and a large memory, thereby causing the efficiency of the data ranking process to be low and increasing the network throughput.
Therefore, an object sorting method is needed, which can occupy less time and less memory, thereby improving the sorting efficiency of mass data and reducing the network throughput.
Disclosure of Invention
The inventor of the application finds out in the research process that:
objects in the mass data may be sorted based on an object sorting system. Referring to fig. 1, an object ranking system includes: a plurality of distribution nodes 110, a plurality of compute nodes 120, a plurality of first level sort nodes 130, and a second level sort node 140. The following describes the implementation of the object ranking system shown in fig. 1.
Due to the huge total amount of objects of mass data, the computing power of one computing node is not enough to compute the data values of all the objects. To compute the data values of all objects, the distribution node 110 may distribute all objects in the mass data to different compute nodes 120 so that each compute node 120 computes the data values of a portion of the objects.
Because the total amount of the objects of the mass data is huge, the sequencing capability of one sequencing node is not enough to sequence all the objects, and therefore two levels of sequencing nodes can be utilized for sequencing. The first-level sequencing nodes are used for sequencing part of the objects, then only outputting the sequencing results to the second-level sequencing nodes, and the second-level sequencing nodes integrate the sequencing results of all the first-level sequencing nodes, so that the purpose of outputting the sequencing results in the specified number is achieved.
For example, taking the TOP 10 items as the output of the sort result (i.e., calculating TOP (10)), it is assumed that the total number of objects is 50, the number of the first-level sort nodes is 2, and the number of the second-level sort nodes is 1. One primary sorting node can sort 25 objects and output the TOP 10 of the 25 objects (TOP (10)); another level one sorting node may sort another 25 objects and output the TOP 10 of the other 25 objects (TOP (10)). The two first-level sorting nodes can output the sorting results of the calculated partial objects to the second-level sorting node, and the sorting results of the two partial objects are integrated by the second-level sorting node, so that a final sorting result, namely the top 10 of the total 50 objects, is obtained.
The sorting method adopts a mode of two-stage sorting nodes to improve the sorting efficiency, the first-stage sorting node performs partial sorting, and then the second-stage sorting node performs sorting again. Although the processing mode of utilizing the hierarchical sorting can accelerate the sorting efficiency to a certain extent. However, each object still needs to be sorted on the first-level sorting node, which occupies a large memory. Moreover, the secondary sorting node needs to integrate multiple partial sorting results for multiple times, which increases the delay of the ranking results and the network throughput.
In view of this, the present application provides an object sorting method, device, and system, which may occupy less time and less memory, thereby improving the efficiency of sorting mass data and reducing the network throughput.
In order to achieve the above object, the present application provides the following technical means:
an object sorting method is applied to a computing node in a computing node group of an object sorting system, and the object sorting system also comprises a distribution node group and a sorting node which are connected with the computing node group; the method comprises the following steps:
determining to-be-processed data information of an object to be processed; the data information to be processed comprises an object identifier to be processed for uniquely representing the object to be processed and a data value to be processed of the object to be processed;
when the sorting mode is that the data values are sorted in a descending order, the data information to be processed is sent to the sorting node only when the data value to be processed is larger than the current threshold value;
under the condition that the sorting mode is ascending sorting according to the data values, the data information to be processed is sent to the sorting node only under the condition that the data value to be processed is smaller than the current threshold value;
the current threshold value is a data value in the last data information of the existing sorting result of the sorting node, and the existing sorting result is generated after the sorting node sorts a plurality of data information according to the sorting mode.
Preferably, the determining the to-be-processed data information of the to-be-processed object includes:
receiving current data information of an object to be processed sent by the distribution node group, wherein the current data information comprises an identifier of the object to be processed and a current data value;
determining the to-be-processed object identification of the to-be-processed object in the current data information;
searching a historical data value to be processed corresponding to the identification of the object to be processed;
and determining the sum of the historical data value to be processed and the current data value as the data value to be processed.
Preferably, the method further comprises the following steps:
after the sequencing node updates the existing sequencing result according to the to-be-processed data information to generate a current sequencing result, receiving a latest threshold value sent by the sequencing node; wherein the latest threshold is a data value in the last data information of the current sorting result;
updating the current threshold with the latest threshold.
An object sorting method is applied to a computing node in a computing node group of an object sorting system, and the object sorting system also comprises a distribution node group and a sorting node group which are connected with the computing node group; the method comprises the following steps:
determining to-be-processed data information of an object to be processed and a to-be-processed category identifier for representing a category to which the object to be processed belongs; the data information to be processed comprises an object identifier to be processed for uniquely representing the object to be processed and a data value to be processed of the object to be processed;
when the sorting mode is that the data values are sorted in a descending order, the data information to be processed is sent to the sorting node corresponding to the category identifier to be processed only when the data value to be processed is larger than the current threshold corresponding to the category identifier to be processed;
when the sorting mode is that the data values are sorted in an ascending order, the data information to be processed is sent to the sorting node corresponding to the category identifier to be processed only when the data value to be processed is smaller than the current threshold corresponding to the category identifier to be processed;
the current threshold value is a data value in the last data information of the existing sorting result of the sorting node, and the existing sorting result is generated after the sorting node sorts a plurality of data information according to the sorting mode.
Preferably, the determining the to-be-processed data information of the to-be-processed object and the to-be-processed category identifier for indicating the category to which the to-be-processed object belongs includes:
receiving current data information of an object to be processed, which is sent by the distribution node group, wherein the data information comprises a category identifier to be processed, an object identifier to be processed and a current data value;
determining the class identifier to be processed and the object identifier to be processed of the object to be processed in the current data information;
searching a historical data value to be processed corresponding to the identification of the object to be processed in a historical data value set corresponding to the identification of the category to be processed;
and determining the sum of the historical data value to be processed and the current data value as the data value to be processed of the object to be processed.
Preferably, the method further comprises the following steps:
after the sorting node corresponding to the category identifier to be processed updates the existing sorting result according to the data information to be processed to generate a current sorting result, receiving the category identifier and the latest threshold value sent by the sorting node corresponding to the category identifier to be processed;
updating a current threshold corresponding to the category identification by using the latest threshold;
and the latest threshold is a data value in the last data information in the current sorting result, and the category identification is consistent with the category identification to be processed of the object to be processed.
An object sorting method is applied to a sorting node in an object sorting system, and the sorting system also comprises a computing node group and a distribution node group connected with the computing node group; the method comprises the following steps:
receiving the data information to be processed of the object to be processed, which is sent by the computing node group according to the method; the data information to be processed comprises an object identifier to be processed of the object to be processed and a data value to be processed;
and updating the existing sorting result according to the to-be-processed data information to generate a current sorting result.
Preferably, the updating the existing sorting result according to the to-be-processed data information to generate the current sorting result includes:
if the data sorting queue corresponding to the existing sorting result does not contain the identification of the object to be processed, deleting the last data information in the existing sorting result in the sorting data queue;
adding the to-be-processed data information in the sequencing data queue;
and generating the current sorting result after the sorting data queue is reordered.
Preferably, the updating the existing sorting result according to the to-be-processed data information to generate the current sorting result includes:
if the data sorting queue corresponding to the existing sorting result contains the identification of the object to be processed, updating a historical data value corresponding to the identification of the object to be processed by using the data value to be processed in the data sorting queue;
and generating the current sorting result after the sorting data queue is reordered.
Preferably, after the generating the current sorting result, the method further includes:
determining a data value in the last data information of the current sorting result as a latest threshold value; sending the latest threshold value to each computing node in the computing node group; alternatively, the first and second electrodes may be,
determining a data value in the last data information of the current sorting result as a latest threshold value; sending the latest threshold value and the category identification of the sorting node to each computing node in the computing node group; wherein the class identifier is consistent with the to-be-processed class identifier of the to-be-processed object.
An object sorting device is integrated in a computing node group of an object sorting system, and the object sorting system also comprises a distribution node group and a sorting node which are connected with the computing node group; the device comprises:
the first determining data information unit is used for determining the to-be-processed data information of the to-be-processed object; the data information to be processed comprises an object identifier to be processed for uniquely representing the object to be processed and a data value to be processed of the object to be processed;
the first data information sending unit is used for sending the data information to be processed to the sorting node only under the condition that the data values to be processed are larger than the current threshold value under the condition that the sorting mode is the descending sorting according to the data values;
the second data information sending unit is used for sending the data information to be processed to the sorting node only under the condition that the data value to be processed is smaller than the current threshold value under the condition that the sorting mode is in ascending order according to the data value;
the current threshold value is a data value in the last data information of the existing sorting result of the sorting node, and the existing sorting result is generated after the sorting node sorts a plurality of data information according to the sorting mode.
Preferably, the first specific data information unit includes:
a first data information receiving unit, configured to receive current data information of an object to be processed sent by the distribution node group, where the current data information includes an identifier of the object to be processed and a current data value;
a first determining unit, configured to determine the to-be-processed object identifier of the to-be-processed object in the current data information;
the first searching unit is used for searching the historical data value to be processed corresponding to the identification of the object to be processed;
and the second determining unit is used for determining the sum of the historical data value to be processed and the current data value as the data value to be processed.
An object sorting device is applied to a computing node in a computing node group of an object sorting system, and the object sorting system also comprises a distribution node group and a sorting node group which are connected with the computing node group; the device comprises:
the second determination data information unit is used for determining to-be-processed data information of the to-be-processed object and a to-be-processed category identifier for representing the category to which the to-be-processed object belongs; the data information to be processed comprises an object identifier to be processed for uniquely representing the object to be processed and a data value to be processed of the object to be processed;
a third data information sending unit, configured to send the to-be-processed data information to a sorting node corresponding to the to-be-processed category identifier only when the to-be-processed data value is greater than the current threshold corresponding to the to-be-processed category identifier in a sorting manner that the to-be-processed data value is sorted in a descending order of data values;
a fourth data information sending unit, configured to send the to-be-processed data information to the sorting node corresponding to the to-be-processed category identifier only when the to-be-processed data value is smaller than the current threshold corresponding to the to-be-processed category identifier in a sorting manner that the to-be-processed data value is arranged in an ascending order of data values;
the current threshold value is a data value in last data information in an existing sorting result of the sorting node, and the existing sorting result is generated after the sorting node sorts a plurality of data information according to the sorting mode.
Preferably, the second specific data information unit includes:
a second data information receiving unit, configured to receive current data information of the to-be-processed object sent by the distribution node group, where the data information includes a to-be-processed category identifier, a to-be-processed object identifier, and a current data value;
a third determining unit, configured to determine, in the current data information, the to-be-processed category identifier and the to-be-processed object identifier of the to-be-processed object;
the second searching unit is used for searching the historical data value to be processed corresponding to the identification of the object to be processed in the historical data value set corresponding to the identification of the class to be processed;
and the fourth determining unit is used for determining the sum of the historical data value to be processed and the current data value as the data value to be processed of the object to be processed.
Preferably, the method further comprises the following steps:
the second receiving threshold unit is used for receiving the category identifier and the latest threshold sent by the sorting node corresponding to the to-be-processed category identifier after the sorting node corresponding to the to-be-processed category identifier updates the existing sorting result according to the to-be-processed data information to generate a current sorting result;
a second updating unit that updates the current threshold corresponding to the category identifier using the latest threshold;
and the latest threshold is a data value in the last data information in the current sorting result, and the category identification is consistent with the category identification to be processed of the object to be processed.
An object sorting device is applied to a sorting node in an object sorting system, and the sorting system also comprises a computing node group and a distribution node group connected with the computing node group; the device comprises:
a third data information receiving unit, configured to receive to-be-processed data information of an object to be processed, where the to-be-processed data information is sent by the computing node group; each computing node in the computing node group is integrated with the object sorting device, and the to-be-processed data information comprises to-be-processed object identification and to-be-processed data value of the to-be-processed object;
and the generating unit is used for updating the existing sorting result according to the to-be-processed data information and generating the current sorting result.
Preferably, the generating unit includes:
a deleting unit, configured to delete the last data information in the existing sorting result from the sorting data queue if the data sorting queue corresponding to the existing sorting result does not include the identifier of the object to be processed;
the adding unit is used for adding the to-be-processed data information in the sequencing data queue;
and the first result generation unit is used for generating the current sorting result after the sorting data queue is reordered.
Preferably, the generating unit includes:
a third updating unit, configured to update, in the data sorting queue, a historical data value corresponding to the to-be-processed object identifier with the to-be-processed data value if the data sorting queue corresponding to the existing sorting result includes the to-be-processed object identifier;
and the second result generating unit is used for generating the current sorting result after the sorting data queue is reordered.
An object ranking system comprising:
distributing node groups, computing node groups and sequencing nodes;
the computing node group is used for executing the object sorting method;
the sorting node is used for executing the object sorting method.
An object ranking system comprising:
distributing node groups, computing node groups and sequencing node groups;
the computing node group is used for executing the object sorting method;
the sorting nodes in the sorting node group are used for executing the object sorting method.
From the above, it can be seen that the present application has the following beneficial effects:
the object sorting system provided by the application is provided with a current threshold value on the computing node, wherein the current threshold value is a data value in the last data information of the existing sorting result. It is understood that, in the case where the sort order is a descending (or ascending) order [ i.e., in the case where the maximum top (n) (or the minimum top (n)) is calculated ], the data value in the last data information of the existing sort result is the minimum value (maximum value) in the existing sort result.
Therefore, when the value of the data to be processed is greater than (smaller than) the current threshold, the information of the data to be processed has the possibility of breaking the existing sorting result, and at this time, the computing node sends the information of the data to be processed of the object to be processed to the sorting node, so that the sorting node updates the existing sorting result. When the value of the data to be processed is not larger than (not smaller than) the current threshold value, the information of the data to be processed does not change the existing sorting result, at this time, the computing node does not send the information of the data to be processed of the object to be processed to the sorting node, and the sorting node does not need to calculate the data to be processed of the object to be processed.
That is, in the present application, the computing node filters most of the to-be-processed objects that are useless for the sorting result by comparing the to-be-processed data information of the to-be-processed objects with the current threshold, and only sends the to-be-processed objects that are useful for the sorting result to the sorting node, and the sorting node sorts the objects. Therefore, the method and the device can enable the sequencing node to sequence a small number of objects, and can improve sequencing efficiency and reduce occupied memory due to the fact that the number of the sequencing objects is reduced suddenly.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a schematic structural diagram of an object sorting system disclosed in an embodiment of the present application;
FIG. 2 is a schematic structural diagram of another object sorting system disclosed in the embodiments of the present application;
FIG. 3 is a flowchart of a method for object ranking according to an embodiment of the present application;
FIG. 4 is a flowchart of another object ranking method disclosed in the embodiments of the present application;
FIG. 5 is a flow chart of yet another object ranking method disclosed in an embodiment of the present application;
FIG. 6 is a schematic diagram of a structure of another object ranking system disclosed in an embodiment of the present application;
FIG. 7 is a flowchart of another object ranking method disclosed in the embodiments of the present application;
FIG. 8 is a flow chart of yet another object ranking method disclosed in an embodiment of the present application;
FIG. 9 is a flowchart of another object ranking method disclosed in an embodiment of the present application;
FIG. 10 is a flow chart of yet another object ranking method disclosed in an embodiment of the present application;
fig. 11 is a schematic structural diagram of an object sorting apparatus disclosed in an embodiment of the present application;
FIG. 12 is a schematic structural diagram of another object sorting apparatus disclosed in the embodiments of the present application;
FIG. 13 is a schematic structural diagram of another object sorting apparatus disclosed in the embodiments of the present application;
FIG. 14 is a schematic structural diagram of another object sorting apparatus disclosed in the embodiments of the present application;
FIG. 15 is a schematic structural diagram of another object sorting apparatus disclosed in the embodiments of the present application;
FIG. 16 is a schematic structural diagram of another object sorting apparatus disclosed in the embodiments of the present application;
FIG. 17 is a schematic structural diagram of another object sorting apparatus disclosed in the embodiments of the present application;
FIG. 18 is a schematic structural diagram of another object sorting apparatus disclosed in the embodiments of the present application;
FIG. 19 is a schematic structural diagram of another object sorting apparatus disclosed in the embodiments of the present application;
fig. 20 is a schematic structural diagram of another object sorting apparatus disclosed in the embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
Before describing the embodiments of the present application, first, an embodiment of an object ranking system is described to facilitate those skilled in the art to more easily understand application scenarios of the embodiments of the present application. As shown in fig. 2, the object ranking system includes: a distribution node group 100, a compute node group 200, and a sort node 301.
Through research of the inventor of the application, the root cause of the prior art that the sequencing efficiency is low and the network throughput is high is that: the number of objects participating in the ranking is large. Therefore, for the purpose of improving the sorting efficiency and reducing the network throughput, the technical means provided by the application is to reduce the number of objects participating in sorting so that the sorting node can efficiently and quickly calculate top (n) of a plurality of objects.
TOP (N) means that the first N objects are output in a sorted manner, where N is a non-zero natural number. When the sorting mode is ascending, top (N) means the first N objects with the smallest output data value; for example, the top 10 lowest national pollution indices. When the sorting mode is ascending, top (N) means the first N objects with the largest output data value; for example, the top 10 seller orders are the highest.
In order to achieve the purpose of reducing the number of objects participating in sorting, the computing node group is used as a filter for filtering out objects to be processed which are useless for the existing sorting result in the mass data, and only the objects to be processed which are useful for the sorting result in the mass data are reserved, so that only the objects which are useful for the sorting result are sent to the sorting nodes. Because only objects which are useful for the sequencing result are processed on the sequencing node, and objects which are useless for the sequencing result are not required to be processed, the number of objects processed by the sequencing node can be greatly reduced.
In order to achieve the purpose that the computing node group serves as a filter, a dynamically-changed current threshold value is set for each computing node in the computing node group, and the current threshold value changes along with the change of the data value of the last data information of the sorting result in the sorting node. The compute node may compare the data information for the object to a current threshold to decide whether to filter the data information for the object.
The present application provides a first embodiment of an object sorting method, which is applied to a computing node in a computing node group of an object sorting system shown in fig. 2. The processing procedures of all the computing nodes in the computing node group are consistent, and the processing procedure of each object by one computing node is also consistent, so that the application only introduces the processing procedure of one object by one computing node in detail. The object to be processed in this embodiment is referred to as an object to be processed for convenience of description.
As shown in fig. 3, the object sorting method specifically includes the following steps:
step S301: determining to-be-processed data information of an object to be processed; the data information to be processed comprises an object identifier to be processed for uniquely representing the object to be processed and a data value to be processed of the object to be processed.
Since the value of the data to be processed of the object to be processed is changing, the current information of the data to be processed of the object to be processed is determined first before the subsequent steps are executed. The data information to be processed comprises the identification of the object to be processed and the data value to be processed of the object to be processed. Of course, the identifier of the object to be processed does not change, and the data value to be processed is changed continuously.
The data value to be processed varies based on the historical data value. Therefore, the sum of the data value to be processed and the historical data value may be taken as the current data value.
For example, taking the object to be processed as the cargo storage amount as an example, the cargo storage amount is gradually increased on the basis of the historical cargo amount, and then the sum of the historical cargo amount and the current cargo amount is calculated when the data value to be processed is calculated.
The application provides an embodiment of determining to-be-processed data information for a change process of the to-be-processed data value. Referring to fig. 4, the method specifically includes the following steps:
step S401: and receiving current data information of the object to be processed sent by the distribution node group, wherein the current data information comprises an identifier of the object to be processed and a current data value.
Since a plurality of objects can be processed on one computing node, the computing node can receive the current data information of the object to be processed and determine the identifier of the object to be processed and the current data value. So as to distinguish the object to be processed from the plurality of objects by using the object to be processed identifier.
Step S402: and determining the identification of the object to be processed in the current data information.
Step S403: and searching the historical data value to be processed corresponding to the identification of the object to be processed.
Historical data values of a plurality of objects are maintained on the computing nodes, and each historical data value corresponds to an object identifier in a one-to-one mode. After the identification of the object to be processed is determined, the historical data value to be processed corresponding to the identification of the object to be processed is searched in the storage space.
Step S404: and determining the sum of the historical data value to be processed and the current data value as the data value to be processed.
And calculating the sum of the historical data value to be processed and the current data value so as to obtain the current data value to be processed of the object to be processed.
For example, taking the object to be processed as the cargo storage amount as an example, when the cargo is stored in the warehouse, the cargo storage amount is increased on the basis of the historical cargo amount (historical data value), and the sum of the historical cargo amount and the current cargo amount may be calculated when the data value to be processed is calculated.
The sorting modes include ascending sorting and descending sorting, the comparison process between the data value to be processed and the current threshold value in different sorting modes is slightly different, and the comparison process between the data value to be processed and the current threshold value in different sorting modes is explained in detail below.
Then, returning to fig. 3, the flow proceeds to step S302: and under the condition that the sorting mode is in descending order according to the data values, sending the data information to be processed to the sorting node only under the condition that the data value to be processed is larger than the current threshold value.
When the sorting mode is descending sorting (i.e. calculating the maximum TOP (N)), the data value in the last data information in the existing sorting result is the minimum data value of the existing sorting result; in the application, the current threshold value is consistent with the data value of the last data information.
Therefore, if the data value of an object is greater than the current threshold, the data information indicating the object breaks the existing sorting result, that is, the object is a useful object for the sorting result; at this time, the computing node sends the data information of the object to the sorting node.
If the data value of an object is not larger than the current threshold value, the data information of the object does not break the existing sorting result, namely the object is an object which is useless for the sorting result; at this time, the computing node does not send the data information of the object to the sorting node.
Step S303: and under the condition that the sorting mode is ascending sorting according to the data values, sending the data information to be processed to the sorting node only under the condition that the data value to be processed is smaller than the current threshold value.
Under the condition that the sorting mode is ascending sorting (namely under the condition that the minimum TOP (N) is calculated), the data value in the last data information in the existing sorting result is the maximum data value of the existing sorting result; in the application, the current threshold value is consistent with the data value of the last data information.
Therefore, if the data value of an object is smaller than the current threshold, the data information indicating the object breaks the existing sorting result, that is, the object is a useful object for the sorting result; at this time, the computing node sends the data information of the object to the sorting node.
If the data value of an object is not less than the current threshold value, the data information of the object does not break the existing sorting result, namely the object is an object which is useless for the sorting result; at this time, the computing node does not send the data information of the object to the sorting node.
It can be understood that, after the sorting node receives the to-be-processed information, the existing sorting result may be updated according to the to-be-processed data information to generate the current sorting result. In order to keep the current threshold consistent with the data value of the last data information in the sorting result in real time, the computing node may receive the latest threshold sent by the sorting node; wherein, the latest threshold is a data value in the last data information in the current sorting result; the current threshold is then updated with the latest threshold.
Through the technical content, the following beneficial effects can be seen in the embodiment:
the object sorting system provided by the application is provided with a current threshold value on the computing node, wherein the current threshold value is a data value in the last data information of the existing sorting result. It is understood that, in the case where the sort order is a descending (or ascending) order [ i.e., in the case where the maximum top (n) (or the minimum top (n)) is calculated ], the data value in the last data information of the existing sort result is the minimum value (maximum value) in the existing sort result.
Therefore, when the value of the data to be processed is greater than (smaller than) the current threshold, the information of the data to be processed has the possibility of breaking the existing sorting result, and at this time, the computing node sends the information of the data to be processed of the object to be processed to the sorting node, so that the sorting node updates the existing sorting result. When the value of the data to be processed is not larger (not smaller) than the current threshold, the information of the data to be processed does not change the existing sorting result. At this time, the computing node does not send the to-be-processed data information of the to-be-processed object to the sorting node, and the sorting node does not need to calculate the to-be-processed data of the to-be-processed object.
That is, in the present application, the computing node filters most of the to-be-processed objects that are useless for the sorting result by comparing the to-be-processed data information of the to-be-processed objects with the current threshold, and only sends the to-be-processed objects that are useful for the sorting result to the sorting node, and the sorting node sorts the objects. Therefore, the method and the device can enable the sequencing node to sequence a small number of objects, and can improve sequencing efficiency and reduce occupied memory due to the fact that the number of the sequencing objects is reduced suddenly.
The object processing system shown in fig. 2 has only one sort node, and generally, one sort node performs sorting processing on objects of one category. For example, only seller order quantities, only atmospheric pollution indices, and so forth. In order to achieve sorting of objects of multiple categories, the present application provides a second embodiment of an object sorting system. Referring to fig. 5, the method specifically includes:
a distribution node group 100, a compute node group 200, and a sorting node group 300. The sorting node group 300 includes a plurality of sorting nodes, and each sorting node can implement a sorting process for an object of a category.
Since the number of the sorting nodes is multiple, in order to make the computing node know that the object to be processed should be sent to the sorting node after determining the object to be processed, the corresponding relationship between the category identification of the object and the sorting node identification can be constructed on each sorting node. For example, the sorting node identifier 1 corresponds to the category identifier 1 of the object to be processed.
An embodiment two of the object processing method provided on the basis of the object processing system shown in fig. 5 is described below. As shown in fig. 6, the method specifically includes the following steps:
step S601: determining to-be-processed data information of an object to be processed and a to-be-processed category identifier for representing a category to which the object to be processed belongs; the data information to be processed comprises an object identifier to be processed for uniquely representing the object to be processed and a data value to be processed of the object to be processed.
Since the value of the data to be processed of the object to be processed is changing, the current information of the data to be processed of the object to be processed is determined first before the subsequent steps are executed. The to-be-processed data information comprises to-be-processed category identification of the to-be-processed object, the to-be-processed object identification and the to-be-processed data value. Of course, the to-be-processed category identifier and the to-be-processed object identifier do not change, and the to-be-processed category identifier and the to-be-processed object identifier continuously change to be to-be-processed data values.
The change process of the data value to be processed is that the data value to be processed changes on the basis of the historical data value.
The application provides an embodiment of determining to-be-processed data information for a change process of the to-be-processed data value. Referring to fig. 7, the method specifically includes the following steps:
step S701: and receiving current data information of the object to be processed, which is sent by the distribution node group, wherein the data information comprises a class identifier to be processed, an object identifier to be processed and a current data value.
Since multiple classes of objects can be handled on one compute node, each class can handle multiple objects. Therefore, the computing node may receive the current data information of the object to be processed, and determine the class identifier of the object to be processed, the identifier of the object to be processed, and the current data value from the current data information, so as to distinguish the object to be processed among the plurality of objects by using the class identifier of the object to be processed and the identifier of the object to be processed.
Step S702: and determining the class identifier to be processed and the object identifier to be processed of the object to be processed in the current data information.
Step S703: and searching the historical data value to be processed corresponding to the identification of the object to be processed in the historical data value set corresponding to the identification of the category to be processed.
The computation node maintains historical data value sets of a plurality of class objects, so that the historical data value set to be processed corresponding to the class identifier to be processed is found out in the historical data value sets. The historical data value set to be processed has the historical data values of a plurality of objects, and each historical data value is in one-to-one correspondence with the object identifier. Therefore, the historical data value to be processed corresponding to the identification of the object to be processed can be searched in the historical data value set to be processed.
Step S704: and determining the sum of the historical data value to be processed and the current data value as the data value to be processed of the object to be processed.
And calculating the sum of the historical data value to be processed and the current data value so as to obtain the current data value to be processed of the object to be processed.
After determining the to-be-processed data of the to-be-processed object, in order to determine whether the to-be-processed object is an object useful for the sorting result, the to-be-processed data value needs to be compared with the current threshold corresponding to the to-be-processed category identifier. The current threshold value is a data value in the last data information in the existing sorting result of the sorting node, and the existing sorting result is generated after the sorting node sorts a plurality of data information according to the sorting mode.
The comparison process between the data value to be processed and the current threshold in different sorting modes is slightly different, and the comparison process between the data value to be processed and the current threshold in different sorting modes is explained in detail below.
Subsequently, returning to fig. 6, the flow proceeds to step S602: and under the condition that the sorting mode is in descending order according to the data values, sending the data information to be processed to the sorting node corresponding to the category identification to be processed only under the condition that the data value to be processed is larger than the current threshold corresponding to the category identification to be processed.
Step S603: and under the condition that the sorting mode is in ascending order according to the data values, sending the to-be-processed data information to a sorting node corresponding to the to-be-processed category identification only under the condition that the to-be-processed data values are smaller than the current threshold corresponding to the to-be-processed category identification.
The current threshold value is a data value in last data information in an existing sorting result of the sorting node, and the existing sorting result is generated after the sorting node sorts a plurality of data information according to the sorting mode.
The specific contents of step S602 and step S603 have already been described in step S302 and step S303 in detail and are not described here again. Different from step S302 and step S303, the current threshold in step S602 and step S603 is a threshold corresponding to the to-be-processed category identifier found in the correspondence between the category identifier and the threshold by the computing node, and the sorting node is a sorting node corresponding to the to-be-processed category identifier found in the correspondence between the category identifier and the sorting node identifier by the computing node.
It can be understood that, after the sorting node corresponding to the to-be-processed category identifier updates the existing sorting result according to the to-be-processed data information to generate the current sorting result, the computing node may receive the category identifier and the latest threshold value sent by the sorting node corresponding to the to-be-processed category identifier. The computing node may update a current threshold corresponding to the category identification with the most recent threshold.
For the purpose of processing multiple class objects, there are current thresholds on the compute nodes corresponding to multiple class identifications. To accurately update the current threshold corresponding to the sorting node, the sorting node may send the category identification and the latest threshold. Thus, after receiving the category identifier and the latest threshold value, the computing node knows that the current threshold value corresponding to the category identifier needs to be updated.
The class identification sent by the sequencing node is consistent with the class identification of the processing object. That is, if the sorting node processes the object of the category identifier a, the sorting node sends the category identifier a together when sending the latest threshold value to the computing node; so that the computing node updates the current threshold corresponding to the class identity a.
From the above, it can be seen that the present embodiment has the following beneficial effects: in this embodiment, the computing node filters most of the to-be-processed objects that are useless for the sorting result by comparing the to-be-processed data information of the to-be-processed objects with the current threshold, and only sends the to-be-processed objects that are useful for the sorting result to the sorting node, and the sorting node sorts the objects. Therefore, the method and the device can enable the sequencing node to sequence a small number of objects, and can improve sequencing efficiency and reduce occupied memory due to the fact that the number of the sequencing objects is reduced suddenly.
In addition, the present embodiment can also achieve the purpose of data sorting for a plurality of class objects, so that the application range of the present embodiment can be widened.
The foregoing describes the processing of a compute node and the following describes the processing of a rank node. The application provides a third embodiment of a data sorting method, which is applied to a sorting node in an object sorting system shown in fig. 2 or fig. 5. As shown in fig. 8, the method specifically includes the following steps:
step S801: receiving to-be-processed data information of to-be-processed objects sent by the computing node group; the information of the data to be processed comprises the identification of the object to be processed and the value of the data to be processed.
The computing node may send the to-be-processed data information to the sorting node by using the object processing method shown in fig. 3 or fig. 7. The sorting node may be the sorting node 301 shown in fig. 2, or the sorting node corresponding to the to-be-processed category identifier in the sorting node group 300 shown in fig. 6.
The sorting node 301 or a sorting node corresponding to the pending class identification may receive the pending data information.
Step S802: and updating the existing sorting result according to the to-be-processed data information to generate a current sorting result.
The data information to be processed sent by the computing node to the sorting node is the data information of the useful object, namely the data information which has the influence on the sorting result, so that the sorting node can update the existing sorting result according to the data to be processed and generate the current sorting result.
The following describes in detail a specific implementation process of step S802:
it will be appreciated that there are multiple objects in the sorting queue for which there is a sorting result. For the object to be processed, the object to be processed may already exist in the sorting queue of the existing sorting result, and may not exist in the sorting queue of the existing sorting result. Therefore, two processing manners of step S1002 are provided for two different cases:
the first processing mode is as follows: and the data sorting queue corresponding to the existing sorting result does not contain the identification of the object to be processed. As shown in fig. 9, the first processing method specifically includes the following steps:
step S901: and if the data sorting queue corresponding to the existing sorting result does not contain the identification of the object to be processed, deleting the last data information in the existing sorting result in the sorting data queue.
Step S902: and adding the data information to be processed in the sequencing data queue.
Step S903: and generating the current sorting result after the sorting data queue is reordered.
And because the data sorting queue of the existing sorting result does not contain the identification of the object to be processed, the data sorting queue does not contain the object to be processed. Since the data value to be processed of the object to be processed is already greater (or less) than the data value of the last data information, the last data information is already useless in the existing sorting result. Therefore, the last data information is deleted in the data sorting queue, and then the data information to be processed is added in the sorting queue. And then reordering the data ordering queue to generate a current ordering result.
The second mode is as follows: and the data sorting queue corresponding to the existing sorting result comprises the identification of the object to be processed. As shown in fig. 10, the method specifically includes the following steps:
step S1001: and if the data sorting queue corresponding to the existing sorting result contains the identification of the object to be processed, updating the historical data value corresponding to the identification of the object to be processed by using the data value to be processed in the data sorting queue.
Step S1002: and generating the current sorting result after the sorting data queue is reordered.
And because the data sorting queue with the sorting result contains the identification of the object to be processed, the data sorting queue contains the object to be processed. However, the history data value that is the object to be processed is stored in the data sorting queue, and therefore, the history data value is updated with the data value to be processed. Since the data value of the object to be processed has changed, the existing sorting result may change, and therefore the data sorting queue needs to be sorted again, so as to generate the current sorting result.
After the update generates the current sort result, the current threshold in the compute node may be updated in order to ensure performance of subsequent object sorts.
The first implementation manner of updating the current threshold is as follows: determining a data value in the last data information of the current sorting result as a latest threshold value; and sending the latest threshold value to each computing node in the computing node group.
Under the condition that only one sorting node exists in the object sorting system, only one current threshold exists on each computing node in the computing node group, so that the direct sorting node only sends the latest threshold, and the computing node only needs to update the current threshold.
The second implementation manner is as follows: determining a data value in the last data information of the current sorting result as a latest threshold value; sending the latest threshold value and the category identification of the sorting node to each computing node in the computing node group; wherein the class identifier is consistent with the to-be-processed class identifier of the to-be-processed object.
Under the condition that a plurality of sorting nodes are arranged in the object sorting system, each computing node in the computing node group is provided with a plurality of current thresholds, and the thresholds needing to be updated are determined conveniently by the computing nodes, so that the sorting nodes send the category identifications and the latest thresholds so that the computing nodes update the current thresholds corresponding to the category identifications.
Corresponding to the first embodiment of the object sorting method shown in fig. 3, the present application further provides a first embodiment of an object sorting apparatus, which is integrated in one computing node group of the object sorting system shown in fig. 2. As shown in fig. 11, the apparatus includes:
a first determination data information unit 111 for determining to-be-processed data information of an object to be processed; the data information to be processed comprises an object identifier to be processed for uniquely representing the object to be processed and a data value to be processed of the object to be processed;
a first data information sending unit 112, configured to send the to-be-processed data information to the sorting node only when the to-be-processed data value is greater than the current threshold value in a case that the sorting manner is a descending sorting according to the data values;
a second data information sending unit 113, configured to send the to-be-processed data information to the sorting node only when the to-be-processed data value is smaller than the current threshold value in a case where the sorting manner is that the to-be-processed data information is sorted in an ascending order of data values;
the current threshold value is a data value in last data information in an existing sorting result of the sorting node, and the existing sorting result is generated after the sorting node sorts a plurality of data information according to the sorting mode.
As shown in fig. 12, the first specific data information unit 111 includes:
a first data information receiving unit 121, configured to receive current data information of an object to be processed sent by the distribution node group, where the current data information includes an identifier of the object to be processed and a current data value;
a first determining unit 122, configured to determine the to-be-processed object identifier of the to-be-processed object in the current data information;
a first searching unit 123, configured to search for a to-be-processed historical data value corresponding to the to-be-processed object identifier;
a second determining unit 124, configured to determine a sum of the historical data value to be processed and the current data value as the data value to be processed.
As shown in fig. 13, a first embodiment of the object sorting apparatus provided in the present application further includes:
a first receiving threshold unit 131, configured to receive a latest threshold sent by the sorting node after the sorting node updates the existing sorting result according to the to-be-processed data information to generate a current sorting result; wherein, the latest threshold is a data value in the last data information in the current sorting result;
a first updating unit 132, configured to update the current threshold with the latest threshold.
A second embodiment of an object sorting apparatus corresponding to the object sorting method shown in fig. 6 is integrated into a computing node in a computing node group of the object sorting system shown in fig. 5, as shown in fig. 14, the apparatus includes:
a second determined data information unit 141, configured to determine to-be-processed data information of an object to be processed, and a to-be-processed category identifier indicating a category to which the object to be processed belongs; the data information to be processed comprises an object identifier to be processed for uniquely representing the object to be processed and a data value to be processed of the object to be processed;
a third data information sending unit 142, configured to, when the sorting manner is in a descending order of data values, send the to-be-processed data information to the sorting node corresponding to the to-be-processed category identifier only when the to-be-processed data value is greater than the current threshold corresponding to the to-be-processed category identifier;
a fourth data information sending unit 143, configured to, when the sorting manner is in an ascending order of data values, send the to-be-processed data information to the sorting node corresponding to the to-be-processed category identifier only when the to-be-processed data value is smaller than the current threshold corresponding to the to-be-processed category identifier;
the current threshold value is a data value in last data information in an existing sorting result of the sorting node, and the existing sorting result is generated after the sorting node sorts a plurality of data information according to the sorting mode.
As shown in fig. 15, the second specific data information unit 141 includes:
a second data information receiving unit 151, configured to receive current data information of the to-be-processed object sent by the distribution node group, where the data information includes a to-be-processed category identifier, a to-be-processed object identifier, and a current data value;
a third determining unit 152, configured to determine, in the current data information, the to-be-processed category identifier and the to-be-processed object identifier of the to-be-processed object;
a second searching unit 153, configured to search, in the historical data value set corresponding to the category identifier to be processed, a historical data value to be processed corresponding to the object identifier to be processed;
a fourth determining unit 154, configured to determine a sum of the historical data value to be processed and the current data value as a data value to be processed of the object to be processed.
As shown in fig. 16, a second embodiment of the object sorting apparatus provided in the present application further includes:
a second receiving threshold unit 161, configured to receive the category identifier and the latest threshold sent by the sorting node corresponding to the to-be-processed category identifier after the sorting node corresponding to the to-be-processed category identifier updates the existing sorting result according to the to-be-processed data information to generate a current sorting result;
a second updating unit 162 that updates the current threshold corresponding to the category identifier with the latest threshold;
and the latest threshold is a data value in the last data information in the current sorting result, and the category identification is consistent with the category identification to be processed of the object to be processed.
Corresponding to the object sorting method shown in fig. 8, the present application also provides an object sorting apparatus, which is applied to a sorting node in the object sorting system shown in fig. 2 or fig. 5. As shown in fig. 17, the apparatus includes:
a third data information receiving unit 171, configured to receive to-be-processed data information of to-be-processed objects sent by the computing node group; the data information to be processed comprises an object identifier to be processed of the object to be processed and a data value to be processed; each computing node in the computing node group is integrated with the device shown in fig. 11 or fig. 14.
And the generating unit 172 is configured to update the existing sorting result according to the to-be-processed data information, and generate a current sorting result.
As shown in fig. 18, the generating unit 172 includes:
a deleting unit 181, configured to delete the last data information in the existing sorting result from the sorting data queue if the data sorting queue corresponding to the existing sorting result does not include the identifier of the object to be processed;
an adding unit 182, configured to add the to-be-processed data information in the sorting data queue;
the first result generating unit 183 is configured to generate the current sorting result after the sorting data queue is reordered.
As shown in fig. 19, the generating unit 172 includes:
a third updating unit 191, configured to update, in the data sorting queue, a historical data value corresponding to the identifier of the object to be processed by using the data value to be processed if the data sorting queue corresponding to the existing sorting result includes the identifier of the object to be processed;
a second result generating unit 192, configured to generate the current sorting result after the sorting data queue is reordered.
As shown in fig. 20, after the generating the current sorting result, the method further includes:
a fifth determining unit 201, configured to determine a data value in the last data information of the current sorting result as a latest threshold;
a first sending threshold unit 202, configured to send the latest threshold to each computing node in the computing node group. Or, the second threshold sending unit 203 is configured to send the latest threshold and the category identifier of the sorting node to each computing node in the computing node group; wherein the class identifier is consistent with the to-be-processed class identifier of the to-be-processed object.
As shown in fig. 2, the present application provides an object ranking system, comprising:
a distribution node group 100, a computing node group 200, and a sorting node 301;
the computing node cluster 100 is used to execute the object ordering method shown in FIG. 3.
The sorting node 301 is configured to execute the object sorting method shown in fig. 8.
The specific implementation process of the present system has been described in detail in the embodiments of fig. 3 and fig. 8, and is not described herein again.
As shown in fig. 5, the present application provides an object ranking system comprising:
a distribution node group 100, a compute node group 200, and a sort node group 300;
the computing node cluster 100 is used to execute the object ordering method shown in FIG. 6.
The sorting nodes in the sorting node group 300 are used to execute the object sorting method shown in fig. 8.
The specific implementation process of the present system has been described in detail in the embodiments of fig. 6 and fig. 8, and is not described herein again.
From the above, it can be seen that the present application has the following beneficial effects:
the object sorting system provided by the application is provided with a current threshold value on the computing node, wherein the current threshold value is a data value in the last data information of the existing sorting result. It is understood that, in the case where the sort order is a descending (or ascending) order [ i.e., in the case where the maximum top (n) (or the minimum top (n)) is calculated ], the data value in the last data information of the existing sort result is the minimum value (maximum value) in the existing sort result.
Therefore, when the value of the data to be processed is greater than (smaller than) the current threshold, the information of the data to be processed has the possibility of breaking the existing sorting result, and at this time, the computing node sends the information of the data to be processed of the object to be processed to the sorting node, so that the sorting node updates the existing sorting result. When the value of the data to be processed is not larger than (not smaller than) the current threshold value, the information of the data to be processed does not change the existing sorting result, at this time, the computing node does not send the information of the data to be processed of the object to be processed to the sorting node, and the sorting node does not need to calculate the data to be processed of the object to be processed.
That is, in the present application, the computing node filters most of the to-be-processed objects that are useless for the sorting result by comparing the to-be-processed data information of the to-be-processed objects with the current threshold, and only sends the to-be-processed objects that are useful for the sorting result to the sorting node, and the sorting node sorts the objects. Therefore, the method and the device can enable the sequencing node to sequence a small number of objects, and can improve sequencing efficiency and reduce occupied memory due to the fact that the number of the sequencing objects is reduced suddenly.
The functions described in the method of the present embodiment, if implemented in the form of software functional units and sold or used as independent products, may be stored in a storage medium readable by a computing device. Based on such understanding, part of the contribution to the prior art of the embodiments of the present application or part of the technical solution may be embodied in the form of a software product stored in a storage medium and including several instructions for causing a computing device (which may be a personal computer, a server, a mobile computing device or a network device) to execute all or part of the steps of the method described in the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
The embodiments are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same or similar parts among the embodiments are referred to each other.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (20)

1. An object sorting method is characterized in that the object sorting method is applied to one computing node in a computing node group of an object sorting system, and the object sorting system also comprises a distribution node group and a sorting node which are connected with the computing node group; the method comprises the following steps:
determining to-be-processed data information of an object to be processed; the data information to be processed comprises an object identifier to be processed for uniquely representing the object to be processed and a data value to be processed of the object to be processed;
when the sorting mode is that the data values are sorted in a descending order, the data information to be processed is sent to the sorting node only when the data value to be processed is larger than the current threshold value;
under the condition that the sorting mode is ascending sorting according to the data values, the data information to be processed is sent to the sorting node only under the condition that the data value to be processed is smaller than the current threshold value;
the current threshold value is a data value in the last data information of the existing sorting result of the sorting node, and the existing sorting result is generated after the sorting node sorts a plurality of data information according to the sorting mode.
2. The method of claim 1, wherein the determining the to-be-processed data information of the to-be-processed object comprises:
receiving current data information of an object to be processed sent by the distribution node group, wherein the current data information comprises an identifier of the object to be processed and a current data value;
determining the to-be-processed object identification of the to-be-processed object in the current data information;
searching a historical data value to be processed corresponding to the identification of the object to be processed;
and determining the sum of the historical data value to be processed and the current data value as the data value to be processed.
3. The method of claim 1, further comprising:
after the sequencing node updates the existing sequencing result according to the to-be-processed data information to generate a current sequencing result, receiving a latest threshold value sent by the sequencing node; wherein the latest threshold is a data value in the last data information of the current sorting result;
updating the current threshold with the latest threshold.
4. An object sorting method is characterized in that the object sorting method is applied to one computing node in a computing node group of an object sorting system, and the object sorting system also comprises a distribution node group and a sorting node group which are connected with the computing node group; the method comprises the following steps:
determining to-be-processed data information of an object to be processed and a to-be-processed category identifier for representing a category to which the object to be processed belongs; the data information to be processed comprises an object identifier to be processed for uniquely representing the object to be processed and a data value to be processed of the object to be processed;
when the sorting mode is that the data values are sorted in a descending order, the data information to be processed is sent to the sorting node corresponding to the category identifier to be processed only when the data value to be processed is larger than the current threshold corresponding to the category identifier to be processed;
when the sorting mode is that the data values are sorted in an ascending order, the data information to be processed is sent to the sorting node corresponding to the category identifier to be processed only when the data value to be processed is smaller than the current threshold corresponding to the category identifier to be processed;
the current threshold value is a data value in the last data information of the existing sorting result of the sorting node, and the existing sorting result is generated after the sorting node sorts a plurality of data information according to the sorting mode.
5. The method of claim 4, wherein the determining the to-be-processed data information of the to-be-processed object and the to-be-processed category identifier for indicating the category to which the to-be-processed object belongs comprises:
receiving current data information of an object to be processed, which is sent by the distribution node group, wherein the data information comprises a category identifier to be processed, an object identifier to be processed and a current data value;
determining the class identifier to be processed and the object identifier to be processed of the object to be processed in the current data information;
searching a historical data value to be processed corresponding to the identification of the object to be processed in a historical data value set corresponding to the identification of the category to be processed;
and determining the sum of the historical data value to be processed and the current data value as the data value to be processed of the object to be processed.
6. The method of claim 4, further comprising:
after the sorting node corresponding to the category identifier to be processed updates the existing sorting result according to the data information to be processed to generate a current sorting result, receiving the category identifier and the latest threshold value sent by the sorting node corresponding to the category identifier to be processed;
updating a current threshold corresponding to the category identification by using the latest threshold;
and the latest threshold is a data value in the last data information in the current sorting result, and the category identification is consistent with the category identification to be processed of the object to be processed.
7. An object sorting method is applied to a sorting node in an object sorting system, and the sorting system also comprises a computing node group and a distribution node group connected with the computing node group; the method comprises the following steps:
receiving the data information to be processed of the object to be processed sent by the computing node group according to the method of claim 1 or claim 4; the data information to be processed comprises an object identifier to be processed of the object to be processed and a data value to be processed;
and updating the existing sorting result according to the to-be-processed data information to generate a current sorting result.
8. The method as claimed in claim 7, wherein said updating the existing ranking result according to the to-be-processed data information to generate the current ranking result comprises:
if the data sorting queue corresponding to the existing sorting result does not contain the identification of the object to be processed, deleting the last data information in the existing sorting result in the sorting data queue;
adding the to-be-processed data information in the sequencing data queue;
and generating the current sorting result after the sorting data queue is reordered.
9. The method as claimed in claim 7, wherein said updating the existing ranking result according to the to-be-processed data information to generate the current ranking result comprises:
if the data sorting queue corresponding to the existing sorting result contains the identification of the object to be processed, updating a historical data value corresponding to the identification of the object to be processed by using the data value to be processed in the data sorting queue;
and generating the current sorting result after the sorting data queue is reordered.
10. The method of claim 7, after said generating the current ranking result, further comprising:
determining a data value in the last data information of the current sorting result as a latest threshold value; sending the latest threshold value to each computing node in the computing node group; alternatively, the first and second electrodes may be,
determining a data value in the last data information of the current sorting result as a latest threshold value; sending the latest threshold value and the category identification of the sorting node to each computing node in the computing node group; wherein the class identifier is consistent with the to-be-processed class identifier of the to-be-processed object.
11. An object sorting apparatus is characterized in that a computing node is integrated in a computing node group of an object sorting system, and the object sorting system further comprises a distribution node group and a sorting node which are connected with the computing node group; the device comprises:
the first determining data information unit is used for determining the to-be-processed data information of the to-be-processed object; the data information to be processed comprises an object identifier to be processed for uniquely representing the object to be processed and a data value to be processed of the object to be processed;
the first data information sending unit is used for sending the data information to be processed to the sorting node only under the condition that the data values to be processed are larger than the current threshold value under the condition that the sorting mode is the descending sorting according to the data values;
the second data information sending unit is used for sending the data information to be processed to the sorting node only under the condition that the data value to be processed is smaller than the current threshold value under the condition that the sorting mode is in ascending order according to the data value;
the current threshold value is a data value in the last data information of the existing sorting result of the sorting node, and the existing sorting result is generated after the sorting node sorts a plurality of data information according to the sorting mode.
12. The apparatus of claim 11, wherein the first deterministic data information unit comprises:
a first data information receiving unit, configured to receive current data information of an object to be processed sent by the distribution node group, where the current data information includes an identifier of the object to be processed and a current data value;
a first determining unit, configured to determine the to-be-processed object identifier of the to-be-processed object in the current data information;
the first searching unit is used for searching the historical data value to be processed corresponding to the identification of the object to be processed;
and the second determining unit is used for determining the sum of the historical data value to be processed and the current data value as the data value to be processed.
13. An object sorting device is characterized in that the object sorting device is applied to one computing node in a computing node group of an object sorting system, and the object sorting system further comprises a distribution node group and a sorting node group which are connected with the computing node group; the device comprises:
the second determination data information unit is used for determining to-be-processed data information of the to-be-processed object and a to-be-processed category identifier for representing the category to which the to-be-processed object belongs; the data information to be processed comprises an object identifier to be processed for uniquely representing the object to be processed and a data value to be processed of the object to be processed;
a third data information sending unit, configured to send the to-be-processed data information to a sorting node corresponding to the to-be-processed category identifier only when the to-be-processed data value is greater than the current threshold corresponding to the to-be-processed category identifier in a sorting manner that the to-be-processed data value is sorted in a descending order of data values;
a fourth data information sending unit, configured to send the to-be-processed data information to the sorting node corresponding to the to-be-processed category identifier only when the to-be-processed data value is smaller than the current threshold corresponding to the to-be-processed category identifier in a sorting manner that the to-be-processed data value is arranged in an ascending order of data values;
the current threshold value is a data value in last data information in an existing sorting result of the sorting node, and the existing sorting result is generated after the sorting node sorts a plurality of data information according to the sorting mode.
14. The apparatus of claim 13, wherein the second deterministic data information unit comprises:
a second data information receiving unit, configured to receive current data information of the to-be-processed object sent by the distribution node group, where the data information includes a to-be-processed category identifier, a to-be-processed object identifier, and a current data value;
a third determining unit, configured to determine, in the current data information, the to-be-processed category identifier and the to-be-processed object identifier of the to-be-processed object;
the second searching unit is used for searching the historical data value to be processed corresponding to the identification of the object to be processed in the historical data value set corresponding to the identification of the class to be processed;
and the fourth determining unit is used for determining the sum of the historical data value to be processed and the current data value as the data value to be processed of the object to be processed.
15. The apparatus of claim 13, further comprising:
the second receiving threshold unit is used for receiving the category identifier and the latest threshold sent by the sorting node corresponding to the to-be-processed category identifier after the sorting node corresponding to the to-be-processed category identifier updates the existing sorting result according to the to-be-processed data information to generate a current sorting result;
a second updating unit that updates the current threshold corresponding to the category identifier using the latest threshold;
and the latest threshold is a data value in the last data information in the current sorting result, and the category identification is consistent with the category identification to be processed of the object to be processed.
16. An object sorting device is applied to a sorting node in an object sorting system, and the sorting system also comprises a computing node group and a distribution node group connected with the computing node group; the device comprises:
a third data information receiving unit, configured to receive to-be-processed data information of an object to be processed, where the to-be-processed data information is sent by the computing node group; wherein, each computing node in the computing node group is integrated with the device as claimed in claim 11 or 13, and the to-be-processed data information includes the to-be-processed object identifier and the to-be-processed data value of the to-be-processed object;
and the generating unit is used for updating the existing sorting result according to the to-be-processed data information and generating the current sorting result.
17. The apparatus of claim 16, wherein the generating unit comprises:
a deleting unit, configured to delete the last data information in the existing sorting result from the sorting data queue if the data sorting queue corresponding to the existing sorting result does not include the identifier of the object to be processed;
the adding unit is used for adding the to-be-processed data information in the sequencing data queue;
and the first result generation unit is used for generating the current sorting result after the sorting data queue is reordered.
18. The apparatus of claim 17, wherein the generating unit comprises:
a third updating unit, configured to update, in the data sorting queue, a historical data value corresponding to the to-be-processed object identifier with the to-be-processed data value if the data sorting queue corresponding to the existing sorting result includes the to-be-processed object identifier;
and the second result generating unit is used for generating the current sorting result after the sorting data queue is reordered.
19. An object ranking system, comprising:
distributing node groups, computing node groups and sequencing nodes;
the computing node group is configured to perform the object ordering method of any of claims 1-3;
the sorting node is adapted to perform the object sorting method of any of claims 7-11.
20. An object ranking system, comprising:
distributing node groups, computing node groups and sequencing node groups;
the computing node group is configured to perform the object ordering method of any of claims 4-6;
the sorting nodes of the group of sorting nodes are adapted to perform the object sorting method of claim 7, 8, 9 or 11.
CN201511021334.8A 2015-12-30 2015-12-30 Object sorting method, device and system Active CN106933855B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201511021334.8A CN106933855B (en) 2015-12-30 2015-12-30 Object sorting method, device and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201511021334.8A CN106933855B (en) 2015-12-30 2015-12-30 Object sorting method, device and system

Publications (2)

Publication Number Publication Date
CN106933855A CN106933855A (en) 2017-07-07
CN106933855B true CN106933855B (en) 2020-06-23

Family

ID=59442372

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201511021334.8A Active CN106933855B (en) 2015-12-30 2015-12-30 Object sorting method, device and system

Country Status (1)

Country Link
CN (1) CN106933855B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101101599A (en) * 2007-06-20 2008-01-09 精实万维软件(北京)有限公司 Method for extracting advertisement main information from web page
CN102779153A (en) * 2011-05-13 2012-11-14 佳能株式会社 Information processing apparatus and information processing method
CN102880603A (en) * 2011-07-11 2013-01-16 阿里巴巴集团控股有限公司 Method and equipment for filtering ranking list data
CN104866561A (en) * 2015-05-19 2015-08-26 国家计算机网络与信息安全管理中心 Method for mining microblog topic tendency initiator

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP5304213B2 (en) * 2008-12-15 2013-10-02 沖電気工業株式会社 Data processing apparatus, program and method, and network system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101101599A (en) * 2007-06-20 2008-01-09 精实万维软件(北京)有限公司 Method for extracting advertisement main information from web page
CN102779153A (en) * 2011-05-13 2012-11-14 佳能株式会社 Information processing apparatus and information processing method
CN102880603A (en) * 2011-07-11 2013-01-16 阿里巴巴集团控股有限公司 Method and equipment for filtering ranking list data
CN104866561A (en) * 2015-05-19 2015-08-26 国家计算机网络与信息安全管理中心 Method for mining microblog topic tendency initiator

Also Published As

Publication number Publication date
CN106933855A (en) 2017-07-07

Similar Documents

Publication Publication Date Title
CN109101620B (en) Similarity calculation method, clustering method, device, storage medium and electronic equipment
CN110209808B (en) Event generation method based on text information and related device
CN108304444B (en) Information query method and device
CN110909182B (en) Multimedia resource searching method, device, computer equipment and storage medium
KR101508260B1 (en) Summary generation apparatus and method reflecting document feature
CN108595688A (en) Across the media Hash search methods of potential applications based on on-line study
KR102473155B1 (en) Method for providing interactive information service and apparatus therefor
CN106897280B (en) Data query method and device
CN109508374B (en) Text data semi-supervised clustering method based on genetic algorithm
JPWO2016001998A1 (en) Similarity calculation system, similarity calculation method, and program
CN110647995A (en) Rule training method, device, equipment and storage medium
CN112598128A (en) Model training and online analysis processing method and device
CN107844536B (en) Method, device and system for selecting application program
CN108549696B (en) Time series data similarity query method based on memory calculation
CN113204642A (en) Text clustering method and device, storage medium and electronic equipment
CN109657060B (en) Safety production accident case pushing method and system
CN106933855B (en) Object sorting method, device and system
CN108595395B (en) Nickname generation method, device and equipment
CN111582967A (en) Content search method, device, equipment and storage medium
CN111737461B (en) Text processing method and device, electronic equipment and computer readable storage medium
CN111639099A (en) Full-text indexing method and system
CN114024912A (en) Network traffic application identification analysis method and system based on improved CHAMELEON algorithm
CN113327154A (en) E-commerce user message pushing method and system based on big data
CN112925912A (en) Text processing method, and synonymous text recall method and device
KR101266504B1 (en) Method for extracting top word on set of documents using richness

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20211109

Address after: No. 301, floor 3, building 9, zone 4, Wangjing Dongyuan, Chaoyang District, Beijing

Patentee after: ALIBABA (BEIJING) SOFTWARE SERVICE Co.,Ltd.

Address before: A four-storey 847 mailbox in Grand Cayman Capital Building, British Cayman Islands

Patentee before: ALIBABA GROUP HOLDING Ltd.

TR01 Transfer of patent right