CN110046949B - Method, device and medium for integrating numbers - Google Patents

Method, device and medium for integrating numbers Download PDF

Info

Publication number
CN110046949B
CN110046949B CN201810044824.7A CN201810044824A CN110046949B CN 110046949 B CN110046949 B CN 110046949B CN 201810044824 A CN201810044824 A CN 201810044824A CN 110046949 B CN110046949 B CN 110046949B
Authority
CN
China
Prior art keywords
hash
read
target
value
processing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810044824.7A
Other languages
Chinese (zh)
Other versions
CN110046949A (en
Inventor
蔡畅奇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tenpay Payment Technology Co Ltd
Original Assignee
Tenpay Payment Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tenpay Payment Technology Co Ltd filed Critical Tenpay Payment Technology Co Ltd
Priority to CN201810044824.7A priority Critical patent/CN110046949B/en
Publication of CN110046949A publication Critical patent/CN110046949A/en
Application granted granted Critical
Publication of CN110046949B publication Critical patent/CN110046949B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0633Lists, e.g. purchase orders, compilation or processing
    • G06Q30/0635Processing of requisition or of purchase orders
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/04Trading; Exchange, e.g. stocks, commodities, derivatives or currency exchange

Landscapes

  • Business, Economics & Management (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • Engineering & Computer Science (AREA)
  • Strategic Management (AREA)
  • Marketing (AREA)
  • Economics (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Development Economics (AREA)
  • Theoretical Computer Science (AREA)
  • Technology Law (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention relates to computer technology, and more particularly, to a method, an apparatus, and a medium for performing a hash operation. The method comprises the following steps: after each time of performing the hash operation, the next round of hash operation is continued by combining the number of the read marks and the obtained hash value and the latest read mark number until the hash operation is completed in an iterative mode, so that the execution complexity of the hash operation flow can be effectively reduced, the time consumed by the hash operation flow is greatly reduced, and the calculation efficiency of the system and the system load can be effectively improved when the massive order records are faced.

Description

Method, device and medium for integrating numbers
Technical Field
The present invention relates to the field of computer technologies, and in particular, to a method, an apparatus, and a medium for performing a hash function.
Background
At present, electronic commerce has been integrated into social life, and business operation modes of various companies are gradually tending to be digitalized.
In practical applications, bulk goods rounding is a very common e-commerce mode of operation. Namely, the merchant needs to make scattered orders ordered by different users into specified values for shipping, so that the operation cost is saved.
For example, assuming 100 tens of thousands of users, each user may need 1 to 99 cargo, and each cargo is an integer of 100, 200, 300, and 1000, then the unpacking adjustment needs to be reduced as much as possible, that is, the number of the required cargoes of the users is integrated into a whole box and then the cargoes are delivered.
Due to the popularity of electronic commerce, typically, a user's order records will be stored as a massive data set. In the mass data set, the specified value is calculated and output by adding up the goods quantity of each order, and then the method is realized by adopting a traditional algorithm. However, conventional algorithms generally have very high time complexity, and as the amount of data increases, the complexity of the conventional algorithms increases exponentially, resulting in low computational efficiency and a serious operation load on the system.
When a large number of orders exist in the system, bulk goods are calculated by the existing SQL-JOIN hard calculation method, so that calculation time is long, calculation result inquiry is slow, service quality of the system is seriously reduced, and meanwhile, serious operation load is caused to the system.
Disclosure of Invention
The embodiment of the invention provides a method, a device and a medium for rounding, which are used for reducing the execution complexity of a rounding operation flow, thereby improving the calculation efficiency of a system and reducing the system load.
The technical scheme provided by the embodiment of the invention is as follows:
a method of rounding comprising:
determining the number of targets indicated by each record, and determining a specified target integer, wherein the target integer represents a target value which needs to be reached when performing a hash operation on the number of targets;
Reading each of the obtained number of the targets according to an ascending order, wherein each reading of the number of the targets carries out a hash operation on the current read number of the target, the read number of the targets and the obtained hash values, and records a hash result, and the obtained hash values are hash results obtained by carrying out a hash operation on the read number of the targets;
After all the numbers of the targets are read, the corresponding value matched with the target integer is extracted from the obtained values to be used as a target value, and the number of the targets used for generating the target value is output as a result of the value.
A hash apparatus comprising:
The determining unit is used for determining the number of targets indicated by each record and determining a specified target integer, wherein the target integer represents a target value which needs to be reached when performing a hash operation on the number of targets;
The processing unit is used for respectively reading the obtained number of each mark according to the ascending order, wherein each reading of the number of each mark carries out a hash operation on the number of one mark which is read currently, the number of each mark which is read and each hash value which is obtained by carrying out the hash operation on the number of each mark which is read, and the obtained hash value, and records the hash result;
And the output unit is used for extracting the hash value matched with the target integer from the obtained hash values after the reading of all the target numbers is finished as a target hash value, and outputting the number of the targets used for generating the target hash value as a hash result.
A storage medium having stored therein at least one instruction that is loaded and executed by a processor to implement the hash method described above.
An apparatus comprising a processor and a memory having stored therein at least one instruction that is loaded and executed by the processor to implement the hash method described above.
The invention has the following beneficial effects:
In the embodiment of the invention, an iterative mode is adopted, after each execution of the hash operation, the next round of hash operation is continued based on the number of the read marks and the obtained hash value and combined with the number of the latest read marks until the hash is completed, so that the execution complexity of the hash operation flow can be effectively reduced, the time consumed by the hash operation flow is greatly reduced, and the calculation efficiency of a system and the system load can be effectively improved when a huge number of orders are recorded.
Drawings
FIG. 1A is a schematic diagram of a system topology according to an embodiment of the present invention;
FIG. 1B is a schematic diagram of a hash algorithm implementation flow in an embodiment of the present invention;
FIG. 2A is a schematic diagram of a Hadoop cluster implementation of a hash algorithm in an embodiment of the present invention;
FIG. 2B is a schematic diagram of the implementation of the chained hash algorithm in accordance with an embodiment of the present invention;
FIGS. 3A and 3B are schematic functional diagrams of a rounding device according to an embodiment of the present invention;
fig. 4 is a schematic diagram of the structure of the device in the embodiment of the invention.
Detailed Description
The technical scheme provided by the embodiment of the invention can be applied to Hadoop clusters. The Hadoop cluster is a software platform for developing and running large-scale data, is used for providing the capacity of storing distributed small files and quickly searching data, is a preferred system for constructing a quick query system, and can realize distributed computation of mass data in a cluster formed by a large number of computers.
The technical scheme provided by the embodiment of the invention can be applied to bulk goods sorting and trading scenes, such as commodity group purchase sorting and box trading, stock sorting and trading, and other scenes needing sorting analysis and inquiry.
Embodiments of the present invention will be described in further detail below with reference to the accompanying drawings.
Referring to fig. 1A, a distributed topology diagram of a Hadoop cluster is illustrated in fig. 1A, where a master node is configured to schedule data nodes, and task allocation is performed between the data nodes, where each data node is configured to perform a specific processing subtask, and each data node may perform one or more processing subtasks, and specifically needs to wait for an indication of the master node.
Referring to fig. 1B, in the embodiment of the present invention, a specific flow of performing a hash operation based on massive data by a cluster server (i.e., a master node+a data node) in a Hadoop cluster is as follows:
Step 100: the cluster server determines the number of targets indicated for each record and determines the specified target integer.
At the same time, the cluster server may also determine the number of records, i.e. the total number of all records.
In the embodiment of the invention, the record can be an order record, and the meanings of the order record and the number of targets are different in different application scenes. For example, in a group purchase of goods in a compact case transaction, the order record refers to the goods order record, and the number of targets refers to the number of goods indicated in each goods order record.
For another example, in a stock-bulk trade, an order record refers to a stock order record, and a target number refers to the number of stocks indicated in each stock order record.
Of course, the above two scenarios are examples, and when the method is applied to other scenarios, the method can be uniformly described by using the number of records and labels, which is not described herein.
In the embodiment of the invention, the number of the marks indicated by each record is the piecing object used in the process of the hash operation; the target integer is the target value which needs to be achieved after the hash operation is carried out on the pieced object.
In the embodiment of the present invention, for convenience of description, taking a scenario of a bulk goods rounding transaction as an example, referring to table 1, assume that there are 3 users in total: user 1 User 2 User 3, a total of 6 order records were generated by the 3 users, each order record indicating the number of targets as shown in table 1:
TABLE 1
As shown in table 1, it is assumed that there are 6 order records, and that shipment through a box with a shipment volume of 100 is required, i.e., n=6, m=100, where m is a target integer.
Step 110: the cluster server performs ascending order on the obtained number of each target according to the value.
For example, the number of targets and the target integer corresponding to 6 order records in table 1 are sorted in ascending order according to the values, and the results are shown in table 2.
TABLE 2
1 ... 15 ... 25 ... 40 ... 45 ... 55 ... 60 ... 100
F ... T ... T ... T ... T ... T ... T ... T
Referring to table 2, after initializing the data shown in table 1, it can be obtained that m=100, n=6, and the number of targets is from small to large: (15, 25, 40, 45, 55, 60), wherein 100 is a target integer, and can be added at the tail of the table as a threshold value, or can be not added, and the execution process of the flow is not affected.
In practical application, in the Hadoop cluster, distributed processing of mass data can be realized through a plurality of maps (which can be understood as a processing subtask), but in the embodiment of the invention, the memory configuration of one Map can be realized to support a plurality of Gs, so that in most cases, the hash algorithm aiming at bulk goods rounding transaction can be processed in one Map.
Naturally, in order to further increase the operation rate, the operation process of the hash algorithm may be performed in a distributed manner in a plurality of maps.
In this embodiment, description will be given by taking the processing completion in one map as an example.
Step 120: the cluster server reads the number of each of the labels according to the ascending order, wherein each time a label is read, the number of the label read currently, the number of each of the labels read and each obtained hash value are subjected to hash operation, and hash results are recorded, and the obtained hash value is obtained by hash operation on the number of each of the labels read.
Specifically, when performing a hash operation on the number of currently read one target, the number of each target that has been read, and each obtained hash value, the method may include:
1) Combining the number of the currently read one target with each read target number and combining with each obtained compact value to obtain a plurality of combined results;
2) Performing a hash operation on the number of hash values or/and labels contained in each combination result.
That is, the values of the respective pieced objects (the hash value or/and the number of the targets) included in the respective combination results are directly added to obtain the sum thereof, and it is determined whether the target integer can be met, wherein one combination result may include one target number and one target number that are currently read, or may include both the obtained hash value and one target number that are currently read.
Optionally, the number of the read labels and the obtained hash value are sorted in ascending order according to the value; combining the number of the currently read one target with each read target number and each obtained hash value in turn according to the ascending order to obtain a corresponding combined result
For example, still taking table 1 as an example, in performing step 120, the following operations may be performed:
a) Reading the number of targets in the first order record: 15;
Since only one number of labels is read, no hash operation can be performed, and 15 is initialized as: t (True) continues with the subsequent steps.
B) Reading the number of targets in the second order record: 25, a step of selecting a specific type of material;
at this time, the patchwork object includes: 15 and 25, it is known from the hash operation that 15+25=40, the new hash value is obtained as: 40, the data currently being T therefore includes: 15. 25 and 40.
The underline marks the latest calculated value, and the subsequent process has the same marking function, and is not described herein.
C) Reading the number of targets in the third order record: 40, a step of performing a;
at this time, the patchwork object includes: 15. 25 and 40 (x 2), where x 2 indicates that it can be used twice, so that one 40 is the hash value obtained in step b) and one 40 is the number of labels read in step c).
As can be seen from the hash operation, 15+40=55, 25+40=65, 40+40=80, a new hash value is obtained as: 55. 65 and 80, and thus the data currently being T comprises: 15. 25, 40 (×2), 55, 65, and 80.
D) Reading the number of targets in the fourth order record: 45;
The operation of the hash number shows that: 15+45=60, 25+45=70, 40+45=85, 55+45=100, 65+45=110, 80+45=125, the new values of the resulting hash are: 60. 70, 85, 100, 110 and 125, the data currently being T therefore includes: 15. 25, 40 (x 2), 45, 55, 60, 65, 70, 80, 85 (x 2), 100.
Since the two hash values 110 and 125 are greater than the destination integer 100, the two hash values can be optionally deleted directly after calculation, so that the hash result is clearer. Of course, the data can also be reserved as log data, and the log data is not used in the subsequent hash operation process, and is not described herein.
E) Reading the number of targets in the fifth order record: 55;
the operation of the hash number shows that: 15+55=70, 25+55=80, 40+55=95, 45+55=100.
In practical application, in the process of hash operation, if the obtained hash value is greater than the target integer, the hash result has no practical meaning, so in the embodiment of the invention, optionally, the number of the read marks and the obtained hash value may be sorted in ascending order according to the values, and the number of the currently read mark may be sequentially combined with the number of each read mark and each obtained hash value according to the ascending order, to obtain a corresponding combined result, and hash operation is performed on each combined result to obtain the hash value, where when it is determined that the latest obtained hash value is greater than the target integer, the hash operation on the number of the currently read mark is stopped.
In combination with the above embodiment, when 45+55=100 is calculated, the destination integer 100 is reached, and if the subsequent calculation 55+55=110 is continued, the obtained hash value 110 is necessarily greater than the destination integer 100, so that the current reading of one number of targets 55 and the subsequent operations 55, 60, 65, 70, 80, 85 (×2), 100 are not needed to be continued, and the current round of hash operation is stopped, and the next number of targets 60 can be read continuously.
It follows that the new compact values are obtained: 70. 80, 95 (×2), 100, the data currently being T therefore include: 15. 25, 40 (x 2), 45, 55, 60, 65, 70 (x 2), 80 (x 2), 85 (x 2), 95 (x 2) and 100 (x 2).
E) Reading the number of targets in the sixth order record: 60;
the operation of the hash number shows that: 15+60=75, 25+60=85, 40+60=100.
The new hash value is obtained: 75. 85, 100, and thus, the data currently being T includes: 15. 25, 40 (. Times.2), 45, 55, 60, 65, 70 (. Times.2), 75, 80 (. Times.2), 85 (. Times.3), 95 (. Times.2) and 100 (. Times.4).
Step 130: after the cluster server finishes reading all the target numbers, extracting a hash value matched with the target integer from the obtained hash values to serve as a target hash value, and outputting each target hash value and the corresponding target number used for generating the target hash value as a hash result.
After the target hash value is obtained, the target hash value needs to be traced back, the number of each target used when the target hash value is obtained is determined and calculated, taking one 100 of the 4 100 as an example, and the tracing back process is as follows: 100 The recursive process adopts tree structure record in the algorithm of the hash algorithm, when meeting 100 meeting the condition, the corresponding number of labels can be found along the path of the tree structure, after the result is obtained, 4 pieces of the hash mode meeting the condition can be found:
45+55=100;
60+40=100;
15+40+45=100;
15+25+60=100
Based on the tree structure record, in the embodiment of the present invention, optionally, for each hash value obtained in the hash operation process, a backtracking manner may also be adopted to determine the number of the labels used when each hash value is obtained, and the mapping relationship between each hash value and the number of the labels used when the corresponding hash value is obtained is recorded respectively.
For example, the mapping relationship is shown in table 3:
TABLE 3 Table 3
As shown in table 3, in the embodiment of the present invention, a prefix Row-key is used to construct a lookup data table, so that the number of each of the read objects and each of the calculated and obtained hash values can be recorded in a column Rowkey, and the number of each of the objects used for calculating the hash values can be stored in a column, so that, given any rowkey value, whether there is a corresponding hash manner of the number of objects can be quickly determined.
The mapping relation shown in table 3 can be stored in HBase, so that all possible combinations of the hash numbers are calculated by the hash operation based on the target integer of the specified hash number and the maximum record number.
In the above embodiment, a Map is taken as an example, and an implementation scheme of a hash operation is described. Furthermore, a hadoop mapreduce framework can be adopted in the hadoop cluster, and the hash operation process is realized in a chained mode. It will be appreciated that Map is a processing subtask.
Specifically, referring to fig. 2A in conjunction with fig. 1A, a serial Map List may be set in the master node, and an execution sequence of each Map, the number of maps, and specific execution content corresponding to each Map may be set, and then, the master node allocates each Map to one or more data nodes to perform specific operations, that is, each data node is responsible for specific execution of the corresponding Map based on scheduling of the master node. Wherein, each data node can interact with each other, and the interaction process is controlled by the main control node.
For example, as shown in fig. 2A, assuming that Map1 is allocated to data node 1, map2 is allocated to data node 2, and maps 3 to MapN are all allocated to data node 3, then data node 2 will read the output result of Map1 on data node 1 according to the notification of the master node and execute the subsequent operation, and data node 3 will read the output result of Map2 on data node 2 according to the notification of the master node and execute the subsequent operation, and complete the specific operations of maps 3 to MapN internally, the output results of maps are also transmitted internally, which is not described herein.
Of course, if the master node only designs one Map, the master node may also directly allocate the Map to one data node to complete a specific operation, which is not described herein again.
Therefore, in the embodiment of the present invention, the cluster server (the master node+the data node) may set a corresponding number of maps according to the number of records (may be the number of order records), set an execution order of each Map, and read a target number from each Map selected, where any Map (hereinafter referred to as Map x) is used to perform the following operations:
if Map x is the first Map, sending the current read number of one target to the next Map according to the execution sequence;
If Map x is not the first Map or the last Map, mapx performs a hash operation on the number of the currently read one of the labels, the number of the read one of the labels received from the previous Map and the obtained one of the hash values, and sends the latest hash value obtained after the calculation and the number of the currently read one of the labels, the number of the read one of the labels received from the previous Map and the obtained one of the hash values to the next Map according to the execution sequence;
If Map x is the last Map, map x performs a hash operation on the number of currently read one of the labels, the number of read labels received from the last Map, and the obtained hash values, and outputs as a hash result the number of labels used by the latest hash value obtained after the calculation and the number of labels used by the obtained hash values received from the last processing sub-task.
For example, referring to fig. 2B, taking table 1 as an example, 6 maps are set based on the order record number 6, wherein Map1 is sent to Map 2 after reading 15; after Map 2 reads 25, 15 and 25 are counted as 40, 15, 25 and 40 are sent to Map3, map3 reads 40, 15, 25, 40 and 40 are counted as 55, 65 and 80, 15, 25, 40, 55, 65 and 80 are sent to maps 4, … … … … and so on, the input of one Map is the output of the last Map plus the data content read by itself, and the algorithm of the number of the internal of each Map is the same as the derivation process introduced in step 120, and is not repeated here.
Where Map means a Map in the Map reduce framework, a Map can be considered as a distributed processing sub-task.
By adopting the mode, the computational complexity of the traditional algorithm can be greatly reduced.
For example, taking the conventional algorithm which is more common in the prior art as the SQL-JOIN hard computing method as an example, the specific comparison process is as follows:
taking the case shown in table 1 as an example, if a box with a loading capacity of 100 is shipped, the following cases can be obtained by means of SQL-JOIN hard calculation:
TABLE 4 Table 4
As shown in table 4:
when the number of order combinations is 1 (i.e. choose whether 1 order meets 100), no return value is queried in the database, since the number of all orders is less than 100;
when the number of order combinations is 2 (i.e. choose whether 2 order pieces satisfy 100), the following are found in the database: the order 100001 for User1 orders 45 and the order 100004 for User2 orders 55 may be pieced together to satisfy 100; and the order 100005 of User 3 orders 60 and the order 100006 presets 40 that can be pieced together to satisfy 100;
When the number of order combinations is 3 (i.e. choose whether 3 order pieces satisfy 100), the following are found in the database: the order 100002 for User1, the order 100003 for User2, the order 15 for User3, and the order 100005 for User3, may be pieced together to satisfy 100; and the goods 45 ordered by order 100001 of User1, the goods 15 ordered by order 100003 of User2 and the goods 40 ordered by order 100006 of User3 can also be pieced together to satisfy 100.
When the number of order combinations is 4 (i.e. choose whether there are 4 orders to splice to satisfy 100), the database is queried for no return value, i.e. there are no 4 orders that can splice to satisfy 100.
……
Similarly, the situation that the number of order combinations is 6 can be calculated at the highest.
While in the case of the SQL-JOIN hard computing approach, the algorithm complexity increases exponentially as the number of order combinations increases.
Taking the case where the number of order combinations is 6 as an example, in performing the SQL-JOIN hard calculation, it is necessary to copy table 1 (hereinafter referred to as t 1) to generate t2, t3, t4, t5, and t6.
Then, in the calculation process, one number needs to be selected from each table to be combined, which is specifically as follows:
select t1.item,t2.item,t3.item,...
t1 join t2 join t3 join t4 join t5 join t5 join t6.item
where t1.item+t2.item+t3.item+...=m
assuming that there are n order records and the maximum number of order combinations is k, then the shipping is performed in m per box, and the calculation scale (i.e., the number of times of calculation) for each order combination is:
And 6 order records are recorded in table 1 as shown in table 1, then the order combination number may be: 1-6, thus, when the SQL-JOIN hard computing method is adopted, the overall computing scale is as follows:
61+62+63+64+65+66=55968。
If the number of order records further increases, the algorithm complexity further increases exponentially with the number of order records, as shown in table 5.
TABLE 5
Order record number SQL-Join hard computing
10 1.11*1010
20 1.10*1026
30 2.13*1044
40 1.24*1064
50 9.06*1084
60 4.97*10106
70 1.46*10129
80 1.79*10152
Compared with the existing SQL-JOIN hard computing mode, in the embodiment of the invention, an iteration mode is adopted to sequentially read the number of each mark to carry out the hash operation, each time the hash operation is carried out on the number of each mark, the obtained hash value and the read number of the marks are continuously used for processing, the hash operation is continuously carried out by combining the number of the next read mark, and the like until the hash result is output, thus the time complexity of the algorithm can be reduced to O (m x n), and the operation time is greatly saved.
Referring to fig. 3A and 3B, in an embodiment of the present invention, a hash apparatus (which may be a cluster server) is provided, which at least includes a determining unit 30, a processing unit 31 and an output unit 32, where,
A determining unit 30, configured to determine the number of targets indicated by each record, and determine a specified target integer, where the target integer represents a target value that needs to be reached when performing a hash operation on the number of targets;
A processing unit 31, configured to read the obtained numbers of the respective targets in ascending order, respectively, where each reading of the number of one target performs a hash operation on the number of one target currently read, the number of each target read, and the obtained hash value, and record a hash result, where the obtained hash value is a hash result obtained by performing a hash operation on the number of each target read;
And an output unit 32 for extracting, from the obtained respective hash values, a hash value matching the target integer as a target hash value after the reading of all the numbers of the targets is completed, and outputting, as a hash result, the number of targets used in generating the target hash value.
Optionally, the processing unit 31 is further configured to:
combining the number of the currently read one target with each read target number and each obtained compact value to obtain a corresponding combination result;
wherein, each time a combination result is obtained, a hash operation is performed on the number of hash values or/and labels contained in the combination result.
Optionally, the processing unit 31 is further configured to:
the number of each read target and the obtained compact value are sorted in ascending order according to the value;
And combining the number of the currently read one target with each read target number and each obtained hash value in turn according to the ascending order sequence to obtain a corresponding combined result.
Optionally, the processing unit 31 is further configured to:
And stopping the hash operation of the number of the currently read one target when the latest obtained hash value is determined to be larger than the target integer in the process of performing the hash operation of the number of the hash value or/and the target contained in the one combined result.
Optionally, the processing unit 31 is further configured to:
In the process of performing the hash operation, each time a hash value is obtained, backtracking is performed on the one hash value, the number of the targets used for obtaining the hash value is determined, and the mapping relation between the one hash value and the number of the targets used for obtaining the hash value is recorded.
Optionally, the processing unit 31 is further configured to:
acquiring the number of records, setting a corresponding number of processing sub-tasks according to the number of records, setting the execution sequence of each processing sub-task, and respectively reading a target number through each set processing sub-task, wherein one processing sub-task is used for executing the following operations:
if the processing subtask is the first processing subtask, transmitting the currently read target number to the next processing subtask according to the execution sequence;
if the processing subtask is not the first processing subtask or the last processing subtask, the processing subtask carries out a hash operation on the number of the currently read one mark and the number of the read marks and the obtained hash values received from the last processing subtask, and sends the latest hash value obtained after calculation and the number of the currently read one mark and the number of the read marks and the obtained hash values received from the last processing subtask to the next processing subtask according to the execution sequence;
If the one processing sub-task is the last processing sub-task, the one processing sub-task performs a hash operation on the number of the currently read one label, the number of the read labels received from the last processing sub-task and the obtained hash values, and outputs the number of the labels used by the latest hash value obtained after the calculation and the number of the labels used by the obtained hash values received from the last processing sub-task as a hash result.
Based on the same inventive concept, in an embodiment of the present invention, there is provided a storage medium having at least one instruction stored therein, the instruction being loaded and executed by a processor to perform the method steps in the above embodiment.
Based on the same inventive concept, referring to fig. 4, an apparatus is provided in an embodiment of the present invention, including one or more processors 40; and
One or more memories 41 having stored thereon instructions that, when executed by the one or more processors 40, cause the apparatus to perform the methods described in the various embodiments above.
The memory 41 may include a Read Only Memory (ROM) and a Random Access Memory (RAM), and provides instructions and data of a program to the processor 40.
In summary, in the embodiment of the present invention, the number of each of the marks indicated by each record is read after the number of each of the marks is sorted according to ascending order, where each of the numbers of each of the marks is read, the current number of each of the marks is calculated with the number of each of the marks read and each of the obtained hash values, and the hash result is recorded, where the obtained hash value is the hash result obtained by performing the hash operation on the number of each of the marks read, and after the reading is completed, the target hash value matched with the target integer and the number of the marks used in the corresponding generation of the target hash value are output as the hash result. Obviously, in the embodiment of the invention, an iterative mode is adopted, after each execution of the hash operation, the next round of hash operation is continued based on the number of the read marks and the obtained hash value and combined with the number of the latest read marks until the hash is completed, so that the execution complexity of the hash operation flow can be effectively reduced, the time consumed by the hash operation flow is greatly reduced, and the calculation efficiency of a system and the system load can be effectively improved when the massive order records are faced.
Specifically, the above technical solution may be applied in a distributed hadoop map reduce architecture, fully utilize the resources of a distributed computing platform, and optimize the system execution complexity to O (m×n) inside the map.
On the other hand, the technical scheme can be transplanted to other computing platforms such as Spark and the like for use, the performance is improved by using distributed computing, and the distributed computing can be conveniently integrated to a hive data warehouse, so that convenience is brought to the application of the enterprise large data warehouse.
It will be appreciated by those skilled in the art that embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following claims be interpreted as including the preferred embodiments and all such alterations and modifications as fall within the scope of the invention.
It will be apparent to those skilled in the art that various modifications and variations can be made to the embodiments of the present invention without departing from the spirit or scope of the embodiments of the invention. Thus, if such modifications and variations of the embodiments of the present invention fall within the scope of the claims and the equivalents thereof, the present invention is also intended to include such modifications and variations.

Claims (12)

1. A cluster server hash method, applied to a cluster server, comprising:
Determining the number of targets indicated by each record, and determining a designated target integer, wherein the target integer represents a target value which needs to be achieved when performing a hash operation on the number of targets, wherein a cluster server comprises a main control node and a data node, sets maps of corresponding numbers according to the number of order records, sets the execution sequence of each Map, and reads one target number through each selected Map;
Reading each of the obtained number of the targets according to an ascending order, wherein each reading of the number of the targets carries out a hash operation on the current read number of the target, the read number of the targets and the obtained hash values, and records a hash result, and the obtained hash values are hash results obtained by carrying out a hash operation on the read number of the targets; stopping the hash operation for the number of one target currently read when it is determined that the latest obtained hash value is greater than the target integer;
After all the numbers of the targets are read, the corresponding value matched with the target integer is extracted from the obtained values to be used as a target value, and the number of the targets used for generating the target value is output as a result of the value.
2. The method of claim 1, wherein performing a hash operation on a number of currently read one of the labels with a number of individual labels that have been read and individual hash values that have been obtained, comprises:
combining the number of the currently read one target with each read target number and each obtained compact value to obtain a corresponding combination result;
wherein, each time a combination result is obtained, a hash operation is performed on the number of hash values or/and labels contained in the combination result.
3. The method of claim 2, wherein combining the number of currently read one of the labels with each of the number of labels read, respectively, and each of the obtained hash values, respectively, to obtain the corresponding combined result comprises:
the number of each read target and the obtained compact value are sorted in ascending order according to the value;
And combining the number of the currently read one target with each read target number and each obtained hash value in turn according to the ascending order sequence to obtain a corresponding combined result.
4. The method as recited in claim 1, further comprising:
In the process of performing the hash operation, each time a hash value is obtained, backtracking is performed on the one hash value, the number of the targets used for obtaining the hash value is determined, and the mapping relation between the one hash value and the number of the targets used for obtaining the hash value is recorded.
5. The method of any of claims 1-4, wherein each number of the objects is read separately, wherein each number of the objects is read, performing a hash operation on a number of the object currently read, the number of the objects read, and the hash values obtained, and recording the hash results, further comprising:
acquiring the number of records, setting a corresponding number of processing sub-tasks according to the number of records, setting the execution sequence of each processing sub-task, and respectively reading a target number through each set processing sub-task, wherein one processing sub-task is used for executing the following operations:
if the processing subtask is the first processing subtask, transmitting the currently read target number to the next processing subtask according to the execution sequence;
if the processing subtask is not the first processing subtask or the last processing subtask, the processing subtask carries out a hash operation on the number of the currently read one mark and the number of the read marks and the obtained hash values received from the last processing subtask, and sends the latest hash value obtained after calculation and the number of the currently read one mark and the number of the read marks and the obtained hash values received from the last processing subtask to the next processing subtask according to the execution sequence;
If the one processing sub-task is the last processing sub-task, the one processing sub-task performs a hash operation on the number of the currently read one label, the number of the read labels received from the last processing sub-task and the obtained hash values, and outputs the number of the labels used by the latest hash value obtained after the calculation and the number of the labels used by the obtained hash values received from the last processing sub-task as a hash result.
6. A cluster server hash apparatus, applied to a cluster server, comprising:
the system comprises a determining unit, a target integer, a cluster server and a target server, wherein the determining unit is used for determining the number of targets indicated by each record and determining a specified target integer, the target integer represents a target value which needs to be reached when performing a hash operation on the number of targets, the cluster server comprises a main control node and a data node, the cluster server sets maps of corresponding numbers according to the number of order records and sets the execution sequence of each Map, and one target number is respectively read through each Map selected;
The processing unit is used for respectively reading the obtained number of each mark according to the ascending order, wherein each reading of the number of each mark carries out a hash operation on the number of one mark which is read currently, the number of each mark which is read and each hash value which is obtained by carrying out the hash operation on the number of each mark which is read, and the obtained hash value, and records the hash result; stopping the hash operation for the number of one target currently read when it is determined that the latest obtained hash value is greater than the target integer;
And the output unit is used for extracting the hash value matched with the target integer from the obtained hash values after the reading of all the target numbers is finished as a target hash value, and outputting the number of the targets used for generating the target hash value as a hash result.
7. The apparatus of claim 6, wherein the processing unit is to:
combining the number of the currently read one target with each read target number and each obtained compact value to obtain a corresponding combination result;
wherein, each time a combination result is obtained, a hash operation is performed on the number of hash values or/and labels contained in the combination result.
8. The apparatus of claim 7, wherein the processing unit is to:
the number of each read target and the obtained compact value are sorted in ascending order according to the value;
And combining the number of the currently read one target with each read target number and each obtained hash value in turn according to the ascending order sequence to obtain a corresponding combined result.
9. The apparatus of claim 6, wherein the processing unit is to:
In the process of performing the hash operation, each time a hash value is obtained, backtracking is performed on the one hash value, the number of the targets used for obtaining the hash value is determined, and the mapping relation between the one hash value and the number of the targets used for obtaining the hash value is recorded.
10. The apparatus according to any of claims 6-9, wherein the processing unit is configured to:
acquiring the number of records, setting a corresponding number of processing sub-tasks according to the number of records, setting the execution sequence of each processing sub-task, and respectively reading a target number through each set processing sub-task, wherein one processing sub-task is used for executing the following operations:
if the processing subtask is the first processing subtask, transmitting the currently read target number to the next processing subtask according to the execution sequence;
if the processing subtask is not the first processing subtask or the last processing subtask, the processing subtask carries out a hash operation on the number of the currently read one mark and the number of the read marks and the obtained hash values received from the last processing subtask, and sends the latest hash value obtained after calculation and the number of the currently read one mark and the number of the read marks and the obtained hash values received from the last processing subtask to the next processing subtask according to the execution sequence;
If the one processing sub-task is the last processing sub-task, the one processing sub-task performs a hash operation on the number of the currently read one label, the number of the read labels received from the last processing sub-task and the obtained hash values, and outputs the number of the labels used by the latest hash value obtained after the calculation and the number of the labels used by the obtained hash values received from the last processing sub-task as a hash result.
11. A storage medium having stored therein at least one instruction that is loaded and executed by a processor to implement the hash method of any of claims 1 to 5.
12. An apparatus comprising a processor and a memory having at least one instruction stored therein, the instruction being loaded and executed by the processor to implement the round method of any of claims 1 to 5.
CN201810044824.7A 2018-01-17 2018-01-17 Method, device and medium for integrating numbers Active CN110046949B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810044824.7A CN110046949B (en) 2018-01-17 2018-01-17 Method, device and medium for integrating numbers

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810044824.7A CN110046949B (en) 2018-01-17 2018-01-17 Method, device and medium for integrating numbers

Publications (2)

Publication Number Publication Date
CN110046949A CN110046949A (en) 2019-07-23
CN110046949B true CN110046949B (en) 2024-05-31

Family

ID=67273496

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810044824.7A Active CN110046949B (en) 2018-01-17 2018-01-17 Method, device and medium for integrating numbers

Country Status (1)

Country Link
CN (1) CN110046949B (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105184412A (en) * 2015-09-21 2015-12-23 北京农业信息技术研究中心 Logistics delivery route planning method and system based on geographic positions

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3596603B2 (en) * 2000-10-13 2004-12-02 日本電気株式会社 Scheduling system and scheduling method

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105184412A (en) * 2015-09-21 2015-12-23 北京农业信息技术研究中心 Logistics delivery route planning method and system based on geographic positions

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
A dynamic programming algorithm for the Knapsack Problem wiht setup;Khalil Chebil等;Computers and Operations Research;第64卷;第40-50页 *
背包问题的一种新算法:降维递归算法;钟海林;CNKI优秀硕士学位论文全文库;第2009年(第07期);正文全文 *
花栅编著.计算机科学数学.哈尔滨船舶工程学院出版社,1994,第210-212页. *

Also Published As

Publication number Publication date
CN110046949A (en) 2019-07-23

Similar Documents

Publication Publication Date Title
US20210049209A1 (en) Distributed graph embedding method and apparatus, device, and system
US20170097853A1 (en) Realizing graph processing based on the mapreduce architecture
Vats et al. Performance evaluation of K-means clustering on Hadoop infrastructure
US20150032759A1 (en) System and method for analyzing result of clustering massive data
KR102134952B1 (en) Data processing method and system
CN107015853A (en) The implementation method and device of phased mission system
JP2015026372A (en) Computer-implemented method, storage medium and computer system for parallel tree based prediction
CN106557307B (en) Service data processing method and system
US20210011928A1 (en) Smart elastic scaling based on application scenarios
US20160125009A1 (en) Parallelized execution of window operator
CN103064955A (en) Inquiry planning method and device
US9384238B2 (en) Block partitioning for efficient record processing in parallel computing environment
CN106708875B (en) Feature screening method and system
CN109829678B (en) Rollback processing method and device and electronic equipment
CN109788013B (en) Method, device and equipment for distributing operation resources in distributed system
US11599540B2 (en) Query execution apparatus, method, and system for processing data, query containing a composite primitive
US20160125032A1 (en) Partition-aware distributed execution of window operator
CN110046949B (en) Method, device and medium for integrating numbers
CN110019357B (en) Database query script generation method and device
CN107203633B (en) Data table pushing processing method and device and electronic equipment
CN114581220A (en) Data processing method and device and distributed computing system
JP2015069518A (en) Parallelization apparatus for processing, parallelization method for processing, and program
US10402230B2 (en) System allocating links for data packets in an electronic system
CN105447183A (en) MPP framework database cluster sequence system and sequence management method
CN106708606B (en) Data processing method and device based on MapReduce

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant