CN109901931A - A kind of reduction function numbers determine method, apparatus and system - Google Patents

A kind of reduction function numbers determine method, apparatus and system Download PDF

Info

Publication number
CN109901931A
CN109901931A CN201910171361.5A CN201910171361A CN109901931A CN 109901931 A CN109901931 A CN 109901931A CN 201910171361 A CN201910171361 A CN 201910171361A CN 109901931 A CN109901931 A CN 109901931A
Authority
CN
China
Prior art keywords
key
value pair
function
time
mapping function
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910171361.5A
Other languages
Chinese (zh)
Other versions
CN109901931B (en
Inventor
梁建煌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing QIYI Century Science and Technology Co Ltd
Original Assignee
Beijing QIYI Century Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing QIYI Century Science and Technology Co Ltd filed Critical Beijing QIYI Century Science and Technology Co Ltd
Priority to CN201910171361.5A priority Critical patent/CN109901931B/en
Publication of CN109901931A publication Critical patent/CN109901931A/en
Application granted granted Critical
Publication of CN109901931B publication Critical patent/CN109901931B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

This application provides a kind of specification function numbers to determine method, apparatus and system, based on MapReduce server, wherein, the MapReduce server includes the reduction function that the blocks of files of input is calculated to and exported the mapping function of key-value pair and carries out reduction calculating to key-value pair, which determines that method first obtains the quantity for the key-value pair that mapping function exports in the first preset period of time.Then it determines that reduction function handles the object time of a key-value pair, and is based on the object time, determine the key-value pair processing quantity of reduction function in the second preset period of time.The quantity of key-value pair later based on mapping function output and the key-value pair of reduction function handle quantity, determine the destination number of reduction function.It can be seen that, the quantity for the key-value pair that this programme is exported according to mapping function determines the destination number of reduction function, so that resource reasonable distribution, improves task treatment effeciency, avoid due to the quantity of reduction function is excessive or it is very few caused by the wasting of resources or the problems such as Caton.

Description

A kind of reduction function numbers determine method, apparatus and system
Technical field
This application involves technical field of data processing, and in particular to a kind of specification function numbers determine method, apparatus and are System.
Background technique
MapReduce is a kind of programming model, the parallel computation for large-scale dataset (being greater than 1TB).It includes to reflect Penetrate Map and reduction Reduce two parts.Specifically, input file is divided into multiple split blocks, each split block by One mapping function Mapper is calculated, mapping function Mapper one group of new key-value pair of output, such as<key, value>right, then Key-value pair is sent to reduction function Reducer, reduction calculating is carried out by reduction function Reducer.
Currently, needing to predefine reduction function Reducer before key-value pair to be sent to reduction function Reducer Number, key-value pair can be ranked up by mapping function Mapper according to affiliated reduction function Reducer, and according to above-mentioned row Key-value pair is stored in continuous disk by sequence, in storage to SATA.Reduction function Reducer is made to read mapping letter in this way It when counting the key-value pair in Mapper, can be read with monolith, and then improve reading performance.
However, it is found by the inventors that the number of reduction function Reducer is set by the user in aforesaid way, and set quantity It is excessive or very few, phenomena such as will lead to the wasting of resources or task Caton.Therefore, it is determining that a kind of specification function numbers how to be provided Method, apparatus and system are that those skilled in the art are urgently to be resolved to improve resource utilization, improve task treatment effeciency One big technical problem.
Summary of the invention
In view of this, the embodiment of the present application, which provides a kind of specification function numbers, determines method, apparatus and system, Neng Gouti High resource utilization improves task treatment effeciency.
To achieve the above object, the embodiment of the present application provides the following technical solutions:
A kind of specification function numbers determine method, are applied to MapReduce server, the MapReduce server packet Mapping function and reduction function are included, the mapping function exports at least one key assignments for calculating the blocks of files of input Right, the reduction function is used to carry out reduction calculating to the key-value pair, and the specification function numbers determine that method includes:
Obtain the quantity of the key-value pair of the mapping function output in the first preset period of time;
Determine that the reduction function handles the object time of a key-value pair;
Based on the object time, the key-value pair processing number of the reduction function in the second preset period of time is determined Amount;
The quantity of key-value pair based on mapping function output and the key-value pair of the reduction function handle quantity, really Make the destination number of the reduction function.
Optionally, the quantity for obtaining the key-value pair of the mapping function output in the first preset period of time, comprising:
It obtains in third preset period of time, inputs the quantity and the mapping of the blocks of files of the mapping function The quantity of the key-value pair of function output;
The key-value pair of quantity and mapping function output based on the blocks of files for inputting the mapping function Quantity determines the input and output ratio of the mapping function;
Obtain the quantity of the blocks of files of the input mapping function in first preset period of time;
Determine the quantity of the blocks of files of the input mapping function and the mapping letter in first preset period of time The product of several input and output ratio is the quantity of the key-value pair of the mapping function output in the first preset period of time.
Optionally, the determination reduction function handles the object time of a key-value pair, comprising:
Obtain the attribute-bit of the key-value pair;
When the attribute-bit is first kind attribute-bit, the key-value pair is gone through in the 4th preset period of time of acquisition History handles the time, determines that the average value of the history processing time is the object time of a key-value pair;
When the attribute-bit is that the second generic attribute identifies, the key-value pair is split into multiple subdatas, and obtain Each reduction function handles the processing time of every subdata, determines that the sum of described processing time is at the reduction function Manage the object time of a key-value pair.
Optionally, described to be based on the object time, determine the key of the reduction function in the second preset period of time Value is to processing quantity, comprising:
Determine that the quotient of second preset period of time and the object time are institute in second preset period of time State the key-value pair processing quantity of reduction function.
Optionally, the key-value pair of the quantity of the key-value pair based on mapping function output and the reduction function Quantity is handled, determines the destination number of the reduction function, comprising:
Determine the quotient of the quantity of the key-value pair of the mapping function output and the key-value pair processing quantity of the reduction function For the destination number of the reduction function.
A kind of specification function numbers determining device is applied to MapReduce server, the MapReduce server packet Mapping function and reduction function are included, the mapping function exports at least one key assignments for calculating the blocks of files of input Right, the reduction function is used to carry out reduction calculating to the key-value pair, and the specification function numbers determining device includes:
Module is obtained, for obtaining the number of the key-value pair of the mapping function output in the first preset period of time Amount;
First determining module, for determining that the reduction function handles the object time of a key-value pair;
Second determining module determines the reduction letter in the second preset period of time for being based on the object time Several key-value pairs handles quantity;
Third determining module, the quantity of the key-value pair for being exported based on the mapping function and the reduction function Key-value pair handles quantity, determines the destination number of the reduction function.
Optionally, the acquisition module includes:
First acquisition unit inputs the blocks of files of the mapping function for obtaining in third preset period of time Quantity and the mapping function output the key-value pair quantity;
First determination unit, it is defeated for the quantity of the blocks of files based on the input mapping function and the mapping function The quantity of the key-value pair out determines the input and output ratio of the mapping function;
Second acquisition unit, for obtaining the blocks of files for inputting the mapping function in first preset period of time Quantity;
Second determination unit, for determining the blocks of files for inputting the mapping function in first preset period of time The product of the input and output ratio of quantity and the mapping function is the mapping function output in the first preset period of time The quantity of key-value pair.
Optionally, first determining module includes:
Third acquiring unit, for obtaining the attribute-bit of the key-value pair;
Third determination unit, for obtaining the 4th preset time week when the attribute-bit is first kind attribute-bit The history of the key-value pair handles the time in phase, determines that the average value of the history processing time is the mesh of a key-value pair Mark the time;
4th determination unit, for when the attribute-bit is that the second generic attribute identifies, the key-value pair to be split into Multiple subdatas, and obtain the processing time that each reduction function handles every subdata, determine the processing time it With the object time for handling a key-value pair for the reduction function.
Optionally, the second determining module includes the 5th determination unit, and the third determining module includes the 6th determination unit,
5th determination unit is used to determine that the quotient of second preset period of time and the object time to be described The key-value pair of the reduction function handles quantity in second preset period of time;
The quantity for the key-value pair that 6th determination unit is used to determine that the mapping function exports and the reduction function Key-value pair processing quantity quotient be the reduction function destination number.
A kind of specification function numbers determine system, including the above-mentioned specification function numbers determining device of any one.
Based on the above-mentioned technical proposal, this application provides a kind of specification function numbers to determine method, is taken based on MapReduce Be engaged in device, wherein MapReduce server includes mapping function and reduction function, blocks of files of the mapping function for that will input into Row calculates, and exports at least one key-value pair, and reduction function is used to carry out reduction calculating, the specification function numbers to the key-value pair Determine that method obtains the quantity for the key-value pair that mapping function exports in the first preset period of time first.Then reduction function is determined The object time of a key-value pair is handled, and is based on the object time, determines the key of reduction function in the second preset period of time Value is to processing quantity.The quantity of key-value pair later based on mapping function output and the key-value pair of reduction function handle quantity, Determine the destination number of reduction function.As it can be seen that the quantity for the key-value pair that this programme is exported according to mapping function determines reduction The destination number of function, so that resource reasonable distribution, and then task treatment effeciency is improved, avoid the number due to reduction function The problems such as measuring the excessive or very few caused wasting of resources or Caton.
Detailed description of the invention
In order to illustrate the technical solutions in the embodiments of the present application or in the prior art more clearly, to embodiment or will show below There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this The embodiment of application for those of ordinary skill in the art without creative efforts, can also basis The attached drawing of offer obtains other attached drawings.
Fig. 1 is the structural block diagram that specification function numbers provided by the embodiments of the present application determine system;
Fig. 2 is the flow chart that specification function numbers provided by the embodiments of the present application determine method;
Fig. 3 is the another flow chart that specification function numbers provided by the embodiments of the present application determine method;
Fig. 4 is the another flow chart that specification function numbers provided by the embodiments of the present application determine method;
Fig. 5 is the another flow chart that specification function numbers provided by the embodiments of the present application determine method;
Fig. 6 is the another flow chart that specification function numbers provided by the embodiments of the present application determine method;
Fig. 7 is the structural schematic diagram of specification function numbers determining device provided by the embodiments of the present application;
Fig. 8 is a kind of hardware structural diagram of server provided by the embodiments of the present application.
Specific embodiment
Below in conjunction with the attached drawing in the embodiment of the present application, technical solutions in the embodiments of the present application carries out clear, complete Site preparation description, it is clear that described embodiments are only a part of embodiments of the present application, instead of all the embodiments.It is based on Embodiment in the application, it is obtained by those of ordinary skill in the art without making creative efforts every other Embodiment shall fall in the protection scope of this application.
Fig. 1 is the structural block diagram that specification function numbers provided by the embodiments of the present application determine system, specification shown in the figure Function numbers determine that system can be used to implement specification function numbers provided by the embodiments of the present application and determine method.Referring to Fig.1, should Specification function numbers determine that system includes MapReduce server, which includes mapping function 101 and return About function 102.
Wherein, input file is divided into multiple blocks of files split, each blocks of files is by a mapping function Mapper It calculates, mapping function Mapper exports one group of new key-value pair, then key-value pair is sent to reduction by such as<key, value>right Function Reducer carries out reduction calculating by reduction function Reducer.
System is determined based on specification function numbers shown in FIG. 1, below from the angle of MapReduce server to the application The specification function numbers of offer determine that method is introduced.As shown in Fig. 2, being specification function numbers provided by the embodiments of the present application Determine the flow chart of method, this method is applied to MapReduce server, may include:
S21, the quantity for obtaining the key-value pair of the mapping function output in the first preset period of time.
Specifically, as shown in figure 3, present embodiments providing the mapping function in a kind of the first preset period of time of acquisition The specific implementation of the quantity of the key-value pair of output, comprising:
S31 is obtained in third preset period of time, inputs the quantity and the mapping of the blocks of files of the mapping function The quantity of the key-value pair of function output;
The key-value pair of S32, the quantity based on the blocks of files for inputting the mapping function and mapping function output Quantity determines the input and output ratio of the mapping function;
S33, the quantity for obtaining the blocks of files of the input mapping function in first preset period of time;
S34, determine that the quantity of the blocks of files of the input mapping function in first preset period of time is reflected with described The product for penetrating the input and output ratio of function is the quantity of the key-value pair of the mapping function output in the first preset period of time.
Wherein, third preset period of time can be less than the first preset period of time, such as third preset period of time is 1 Hour, the first preset period of time is 12 hours, then the present embodiment is the file for obtaining the mapping function input in 1 hour first Then the quantity of the quantity of block and the key-value pair of mapping function output determines the quantity of the blocks of files of input mapping function Quotient with the quantity of the key-value pair of mapping function output is the input and output ratio of mapping function, and determining 12 hours later corresponding The product for inputting the quantity of the blocks of files of mapping function and the input and output ratio of mapping function is mapping corresponding to 12 hours The quantity of the key-value pair of function output.
Except this, third preset period of time can also be a time cycle in the first preset period of time, for example, the A period of time is 1:00-12:00, at this point, third preset period of time can be 1:00-2:00, i.e., by first it is default when Between period (third preset period of time) in the period related data (quantity of the blocks of files of input mapping function with And the quantity of the key-value pair of mapping function output) calculate the number of the key-value pair that mapping function in the first preset period of time exports Amount.
S22, determine that the reduction function handles the object time of a key-value pair.
Specifically, as shown in figure 4, present embodiments providing a kind of one key-value pair of determination reduction function processing Object time concrete mode, comprising:
S41 obtains the attribute-bit of the key-value pair;
S42, when the attribute-bit be the first kind attribute-bit when, obtain the 4th preset period of time in the key-value pair History handle the time, determine that the average value of history processing time is the object time of a key-value pair;
S43, when the attribute-bit be the second generic attribute identify when, the key-value pair is split into multiple subdatas, and The processing time that each reduction function handles every subdata is obtained, determines that the sum of described processing time is the reduction letter The object time of number one key-value pair of processing.
Wherein, in this embodiment, first kind attribute-bit is the non-key-value pair run for the first time, is for example, periodically reruned Key-value pair, it is possible to obtain the key-value pair history processing the time, and acquire multiple key-value pairs average time be one The object time of a key-value pair.
Second generic attribute information is the key-value pair run for the first time, then, it, can be by the key assignments at calculating " object time " To multiple sets of sub-data is split into, this group of subdata is handled by multiple reduction functions, finally determines the runing time of every subdata The sum of be the key-value pair object time.
S23, it is based on the object time, determines the key-value pair processing of the reduction function in the second preset period of time Quantity.
Specifically, as shown in figure 5, present embodiments provide it is a kind of based on the object time, determine second it is default when Between in the period key-value pair processing quantity of the reduction function specific implementation, comprising:
S51, determine that the quotient of second preset period of time and the object time are second preset period of time The key-value pair of the interior reduction function handles quantity.
Wherein, the second preset time is setting value, can be expected calculated time value set by user, e.g., Yong Huxi The total duration for hoping data calculate controlled in 1 minute, therefore the second preset period of time can be set to 1 minute.Certainly, also The time can be calculated according to history, it is automatic to carry out the preferred reckoning for calculating the time, determine that the preferred calculating time is second Preset period of time, for example, the time for obtaining the key-value pairs of history 3 times calculating equivalent amounts is respectively t1, t2 and t3, then The average value for determining t1, t2 and t3 is the second preset period of time.
The key-value pair processing quantity of reduction function characterizes the processing capacity of reduction function processing key-value pair, this step can be with Then empirically determined a preferred calculating time, i.e. the second preset period of time out handle one according to reduction function The object time of key-value pair determines key-value pair processing quantity=second preset period of time/object time of the reduction function.
S24, the quantity of key-value pair based on mapping function output and the key-value pair of the reduction function handle number Amount, determines the destination number of the reduction function.
Specifically, as shown in fig. 6, present embodiments providing a kind of quantity of key-value pair based on mapping function output And the key-value pair of the reduction function handles quantity, determines the specific implementation of the destination number of the reduction function, Include:
S61, determine that the quantity of key-value pair of the mapping function output and the key-value pair of the reduction function handle quantity Quotient be the reduction function destination number.
It has been determined that the key-value pair of each reduction function handles quantity in step S23, then the number of targets of required reduction function Quantity/reduction function key-value pair of amount=mapping function output key-value pair handles quantity.
As it can be seen that the quantity for the key-value pair that this programme is exported according to mapping function determines the destination number of reduction function, make Resource reasonable distribution, and then improve task treatment effeciency, avoid since the quantity of reduction function is excessive or very few causes The wasting of resources or Caton the problems such as.
MapReduce server provided by the embodiments of the present application is introduced below, MapReduce clothes described below Business device with above with MapReduce server side description specification function numbers determine that method corresponds to each other reference.Such as Fig. 7 institute Show, is the structural block diagram of MapReduce server provided by the embodiments of the present application, referring to Fig. 7, which can To include:
Module 71 is obtained, for obtaining the quantity of the key-value pair of the mapping function output in the first preset period of time;
First determining module 72, for determining that the reduction function handles the object time of a key-value pair;
Second determining module 73 determines the reduction in the second preset period of time for being based on the object time The key-value pair of function handles quantity;
Third determining module 74, the quantity of the key-value pair for being exported based on the mapping function and the reduction function Key-value pair handle quantity, determine the destination number of the reduction function.
On the basis of the above embodiments, acquisition module provided in this embodiment includes:
First acquisition unit inputs the number of the blocks of files of the mapping function for obtaining in third preset period of time Amount and the mapping function output key-value pair quantity, the third preset period of time be contained in described first it is default when Between the period;
First determination unit, it is defeated for the quantity of the blocks of files based on the input mapping function and the mapping function The quantity of key-value pair out determines the input and output ratio of the mapping function;
Second acquisition unit, for obtaining the blocks of files for inputting the mapping function in first preset period of time Quantity;
Second determination unit inputs the blocks of files of the mapping function in first preset period of time for determining The product of the input and output ratio of quantity and the mapping function is the mapping function output in the first preset period of time The quantity of key-value pair.
Except this, in specification function numbers determining device provided in this embodiment, the first determining module includes:
Third acquiring unit, for obtaining the attribute-bit of the key-value pair;
Third determination unit, for obtaining the 4th preset time week when the attribute-bit is first kind attribute-bit The history of the key-value pair handles the time in phase, determines that the average value of the history processing time is the mesh of a key-value pair Mark the time;
4th determination unit, for when the attribute-bit is that the second generic attribute identifies, the key-value pair to be split into Multiple subdatas, and obtain the processing time that each reduction function handles every subdata, determine the processing time it With the object time for handling a key-value pair for the reduction function.
On the basis of the above embodiments, the second determining module includes the 5th determination unit, the third determining module packet The 6th determination unit is included,
5th determination unit, for determining that the quotient of second preset period of time and the object time is described The key-value pair of the reduction function handles quantity in second preset period of time;
6th determination unit, for determine the key-value pair of mapping function output quantity and the reduction function Key-value pair processing quantity quotient be the reduction function destination number.
The working principle of the MapReduce server refers to above method embodiment.
Above-described is the software function module framework of MapReduce server, on the hardware configuration of server, clothes Business device can realize in the following way Resource Allocation Formula:
Fig. 8 is the hardware block diagram of server provided by the embodiments of the present application, and referring to Fig. 8, which may include: Processor 111, communication interface 112, memory 113 and communication bus 114;
Wherein processor 111, communication interface 112, memory 113 complete mutual communication by communication bus 114;
Optionally, communication interface 112 can be the interface of communication module, such as the interface of gsm module;
Processor 111, for executing program;
Memory 113, for storing program;
Program may include program code, and said program code includes computer operation instruction.
Processor 111 may be a central processor CPU or specific integrated circuit ASIC (Application Specific Integrated Circuit), or be arranged to implement the integrated electricity of one or more of the embodiment of the present application Road.
Memory 113 may include high speed RAM memory, it is also possible to further include nonvolatile memory (non- Volatile memory), a for example, at least magnetic disk storage.
Wherein, program can be specifically used for:
Obtain the quantity of the key-value pair of the mapping function output in the first preset period of time;
Determine that the reduction function handles the object time of a key-value pair;
Based on the object time, the key-value pair processing number of the reduction function in the second preset period of time is determined Amount;
The quantity of key-value pair based on mapping function output and the key-value pair of the reduction function handle quantity, really Make the destination number of the reduction function.
Optionally, the quantity for obtaining the key-value pair of the mapping function output in the first preset period of time, comprising:
Obtain third preset period of time in, input the blocks of files of the mapping function quantity and the mapping function The quantity of the key-value pair of output, the third preset period of time are contained in first preset period of time;
The quantity of the key-value pair of quantity and mapping function output based on the blocks of files for inputting the mapping function, Determine the input and output ratio of the mapping function;
Obtain the quantity of the blocks of files of the input mapping function in first preset period of time;
Determine the quantity of the blocks of files of the input mapping function and the mapping letter in first preset period of time The product of several input and output ratio is the quantity of the key-value pair of the mapping function output in the first preset period of time.
Optionally, the determination reduction function handles the object time of a key-value pair, comprising:
Obtain the attribute-bit of the key-value pair;
When the attribute-bit is first kind attribute-bit, the key-value pair is gone through in the 4th preset period of time of acquisition History handles the time, determines that the average value of the history processing time is the object time of a key-value pair;
When the attribute-bit is that the second generic attribute identifies, the key-value pair is split into multiple subdatas, and obtain Each reduction function handles the processing time of every subdata, determines that the sum of described processing time is at the reduction function Manage the object time of a key-value pair.
Optionally, described to be based on the object time, determine the key of the reduction function in the second preset period of time Value is to processing quantity, comprising:
Determine that the quotient of second preset period of time and the object time are institute in second preset period of time State the key-value pair processing quantity of reduction function.
Optionally, the key-value pair of the quantity of the key-value pair based on mapping function output and the reduction function Quantity is handled, determines the destination number of the reduction function, comprising:
Determine the quotient of the quantity of the key-value pair of the mapping function output and the key-value pair processing quantity of the reduction function For the destination number of the reduction function.
The working principle of the server refers to above method embodiment, herein not repeated description, can be according to mapping The quantity of the key-value pair of function output determines the destination number of reduction function, so that resource reasonable distribution, and then improve and appoint Be engaged in treatment effeciency, avoid due to the quantity of reduction function is excessive or it is very few caused by the wasting of resources or the problems such as Caton.
In conclusion this application provides a kind of specification function numbers to determine method, apparatus and system, it is based on MapReduce server, wherein MapReduce server includes mapping function and reduction function, which determines Method obtains the quantity for the key-value pair that mapping function exports in the first preset period of time first.Then the processing of reduction function is determined The object time of one key-value pair, and it is based on the object time, determine the key-value pair of reduction function in the second preset period of time Handle quantity.The quantity of key-value pair later based on mapping function output and the key-value pair of reduction function handle quantity, determine The destination number of reduction function out.As it can be seen that the quantity for the key-value pair that this programme is exported according to mapping function determines reduction function Destination number so that resource reasonable distribution, and then improve task treatment effeciency, avoid the quantity mistake due to reduction function The problems such as wasting of resources caused by more or very few or Caton.
Each embodiment in this specification is described in a progressive manner, the highlights of each of the examples are with other The difference of embodiment, the same or similar parts in each embodiment may refer to each other.For device disclosed in embodiment For, since it is corresponded to the methods disclosed in the examples, so being described relatively simple, related place is said referring to method part It is bright.
Professional further appreciates that, unit described in conjunction with the examples disclosed in the embodiments of the present disclosure And algorithm steps, can be realized with electronic hardware, computer software, or a combination of the two, in order to clearly demonstrate hardware and The interchangeability of software generally describes each exemplary composition and step according to function in the above description.These Function is implemented in hardware or software actually, the specific application and design constraint depending on technical solution.Profession Technical staff can use different methods to achieve the described function each specific application, but this realization is not answered Think beyond scope of the present application.
The step of method described in conjunction with the examples disclosed in this document or algorithm, can directly be held with hardware, processor The combination of capable software module or the two is implemented.Software module can be placed in random access memory (RAM), memory, read-only deposit Reservoir (ROM), electrically programmable ROM, electrically erasable ROM, register, hard disk, moveable magnetic disc, CD-ROM or technology In any other form of storage medium well known in field.
The foregoing description of the disclosed embodiments makes professional and technical personnel in the field can be realized or use the application. Various modifications to these embodiments will be readily apparent to those skilled in the art, as defined herein General Principle can be realized in other embodiments without departing from the spirit or scope of the application.Therefore, the application It is not intended to be limited to the embodiments shown herein, and is to fit to and the principles and novel features disclosed herein phase one The widest scope of cause.

Claims (10)

1. a kind of specification function numbers determine method, which is characterized in that be applied to MapReduce server, the MapReduce Server includes mapping function and reduction function, and the mapping function is for calculating the blocks of files of input, and output is at least One key-value pair, the reduction function are used to carry out reduction calculating to the key-value pair, and the specification function numbers determine method Include:
Obtain the quantity of the key-value pair of the mapping function output in the first preset period of time;
Determine that the reduction function handles the object time of a key-value pair;
Based on the object time, the key-value pair processing quantity of the reduction function in the second preset period of time is determined;
The quantity of key-value pair based on mapping function output and the key-value pair of the reduction function handle quantity, determine The destination number of the reduction function.
2. specification function numbers according to claim 1 determine method, which is characterized in that the first preset time of the acquisition The quantity of the key-value pair of the mapping function output in period, comprising:
Obtain third preset period of time in, input the blocks of files of the mapping function quantity and the mapping function The quantity of the key-value pair of output;
The quantity of the key-value pair of quantity and mapping function output based on the blocks of files for inputting the mapping function, Determine the input and output ratio of the mapping function;
Obtain the quantity of the blocks of files of the input mapping function in first preset period of time;
Determine the quantity of the blocks of files of the input mapping function and the mapping function in first preset period of time The product of input and output ratio is the quantity of the key-value pair of the mapping function output in the first preset period of time.
3. specification function numbers according to claim 1 determine method, which is characterized in that the determination reduction function Handle the object time of a key-value pair, comprising:
Obtain the attribute-bit of the key-value pair;
When the attribute-bit is first kind attribute-bit, obtain in the 4th preset period of time at the history of the key-value pair The time is managed, determines that the average value of the history processing time is the object time of a key-value pair;
When the attribute-bit is that the second generic attribute identifies, the key-value pair is split into multiple subdatas, and obtain each Reduction function handles the processing time of every subdata, determines that the sum of described processing time is reduction function processing one The object time of a key-value pair.
4. specification function numbers according to claim 1-3 determine method, which is characterized in that described based on described Object time determines the key-value pair processing quantity of the reduction function in the second preset period of time, comprising:
It determines second preset period of time and the quotient of the object time is described in second preset period of time returns About the key-value pair of function handles quantity.
5. specification function numbers according to claim 1-3 determine method, which is characterized in that described based on described The quantity of the key-value pair of mapping function output and the key-value pair of the reduction function handle quantity, determine the reduction function Destination number, comprising:
Determine that the quotient of the quantity of the key-value pair of the mapping function output and the key-value pair processing quantity of the reduction function is institute State the destination number of reduction function.
6. a kind of specification function numbers determining device, which is characterized in that be applied to MapReduce server, the MapReduce Server includes mapping function and reduction function, and the mapping function is for calculating the blocks of files of input, and output is at least One key-value pair, the reduction function are used to carry out reduction calculating, the specification function numbers determining device to the key-value pair Include:
Module is obtained, for obtaining the quantity of the key-value pair of the mapping function output in the first preset period of time;
First determining module, for determining that the reduction function handles the object time of a key-value pair;
Second determining module determines the reduction function in the second preset period of time for being based on the object time Key-value pair handles quantity;
Third determining module, the key assignments of the quantity of the key-value pair for being exported based on the mapping function and the reduction function To processing quantity, the destination number of the reduction function is determined.
7. specification function numbers determining device according to claim 6, which is characterized in that the acquisition module includes:
First acquisition unit inputs the number of the blocks of files of the mapping function for obtaining in third preset period of time The quantity of amount and the key-value pair of mapping function output;
First determination unit is exported for the quantity of the blocks of files based on the input mapping function and the mapping function The quantity of the key-value pair determines the input and output ratio of the mapping function;
Second acquisition unit, for obtaining the number for inputting the blocks of files of the mapping function in first preset period of time Amount;
Second determination unit, for determining the quantity for inputting the blocks of files of the mapping function in first preset period of time Product with the input and output ratio of the mapping function is the key assignments of the mapping function output in the first preset period of time Pair quantity.
8. specification function numbers determining device according to claim 6, which is characterized in that the first determining module packet It includes:
Third acquiring unit, for obtaining the attribute-bit of the key-value pair;
Third determination unit, for obtaining in the 4th preset period of time when the attribute-bit is first kind attribute-bit The history of the key-value pair handles the time, when the average value for determining the history processing time is the target of a key-value pair Between;
4th determination unit, for the key-value pair being split into multiple when the attribute-bit is that the second generic attribute identifies Subdata, and the processing time that each reduction function handles every subdata is obtained, determine that the sum of described processing time is The reduction function handles the object time of a key-value pair.
9. according to the described in any item specification function numbers determining devices of claim 6-8, which is characterized in that the second determining module Including the 5th determination unit, the third determining module includes the 6th determination unit,
5th determination unit is used to determine second preset period of time and the quotient of the object time is described second The key-value pair of the reduction function handles quantity in preset period of time;
6th determination unit is used to determine the quantity of the key-value pair of the mapping function output and the key of the reduction function It is worth the destination number to the quotient for handling quantity for the reduction function.
10. a kind of specification function numbers determine system, which is characterized in that including as described in any one of claim 6-9 Specification function numbers determining device.
CN201910171361.5A 2019-03-07 2019-03-07 Reduction function quantity determination method, device and system Active CN109901931B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910171361.5A CN109901931B (en) 2019-03-07 2019-03-07 Reduction function quantity determination method, device and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910171361.5A CN109901931B (en) 2019-03-07 2019-03-07 Reduction function quantity determination method, device and system

Publications (2)

Publication Number Publication Date
CN109901931A true CN109901931A (en) 2019-06-18
CN109901931B CN109901931B (en) 2021-06-15

Family

ID=66946617

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910171361.5A Active CN109901931B (en) 2019-03-07 2019-03-07 Reduction function quantity determination method, device and system

Country Status (1)

Country Link
CN (1) CN109901931B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113722071A (en) * 2021-09-10 2021-11-30 拉卡拉支付股份有限公司 Data processing method, data processing apparatus, electronic device, storage medium, and program product

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102799486A (en) * 2012-06-18 2012-11-28 北京大学 Data sampling and partitioning method for MapReduce system
CN104298550A (en) * 2014-10-09 2015-01-21 南通大学 Hadoop-oriented dynamic scheduling method
US20150039667A1 (en) * 2013-08-02 2015-02-05 Linkedin Corporation Incremental processing on data intensive distributed applications
JP2015191428A (en) * 2014-03-28 2015-11-02 日本電信電話株式会社 Distributed data processing apparatus, distributed data processing method, and distributed data processing program
CN105577438A (en) * 2015-12-22 2016-05-11 桂林电子科技大学 MapReduce-based network traffic ontology construction method
CN107038072A (en) * 2016-02-03 2017-08-11 博雅网络游戏开发(深圳)有限公司 Method for scheduling task and device based on Hadoop system
CN108595268A (en) * 2018-04-24 2018-09-28 咪咕文化科技有限公司 A kind of data distributing method, device and computer readable storage medium based on MapReduce
CN109324898A (en) * 2018-08-27 2019-02-12 北京奇虎科技有限公司 A kind of method for processing business and system

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102799486A (en) * 2012-06-18 2012-11-28 北京大学 Data sampling and partitioning method for MapReduce system
US20150039667A1 (en) * 2013-08-02 2015-02-05 Linkedin Corporation Incremental processing on data intensive distributed applications
JP2015191428A (en) * 2014-03-28 2015-11-02 日本電信電話株式会社 Distributed data processing apparatus, distributed data processing method, and distributed data processing program
CN104298550A (en) * 2014-10-09 2015-01-21 南通大学 Hadoop-oriented dynamic scheduling method
CN105577438A (en) * 2015-12-22 2016-05-11 桂林电子科技大学 MapReduce-based network traffic ontology construction method
CN107038072A (en) * 2016-02-03 2017-08-11 博雅网络游戏开发(深圳)有限公司 Method for scheduling task and device based on Hadoop system
CN108595268A (en) * 2018-04-24 2018-09-28 咪咕文化科技有限公司 A kind of data distributing method, device and computer readable storage medium based on MapReduce
CN109324898A (en) * 2018-08-27 2019-02-12 北京奇虎科技有限公司 A kind of method for processing business and system

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
_1990: "Hadoop-MapReduce", 《HTTPS://WWW.CNBLOGS.COM/WXD0108/P/7156223.HTML》 *
王小鉴 等: "基于归约函数数量裁减的彩虹表技术改进", 《计算机工程》 *
陶永才 等: "异构资源环境下的MapReduce性能优化", 《小型微型计算机系统》 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113722071A (en) * 2021-09-10 2021-11-30 拉卡拉支付股份有限公司 Data processing method, data processing apparatus, electronic device, storage medium, and program product

Also Published As

Publication number Publication date
CN109901931B (en) 2021-06-15

Similar Documents

Publication Publication Date Title
US9483288B2 (en) Method and system for running a virtual appliance
CN103370691B (en) Managing buffer overflow conditions
CN103458052B (en) Resource scheduling method and device based on IaaS cloud platform
CN103810047A (en) Dynamically improving memory affinity of logical partitions
CN112527599A (en) Intelligent monitoring method and device, electronic equipment and readable storage medium
CN109062666A (en) A kind of cluster virtual machine management method and relevant apparatus
CN111190696A (en) Docker container deployment method, system, device and storage medium
CN105431815B (en) Input-output for data base workload is prioritized
CN109189572A (en) A kind of resource predictor method and system, electronic equipment and storage medium
CN106126384A (en) A kind of method and device of acquisition performance monitor unit PMU event
CN103500143B (en) Hard disk praameter method of adjustment and device
CN106648839A (en) Method and device for processing data
CN108874520A (en) Calculation method and device
CN109901931A (en) A kind of reduction function numbers determine method, apparatus and system
CN109240893B (en) Application running state query method and terminal equipment
CN108509440A (en) A kind of data processing method and device
CN109634524A (en) A kind of data partitioned allocation method, device and the equipment of data processing finger daemon
CN107577962B (en) A kind of method, system and relevant apparatus that the more algorithms of cipher card execute side by side
CN109213453A (en) A kind of disk management method and relevant apparatus
CN108491165A (en) A kind of data migration method and system for being classified storage
CN108664322A (en) Data processing method and system
CN107562520A (en) The method and apparatus of the internal memory of dilatation virtual machine
US10324765B2 (en) Predicting capacity of shared virtual machine resources
CN108322537A (en) Method, apparatus, equipment and the storage medium in Cloud Server node resource pond
CN115061813A (en) Cluster resource management method, device, equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant