CN109901931A - A kind of reduction function numbers determine method, apparatus and system - Google Patents
A kind of reduction function numbers determine method, apparatus and system Download PDFInfo
- Publication number
- CN109901931A CN109901931A CN201910171361.5A CN201910171361A CN109901931A CN 109901931 A CN109901931 A CN 109901931A CN 201910171361 A CN201910171361 A CN 201910171361A CN 109901931 A CN109901931 A CN 109901931A
- Authority
- CN
- China
- Prior art keywords
- key
- value pair
- function
- time
- mapping function
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Abstract
This application provides a kind of specification function numbers to determine method, apparatus and system, based on MapReduce server, wherein, the MapReduce server includes the reduction function that the blocks of files of input is calculated to and exported the mapping function of key-value pair and carries out reduction calculating to key-value pair, which determines that method first obtains the quantity for the key-value pair that mapping function exports in the first preset period of time.Then it determines that reduction function handles the object time of a key-value pair, and is based on the object time, determine the key-value pair processing quantity of reduction function in the second preset period of time.The quantity of key-value pair later based on mapping function output and the key-value pair of reduction function handle quantity, determine the destination number of reduction function.It can be seen that, the quantity for the key-value pair that this programme is exported according to mapping function determines the destination number of reduction function, so that resource reasonable distribution, improves task treatment effeciency, avoid due to the quantity of reduction function is excessive or it is very few caused by the wasting of resources or the problems such as Caton.
Description
Technical field
This application involves technical field of data processing, and in particular to a kind of specification function numbers determine method, apparatus and are
System.
Background technique
MapReduce is a kind of programming model, the parallel computation for large-scale dataset (being greater than 1TB).It includes to reflect
Penetrate Map and reduction Reduce two parts.Specifically, input file is divided into multiple split blocks, each split block by
One mapping function Mapper is calculated, mapping function Mapper one group of new key-value pair of output, such as<key, value>right, then
Key-value pair is sent to reduction function Reducer, reduction calculating is carried out by reduction function Reducer.
Currently, needing to predefine reduction function Reducer before key-value pair to be sent to reduction function Reducer
Number, key-value pair can be ranked up by mapping function Mapper according to affiliated reduction function Reducer, and according to above-mentioned row
Key-value pair is stored in continuous disk by sequence, in storage to SATA.Reduction function Reducer is made to read mapping letter in this way
It when counting the key-value pair in Mapper, can be read with monolith, and then improve reading performance.
However, it is found by the inventors that the number of reduction function Reducer is set by the user in aforesaid way, and set quantity
It is excessive or very few, phenomena such as will lead to the wasting of resources or task Caton.Therefore, it is determining that a kind of specification function numbers how to be provided
Method, apparatus and system are that those skilled in the art are urgently to be resolved to improve resource utilization, improve task treatment effeciency
One big technical problem.
Summary of the invention
In view of this, the embodiment of the present application, which provides a kind of specification function numbers, determines method, apparatus and system, Neng Gouti
High resource utilization improves task treatment effeciency.
To achieve the above object, the embodiment of the present application provides the following technical solutions:
A kind of specification function numbers determine method, are applied to MapReduce server, the MapReduce server packet
Mapping function and reduction function are included, the mapping function exports at least one key assignments for calculating the blocks of files of input
Right, the reduction function is used to carry out reduction calculating to the key-value pair, and the specification function numbers determine that method includes:
Obtain the quantity of the key-value pair of the mapping function output in the first preset period of time;
Determine that the reduction function handles the object time of a key-value pair;
Based on the object time, the key-value pair processing number of the reduction function in the second preset period of time is determined
Amount;
The quantity of key-value pair based on mapping function output and the key-value pair of the reduction function handle quantity, really
Make the destination number of the reduction function.
Optionally, the quantity for obtaining the key-value pair of the mapping function output in the first preset period of time, comprising:
It obtains in third preset period of time, inputs the quantity and the mapping of the blocks of files of the mapping function
The quantity of the key-value pair of function output;
The key-value pair of quantity and mapping function output based on the blocks of files for inputting the mapping function
Quantity determines the input and output ratio of the mapping function;
Obtain the quantity of the blocks of files of the input mapping function in first preset period of time;
Determine the quantity of the blocks of files of the input mapping function and the mapping letter in first preset period of time
The product of several input and output ratio is the quantity of the key-value pair of the mapping function output in the first preset period of time.
Optionally, the determination reduction function handles the object time of a key-value pair, comprising:
Obtain the attribute-bit of the key-value pair;
When the attribute-bit is first kind attribute-bit, the key-value pair is gone through in the 4th preset period of time of acquisition
History handles the time, determines that the average value of the history processing time is the object time of a key-value pair;
When the attribute-bit is that the second generic attribute identifies, the key-value pair is split into multiple subdatas, and obtain
Each reduction function handles the processing time of every subdata, determines that the sum of described processing time is at the reduction function
Manage the object time of a key-value pair.
Optionally, described to be based on the object time, determine the key of the reduction function in the second preset period of time
Value is to processing quantity, comprising:
Determine that the quotient of second preset period of time and the object time are institute in second preset period of time
State the key-value pair processing quantity of reduction function.
Optionally, the key-value pair of the quantity of the key-value pair based on mapping function output and the reduction function
Quantity is handled, determines the destination number of the reduction function, comprising:
Determine the quotient of the quantity of the key-value pair of the mapping function output and the key-value pair processing quantity of the reduction function
For the destination number of the reduction function.
A kind of specification function numbers determining device is applied to MapReduce server, the MapReduce server packet
Mapping function and reduction function are included, the mapping function exports at least one key assignments for calculating the blocks of files of input
Right, the reduction function is used to carry out reduction calculating to the key-value pair, and the specification function numbers determining device includes:
Module is obtained, for obtaining the number of the key-value pair of the mapping function output in the first preset period of time
Amount;
First determining module, for determining that the reduction function handles the object time of a key-value pair;
Second determining module determines the reduction letter in the second preset period of time for being based on the object time
Several key-value pairs handles quantity;
Third determining module, the quantity of the key-value pair for being exported based on the mapping function and the reduction function
Key-value pair handles quantity, determines the destination number of the reduction function.
Optionally, the acquisition module includes:
First acquisition unit inputs the blocks of files of the mapping function for obtaining in third preset period of time
Quantity and the mapping function output the key-value pair quantity;
First determination unit, it is defeated for the quantity of the blocks of files based on the input mapping function and the mapping function
The quantity of the key-value pair out determines the input and output ratio of the mapping function;
Second acquisition unit, for obtaining the blocks of files for inputting the mapping function in first preset period of time
Quantity;
Second determination unit, for determining the blocks of files for inputting the mapping function in first preset period of time
The product of the input and output ratio of quantity and the mapping function is the mapping function output in the first preset period of time
The quantity of key-value pair.
Optionally, first determining module includes:
Third acquiring unit, for obtaining the attribute-bit of the key-value pair;
Third determination unit, for obtaining the 4th preset time week when the attribute-bit is first kind attribute-bit
The history of the key-value pair handles the time in phase, determines that the average value of the history processing time is the mesh of a key-value pair
Mark the time;
4th determination unit, for when the attribute-bit is that the second generic attribute identifies, the key-value pair to be split into
Multiple subdatas, and obtain the processing time that each reduction function handles every subdata, determine the processing time it
With the object time for handling a key-value pair for the reduction function.
Optionally, the second determining module includes the 5th determination unit, and the third determining module includes the 6th determination unit,
5th determination unit is used to determine that the quotient of second preset period of time and the object time to be described
The key-value pair of the reduction function handles quantity in second preset period of time;
The quantity for the key-value pair that 6th determination unit is used to determine that the mapping function exports and the reduction function
Key-value pair processing quantity quotient be the reduction function destination number.
A kind of specification function numbers determine system, including the above-mentioned specification function numbers determining device of any one.
Based on the above-mentioned technical proposal, this application provides a kind of specification function numbers to determine method, is taken based on MapReduce
Be engaged in device, wherein MapReduce server includes mapping function and reduction function, blocks of files of the mapping function for that will input into
Row calculates, and exports at least one key-value pair, and reduction function is used to carry out reduction calculating, the specification function numbers to the key-value pair
Determine that method obtains the quantity for the key-value pair that mapping function exports in the first preset period of time first.Then reduction function is determined
The object time of a key-value pair is handled, and is based on the object time, determines the key of reduction function in the second preset period of time
Value is to processing quantity.The quantity of key-value pair later based on mapping function output and the key-value pair of reduction function handle quantity,
Determine the destination number of reduction function.As it can be seen that the quantity for the key-value pair that this programme is exported according to mapping function determines reduction
The destination number of function, so that resource reasonable distribution, and then task treatment effeciency is improved, avoid the number due to reduction function
The problems such as measuring the excessive or very few caused wasting of resources or Caton.
Detailed description of the invention
In order to illustrate the technical solutions in the embodiments of the present application or in the prior art more clearly, to embodiment or will show below
There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this
The embodiment of application for those of ordinary skill in the art without creative efforts, can also basis
The attached drawing of offer obtains other attached drawings.
Fig. 1 is the structural block diagram that specification function numbers provided by the embodiments of the present application determine system;
Fig. 2 is the flow chart that specification function numbers provided by the embodiments of the present application determine method;
Fig. 3 is the another flow chart that specification function numbers provided by the embodiments of the present application determine method;
Fig. 4 is the another flow chart that specification function numbers provided by the embodiments of the present application determine method;
Fig. 5 is the another flow chart that specification function numbers provided by the embodiments of the present application determine method;
Fig. 6 is the another flow chart that specification function numbers provided by the embodiments of the present application determine method;
Fig. 7 is the structural schematic diagram of specification function numbers determining device provided by the embodiments of the present application;
Fig. 8 is a kind of hardware structural diagram of server provided by the embodiments of the present application.
Specific embodiment
Below in conjunction with the attached drawing in the embodiment of the present application, technical solutions in the embodiments of the present application carries out clear, complete
Site preparation description, it is clear that described embodiments are only a part of embodiments of the present application, instead of all the embodiments.It is based on
Embodiment in the application, it is obtained by those of ordinary skill in the art without making creative efforts every other
Embodiment shall fall in the protection scope of this application.
Fig. 1 is the structural block diagram that specification function numbers provided by the embodiments of the present application determine system, specification shown in the figure
Function numbers determine that system can be used to implement specification function numbers provided by the embodiments of the present application and determine method.Referring to Fig.1, should
Specification function numbers determine that system includes MapReduce server, which includes mapping function 101 and return
About function 102.
Wherein, input file is divided into multiple blocks of files split, each blocks of files is by a mapping function Mapper
It calculates, mapping function Mapper exports one group of new key-value pair, then key-value pair is sent to reduction by such as<key, value>right
Function Reducer carries out reduction calculating by reduction function Reducer.
System is determined based on specification function numbers shown in FIG. 1, below from the angle of MapReduce server to the application
The specification function numbers of offer determine that method is introduced.As shown in Fig. 2, being specification function numbers provided by the embodiments of the present application
Determine the flow chart of method, this method is applied to MapReduce server, may include:
S21, the quantity for obtaining the key-value pair of the mapping function output in the first preset period of time.
Specifically, as shown in figure 3, present embodiments providing the mapping function in a kind of the first preset period of time of acquisition
The specific implementation of the quantity of the key-value pair of output, comprising:
S31 is obtained in third preset period of time, inputs the quantity and the mapping of the blocks of files of the mapping function
The quantity of the key-value pair of function output;
The key-value pair of S32, the quantity based on the blocks of files for inputting the mapping function and mapping function output
Quantity determines the input and output ratio of the mapping function;
S33, the quantity for obtaining the blocks of files of the input mapping function in first preset period of time;
S34, determine that the quantity of the blocks of files of the input mapping function in first preset period of time is reflected with described
The product for penetrating the input and output ratio of function is the quantity of the key-value pair of the mapping function output in the first preset period of time.
Wherein, third preset period of time can be less than the first preset period of time, such as third preset period of time is 1
Hour, the first preset period of time is 12 hours, then the present embodiment is the file for obtaining the mapping function input in 1 hour first
Then the quantity of the quantity of block and the key-value pair of mapping function output determines the quantity of the blocks of files of input mapping function
Quotient with the quantity of the key-value pair of mapping function output is the input and output ratio of mapping function, and determining 12 hours later corresponding
The product for inputting the quantity of the blocks of files of mapping function and the input and output ratio of mapping function is mapping corresponding to 12 hours
The quantity of the key-value pair of function output.
Except this, third preset period of time can also be a time cycle in the first preset period of time, for example, the
A period of time is 1:00-12:00, at this point, third preset period of time can be 1:00-2:00, i.e., by first it is default when
Between period (third preset period of time) in the period related data (quantity of the blocks of files of input mapping function with
And the quantity of the key-value pair of mapping function output) calculate the number of the key-value pair that mapping function in the first preset period of time exports
Amount.
S22, determine that the reduction function handles the object time of a key-value pair.
Specifically, as shown in figure 4, present embodiments providing a kind of one key-value pair of determination reduction function processing
Object time concrete mode, comprising:
S41 obtains the attribute-bit of the key-value pair;
S42, when the attribute-bit be the first kind attribute-bit when, obtain the 4th preset period of time in the key-value pair
History handle the time, determine that the average value of history processing time is the object time of a key-value pair;
S43, when the attribute-bit be the second generic attribute identify when, the key-value pair is split into multiple subdatas, and
The processing time that each reduction function handles every subdata is obtained, determines that the sum of described processing time is the reduction letter
The object time of number one key-value pair of processing.
Wherein, in this embodiment, first kind attribute-bit is the non-key-value pair run for the first time, is for example, periodically reruned
Key-value pair, it is possible to obtain the key-value pair history processing the time, and acquire multiple key-value pairs average time be one
The object time of a key-value pair.
Second generic attribute information is the key-value pair run for the first time, then, it, can be by the key assignments at calculating " object time "
To multiple sets of sub-data is split into, this group of subdata is handled by multiple reduction functions, finally determines the runing time of every subdata
The sum of be the key-value pair object time.
S23, it is based on the object time, determines the key-value pair processing of the reduction function in the second preset period of time
Quantity.
Specifically, as shown in figure 5, present embodiments provide it is a kind of based on the object time, determine second it is default when
Between in the period key-value pair processing quantity of the reduction function specific implementation, comprising:
S51, determine that the quotient of second preset period of time and the object time are second preset period of time
The key-value pair of the interior reduction function handles quantity.
Wherein, the second preset time is setting value, can be expected calculated time value set by user, e.g., Yong Huxi
The total duration for hoping data calculate controlled in 1 minute, therefore the second preset period of time can be set to 1 minute.Certainly, also
The time can be calculated according to history, it is automatic to carry out the preferred reckoning for calculating the time, determine that the preferred calculating time is second
Preset period of time, for example, the time for obtaining the key-value pairs of history 3 times calculating equivalent amounts is respectively t1, t2 and t3, then
The average value for determining t1, t2 and t3 is the second preset period of time.
The key-value pair processing quantity of reduction function characterizes the processing capacity of reduction function processing key-value pair, this step can be with
Then empirically determined a preferred calculating time, i.e. the second preset period of time out handle one according to reduction function
The object time of key-value pair determines key-value pair processing quantity=second preset period of time/object time of the reduction function.
S24, the quantity of key-value pair based on mapping function output and the key-value pair of the reduction function handle number
Amount, determines the destination number of the reduction function.
Specifically, as shown in fig. 6, present embodiments providing a kind of quantity of key-value pair based on mapping function output
And the key-value pair of the reduction function handles quantity, determines the specific implementation of the destination number of the reduction function,
Include:
S61, determine that the quantity of key-value pair of the mapping function output and the key-value pair of the reduction function handle quantity
Quotient be the reduction function destination number.
It has been determined that the key-value pair of each reduction function handles quantity in step S23, then the number of targets of required reduction function
Quantity/reduction function key-value pair of amount=mapping function output key-value pair handles quantity.
As it can be seen that the quantity for the key-value pair that this programme is exported according to mapping function determines the destination number of reduction function, make
Resource reasonable distribution, and then improve task treatment effeciency, avoid since the quantity of reduction function is excessive or very few causes
The wasting of resources or Caton the problems such as.
MapReduce server provided by the embodiments of the present application is introduced below, MapReduce clothes described below
Business device with above with MapReduce server side description specification function numbers determine that method corresponds to each other reference.Such as Fig. 7 institute
Show, is the structural block diagram of MapReduce server provided by the embodiments of the present application, referring to Fig. 7, which can
To include:
Module 71 is obtained, for obtaining the quantity of the key-value pair of the mapping function output in the first preset period of time;
First determining module 72, for determining that the reduction function handles the object time of a key-value pair;
Second determining module 73 determines the reduction in the second preset period of time for being based on the object time
The key-value pair of function handles quantity;
Third determining module 74, the quantity of the key-value pair for being exported based on the mapping function and the reduction function
Key-value pair handle quantity, determine the destination number of the reduction function.
On the basis of the above embodiments, acquisition module provided in this embodiment includes:
First acquisition unit inputs the number of the blocks of files of the mapping function for obtaining in third preset period of time
Amount and the mapping function output key-value pair quantity, the third preset period of time be contained in described first it is default when
Between the period;
First determination unit, it is defeated for the quantity of the blocks of files based on the input mapping function and the mapping function
The quantity of key-value pair out determines the input and output ratio of the mapping function;
Second acquisition unit, for obtaining the blocks of files for inputting the mapping function in first preset period of time
Quantity;
Second determination unit inputs the blocks of files of the mapping function in first preset period of time for determining
The product of the input and output ratio of quantity and the mapping function is the mapping function output in the first preset period of time
The quantity of key-value pair.
Except this, in specification function numbers determining device provided in this embodiment, the first determining module includes:
Third acquiring unit, for obtaining the attribute-bit of the key-value pair;
Third determination unit, for obtaining the 4th preset time week when the attribute-bit is first kind attribute-bit
The history of the key-value pair handles the time in phase, determines that the average value of the history processing time is the mesh of a key-value pair
Mark the time;
4th determination unit, for when the attribute-bit is that the second generic attribute identifies, the key-value pair to be split into
Multiple subdatas, and obtain the processing time that each reduction function handles every subdata, determine the processing time it
With the object time for handling a key-value pair for the reduction function.
On the basis of the above embodiments, the second determining module includes the 5th determination unit, the third determining module packet
The 6th determination unit is included,
5th determination unit, for determining that the quotient of second preset period of time and the object time is described
The key-value pair of the reduction function handles quantity in second preset period of time;
6th determination unit, for determine the key-value pair of mapping function output quantity and the reduction function
Key-value pair processing quantity quotient be the reduction function destination number.
The working principle of the MapReduce server refers to above method embodiment.
Above-described is the software function module framework of MapReduce server, on the hardware configuration of server, clothes
Business device can realize in the following way Resource Allocation Formula:
Fig. 8 is the hardware block diagram of server provided by the embodiments of the present application, and referring to Fig. 8, which may include:
Processor 111, communication interface 112, memory 113 and communication bus 114;
Wherein processor 111, communication interface 112, memory 113 complete mutual communication by communication bus 114;
Optionally, communication interface 112 can be the interface of communication module, such as the interface of gsm module;
Processor 111, for executing program;
Memory 113, for storing program;
Program may include program code, and said program code includes computer operation instruction.
Processor 111 may be a central processor CPU or specific integrated circuit ASIC (Application
Specific Integrated Circuit), or be arranged to implement the integrated electricity of one or more of the embodiment of the present application
Road.
Memory 113 may include high speed RAM memory, it is also possible to further include nonvolatile memory (non-
Volatile memory), a for example, at least magnetic disk storage.
Wherein, program can be specifically used for:
Obtain the quantity of the key-value pair of the mapping function output in the first preset period of time;
Determine that the reduction function handles the object time of a key-value pair;
Based on the object time, the key-value pair processing number of the reduction function in the second preset period of time is determined
Amount;
The quantity of key-value pair based on mapping function output and the key-value pair of the reduction function handle quantity, really
Make the destination number of the reduction function.
Optionally, the quantity for obtaining the key-value pair of the mapping function output in the first preset period of time, comprising:
Obtain third preset period of time in, input the blocks of files of the mapping function quantity and the mapping function
The quantity of the key-value pair of output, the third preset period of time are contained in first preset period of time;
The quantity of the key-value pair of quantity and mapping function output based on the blocks of files for inputting the mapping function,
Determine the input and output ratio of the mapping function;
Obtain the quantity of the blocks of files of the input mapping function in first preset period of time;
Determine the quantity of the blocks of files of the input mapping function and the mapping letter in first preset period of time
The product of several input and output ratio is the quantity of the key-value pair of the mapping function output in the first preset period of time.
Optionally, the determination reduction function handles the object time of a key-value pair, comprising:
Obtain the attribute-bit of the key-value pair;
When the attribute-bit is first kind attribute-bit, the key-value pair is gone through in the 4th preset period of time of acquisition
History handles the time, determines that the average value of the history processing time is the object time of a key-value pair;
When the attribute-bit is that the second generic attribute identifies, the key-value pair is split into multiple subdatas, and obtain
Each reduction function handles the processing time of every subdata, determines that the sum of described processing time is at the reduction function
Manage the object time of a key-value pair.
Optionally, described to be based on the object time, determine the key of the reduction function in the second preset period of time
Value is to processing quantity, comprising:
Determine that the quotient of second preset period of time and the object time are institute in second preset period of time
State the key-value pair processing quantity of reduction function.
Optionally, the key-value pair of the quantity of the key-value pair based on mapping function output and the reduction function
Quantity is handled, determines the destination number of the reduction function, comprising:
Determine the quotient of the quantity of the key-value pair of the mapping function output and the key-value pair processing quantity of the reduction function
For the destination number of the reduction function.
The working principle of the server refers to above method embodiment, herein not repeated description, can be according to mapping
The quantity of the key-value pair of function output determines the destination number of reduction function, so that resource reasonable distribution, and then improve and appoint
Be engaged in treatment effeciency, avoid due to the quantity of reduction function is excessive or it is very few caused by the wasting of resources or the problems such as Caton.
In conclusion this application provides a kind of specification function numbers to determine method, apparatus and system, it is based on
MapReduce server, wherein MapReduce server includes mapping function and reduction function, which determines
Method obtains the quantity for the key-value pair that mapping function exports in the first preset period of time first.Then the processing of reduction function is determined
The object time of one key-value pair, and it is based on the object time, determine the key-value pair of reduction function in the second preset period of time
Handle quantity.The quantity of key-value pair later based on mapping function output and the key-value pair of reduction function handle quantity, determine
The destination number of reduction function out.As it can be seen that the quantity for the key-value pair that this programme is exported according to mapping function determines reduction function
Destination number so that resource reasonable distribution, and then improve task treatment effeciency, avoid the quantity mistake due to reduction function
The problems such as wasting of resources caused by more or very few or Caton.
Each embodiment in this specification is described in a progressive manner, the highlights of each of the examples are with other
The difference of embodiment, the same or similar parts in each embodiment may refer to each other.For device disclosed in embodiment
For, since it is corresponded to the methods disclosed in the examples, so being described relatively simple, related place is said referring to method part
It is bright.
Professional further appreciates that, unit described in conjunction with the examples disclosed in the embodiments of the present disclosure
And algorithm steps, can be realized with electronic hardware, computer software, or a combination of the two, in order to clearly demonstrate hardware and
The interchangeability of software generally describes each exemplary composition and step according to function in the above description.These
Function is implemented in hardware or software actually, the specific application and design constraint depending on technical solution.Profession
Technical staff can use different methods to achieve the described function each specific application, but this realization is not answered
Think beyond scope of the present application.
The step of method described in conjunction with the examples disclosed in this document or algorithm, can directly be held with hardware, processor
The combination of capable software module or the two is implemented.Software module can be placed in random access memory (RAM), memory, read-only deposit
Reservoir (ROM), electrically programmable ROM, electrically erasable ROM, register, hard disk, moveable magnetic disc, CD-ROM or technology
In any other form of storage medium well known in field.
The foregoing description of the disclosed embodiments makes professional and technical personnel in the field can be realized or use the application.
Various modifications to these embodiments will be readily apparent to those skilled in the art, as defined herein
General Principle can be realized in other embodiments without departing from the spirit or scope of the application.Therefore, the application
It is not intended to be limited to the embodiments shown herein, and is to fit to and the principles and novel features disclosed herein phase one
The widest scope of cause.
Claims (10)
1. a kind of specification function numbers determine method, which is characterized in that be applied to MapReduce server, the MapReduce
Server includes mapping function and reduction function, and the mapping function is for calculating the blocks of files of input, and output is at least
One key-value pair, the reduction function are used to carry out reduction calculating to the key-value pair, and the specification function numbers determine method
Include:
Obtain the quantity of the key-value pair of the mapping function output in the first preset period of time;
Determine that the reduction function handles the object time of a key-value pair;
Based on the object time, the key-value pair processing quantity of the reduction function in the second preset period of time is determined;
The quantity of key-value pair based on mapping function output and the key-value pair of the reduction function handle quantity, determine
The destination number of the reduction function.
2. specification function numbers according to claim 1 determine method, which is characterized in that the first preset time of the acquisition
The quantity of the key-value pair of the mapping function output in period, comprising:
Obtain third preset period of time in, input the blocks of files of the mapping function quantity and the mapping function
The quantity of the key-value pair of output;
The quantity of the key-value pair of quantity and mapping function output based on the blocks of files for inputting the mapping function,
Determine the input and output ratio of the mapping function;
Obtain the quantity of the blocks of files of the input mapping function in first preset period of time;
Determine the quantity of the blocks of files of the input mapping function and the mapping function in first preset period of time
The product of input and output ratio is the quantity of the key-value pair of the mapping function output in the first preset period of time.
3. specification function numbers according to claim 1 determine method, which is characterized in that the determination reduction function
Handle the object time of a key-value pair, comprising:
Obtain the attribute-bit of the key-value pair;
When the attribute-bit is first kind attribute-bit, obtain in the 4th preset period of time at the history of the key-value pair
The time is managed, determines that the average value of the history processing time is the object time of a key-value pair;
When the attribute-bit is that the second generic attribute identifies, the key-value pair is split into multiple subdatas, and obtain each
Reduction function handles the processing time of every subdata, determines that the sum of described processing time is reduction function processing one
The object time of a key-value pair.
4. specification function numbers according to claim 1-3 determine method, which is characterized in that described based on described
Object time determines the key-value pair processing quantity of the reduction function in the second preset period of time, comprising:
It determines second preset period of time and the quotient of the object time is described in second preset period of time returns
About the key-value pair of function handles quantity.
5. specification function numbers according to claim 1-3 determine method, which is characterized in that described based on described
The quantity of the key-value pair of mapping function output and the key-value pair of the reduction function handle quantity, determine the reduction function
Destination number, comprising:
Determine that the quotient of the quantity of the key-value pair of the mapping function output and the key-value pair processing quantity of the reduction function is institute
State the destination number of reduction function.
6. a kind of specification function numbers determining device, which is characterized in that be applied to MapReduce server, the MapReduce
Server includes mapping function and reduction function, and the mapping function is for calculating the blocks of files of input, and output is at least
One key-value pair, the reduction function are used to carry out reduction calculating, the specification function numbers determining device to the key-value pair
Include:
Module is obtained, for obtaining the quantity of the key-value pair of the mapping function output in the first preset period of time;
First determining module, for determining that the reduction function handles the object time of a key-value pair;
Second determining module determines the reduction function in the second preset period of time for being based on the object time
Key-value pair handles quantity;
Third determining module, the key assignments of the quantity of the key-value pair for being exported based on the mapping function and the reduction function
To processing quantity, the destination number of the reduction function is determined.
7. specification function numbers determining device according to claim 6, which is characterized in that the acquisition module includes:
First acquisition unit inputs the number of the blocks of files of the mapping function for obtaining in third preset period of time
The quantity of amount and the key-value pair of mapping function output;
First determination unit is exported for the quantity of the blocks of files based on the input mapping function and the mapping function
The quantity of the key-value pair determines the input and output ratio of the mapping function;
Second acquisition unit, for obtaining the number for inputting the blocks of files of the mapping function in first preset period of time
Amount;
Second determination unit, for determining the quantity for inputting the blocks of files of the mapping function in first preset period of time
Product with the input and output ratio of the mapping function is the key assignments of the mapping function output in the first preset period of time
Pair quantity.
8. specification function numbers determining device according to claim 6, which is characterized in that the first determining module packet
It includes:
Third acquiring unit, for obtaining the attribute-bit of the key-value pair;
Third determination unit, for obtaining in the 4th preset period of time when the attribute-bit is first kind attribute-bit
The history of the key-value pair handles the time, when the average value for determining the history processing time is the target of a key-value pair
Between;
4th determination unit, for the key-value pair being split into multiple when the attribute-bit is that the second generic attribute identifies
Subdata, and the processing time that each reduction function handles every subdata is obtained, determine that the sum of described processing time is
The reduction function handles the object time of a key-value pair.
9. according to the described in any item specification function numbers determining devices of claim 6-8, which is characterized in that the second determining module
Including the 5th determination unit, the third determining module includes the 6th determination unit,
5th determination unit is used to determine second preset period of time and the quotient of the object time is described second
The key-value pair of the reduction function handles quantity in preset period of time;
6th determination unit is used to determine the quantity of the key-value pair of the mapping function output and the key of the reduction function
It is worth the destination number to the quotient for handling quantity for the reduction function.
10. a kind of specification function numbers determine system, which is characterized in that including as described in any one of claim 6-9
Specification function numbers determining device.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910171361.5A CN109901931B (en) | 2019-03-07 | 2019-03-07 | Reduction function quantity determination method, device and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910171361.5A CN109901931B (en) | 2019-03-07 | 2019-03-07 | Reduction function quantity determination method, device and system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109901931A true CN109901931A (en) | 2019-06-18 |
CN109901931B CN109901931B (en) | 2021-06-15 |
Family
ID=66946617
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910171361.5A Active CN109901931B (en) | 2019-03-07 | 2019-03-07 | Reduction function quantity determination method, device and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109901931B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113722071A (en) * | 2021-09-10 | 2021-11-30 | 拉卡拉支付股份有限公司 | Data processing method, data processing apparatus, electronic device, storage medium, and program product |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102799486A (en) * | 2012-06-18 | 2012-11-28 | 北京大学 | Data sampling and partitioning method for MapReduce system |
CN104298550A (en) * | 2014-10-09 | 2015-01-21 | 南通大学 | Hadoop-oriented dynamic scheduling method |
US20150039667A1 (en) * | 2013-08-02 | 2015-02-05 | Linkedin Corporation | Incremental processing on data intensive distributed applications |
JP2015191428A (en) * | 2014-03-28 | 2015-11-02 | 日本電信電話株式会社 | Distributed data processing apparatus, distributed data processing method, and distributed data processing program |
CN105577438A (en) * | 2015-12-22 | 2016-05-11 | 桂林电子科技大学 | MapReduce-based network traffic ontology construction method |
CN107038072A (en) * | 2016-02-03 | 2017-08-11 | 博雅网络游戏开发(深圳)有限公司 | Method for scheduling task and device based on Hadoop system |
CN108595268A (en) * | 2018-04-24 | 2018-09-28 | 咪咕文化科技有限公司 | A kind of data distributing method, device and computer readable storage medium based on MapReduce |
CN109324898A (en) * | 2018-08-27 | 2019-02-12 | 北京奇虎科技有限公司 | A kind of method for processing business and system |
-
2019
- 2019-03-07 CN CN201910171361.5A patent/CN109901931B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102799486A (en) * | 2012-06-18 | 2012-11-28 | 北京大学 | Data sampling and partitioning method for MapReduce system |
US20150039667A1 (en) * | 2013-08-02 | 2015-02-05 | Linkedin Corporation | Incremental processing on data intensive distributed applications |
JP2015191428A (en) * | 2014-03-28 | 2015-11-02 | 日本電信電話株式会社 | Distributed data processing apparatus, distributed data processing method, and distributed data processing program |
CN104298550A (en) * | 2014-10-09 | 2015-01-21 | 南通大学 | Hadoop-oriented dynamic scheduling method |
CN105577438A (en) * | 2015-12-22 | 2016-05-11 | 桂林电子科技大学 | MapReduce-based network traffic ontology construction method |
CN107038072A (en) * | 2016-02-03 | 2017-08-11 | 博雅网络游戏开发(深圳)有限公司 | Method for scheduling task and device based on Hadoop system |
CN108595268A (en) * | 2018-04-24 | 2018-09-28 | 咪咕文化科技有限公司 | A kind of data distributing method, device and computer readable storage medium based on MapReduce |
CN109324898A (en) * | 2018-08-27 | 2019-02-12 | 北京奇虎科技有限公司 | A kind of method for processing business and system |
Non-Patent Citations (3)
Title |
---|
_1990: "Hadoop-MapReduce", 《HTTPS://WWW.CNBLOGS.COM/WXD0108/P/7156223.HTML》 * |
王小鉴 等: "基于归约函数数量裁减的彩虹表技术改进", 《计算机工程》 * |
陶永才 等: "异构资源环境下的MapReduce性能优化", 《小型微型计算机系统》 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113722071A (en) * | 2021-09-10 | 2021-11-30 | 拉卡拉支付股份有限公司 | Data processing method, data processing apparatus, electronic device, storage medium, and program product |
Also Published As
Publication number | Publication date |
---|---|
CN109901931B (en) | 2021-06-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9483288B2 (en) | Method and system for running a virtual appliance | |
CN103370691B (en) | Managing buffer overflow conditions | |
CN103458052B (en) | Resource scheduling method and device based on IaaS cloud platform | |
CN103810047A (en) | Dynamically improving memory affinity of logical partitions | |
CN112527599A (en) | Intelligent monitoring method and device, electronic equipment and readable storage medium | |
CN109062666A (en) | A kind of cluster virtual machine management method and relevant apparatus | |
CN111190696A (en) | Docker container deployment method, system, device and storage medium | |
CN105431815B (en) | Input-output for data base workload is prioritized | |
CN109189572A (en) | A kind of resource predictor method and system, electronic equipment and storage medium | |
CN106126384A (en) | A kind of method and device of acquisition performance monitor unit PMU event | |
CN103500143B (en) | Hard disk praameter method of adjustment and device | |
CN106648839A (en) | Method and device for processing data | |
CN108874520A (en) | Calculation method and device | |
CN109901931A (en) | A kind of reduction function numbers determine method, apparatus and system | |
CN109240893B (en) | Application running state query method and terminal equipment | |
CN108509440A (en) | A kind of data processing method and device | |
CN109634524A (en) | A kind of data partitioned allocation method, device and the equipment of data processing finger daemon | |
CN107577962B (en) | A kind of method, system and relevant apparatus that the more algorithms of cipher card execute side by side | |
CN109213453A (en) | A kind of disk management method and relevant apparatus | |
CN108491165A (en) | A kind of data migration method and system for being classified storage | |
CN108664322A (en) | Data processing method and system | |
CN107562520A (en) | The method and apparatus of the internal memory of dilatation virtual machine | |
US10324765B2 (en) | Predicting capacity of shared virtual machine resources | |
CN108322537A (en) | Method, apparatus, equipment and the storage medium in Cloud Server node resource pond | |
CN115061813A (en) | Cluster resource management method, device, equipment and medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |