CN110489301A - Analysis method, device and the equipment of mapreduce mission performance - Google Patents
Analysis method, device and the equipment of mapreduce mission performance Download PDFInfo
- Publication number
- CN110489301A CN110489301A CN201910776593.3A CN201910776593A CN110489301A CN 110489301 A CN110489301 A CN 110489301A CN 201910776593 A CN201910776593 A CN 201910776593A CN 110489301 A CN110489301 A CN 110489301A
- Authority
- CN
- China
- Prior art keywords
- data volume
- reduce
- map
- current
- variable
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/3065—Monitoring arrangements determined by the means or processing involved in reporting the monitored data
- G06F11/3072—Monitoring arrangements determined by the means or processing involved in reporting the monitored data where the reporting involves data filtering, e.g. pattern matching, time or event triggered, adaptive or policy-based reporting
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Quality & Reliability (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Debugging And Monitoring (AREA)
Abstract
This application involves analysis method, device and the equipment of a kind of mapreduce mission performance, including setting time to acquire threshold value, and it includes initial time and end time that the time, which acquires threshold value,;Progress analysis variable is obtained according to the reduce progress of the reduce progress of initial time and end time;Map inclination variable is obtained according to the first preset algorithm according to the data volume that the data volume for the history map/reduce processing being collected into, current map/reduce have completed the data volume of processing, current map/reduce is being handled and obtains reduce inclination variable according to the second preset algorithm;Progress analysis variable, map inclination variable, reduce inclination variable are input in the data skew model that training obtains in advance, data skew risk parameter is obtained;The analysis result of mapreduce mission performance is obtained according to data skew risk parameter and the preset mission performance table of comparisons;By the analysis method of above-mentioned mapreduce mission performance, available mapreduce mission performance analyzes the performance as a result, to know task execution, finds performance based on the analysis results and executes best situation.
Description
Technical field
This application involves big data sciemtifec and technical sphere more particularly to a kind of analysis methods of mapreduce mission performance, device
And equipment.
Background technique
With the rise of artificial intelligence, mobile Internet and Internet of Things, the fast-developing stage that big data science and technology is in,
Big data off-line calculation is the pith in big data science and technology.Big data off-line calculation is typically all to be counted using hive
It calculates, and hive is a Tool for Data Warehouse, and sql sentence can be transformed into mapreduce task and run.For big data
Cluster, real time monitoring and analysis mapreduce task execution performance it is particularly significant, the efficiency of mapreduce task execution
Meet wooden pail effect with performance, mapreduce task can be obtained by analyzing the short slab during mapreduce task execution
The efficiency and performance of execution.
Currently without energy efficient analysis mission performance and the method for providing solution, it can not analyze and find task execution
The optimal cases of energy, also can not find suitable solution in the poor situation of task execution performance situation.
Summary of the invention
To be overcome the problems, such as present in the relevant technologies at least to a certain extent, the application provides mapreduce a kind of
Analysis method, device and the equipment for performance of being engaged in.
According to a first aspect of the present application, a kind of analysis method of mapreduce mission performance is provided, comprising:
Setting time acquires threshold value, and the time acquisition threshold value includes initial time and end time;
Progress analysis is obtained according to the reduce progress of the reduce progress of the initial time and the end time to become
Amount;
Collect the data volume of history map processing, current map has completed the data volume of processing, current map is being handled
Data volume, the data volume of history reduce processing, current reduce has completed the data volume of processing, current reduce is locating
The data volume of reason;
The data volume that is handled according to the history map, the current map have completed the data volume of processing and described current
The data volume that map is being handled obtains map inclination variable according to the first preset algorithm;
Data volume, the institute of processing have been completed according to the data volume of history reduce processing, the current reduce
It states the data volume that current reduce is being handled and obtains reduce inclination variable according to the second preset algorithm;
The progress analysis variable, map inclination variable, reduce inclination variable are input to trained in advance
To data skew model in, obtain data skew risk parameter;
The mapreduce is obtained according to the data skew risk parameter and the preset mission performance table of comparisons to appoint
The analysis result for performance of being engaged in.
Optionally, according to the history map handle data volume, the current map completed processing data volume and
The data volume that the current map is being handled obtains map inclination variable according to the first preset algorithm, comprising:
The first history mean value is obtained according to the data volume that the history map is handled;
The first current mean value is obtained according to the current map data volume for having completed to handle;
First variance knot is obtained by variance algorithm according to the current map data volume handled and the first history mean value
Fruit;
Second variance knot is obtained by variance algorithm according to the current map data volume being currently running and the first current mean value
Fruit;
The first variance result and the second variance result are subjected to linear operation by the first preset ratio and obtain map
Tilt variable.
Optionally, the data volume handled according to the history reduce, the current reduce completion are handled
Data volume, the data volume that is handling of the current reduce according to the second preset algorithm obtain reduce inclination variable, packet
It includes:
The second history mean value is obtained according to the data volume that the history reduce is handled;
The second current mean value is obtained according to the current reduce data volume for having completed to handle;
Third variance is obtained by variance algorithm according to the current reduce data volume handled and the second history mean value
As a result;
The 4th variance is obtained by variance algorithm according to the current reduce data volume being currently running and the second current mean value
As a result;
Linear operation is carried out by the second preset ratio according to the third variance result and the 4th variance result to obtain
Reduce tilts variable.
Optionally, the data skew model that the preparatory training obtains are as follows:
Wherein, m is data skew variable index, and z is the progress analysis variable, and x is that the map tilts variable, and y is institute
Reduce inclination variable is stated, t is temporal operator, and a, b, c are preset constant.
Optionally, performance affecting parameters are obtained according to mapreduce operation data and the analysis result;The performance shadow
Ringing parameter is to influence the parameter of the mapreduce mission performance;
The mapreduce operation data includes: the execution duration, every of task cpu time, gc time, average task
A cpu handles the data volume of the percentage of gc, each task average treatment.
According to a second aspect of the present application, a kind of analytical equipment of mapreduce mission performance is provided, comprising:
Computing module, setting time acquire threshold value, and time acquisition threshold value includes initial time and end time, according to
The reduce progress of the initial time and the reduce progress of the end time obtain progress analysis variable;
Collection module collects the data volume of history map processing, current map has completed the data volume of processing, current map
The data volume that is handling, the data volume of history reduce processing, current reduce have completed the data volume, current of processing
The data volume that reduce is being handled;
First preset algorithm module, the data volume handled according to the history map, the current map completion are handled
Data volume and the data volume that is handling of the current map according to the first preset algorithm obtain map inclination variable;
Second preset algorithm module, the data volume handled according to the history reduce, the current reduce are complete
The data volume handled at the data volume of processing, the current reduce obtains reduce inclination according to the second preset algorithm and becomes
Amount;
The progress analysis variable, map inclination variable, reduce inclination variable are input to by input module
In the data skew model that training obtains in advance, data skew risk parameter is obtained;
Analysis module obtains described according to the data skew risk parameter and the preset mission performance table of comparisons
The analysis result of mapreduce mission performance.
Optionally, the first preset algorithm module includes:
First computing unit obtains the first history mean value according to the data volume that the history map is handled;
Second computing unit obtains the first current mean value according to the current map data volume for having completed to handle;
First variance result unit is calculated according to the current map data volume handled and the first history mean value by variance
Method obtains first variance result;
Second variance result unit is calculated according to the current map data volume being currently running and the first current mean value by variance
Method obtains second variance result;
First linear arithmetic element, by the first variance result and the second variance result by the first preset ratio into
Row linear operation obtains map inclination variable.
Optionally, the second preset algorithm module includes:
Third computing unit obtains the second history mean value according to the data volume that the history reduce is handled;
4th computing unit obtains second currently according to the current reduce data volume for having completed to handle
Value;
Third variance result unit passes through variance according to the current reduce data volume handled and the second history mean value
Algorithm obtains third variance result;
4th variance result unit passes through variance according to the current reduce data volume being currently running and the second current mean value
Algorithm obtains the 4th variance result;
Second linear operation unit presses the second preset ratio according to the third variance result and the 4th variance result
It carries out linear operation and obtains reduce inclination variable.
Optionally, the input module includes the data skew model that the preparatory training obtains are as follows:
Wherein, m is data skew variable index, and z is the progress analysis variable, and x is that the map tilts variable, and y is institute
Reduce inclination variable is stated, t is temporal operator, and a, b, c are preset constant.
According to the third aspect of the application, a kind of analytical equipment of mapreduce mission performance is provided, comprising:
Processor, and the memory being connected with the processor;
For storing computer program, the computer program is at least used to execute such as the application first party the memory
The analysis method of mapreduce mission performance described in face;
The processor is for calling and executing the computer program in the memory.
Technical solution provided by the present application can include the following benefits:
A kind of analysis method of mapreduce mission performance includes setting time acquisition threshold value, is divided into initial time and end
The only moment obtains progress analysis variable according to the reduce progress of the reduce progress of initial time and end time;Collection is gone through
The data volume of history map/reduce processing, current map/reduce complete the data volume of processing, current map/reduce just
In the data volume of processing;By data volume that history map is handled, current the map data volume for completing processing and current map
The data volume of processing obtains map inclination variable according to the first preset algorithm;According to the data volume of history reduce processing, currently
The data volume that reduce has completed the data volume of processing, current reduce is being handled is obtained according to the second preset algorithm
Reduce tilts variable;Progress analysis variable, map inclination variable and reduce inclination variable are input to what training in advance obtained
In data skew model, data skew risk parameter is obtained;According to data skew risk parameter and preset mission performance
The table of comparisons obtains the analysis result of mapreduce mission performance.It, can by the analysis method of above-mentioned mapreduce mission performance
Analysis to obtain mapreduce mission performance can be found based on the analysis results as a result, to know the performance of task execution
Performance executes best situation.
It should be understood that above general description and following detailed description be only it is exemplary and explanatory, not
The application can be limited.
Detailed description of the invention
The drawings herein are incorporated into the specification and forms part of this specification, and shows the implementation for meeting the application
Example, and together with specification it is used to explain the principle of the application.
Fig. 1 is the flow diagram for the mapreduce mission performance analysis method that the embodiment of the present application one provides.
Fig. 2 is the flow diagram for obtaining map and tilting variable that the embodiment of the present application one provides.
Fig. 3 is the flow diagram for obtaining reduce and tilting variable that the embodiment of the present application one provides.
Fig. 4 is the structural schematic diagram for the mapreduce mission performance analytical equipment that the embodiment of the present application two provides.
Fig. 5 is the structural schematic diagram for the first preset algorithm module that the embodiment of the present application two provides.
Fig. 6 is the structural schematic diagram for the second preset algorithm module that the embodiment of the present application two provides.
Fig. 7 is the structural schematic diagram for the mapreduce mission performance analytical equipment that the embodiment of the present application three provides.
Specific embodiment
Example embodiments are described in detail here, and the example is illustrated in the accompanying drawings.Following description is related to
When attached drawing, unless otherwise indicated, the same numbers in different drawings indicate the same or similar elements.Following exemplary embodiment
Described in embodiment do not represent all embodiments consistent with the application.On the contrary, they be only with it is such as appended
The example of the consistent device and method of some aspects be described in detail in claims, the application.
Embodiment one
Referring to Fig. 1, the process that Fig. 1 is the mapreduce mission performance analysis method that the embodiment of the present application one provides is shown
It is intended to.
As shown in Figure 1, the analysis method of mapreduce mission performance includes:
Step 101, setting time acquires threshold value, and it includes initial time and end time that the time, which acquires threshold value, according to starting
The reduce progress at moment and the reduce progress of end time obtain progress analysis variable.
Specifically, hadoop provides the api of jmx, the data of metrics are grabbed by timing, according to the value of metrics
Judge to obtain the difference of the reduce progress of end time and the reduce progress of initial time as progress analysis variable, above-mentioned difference
It is worth smaller, the progress of execution is slower.
Step 102, the data volume of history map processing is collected, current map completes the data volume of processing, current map just
The data volume, current of processing has been completed in the data volume of processing, the data volume of history reduce processing, current reduce
The data volume that reduce is being handled.
Step 103, the data volume handled and current map have been completed according to the data volume of history map processing, current map
The data volume handled obtains map inclination variable according to the first preset algorithm.
Step 104, according to history reduce handle data volume, current reduce completed processing data volume, when
The data volume that preceding reduce is being handled obtains reduce inclination variable according to the second preset algorithm.
Step 105, progress analysis variable, map inclination variable, reduce inclination variable are input to what training in advance obtained
In data skew model, data skew risk parameter is obtained.
Specifically, the data skew model that training obtains in advance are as follows:
Wherein, m is data skew variable index, and z is the progress analysis variable, and x is that the map tilts variable, and y is institute
Reduce inclination variable is stated, t is temporal operator, and a, b, c are preset constant.
Further, data skew model is obtained by historical data training.
Data acquisition, by the acquisition to map, reduce data on yarn, collects map, the processing data of reduce
It measures, the duration of processing, machine execution duration, the processing progress of ruduce, the information such as log analysis on jobhistory.
Step 106, mapreduce is obtained according to data skew risk parameter and the preset mission performance table of comparisons to appoint
The analysis result for performance of being engaged in.
Specifically, the mission performance table of comparisons, which corresponds to mission performance grade for data skew risk parameter, obtains analysis result.
For example, it may be data skew risk parameter, in 0~0.2 range, mission performance grade is normal;Data skew
Risk parameter m in 0.2~0.4 range, mission performance grade be it is general, will be sent to nail nail inform person liable's mission performance
Generally;For data skew risk parameter m in 0.4~0.6 range, mission performance grade is exception, will send short message and inform duty
Mission performance of leting people is abnormal;For data skew risk parameter m 0.6~0.8, mission performance is extremely serious, and will make a telephone call to duty
Mission performance of leting people is extremely serious;For data skew risk parameter m in 0.8~1.0 range, mission performance is abnormal very serious,
The person liable higher level that will make a telephone call to informs that mission performance is abnormal very serious.
Performance affecting parameters are obtained according to mapreduce operation data and the analysis result;The performance affecting parameters
For the parameter for influencing the mapreduce mission performance;Mapreduce operation data includes: task cpu time, gc
Time, the execution duration of average task, each cpu handle the data volume of the percentage of gc, each task average treatment.
It is needed according to the performance affecting parameters that mapreduce operation data and analysis result obtain using heuristic thinking
In conjunction with actual experience, different performance affecting parameters correspond to different needs and attempt the factor solved, for example have obtained performance
Affecting parameters may be that the execution duration of average task exception, or execution duration and each cpu of average task occurs
There is exception in the percentage of processing gc, quickly finds solution to the problem according to the analysis of mapreduce mission performance.
A kind of analysis method of mapreduce mission performance includes setting time acquisition threshold value, is divided into initial time and end
The only moment obtains progress analysis variable according to the reduce progress of the reduce progress of initial time and end time;Collection is gone through
The data volume of history map/reduce processing, current map/reduce complete the data volume of processing, current map/reduce just
In the data volume of processing;By data volume that history map is handled, current the map data volume for completing processing and current map
The data volume of processing obtains map inclination variable according to the first preset algorithm;According to the data volume of history reduce processing, currently
The data volume that reduce has completed the data volume of processing, current reduce is being handled is obtained according to the second preset algorithm
Reduce tilts variable;Progress analysis variable, map inclination variable and reduce inclination variable are input to what training in advance obtained
In data skew model, data skew risk parameter is obtained;According to data skew risk parameter and preset mission performance
The table of comparisons obtains the analysis result of mapreduce mission performance.It, can by the analysis method of above-mentioned mapreduce mission performance
Analysis to obtain mapreduce mission performance can be found based on the analysis results as a result, to know the performance of task execution
Performance executes best situation.
Wherein, step 103, the data volume of processing has been completed according to the data volume of history map processing, current map and worked as
The data volume that preceding map is being handled obtains map inclination variable according to the first preset algorithm and comprises the following processes.
Referring to Fig. 2, Fig. 2 is the flow diagram for obtaining map and tilting variable that the embodiment of the present application one provides.
As shown in Fig. 2, the process for obtaining map inclination variable in step 103 may include:
Step 201, the first history mean value is obtained according to the data volume that history map is handled.
Specifically, mapreduce is at work, many map handle data at the same time;The data volume of history map processing
It can be the previous day, be also possible to the last week, totally go up, the mean value of the data volume of history map processing is about one
Steady state value.
Step 202, that the data volume of processing has been completed according to current map obtains the first current mean value.
Specifically, for example can be currently has 10000 map in processing data, the map for having handled completion has 1000
It is a, these map for having handled completion are averaged to obtain the first current mean value.
Step 203, first is obtained by variance algorithm according to the current map data volume handled and the first history mean value
Variance result.
Specifically, variance is bigger, fluctuation is bigger, and stability is poorer.
Step 204, second is obtained by variance algorithm according to the current map data volume being currently running and the first current mean value
Variance result.
Step 205, first variance result and second variance result are subjected to linear operation by the first preset ratio and obtain map
Tilt variable.
Specifically, the first preset ratio is determines according to actual conditions, such as it can be first variance result and second variance
As a result it is calculated by 1:1;Map inclination variate-value is bigger, and fluctuation is bigger, and stability is poorer.
The data volume handled, current reduce have been completed according to the data volume of history reduce processing, current reduce
The data volume handled obtains reduce inclination variable according to the second preset algorithm;
Wherein, step 104, the data of processing have been completed according to the data volume of history reduce processing, current reduce
The data volume that amount, current reduce are being handled obtains reduce inclination variable according to the second preset algorithm, comprises the following processes.
Referring to Fig. 3, Fig. 3 is the flow diagram for obtaining reduce and tilting variable that the embodiment of the present application one provides.
As shown in figure 3, the process for obtaining reduce inclination variable in step 104 may include:
Step 301, the second history mean value is obtained according to the data volume that history reduce is handled.
Step 302, that the data volume of processing has been completed according to current reduce obtains the second current mean value.
Step 303, it is obtained according to the current reduce data volume handled and the second history mean value by variance algorithm
Third variance result.
Step 304, it is obtained according to the current reduce data volume being currently running and the second current mean value by variance algorithm
4th variance result.
Step 305, linear operation is carried out by the second preset ratio according to third variance result and the 4th variance result to obtain
Reduce tilts variable.
Embodiment two
Referring to Fig. 4, the structure that Fig. 4 is the mapreduce mission performance analytical equipment that the embodiment of the present application two provides is shown
It is intended to.
As shown in figure 4, the structure of mapreduce mission performance analytical equipment includes:
Computing module 401, setting time acquire threshold value, and time acquisition threshold value includes initial time and end time,
Progress analysis variable is obtained according to the reduce progress of the reduce progress of the initial time and the end time.
Collection module 402 collects the data volume of history map processing, current map has completed the data volume handled, current
Data volume that map is being handled, the data volume of history reduce processing, current reduce completed processing data volume, when
The data volume that preceding reduce is being handled.
First preset algorithm module 403, at the data volume of history map processing, the current map completion
The data volume that the data volume of reason and the current map are being handled obtains map inclination variable according to the first preset algorithm.
Second preset algorithm module 404, the data volume handled according to the history reduce, the current reduce are
Complete the data volume of processing, the data volume that the current reduce is being handled obtains reduce inclination according to the second preset algorithm
Variable.
Input module 405, by the progress analysis variable, map inclination variable, reduce inclination variable input
In the data skew model obtained to preparatory training, data skew risk parameter is obtained.
Analysis module 406 obtains institute according to the data skew risk parameter and the preset mission performance table of comparisons
State the analysis result of mapreduce mission performance.
Wherein, the first preset algorithm module 403 may include such as lower unit:
Referring to Fig. 5, Fig. 5 is the structural schematic diagram for the first preset algorithm module that the embodiment of the present application two provides.
As shown in figure 5, the first preset algorithm module includes:
First computing unit 501 obtains the first history mean value according to the data volume that the history map is handled.
Second computing unit 502 obtains first currently according to the current map data volume for having completed to handle
Value.
First variance result unit 503 passes through variance according to the current map data volume handled and the first history mean value
Algorithm obtains first variance result.
Second variance result unit 504 passes through variance according to the current map data volume being currently running and the first current mean value
Algorithm obtains second variance result.
First linear arithmetic element 505, by the first variance result and the second variance result by the first default ratio
Example carries out linear operation and obtains map inclination variable.
Wherein, the second preset algorithm module 404 may include such as lower unit:
Referring to Fig. 6, Fig. 6 is the structural schematic diagram for the second preset algorithm module that the embodiment of the present application two provides.
As shown in fig. 6, the second preset algorithm module includes:
Third computing unit 601 obtains the second history mean value according to the data volume that the history reduce is handled.
4th computing unit 602 obtains second currently according to the current reduce data volume for having completed to handle
Mean value.
Third variance result unit 603, passes through according to the current reduce data volume handled and the second history mean value
Variance algorithm obtains third variance result.
4th variance result unit 604 passes through according to the current reduce data volume being currently running and the second current mean value
Variance algorithm obtains the 4th variance result.
Second linear operation unit 605, it is default by second according to the third variance result and the 4th variance result
Ratio carries out linear operation and obtains reduce inclination variable.
Embodiment three
Referring to Fig. 7, the structure that Fig. 7 is the mapreduce mission performance analytical equipment that the embodiment of the present application three provides is shown
It is intended to.
As shown in fig. 7, mapreduce mission performance analytical equipment includes:
Processor 701, and the memory 702 being connected with the processor 701;
For storing computer program, the computer program is at least used to execute the embodiment of the present application one memory
The analysis method of the mapreduce mission performance;
The processor is for calling and executing the computer program in the memory.
About the device in above-described embodiment, wherein modules execute the concrete mode of operation in related this method
Embodiment in be described in detail, no detailed explanation will be given here.
It is understood that same or similar part can mutually refer in the various embodiments described above, in some embodiments
Unspecified content may refer to the same or similar content in other embodiments.
It should be noted that term " first ", " second " etc. are used for description purposes only in the description of the present application, without
It can be interpreted as indication or suggestion relative importance.In addition, in the description of the present application, unless otherwise indicated, the meaning of " multiple "
Refer at least two.
Any process described otherwise above or method description are construed as in flow chart or herein, and expression includes
It is one or more for realizing specific logical function or process the step of executable instruction code module, segment or portion
Point, and the range of the preferred embodiment of the application includes other realization, wherein can not press shown or discussed suitable
Sequence, including according to related function by it is basic simultaneously in the way of or in the opposite order, Lai Zhihang function, this should be by the application
Embodiment person of ordinary skill in the field understood.
It should be appreciated that each section of the application can be realized with hardware, software, firmware or their combination.Above-mentioned
In embodiment, software that multiple steps or method can be executed in memory and by suitable instruction execution system with storage
Or firmware is realized.It, and in another embodiment, can be under well known in the art for example, if realized with hardware
Any one of column technology or their combination are realized: having a logic gates for realizing logic function to data-signal
Discrete logic, with suitable combinational logic gate circuit specific integrated circuit, programmable gate array (PGA), scene
Programmable gate array (FPGA) etc..
Those skilled in the art are understood that realize all or part of step that above-described embodiment method carries
It suddenly is that relevant hardware can be instructed to complete by program, the program can store in a kind of computer-readable storage medium
In matter, which when being executed, includes the steps that one or a combination set of embodiment of the method.
It, can also be in addition, can integrate in a processing module in each functional unit in each embodiment of the application
It is that each unit physically exists alone, can also be integrated in two or more units in a module.Above-mentioned integrated mould
Block both can take the form of hardware realization, can also be realized in the form of software function module.The integrated module is such as
Fruit is realized and when sold or used as an independent product in the form of software function module, also can store in a computer
In read/write memory medium.
Storage medium mentioned above can be read-only memory, disk or CD etc..
In the description of this specification, reference term " one embodiment ", " some embodiments ", " example ", " specifically show
The description of example " or " some examples " etc. means specific features, structure, material or spy described in conjunction with this embodiment or example
Point is contained at least one embodiment or example of the application.In the present specification, schematic expression of the above terms are not
Centainly refer to identical embodiment or example.Moreover, particular features, structures, materials, or characteristics described can be any
One or more embodiment or examples in can be combined in any suitable manner.
Although embodiments herein has been shown and described above, it is to be understood that above-described embodiment is example
Property, it should not be understood as the limitation to the application, those skilled in the art within the scope of application can be to above-mentioned
Embodiment is changed, modifies, replacement and variant.
Claims (10)
1. a kind of analysis method of mapreduce mission performance characterized by comprising
Setting time acquires threshold value, and the time acquisition threshold value includes initial time and end time;
Progress analysis variable is obtained according to the reduce progress of the reduce progress of the initial time and the end time;
Collect the data volume of history map processing, the data that current map has completed the data volume of processing, current map is being handled
Amount, the data volume of history reduce processing, current reduce has completed the data volume of processing, current reduce is being handled
Data volume;
The data volume handled and the current map have been completed according to the data volume of history map processing, the current map
The data volume handled obtains map inclination variable according to the first preset algorithm;
The data volume that is handled according to the history reduce, the current reduce have completed the data volume of processing, described have worked as
The data volume that preceding reduce is being handled obtains reduce inclination variable according to the second preset algorithm;
The progress analysis variable, map inclination variable, reduce inclination variable are input to what training in advance obtained
In data skew model, data skew risk parameter is obtained;
The mapreduce task is obtained according to the data skew risk parameter and the preset mission performance table of comparisons
The analysis result of energy.
2. the analysis method of mapreduce mission performance according to claim 1, which is characterized in that according to the history
The data volume that the data volume of map processing, the current map have completed the data volume of processing and the current map is being handled
Map inclination variable is obtained according to the first preset algorithm, comprising:
The first history mean value is obtained according to the data volume that the history map is handled;
The first current mean value is obtained according to the current map data volume for having completed to handle;
First variance result is obtained by variance algorithm according to the current map data volume handled and the first history mean value;
Second variance result is obtained by variance algorithm according to the current map data volume being currently running and the first current mean value;
The first variance result and the second variance result are subjected to linear operation by the first preset ratio and obtain map inclination
Variable.
3. the analysis method of mapreduce mission performance according to claim 1, which is characterized in that described according to
The data volume of history reduce processing, the current reduce complete the data volume of processing, the current reduce
The data volume of processing obtains reduce inclination variable according to the second preset algorithm, comprising:
The second history mean value is obtained according to the data volume that the history reduce is handled;
The second current mean value is obtained according to the current reduce data volume for having completed to handle;
Third variance result is obtained by variance algorithm according to the current reduce data volume handled and the second history mean value;
The 4th variance result is obtained by variance algorithm according to the current reduce data volume being currently running and the second current mean value;
Linear operation is carried out by the second preset ratio according to the third variance result and the 4th variance result to obtain
Reduce tilts variable.
4. the analysis method of mapreduce mission performance according to claim 1, which is characterized in that the preparatory training
Obtained data skew model are as follows:
Wherein, m is data skew variable index, and z is the progress analysis variable, and x is that the map tilts variable, and y is described
Reduce tilts variable, and t is temporal operator, and a, b, c are preset constant.
5. the analysis method of mapreduce mission performance according to claim 1, which is characterized in that further include:
Performance affecting parameters are obtained according to mapreduce operation data and the analysis result;The performance affecting parameters are shadow
Ring the parameter of the mapreduce mission performance;
The mapreduce operation data includes: the execution duration, each of task cpu time, gc time, average task
Cpu handles the data volume of the percentage of gc, each task average treatment.
6. a kind of analytical equipment of mapreduce mission performance characterized by comprising
Computing module, setting time acquire threshold value, and the time acquisition threshold value includes initial time and end time, according to described
The reduce progress of initial time and the reduce progress of the end time obtain progress analysis variable;
Collection module collects the data volume of history map processing, current map completes the data volume of processing, current map
The data volume of processing, the data volume of history reduce processing, current reduce have completed the data volume of processing, current reduce
The data volume handled;
First preset algorithm module, the data volume handled according to the history map, the current map have completed the number of processing
The data volume handled according to amount and the current map obtains map inclination variable according to the first preset algorithm;
Second preset algorithm module, at the data volume of history reduce processing, the current reduce completion
The data volume that the data volume of reason, the current reduce are being handled obtains reduce inclination variable according to the second preset algorithm;
The progress analysis variable, map inclination variable, reduce inclination variable are input in advance by input module
In the data skew model that training obtains, data skew risk parameter is obtained;
Analysis module obtains described according to the data skew risk parameter and the preset mission performance table of comparisons
The analysis result of mapreduce mission performance.
7. the analytical equipment of mapreduce mission performance according to claim 6, which is characterized in that described first is default
Algoritic module includes:
First computing unit obtains the first history mean value according to the data volume that the history map is handled;
Second computing unit obtains the first current mean value according to the current map data volume for having completed to handle;
First variance result unit is obtained according to the current map data volume handled and the first history mean value by variance algorithm
To first variance result;
Second variance result unit is obtained according to the current map data volume being currently running and the first current mean value by variance algorithm
To second variance result;
The first variance result and the second variance result are carried out line by the first preset ratio by the first linear arithmetic element
Property operation obtain map inclination variable.
8. the analytical equipment of mapreduce mission performance according to claim 6, which is characterized in that described second is default
Algoritic module includes:
Third computing unit obtains the second history mean value according to the data volume that the history reduce is handled;
4th computing unit obtains the second current mean value according to the current reduce data volume for having completed to handle;
Third variance result unit passes through variance algorithm according to the current reduce data volume handled and the second history mean value
Obtain third variance result;
4th variance result unit passes through variance algorithm according to the current reduce data volume being currently running and the second current mean value
Obtain the 4th variance result;
Second linear operation unit is carried out according to the third variance result and the 4th variance result by the second preset ratio
Linear operation obtains reduce inclination variable.
9. the analytical equipment of mapreduce mission performance according to claim 6, which is characterized in that the preparatory training
Obtained data skew model are as follows:
Wherein, m is data skew variable index, and z is the progress analysis variable, and x is that the map tilts variable, and y is described
Reduce tilts variable, and t is temporal operator, and a, b, c are preset constant.
10. a kind of analytical equipment of mapreduce mission performance, which is characterized in that
Processor, and the memory being connected with the processor;
The memory is at least used for perform claim and requires described in 1 for storing computer program, the computer program
The analysis method of mapreduce mission performance;
The processor is for calling and executing the computer program in the memory.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910776593.3A CN110489301B (en) | 2019-08-22 | 2019-08-22 | Mapreduce task performance analysis method, device and equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910776593.3A CN110489301B (en) | 2019-08-22 | 2019-08-22 | Mapreduce task performance analysis method, device and equipment |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110489301A true CN110489301A (en) | 2019-11-22 |
CN110489301B CN110489301B (en) | 2023-03-10 |
Family
ID=68552729
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910776593.3A Active CN110489301B (en) | 2019-08-22 | 2019-08-22 | Mapreduce task performance analysis method, device and equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110489301B (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111290917A (en) * | 2020-02-26 | 2020-06-16 | 深圳市云智融科技有限公司 | YARN-based resource monitoring method and device and terminal equipment |
CN111651267A (en) * | 2020-05-06 | 2020-09-11 | 京东数字科技控股有限公司 | Method and device for performing performance consumption optimization analysis on parallel operation |
CN113778727A (en) * | 2020-06-19 | 2021-12-10 | 北京沃东天骏信息技术有限公司 | Data processing method and device, electronic equipment and computer readable storage medium |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106201754A (en) * | 2016-07-06 | 2016-12-07 | 乐视控股(北京)有限公司 | Mission bit stream analyzes method and device |
US20170061305A1 (en) * | 2015-08-28 | 2017-03-02 | Jiangnan University | Fuzzy curve analysis based soft sensor modeling method using time difference Gaussian process regression |
WO2017031961A1 (en) * | 2015-08-24 | 2017-03-02 | 华为技术有限公司 | Data processing method and apparatus |
CN107562532A (en) * | 2017-07-13 | 2018-01-09 | 华为技术有限公司 | A kind of method and device for the hardware resource utilization for predicting device clusters |
US20180081566A1 (en) * | 2016-09-16 | 2018-03-22 | International Business Machines Corporation | Data block processing |
US20180159774A1 (en) * | 2016-12-07 | 2018-06-07 | Oracle International Corporation | Application-level Dynamic Scheduling of Network Communication for Efficient Re-partitioning of Skewed Data |
CN108334596A (en) * | 2018-01-31 | 2018-07-27 | 华南师范大学 | A kind of massive relation data efficient concurrent migration method towards big data platform |
CN109827579A (en) * | 2019-03-08 | 2019-05-31 | 兰州交通大学 | The method and system of Filtering Model real time correction in a kind of integrated positioning |
-
2019
- 2019-08-22 CN CN201910776593.3A patent/CN110489301B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2017031961A1 (en) * | 2015-08-24 | 2017-03-02 | 华为技术有限公司 | Data processing method and apparatus |
US20170061305A1 (en) * | 2015-08-28 | 2017-03-02 | Jiangnan University | Fuzzy curve analysis based soft sensor modeling method using time difference Gaussian process regression |
CN106201754A (en) * | 2016-07-06 | 2016-12-07 | 乐视控股(北京)有限公司 | Mission bit stream analyzes method and device |
US20180081566A1 (en) * | 2016-09-16 | 2018-03-22 | International Business Machines Corporation | Data block processing |
US20180159774A1 (en) * | 2016-12-07 | 2018-06-07 | Oracle International Corporation | Application-level Dynamic Scheduling of Network Communication for Efficient Re-partitioning of Skewed Data |
CN107562532A (en) * | 2017-07-13 | 2018-01-09 | 华为技术有限公司 | A kind of method and device for the hardware resource utilization for predicting device clusters |
CN108334596A (en) * | 2018-01-31 | 2018-07-27 | 华南师范大学 | A kind of massive relation data efficient concurrent migration method towards big data platform |
CN109827579A (en) * | 2019-03-08 | 2019-05-31 | 兰州交通大学 | The method and system of Filtering Model real time correction in a kind of integrated positioning |
Non-Patent Citations (8)
Title |
---|
R. MARLET等: "Mapping software architectures to efficient implementations via partial evaluation", 《PROCEEDINGS 12TH IEEE INTERNATIONAL CONFERENCE AUTOMATED SOFTWARE ENGINEERING》 * |
刘海龙;宿宏毅;: "利用Hadoop云计算平台进行海量数据聚类分析" * |
刘肖琛: "基于大数据的网络恶意流量分析系统的设计与实现", 《中国优秀硕士学位论文全文数据库》 * |
周世龙等: "基于灰盒模型的Hadoop MapReduce job参数性能分析与预测", 《四川大学学报(工程科学版)》 * |
朱永利等: "ODPS平台下的电力设备监测大数据存储与并行处理方法", 《电工技术学报》 * |
王卓等: "基于增量式分区策略的MapReduce数据均衡方法", 《计算机学报》 * |
秦军等: "基于异构Hadoop集群的负载均衡策略研究", 《计算机技术与发展》 * |
罗永刚等: "一种Mapreduce作业内存精确预测方法", 《电子科技大学学报》 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111290917A (en) * | 2020-02-26 | 2020-06-16 | 深圳市云智融科技有限公司 | YARN-based resource monitoring method and device and terminal equipment |
CN111651267A (en) * | 2020-05-06 | 2020-09-11 | 京东数字科技控股有限公司 | Method and device for performing performance consumption optimization analysis on parallel operation |
CN113778727A (en) * | 2020-06-19 | 2021-12-10 | 北京沃东天骏信息技术有限公司 | Data processing method and device, electronic equipment and computer readable storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN110489301B (en) | 2023-03-10 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110489301A (en) | Analysis method, device and the equipment of mapreduce mission performance | |
CN111768008B (en) | Federal learning method, apparatus, device, and storage medium | |
CN109815084A (en) | Abnormality recognition method, device and electronic equipment and storage medium | |
US20200401222A1 (en) | Gaming Cognitive Performance | |
CN106354616B (en) | Monitor the method, apparatus and high performance computing system of application execution performance | |
CN109492153A (en) | A kind of Products Show method and apparatus | |
CN110515793A (en) | System performance monitoring method, device, equipment and storage medium | |
CN112598110B (en) | Neural network construction method, device, equipment and medium | |
CN109815092B (en) | Automatic cloud platform expansion method and system | |
KR20210014571A (en) | Scalp and hair management system for providing status information at the stage of change | |
CN111338787A (en) | Data processing method and device, storage medium and electronic device | |
Cáceres et al. | Exploring variable neighborhood search for automatic algorithm configuration | |
Yabas et al. | Churn prediction in subscriber management for mobile and wireless communications services | |
CN112231191A (en) | Log collection method and device | |
KR20150067488A (en) | Method for evaluating smart-grid strategy | |
CN114372383B (en) | Scene fast switching method and system based on VR simulation scene | |
CN110428373A (en) | A kind of training sample processing method and system for video interleave | |
CN110928663A (en) | Cross-platform multithreading monitoring method and device | |
CN110191005A (en) | A kind of alarm log processing method and system | |
CN110109803A (en) | A kind of user behavior report method and system | |
CN115470403A (en) | Real-time updating method and device of vehicle service recommendation model, vehicle and medium | |
CN113138895A (en) | Monitoring method and device of cloud desktop equipment | |
CN111432082B (en) | Customer distribution method and device for customer service | |
US20220187969A1 (en) | Optimizing Service Delivery through Partial Dependency Plots | |
CN107562599A (en) | A kind of parameter detection method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |