CN110489301A - Analysis method, device and the equipment of mapreduce mission performance - Google Patents

Analysis method, device and the equipment of mapreduce mission performance Download PDF

Info

Publication number
CN110489301A
CN110489301A CN201910776593.3A CN201910776593A CN110489301A CN 110489301 A CN110489301 A CN 110489301A CN 201910776593 A CN201910776593 A CN 201910776593A CN 110489301 A CN110489301 A CN 110489301A
Authority
CN
China
Prior art keywords
data volume
reduce
map
current
variable
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910776593.3A
Other languages
Chinese (zh)
Other versions
CN110489301B (en
Inventor
朱友志
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Tunji Network Technology Co Ltd
Original Assignee
Shanghai Tunji Network Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Tunji Network Technology Co Ltd filed Critical Shanghai Tunji Network Technology Co Ltd
Priority to CN201910776593.3A priority Critical patent/CN110489301B/en
Publication of CN110489301A publication Critical patent/CN110489301A/en
Application granted granted Critical
Publication of CN110489301B publication Critical patent/CN110489301B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3065Monitoring arrangements determined by the means or processing involved in reporting the monitored data
    • G06F11/3072Monitoring arrangements determined by the means or processing involved in reporting the monitored data where the reporting involves data filtering, e.g. pattern matching, time or event triggered, adaptive or policy-based reporting
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Debugging And Monitoring (AREA)

Abstract

This application involves analysis method, device and the equipment of a kind of mapreduce mission performance, including setting time to acquire threshold value, and it includes initial time and end time that the time, which acquires threshold value,;Progress analysis variable is obtained according to the reduce progress of the reduce progress of initial time and end time;Map inclination variable is obtained according to the first preset algorithm according to the data volume that the data volume for the history map/reduce processing being collected into, current map/reduce have completed the data volume of processing, current map/reduce is being handled and obtains reduce inclination variable according to the second preset algorithm;Progress analysis variable, map inclination variable, reduce inclination variable are input in the data skew model that training obtains in advance, data skew risk parameter is obtained;The analysis result of mapreduce mission performance is obtained according to data skew risk parameter and the preset mission performance table of comparisons;By the analysis method of above-mentioned mapreduce mission performance, available mapreduce mission performance analyzes the performance as a result, to know task execution, finds performance based on the analysis results and executes best situation.

Description

Analysis method, device and the equipment of mapreduce mission performance
Technical field
This application involves big data sciemtifec and technical sphere more particularly to a kind of analysis methods of mapreduce mission performance, device And equipment.
Background technique
With the rise of artificial intelligence, mobile Internet and Internet of Things, the fast-developing stage that big data science and technology is in, Big data off-line calculation is the pith in big data science and technology.Big data off-line calculation is typically all to be counted using hive It calculates, and hive is a Tool for Data Warehouse, and sql sentence can be transformed into mapreduce task and run.For big data Cluster, real time monitoring and analysis mapreduce task execution performance it is particularly significant, the efficiency of mapreduce task execution Meet wooden pail effect with performance, mapreduce task can be obtained by analyzing the short slab during mapreduce task execution The efficiency and performance of execution.
Currently without energy efficient analysis mission performance and the method for providing solution, it can not analyze and find task execution The optimal cases of energy, also can not find suitable solution in the poor situation of task execution performance situation.
Summary of the invention
To be overcome the problems, such as present in the relevant technologies at least to a certain extent, the application provides mapreduce a kind of Analysis method, device and the equipment for performance of being engaged in.
According to a first aspect of the present application, a kind of analysis method of mapreduce mission performance is provided, comprising:
Setting time acquires threshold value, and the time acquisition threshold value includes initial time and end time;
Progress analysis is obtained according to the reduce progress of the reduce progress of the initial time and the end time to become Amount;
Collect the data volume of history map processing, current map has completed the data volume of processing, current map is being handled Data volume, the data volume of history reduce processing, current reduce has completed the data volume of processing, current reduce is locating The data volume of reason;
The data volume that is handled according to the history map, the current map have completed the data volume of processing and described current The data volume that map is being handled obtains map inclination variable according to the first preset algorithm;
Data volume, the institute of processing have been completed according to the data volume of history reduce processing, the current reduce It states the data volume that current reduce is being handled and obtains reduce inclination variable according to the second preset algorithm;
The progress analysis variable, map inclination variable, reduce inclination variable are input to trained in advance To data skew model in, obtain data skew risk parameter;
The mapreduce is obtained according to the data skew risk parameter and the preset mission performance table of comparisons to appoint The analysis result for performance of being engaged in.
Optionally, according to the history map handle data volume, the current map completed processing data volume and The data volume that the current map is being handled obtains map inclination variable according to the first preset algorithm, comprising:
The first history mean value is obtained according to the data volume that the history map is handled;
The first current mean value is obtained according to the current map data volume for having completed to handle;
First variance knot is obtained by variance algorithm according to the current map data volume handled and the first history mean value Fruit;
Second variance knot is obtained by variance algorithm according to the current map data volume being currently running and the first current mean value Fruit;
The first variance result and the second variance result are subjected to linear operation by the first preset ratio and obtain map Tilt variable.
Optionally, the data volume handled according to the history reduce, the current reduce completion are handled Data volume, the data volume that is handling of the current reduce according to the second preset algorithm obtain reduce inclination variable, packet It includes:
The second history mean value is obtained according to the data volume that the history reduce is handled;
The second current mean value is obtained according to the current reduce data volume for having completed to handle;
Third variance is obtained by variance algorithm according to the current reduce data volume handled and the second history mean value As a result;
The 4th variance is obtained by variance algorithm according to the current reduce data volume being currently running and the second current mean value As a result;
Linear operation is carried out by the second preset ratio according to the third variance result and the 4th variance result to obtain Reduce tilts variable.
Optionally, the data skew model that the preparatory training obtains are as follows:
Wherein, m is data skew variable index, and z is the progress analysis variable, and x is that the map tilts variable, and y is institute Reduce inclination variable is stated, t is temporal operator, and a, b, c are preset constant.
Optionally, performance affecting parameters are obtained according to mapreduce operation data and the analysis result;The performance shadow Ringing parameter is to influence the parameter of the mapreduce mission performance;
The mapreduce operation data includes: the execution duration, every of task cpu time, gc time, average task A cpu handles the data volume of the percentage of gc, each task average treatment.
According to a second aspect of the present application, a kind of analytical equipment of mapreduce mission performance is provided, comprising:
Computing module, setting time acquire threshold value, and time acquisition threshold value includes initial time and end time, according to The reduce progress of the initial time and the reduce progress of the end time obtain progress analysis variable;
Collection module collects the data volume of history map processing, current map has completed the data volume of processing, current map The data volume that is handling, the data volume of history reduce processing, current reduce have completed the data volume, current of processing The data volume that reduce is being handled;
First preset algorithm module, the data volume handled according to the history map, the current map completion are handled Data volume and the data volume that is handling of the current map according to the first preset algorithm obtain map inclination variable;
Second preset algorithm module, the data volume handled according to the history reduce, the current reduce are complete The data volume handled at the data volume of processing, the current reduce obtains reduce inclination according to the second preset algorithm and becomes Amount;
The progress analysis variable, map inclination variable, reduce inclination variable are input to by input module In the data skew model that training obtains in advance, data skew risk parameter is obtained;
Analysis module obtains described according to the data skew risk parameter and the preset mission performance table of comparisons The analysis result of mapreduce mission performance.
Optionally, the first preset algorithm module includes:
First computing unit obtains the first history mean value according to the data volume that the history map is handled;
Second computing unit obtains the first current mean value according to the current map data volume for having completed to handle;
First variance result unit is calculated according to the current map data volume handled and the first history mean value by variance Method obtains first variance result;
Second variance result unit is calculated according to the current map data volume being currently running and the first current mean value by variance Method obtains second variance result;
First linear arithmetic element, by the first variance result and the second variance result by the first preset ratio into Row linear operation obtains map inclination variable.
Optionally, the second preset algorithm module includes:
Third computing unit obtains the second history mean value according to the data volume that the history reduce is handled;
4th computing unit obtains second currently according to the current reduce data volume for having completed to handle Value;
Third variance result unit passes through variance according to the current reduce data volume handled and the second history mean value Algorithm obtains third variance result;
4th variance result unit passes through variance according to the current reduce data volume being currently running and the second current mean value Algorithm obtains the 4th variance result;
Second linear operation unit presses the second preset ratio according to the third variance result and the 4th variance result It carries out linear operation and obtains reduce inclination variable.
Optionally, the input module includes the data skew model that the preparatory training obtains are as follows:
Wherein, m is data skew variable index, and z is the progress analysis variable, and x is that the map tilts variable, and y is institute Reduce inclination variable is stated, t is temporal operator, and a, b, c are preset constant.
According to the third aspect of the application, a kind of analytical equipment of mapreduce mission performance is provided, comprising:
Processor, and the memory being connected with the processor;
For storing computer program, the computer program is at least used to execute such as the application first party the memory The analysis method of mapreduce mission performance described in face;
The processor is for calling and executing the computer program in the memory.
Technical solution provided by the present application can include the following benefits:
A kind of analysis method of mapreduce mission performance includes setting time acquisition threshold value, is divided into initial time and end The only moment obtains progress analysis variable according to the reduce progress of the reduce progress of initial time and end time;Collection is gone through The data volume of history map/reduce processing, current map/reduce complete the data volume of processing, current map/reduce just In the data volume of processing;By data volume that history map is handled, current the map data volume for completing processing and current map The data volume of processing obtains map inclination variable according to the first preset algorithm;According to the data volume of history reduce processing, currently The data volume that reduce has completed the data volume of processing, current reduce is being handled is obtained according to the second preset algorithm Reduce tilts variable;Progress analysis variable, map inclination variable and reduce inclination variable are input to what training in advance obtained In data skew model, data skew risk parameter is obtained;According to data skew risk parameter and preset mission performance The table of comparisons obtains the analysis result of mapreduce mission performance.It, can by the analysis method of above-mentioned mapreduce mission performance Analysis to obtain mapreduce mission performance can be found based on the analysis results as a result, to know the performance of task execution Performance executes best situation.
It should be understood that above general description and following detailed description be only it is exemplary and explanatory, not The application can be limited.
Detailed description of the invention
The drawings herein are incorporated into the specification and forms part of this specification, and shows the implementation for meeting the application Example, and together with specification it is used to explain the principle of the application.
Fig. 1 is the flow diagram for the mapreduce mission performance analysis method that the embodiment of the present application one provides.
Fig. 2 is the flow diagram for obtaining map and tilting variable that the embodiment of the present application one provides.
Fig. 3 is the flow diagram for obtaining reduce and tilting variable that the embodiment of the present application one provides.
Fig. 4 is the structural schematic diagram for the mapreduce mission performance analytical equipment that the embodiment of the present application two provides.
Fig. 5 is the structural schematic diagram for the first preset algorithm module that the embodiment of the present application two provides.
Fig. 6 is the structural schematic diagram for the second preset algorithm module that the embodiment of the present application two provides.
Fig. 7 is the structural schematic diagram for the mapreduce mission performance analytical equipment that the embodiment of the present application three provides.
Specific embodiment
Example embodiments are described in detail here, and the example is illustrated in the accompanying drawings.Following description is related to When attached drawing, unless otherwise indicated, the same numbers in different drawings indicate the same or similar elements.Following exemplary embodiment Described in embodiment do not represent all embodiments consistent with the application.On the contrary, they be only with it is such as appended The example of the consistent device and method of some aspects be described in detail in claims, the application.
Embodiment one
Referring to Fig. 1, the process that Fig. 1 is the mapreduce mission performance analysis method that the embodiment of the present application one provides is shown It is intended to.
As shown in Figure 1, the analysis method of mapreduce mission performance includes:
Step 101, setting time acquires threshold value, and it includes initial time and end time that the time, which acquires threshold value, according to starting The reduce progress at moment and the reduce progress of end time obtain progress analysis variable.
Specifically, hadoop provides the api of jmx, the data of metrics are grabbed by timing, according to the value of metrics Judge to obtain the difference of the reduce progress of end time and the reduce progress of initial time as progress analysis variable, above-mentioned difference It is worth smaller, the progress of execution is slower.
Step 102, the data volume of history map processing is collected, current map completes the data volume of processing, current map just The data volume, current of processing has been completed in the data volume of processing, the data volume of history reduce processing, current reduce The data volume that reduce is being handled.
Step 103, the data volume handled and current map have been completed according to the data volume of history map processing, current map The data volume handled obtains map inclination variable according to the first preset algorithm.
Step 104, according to history reduce handle data volume, current reduce completed processing data volume, when The data volume that preceding reduce is being handled obtains reduce inclination variable according to the second preset algorithm.
Step 105, progress analysis variable, map inclination variable, reduce inclination variable are input to what training in advance obtained In data skew model, data skew risk parameter is obtained.
Specifically, the data skew model that training obtains in advance are as follows:
Wherein, m is data skew variable index, and z is the progress analysis variable, and x is that the map tilts variable, and y is institute Reduce inclination variable is stated, t is temporal operator, and a, b, c are preset constant.
Further, data skew model is obtained by historical data training.
Data acquisition, by the acquisition to map, reduce data on yarn, collects map, the processing data of reduce It measures, the duration of processing, machine execution duration, the processing progress of ruduce, the information such as log analysis on jobhistory.
Step 106, mapreduce is obtained according to data skew risk parameter and the preset mission performance table of comparisons to appoint The analysis result for performance of being engaged in.
Specifically, the mission performance table of comparisons, which corresponds to mission performance grade for data skew risk parameter, obtains analysis result.
For example, it may be data skew risk parameter, in 0~0.2 range, mission performance grade is normal;Data skew Risk parameter m in 0.2~0.4 range, mission performance grade be it is general, will be sent to nail nail inform person liable's mission performance Generally;For data skew risk parameter m in 0.4~0.6 range, mission performance grade is exception, will send short message and inform duty Mission performance of leting people is abnormal;For data skew risk parameter m 0.6~0.8, mission performance is extremely serious, and will make a telephone call to duty Mission performance of leting people is extremely serious;For data skew risk parameter m in 0.8~1.0 range, mission performance is abnormal very serious, The person liable higher level that will make a telephone call to informs that mission performance is abnormal very serious.
Performance affecting parameters are obtained according to mapreduce operation data and the analysis result;The performance affecting parameters For the parameter for influencing the mapreduce mission performance;Mapreduce operation data includes: task cpu time, gc Time, the execution duration of average task, each cpu handle the data volume of the percentage of gc, each task average treatment.
It is needed according to the performance affecting parameters that mapreduce operation data and analysis result obtain using heuristic thinking In conjunction with actual experience, different performance affecting parameters correspond to different needs and attempt the factor solved, for example have obtained performance Affecting parameters may be that the execution duration of average task exception, or execution duration and each cpu of average task occurs There is exception in the percentage of processing gc, quickly finds solution to the problem according to the analysis of mapreduce mission performance.
A kind of analysis method of mapreduce mission performance includes setting time acquisition threshold value, is divided into initial time and end The only moment obtains progress analysis variable according to the reduce progress of the reduce progress of initial time and end time;Collection is gone through The data volume of history map/reduce processing, current map/reduce complete the data volume of processing, current map/reduce just In the data volume of processing;By data volume that history map is handled, current the map data volume for completing processing and current map The data volume of processing obtains map inclination variable according to the first preset algorithm;According to the data volume of history reduce processing, currently The data volume that reduce has completed the data volume of processing, current reduce is being handled is obtained according to the second preset algorithm Reduce tilts variable;Progress analysis variable, map inclination variable and reduce inclination variable are input to what training in advance obtained In data skew model, data skew risk parameter is obtained;According to data skew risk parameter and preset mission performance The table of comparisons obtains the analysis result of mapreduce mission performance.It, can by the analysis method of above-mentioned mapreduce mission performance Analysis to obtain mapreduce mission performance can be found based on the analysis results as a result, to know the performance of task execution Performance executes best situation.
Wherein, step 103, the data volume of processing has been completed according to the data volume of history map processing, current map and worked as The data volume that preceding map is being handled obtains map inclination variable according to the first preset algorithm and comprises the following processes.
Referring to Fig. 2, Fig. 2 is the flow diagram for obtaining map and tilting variable that the embodiment of the present application one provides.
As shown in Fig. 2, the process for obtaining map inclination variable in step 103 may include:
Step 201, the first history mean value is obtained according to the data volume that history map is handled.
Specifically, mapreduce is at work, many map handle data at the same time;The data volume of history map processing It can be the previous day, be also possible to the last week, totally go up, the mean value of the data volume of history map processing is about one Steady state value.
Step 202, that the data volume of processing has been completed according to current map obtains the first current mean value.
Specifically, for example can be currently has 10000 map in processing data, the map for having handled completion has 1000 It is a, these map for having handled completion are averaged to obtain the first current mean value.
Step 203, first is obtained by variance algorithm according to the current map data volume handled and the first history mean value Variance result.
Specifically, variance is bigger, fluctuation is bigger, and stability is poorer.
Step 204, second is obtained by variance algorithm according to the current map data volume being currently running and the first current mean value Variance result.
Step 205, first variance result and second variance result are subjected to linear operation by the first preset ratio and obtain map Tilt variable.
Specifically, the first preset ratio is determines according to actual conditions, such as it can be first variance result and second variance As a result it is calculated by 1:1;Map inclination variate-value is bigger, and fluctuation is bigger, and stability is poorer.
The data volume handled, current reduce have been completed according to the data volume of history reduce processing, current reduce The data volume handled obtains reduce inclination variable according to the second preset algorithm;
Wherein, step 104, the data of processing have been completed according to the data volume of history reduce processing, current reduce The data volume that amount, current reduce are being handled obtains reduce inclination variable according to the second preset algorithm, comprises the following processes.
Referring to Fig. 3, Fig. 3 is the flow diagram for obtaining reduce and tilting variable that the embodiment of the present application one provides.
As shown in figure 3, the process for obtaining reduce inclination variable in step 104 may include:
Step 301, the second history mean value is obtained according to the data volume that history reduce is handled.
Step 302, that the data volume of processing has been completed according to current reduce obtains the second current mean value.
Step 303, it is obtained according to the current reduce data volume handled and the second history mean value by variance algorithm Third variance result.
Step 304, it is obtained according to the current reduce data volume being currently running and the second current mean value by variance algorithm 4th variance result.
Step 305, linear operation is carried out by the second preset ratio according to third variance result and the 4th variance result to obtain Reduce tilts variable.
Embodiment two
Referring to Fig. 4, the structure that Fig. 4 is the mapreduce mission performance analytical equipment that the embodiment of the present application two provides is shown It is intended to.
As shown in figure 4, the structure of mapreduce mission performance analytical equipment includes:
Computing module 401, setting time acquire threshold value, and time acquisition threshold value includes initial time and end time, Progress analysis variable is obtained according to the reduce progress of the reduce progress of the initial time and the end time.
Collection module 402 collects the data volume of history map processing, current map has completed the data volume handled, current Data volume that map is being handled, the data volume of history reduce processing, current reduce completed processing data volume, when The data volume that preceding reduce is being handled.
First preset algorithm module 403, at the data volume of history map processing, the current map completion The data volume that the data volume of reason and the current map are being handled obtains map inclination variable according to the first preset algorithm.
Second preset algorithm module 404, the data volume handled according to the history reduce, the current reduce are Complete the data volume of processing, the data volume that the current reduce is being handled obtains reduce inclination according to the second preset algorithm Variable.
Input module 405, by the progress analysis variable, map inclination variable, reduce inclination variable input In the data skew model obtained to preparatory training, data skew risk parameter is obtained.
Analysis module 406 obtains institute according to the data skew risk parameter and the preset mission performance table of comparisons State the analysis result of mapreduce mission performance.
Wherein, the first preset algorithm module 403 may include such as lower unit:
Referring to Fig. 5, Fig. 5 is the structural schematic diagram for the first preset algorithm module that the embodiment of the present application two provides.
As shown in figure 5, the first preset algorithm module includes:
First computing unit 501 obtains the first history mean value according to the data volume that the history map is handled.
Second computing unit 502 obtains first currently according to the current map data volume for having completed to handle Value.
First variance result unit 503 passes through variance according to the current map data volume handled and the first history mean value Algorithm obtains first variance result.
Second variance result unit 504 passes through variance according to the current map data volume being currently running and the first current mean value Algorithm obtains second variance result.
First linear arithmetic element 505, by the first variance result and the second variance result by the first default ratio Example carries out linear operation and obtains map inclination variable.
Wherein, the second preset algorithm module 404 may include such as lower unit:
Referring to Fig. 6, Fig. 6 is the structural schematic diagram for the second preset algorithm module that the embodiment of the present application two provides.
As shown in fig. 6, the second preset algorithm module includes:
Third computing unit 601 obtains the second history mean value according to the data volume that the history reduce is handled.
4th computing unit 602 obtains second currently according to the current reduce data volume for having completed to handle Mean value.
Third variance result unit 603, passes through according to the current reduce data volume handled and the second history mean value Variance algorithm obtains third variance result.
4th variance result unit 604 passes through according to the current reduce data volume being currently running and the second current mean value Variance algorithm obtains the 4th variance result.
Second linear operation unit 605, it is default by second according to the third variance result and the 4th variance result Ratio carries out linear operation and obtains reduce inclination variable.
Embodiment three
Referring to Fig. 7, the structure that Fig. 7 is the mapreduce mission performance analytical equipment that the embodiment of the present application three provides is shown It is intended to.
As shown in fig. 7, mapreduce mission performance analytical equipment includes:
Processor 701, and the memory 702 being connected with the processor 701;
For storing computer program, the computer program is at least used to execute the embodiment of the present application one memory The analysis method of the mapreduce mission performance;
The processor is for calling and executing the computer program in the memory.
About the device in above-described embodiment, wherein modules execute the concrete mode of operation in related this method Embodiment in be described in detail, no detailed explanation will be given here.
It is understood that same or similar part can mutually refer in the various embodiments described above, in some embodiments Unspecified content may refer to the same or similar content in other embodiments.
It should be noted that term " first ", " second " etc. are used for description purposes only in the description of the present application, without It can be interpreted as indication or suggestion relative importance.In addition, in the description of the present application, unless otherwise indicated, the meaning of " multiple " Refer at least two.
Any process described otherwise above or method description are construed as in flow chart or herein, and expression includes It is one or more for realizing specific logical function or process the step of executable instruction code module, segment or portion Point, and the range of the preferred embodiment of the application includes other realization, wherein can not press shown or discussed suitable Sequence, including according to related function by it is basic simultaneously in the way of or in the opposite order, Lai Zhihang function, this should be by the application Embodiment person of ordinary skill in the field understood.
It should be appreciated that each section of the application can be realized with hardware, software, firmware or their combination.Above-mentioned In embodiment, software that multiple steps or method can be executed in memory and by suitable instruction execution system with storage Or firmware is realized.It, and in another embodiment, can be under well known in the art for example, if realized with hardware Any one of column technology or their combination are realized: having a logic gates for realizing logic function to data-signal Discrete logic, with suitable combinational logic gate circuit specific integrated circuit, programmable gate array (PGA), scene Programmable gate array (FPGA) etc..
Those skilled in the art are understood that realize all or part of step that above-described embodiment method carries It suddenly is that relevant hardware can be instructed to complete by program, the program can store in a kind of computer-readable storage medium In matter, which when being executed, includes the steps that one or a combination set of embodiment of the method.
It, can also be in addition, can integrate in a processing module in each functional unit in each embodiment of the application It is that each unit physically exists alone, can also be integrated in two or more units in a module.Above-mentioned integrated mould Block both can take the form of hardware realization, can also be realized in the form of software function module.The integrated module is such as Fruit is realized and when sold or used as an independent product in the form of software function module, also can store in a computer In read/write memory medium.
Storage medium mentioned above can be read-only memory, disk or CD etc..
In the description of this specification, reference term " one embodiment ", " some embodiments ", " example ", " specifically show The description of example " or " some examples " etc. means specific features, structure, material or spy described in conjunction with this embodiment or example Point is contained at least one embodiment or example of the application.In the present specification, schematic expression of the above terms are not Centainly refer to identical embodiment or example.Moreover, particular features, structures, materials, or characteristics described can be any One or more embodiment or examples in can be combined in any suitable manner.
Although embodiments herein has been shown and described above, it is to be understood that above-described embodiment is example Property, it should not be understood as the limitation to the application, those skilled in the art within the scope of application can be to above-mentioned Embodiment is changed, modifies, replacement and variant.

Claims (10)

1. a kind of analysis method of mapreduce mission performance characterized by comprising
Setting time acquires threshold value, and the time acquisition threshold value includes initial time and end time;
Progress analysis variable is obtained according to the reduce progress of the reduce progress of the initial time and the end time;
Collect the data volume of history map processing, the data that current map has completed the data volume of processing, current map is being handled Amount, the data volume of history reduce processing, current reduce has completed the data volume of processing, current reduce is being handled Data volume;
The data volume handled and the current map have been completed according to the data volume of history map processing, the current map The data volume handled obtains map inclination variable according to the first preset algorithm;
The data volume that is handled according to the history reduce, the current reduce have completed the data volume of processing, described have worked as The data volume that preceding reduce is being handled obtains reduce inclination variable according to the second preset algorithm;
The progress analysis variable, map inclination variable, reduce inclination variable are input to what training in advance obtained In data skew model, data skew risk parameter is obtained;
The mapreduce task is obtained according to the data skew risk parameter and the preset mission performance table of comparisons The analysis result of energy.
2. the analysis method of mapreduce mission performance according to claim 1, which is characterized in that according to the history The data volume that the data volume of map processing, the current map have completed the data volume of processing and the current map is being handled Map inclination variable is obtained according to the first preset algorithm, comprising:
The first history mean value is obtained according to the data volume that the history map is handled;
The first current mean value is obtained according to the current map data volume for having completed to handle;
First variance result is obtained by variance algorithm according to the current map data volume handled and the first history mean value;
Second variance result is obtained by variance algorithm according to the current map data volume being currently running and the first current mean value;
The first variance result and the second variance result are subjected to linear operation by the first preset ratio and obtain map inclination Variable.
3. the analysis method of mapreduce mission performance according to claim 1, which is characterized in that described according to The data volume of history reduce processing, the current reduce complete the data volume of processing, the current reduce The data volume of processing obtains reduce inclination variable according to the second preset algorithm, comprising:
The second history mean value is obtained according to the data volume that the history reduce is handled;
The second current mean value is obtained according to the current reduce data volume for having completed to handle;
Third variance result is obtained by variance algorithm according to the current reduce data volume handled and the second history mean value;
The 4th variance result is obtained by variance algorithm according to the current reduce data volume being currently running and the second current mean value;
Linear operation is carried out by the second preset ratio according to the third variance result and the 4th variance result to obtain Reduce tilts variable.
4. the analysis method of mapreduce mission performance according to claim 1, which is characterized in that the preparatory training Obtained data skew model are as follows:
Wherein, m is data skew variable index, and z is the progress analysis variable, and x is that the map tilts variable, and y is described Reduce tilts variable, and t is temporal operator, and a, b, c are preset constant.
5. the analysis method of mapreduce mission performance according to claim 1, which is characterized in that further include:
Performance affecting parameters are obtained according to mapreduce operation data and the analysis result;The performance affecting parameters are shadow Ring the parameter of the mapreduce mission performance;
The mapreduce operation data includes: the execution duration, each of task cpu time, gc time, average task Cpu handles the data volume of the percentage of gc, each task average treatment.
6. a kind of analytical equipment of mapreduce mission performance characterized by comprising
Computing module, setting time acquire threshold value, and the time acquisition threshold value includes initial time and end time, according to described The reduce progress of initial time and the reduce progress of the end time obtain progress analysis variable;
Collection module collects the data volume of history map processing, current map completes the data volume of processing, current map The data volume of processing, the data volume of history reduce processing, current reduce have completed the data volume of processing, current reduce The data volume handled;
First preset algorithm module, the data volume handled according to the history map, the current map have completed the number of processing The data volume handled according to amount and the current map obtains map inclination variable according to the first preset algorithm;
Second preset algorithm module, at the data volume of history reduce processing, the current reduce completion The data volume that the data volume of reason, the current reduce are being handled obtains reduce inclination variable according to the second preset algorithm;
The progress analysis variable, map inclination variable, reduce inclination variable are input in advance by input module In the data skew model that training obtains, data skew risk parameter is obtained;
Analysis module obtains described according to the data skew risk parameter and the preset mission performance table of comparisons The analysis result of mapreduce mission performance.
7. the analytical equipment of mapreduce mission performance according to claim 6, which is characterized in that described first is default Algoritic module includes:
First computing unit obtains the first history mean value according to the data volume that the history map is handled;
Second computing unit obtains the first current mean value according to the current map data volume for having completed to handle;
First variance result unit is obtained according to the current map data volume handled and the first history mean value by variance algorithm To first variance result;
Second variance result unit is obtained according to the current map data volume being currently running and the first current mean value by variance algorithm To second variance result;
The first variance result and the second variance result are carried out line by the first preset ratio by the first linear arithmetic element Property operation obtain map inclination variable.
8. the analytical equipment of mapreduce mission performance according to claim 6, which is characterized in that described second is default Algoritic module includes:
Third computing unit obtains the second history mean value according to the data volume that the history reduce is handled;
4th computing unit obtains the second current mean value according to the current reduce data volume for having completed to handle;
Third variance result unit passes through variance algorithm according to the current reduce data volume handled and the second history mean value Obtain third variance result;
4th variance result unit passes through variance algorithm according to the current reduce data volume being currently running and the second current mean value Obtain the 4th variance result;
Second linear operation unit is carried out according to the third variance result and the 4th variance result by the second preset ratio Linear operation obtains reduce inclination variable.
9. the analytical equipment of mapreduce mission performance according to claim 6, which is characterized in that the preparatory training Obtained data skew model are as follows:
Wherein, m is data skew variable index, and z is the progress analysis variable, and x is that the map tilts variable, and y is described Reduce tilts variable, and t is temporal operator, and a, b, c are preset constant.
10. a kind of analytical equipment of mapreduce mission performance, which is characterized in that
Processor, and the memory being connected with the processor;
The memory is at least used for perform claim and requires described in 1 for storing computer program, the computer program The analysis method of mapreduce mission performance;
The processor is for calling and executing the computer program in the memory.
CN201910776593.3A 2019-08-22 2019-08-22 Mapreduce task performance analysis method, device and equipment Active CN110489301B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910776593.3A CN110489301B (en) 2019-08-22 2019-08-22 Mapreduce task performance analysis method, device and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910776593.3A CN110489301B (en) 2019-08-22 2019-08-22 Mapreduce task performance analysis method, device and equipment

Publications (2)

Publication Number Publication Date
CN110489301A true CN110489301A (en) 2019-11-22
CN110489301B CN110489301B (en) 2023-03-10

Family

ID=68552729

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910776593.3A Active CN110489301B (en) 2019-08-22 2019-08-22 Mapreduce task performance analysis method, device and equipment

Country Status (1)

Country Link
CN (1) CN110489301B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111290917A (en) * 2020-02-26 2020-06-16 深圳市云智融科技有限公司 YARN-based resource monitoring method and device and terminal equipment
CN111651267A (en) * 2020-05-06 2020-09-11 京东数字科技控股有限公司 Method and device for performing performance consumption optimization analysis on parallel operation
CN113778727A (en) * 2020-06-19 2021-12-10 北京沃东天骏信息技术有限公司 Data processing method and device, electronic equipment and computer readable storage medium

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106201754A (en) * 2016-07-06 2016-12-07 乐视控股(北京)有限公司 Mission bit stream analyzes method and device
US20170061305A1 (en) * 2015-08-28 2017-03-02 Jiangnan University Fuzzy curve analysis based soft sensor modeling method using time difference Gaussian process regression
WO2017031961A1 (en) * 2015-08-24 2017-03-02 华为技术有限公司 Data processing method and apparatus
CN107562532A (en) * 2017-07-13 2018-01-09 华为技术有限公司 A kind of method and device for the hardware resource utilization for predicting device clusters
US20180081566A1 (en) * 2016-09-16 2018-03-22 International Business Machines Corporation Data block processing
US20180159774A1 (en) * 2016-12-07 2018-06-07 Oracle International Corporation Application-level Dynamic Scheduling of Network Communication for Efficient Re-partitioning of Skewed Data
CN108334596A (en) * 2018-01-31 2018-07-27 华南师范大学 A kind of massive relation data efficient concurrent migration method towards big data platform
CN109827579A (en) * 2019-03-08 2019-05-31 兰州交通大学 The method and system of Filtering Model real time correction in a kind of integrated positioning

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017031961A1 (en) * 2015-08-24 2017-03-02 华为技术有限公司 Data processing method and apparatus
US20170061305A1 (en) * 2015-08-28 2017-03-02 Jiangnan University Fuzzy curve analysis based soft sensor modeling method using time difference Gaussian process regression
CN106201754A (en) * 2016-07-06 2016-12-07 乐视控股(北京)有限公司 Mission bit stream analyzes method and device
US20180081566A1 (en) * 2016-09-16 2018-03-22 International Business Machines Corporation Data block processing
US20180159774A1 (en) * 2016-12-07 2018-06-07 Oracle International Corporation Application-level Dynamic Scheduling of Network Communication for Efficient Re-partitioning of Skewed Data
CN107562532A (en) * 2017-07-13 2018-01-09 华为技术有限公司 A kind of method and device for the hardware resource utilization for predicting device clusters
CN108334596A (en) * 2018-01-31 2018-07-27 华南师范大学 A kind of massive relation data efficient concurrent migration method towards big data platform
CN109827579A (en) * 2019-03-08 2019-05-31 兰州交通大学 The method and system of Filtering Model real time correction in a kind of integrated positioning

Non-Patent Citations (8)

* Cited by examiner, † Cited by third party
Title
R. MARLET等: "Mapping software architectures to efficient implementations via partial evaluation", 《PROCEEDINGS 12TH IEEE INTERNATIONAL CONFERENCE AUTOMATED SOFTWARE ENGINEERING》 *
刘海龙;宿宏毅;: "利用Hadoop云计算平台进行海量数据聚类分析" *
刘肖琛: "基于大数据的网络恶意流量分析系统的设计与实现", 《中国优秀硕士学位论文全文数据库》 *
周世龙等: "基于灰盒模型的Hadoop MapReduce job参数性能分析与预测", 《四川大学学报(工程科学版)》 *
朱永利等: "ODPS平台下的电力设备监测大数据存储与并行处理方法", 《电工技术学报》 *
王卓等: "基于增量式分区策略的MapReduce数据均衡方法", 《计算机学报》 *
秦军等: "基于异构Hadoop集群的负载均衡策略研究", 《计算机技术与发展》 *
罗永刚等: "一种Mapreduce作业内存精确预测方法", 《电子科技大学学报》 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111290917A (en) * 2020-02-26 2020-06-16 深圳市云智融科技有限公司 YARN-based resource monitoring method and device and terminal equipment
CN111651267A (en) * 2020-05-06 2020-09-11 京东数字科技控股有限公司 Method and device for performing performance consumption optimization analysis on parallel operation
CN113778727A (en) * 2020-06-19 2021-12-10 北京沃东天骏信息技术有限公司 Data processing method and device, electronic equipment and computer readable storage medium

Also Published As

Publication number Publication date
CN110489301B (en) 2023-03-10

Similar Documents

Publication Publication Date Title
CN110489301A (en) Analysis method, device and the equipment of mapreduce mission performance
CN111768008B (en) Federal learning method, apparatus, device, and storage medium
CN109815084A (en) Abnormality recognition method, device and electronic equipment and storage medium
US20200401222A1 (en) Gaming Cognitive Performance
CN106354616B (en) Monitor the method, apparatus and high performance computing system of application execution performance
CN109492153A (en) A kind of Products Show method and apparatus
CN110515793A (en) System performance monitoring method, device, equipment and storage medium
CN112598110B (en) Neural network construction method, device, equipment and medium
CN109815092B (en) Automatic cloud platform expansion method and system
KR20210014571A (en) Scalp and hair management system for providing status information at the stage of change
CN111338787A (en) Data processing method and device, storage medium and electronic device
Cáceres et al. Exploring variable neighborhood search for automatic algorithm configuration
Yabas et al. Churn prediction in subscriber management for mobile and wireless communications services
CN112231191A (en) Log collection method and device
KR20150067488A (en) Method for evaluating smart-grid strategy
CN114372383B (en) Scene fast switching method and system based on VR simulation scene
CN110428373A (en) A kind of training sample processing method and system for video interleave
CN110928663A (en) Cross-platform multithreading monitoring method and device
CN110191005A (en) A kind of alarm log processing method and system
CN110109803A (en) A kind of user behavior report method and system
CN115470403A (en) Real-time updating method and device of vehicle service recommendation model, vehicle and medium
CN113138895A (en) Monitoring method and device of cloud desktop equipment
CN111432082B (en) Customer distribution method and device for customer service
US20220187969A1 (en) Optimizing Service Delivery through Partial Dependency Plots
CN107562599A (en) A kind of parameter detection method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant