CN106293949A

CN106293949A - Resource dispatching strategy based on baseline analysis under a kind of computing environment

Info

Publication number: CN106293949A
Application number: CN201610689781.9A
Authority: CN
Inventors: 左强
Original assignee: Inspur Electronic Information Industry Co Ltd
Current assignee: Inspur Electronic Information Industry Co Ltd
Priority date: 2016-08-19
Filing date: 2016-08-19
Publication date: 2017-01-04

Abstract

The present invention be more particularly directed to resource dispatching strategy based on baseline analysis under a kind of computing environment.Resource dispatching strategy based on baseline analysis under this computing environment, sets up lexical analysis system, and described lexical analysis system includes log analyzing module, data analysis module and scheduling of resource module；Described log analyzing module includes daily record uploading unit and daily record resolution unit, described data analysis module includes Data Integration and analytic unit and checks analysis result unit, and described scheduling of resource module includes resource dispatching strategy signal generating unit and resource dispatching strategy performance element.Resource dispatching strategy based on baseline analysis under this computing environment, propose a lexical analysis system, the collection by daily record data of the lexical analysis system, data are analyzed, thus generate the scheduling strategy being conducive to improving overall efficiency, by the execution of scheduling strategy, improve the effective rate of utilization of node, it is achieved thereby that the lifting of overall efficiency.

Description

Resource dispatching strategy based on baseline analysis under a kind of computing environment

Technical field

The present invention relates to field of cloud computer technology, particularly to scheduling of resource based on baseline analysis under a kind of computing environment Strategy.

Background technology

The essence of " cloud " is the non-physical of system itself, it is believed that cloud computing is for the customized void of user Intending calculating system, therefore Intel Virtualization Technology becomes the bottom foundation stone that cloud computing realizes.Virtualization is to represent taking out of computer resource As method, can be by the resource after abstract with the way access accessing abstract front resource consistence, this money by Intel Virtualization Technology The abstract method in source is not limited by the physical configuration of realization, geographical position and underlying resource.

Along with the maturation of Intel Virtualization Technology and developing rapidly of the Internet, each big business information system more turns to various Privately owned cloud or publicly-owned cloud, the data of generation also increase rapidly.People recognize the data importance to enterprise, carry out data Analysis mining can provide purpose and information for the decision-making of enterprise.The data generally created and produce are destructuring or half structure The data changed, are analyzed if these data are downloaded to relevant database, can send out the substantial amounts of time and money of expense.At this moment Create such as Map/Reduce distributed computing framework, thus carry out analysis and the excavation of large data collection.

Cloud computing, utilizes system architecture technology that thousands of station servers are integrated, provides the user and provide flexibly Source distribution and task scheduling ability.Intel Virtualization Technology is as one of key technology in cloud computing, and it can be by a physics meter Calculation machine becomes the virtual computer system of multiple stage.The bottom architecture such as physical resource are carried out abstract by Intel Virtualization Technology, make hardware set Difference between Bei and compatibility are transparent to upper layer application, thus realize the unified management of resources all kinds of to bottom.Virtualization skill Art is a kind of method allocated and calculate resource, and it can be by different levels (hardware, software, data, network, the storage of application system Deng) isolate, thus break data center, network, server, data, apply and store in physical equipment between draw Point, it is achieved unified management and dynamically use physical resource and virtual resource, improve motility and the elasticity of system structure.

Accompanying drawing 1 is virtualized system architecture schematic diagram.Virtual machine monitoring software VMM in accompanying drawing, in Intel Virtualization Technology (Virtual Machine Monitor), or referred to as Hypervisor, it has access to that all hardware equipment on server. When startup of server and when calling Hypervisor, it can load the operating system on all virtual-machine client, gives every simultaneously The physical resources such as network, CPU, disk and the internal memory that the distribution of individual virtual machine is appropriate.Hypervisor is responsible for coordinating these hardware money The access in source, the most also applies security protection between each virtual machine.

In large-scale cloud computing environment, physical host is ten hundreds of, and the resource consumed is the hugest.Virtual Change technology provides fictitious host computer, fictitious host computer to be the main objects under cloud computing environment for cloud computing, needs to consume main frame The resources such as cpu, internal memory, disk and network interface card.But virtual machine is different by the physical resource shared by the configuration of virtual machine, the most empty The consumption of resource is also not quite similar by plan machine by the difference of purposes.Based on above-mentioned situation, the present invention proposes a kind of computing environment Under resource dispatching strategy based on baseline analysis.

Summary of the invention

The present invention is in order to make up the defect of prior art, it is provided that divide based on baseline under a kind of simple efficient computing environment The resource dispatching strategy of analysis.

The present invention is achieved through the following technical solutions:

Resource dispatching strategy based on baseline analysis under a kind of computing environment, it is characterised in that: set up lexical analysis system, described Lexical analysis system includes log analyzing module, data analysis module and scheduling of resource module；Described log analyzing module includes Daily record uploading unit and daily record resolution unit, described data analysis module includes Data Integration and analytic unit and checks analysis knot Really unit, described scheduling of resource module includes resource dispatching strategy signal generating unit and resource dispatching strategy performance element.

Described log analyzing module uses Map/Reduce programming model, and Mapper is responsible for reading daily record and resolving, and generates The log object being prone to read gives Reducer process, and Reducer is responsible for merging all daily record datas, and exports HDFS Storage.

Described data analysis module first passes through to be integrated data, and form as requested merges；Then utilize K-means logarithm shows cluster analysis factually, improves computational load amount and efficiency, utilizes Map/Reduce programming framework to K- Means realizes parallelization.

Described scheduling of resource module, according to Data Integration result, arranges virtual machine related information, by generating phase The scheduling strategy closed, improves the utilization rate of node, and simultaneously after scheduling strategy generates, resource dispatching strategy performance element can be held The scheduler task of row associated virtual machine, thus improve usefulness on the whole.

The invention has the beneficial effects as follows: resource dispatching strategy based on baseline analysis under this computing environment, it is proposed that Lexical analysis system, the collection by daily record data of the lexical analysis system, data are analyzed, thus generate and be conducive to improving The scheduling strategy of overall efficiency, by the execution of scheduling strategy, improves the effective rate of utilization of node, it is achieved thereby that overall effect The lifting of energy.

Accompanying drawing explanation

Accompanying drawing 1 is virtualized system architecture schematic diagram.

Accompanying drawing 2 is the simple architecture schematic diagram of Hadoop.

Accompanying drawing 3 is lexical analysis system architecture schematic diagram of the present invention.

Detailed description of the invention

In order to make the technical problem to be solved, technical scheme and beneficial effect clearer, below tie Closing drawings and Examples, the present invention will be described in detail.It should be noted that, specific embodiment described herein is only used To explain the present invention, it is not intended to limit the present invention.

Resource dispatching strategy based on baseline analysis under this computing environment, sets up lexical analysis system, described lexical analysis System includes log analyzing module, data analysis module and scheduling of resource module；Described log analyzing module includes that daily record is uploaded Unit and daily record resolution unit, described data analysis module includes Data Integration and analytic unit and checks analysis result unit, Described scheduling of resource module includes resource dispatching strategy signal generating unit and resource dispatching strategy performance element.

Described log analyzing module uses Hadoop framework, for realizing carrying out unformatted daily record data parsing lattice Formulaization stores, and utilizes Map/Reduce programming framework to carry out parallelization and resolves daily record, extracts virtual machine performance number in daily record Data are used, according to certain form storage after statistics according to (CPU, internal memory usage amount) and virtual machine process.Specifically, Described log analyzing module uses Map/Reduce programming model, and Mapper is responsible for reading daily record and resolving, and generates and is prone to read Log object give Reducer process, Reducer is responsible for merging all daily record datas, and export to HDFS store.

Described data analysis module first passes through to be integrated data, and form as requested merges；Then utilize K-means algorithm logarithm shows cluster analysis factually, improves computational load amount and efficiency, utilizes Map/Reduce programming framework to K- Means realizes parallelization.

1, about Hadoop framework

The core of Hadoop is made up of Map/Reduce and Hadoop Distributed File System.The bottom is HDFS, it stores the file on all nodes of Hadoop, is made up of NameNode and DataNode.Map/Reduce is in The last layer of HDFS, is made up of JobTracker and TaskTracker.

2, about K-means algorithm

K-means algorithm is as being clustering algorithm based on division the most classical.What this algorithm was maximum is a little to be succinctly Quickly, and just can be applied on distributed computing framework by simple amendment, become what large-scale data was analyzed Instrument.

K-means algorithm input numerical value is K and N number of data object, then N number of data object is divided into K cluster, makes to obtain Cluster meet following condition:

(1), in same cluster, data object similarity is higher；

(2), in different clusters, data object similarity is relatively low.

Each cluster has a cluster centre, and the barycenter of the most each cluster is typically equal with the data object in each cluster Value calculates cluster centre and embodies the feature of this cluster.

The basic thought of K-means algorithm is: set up K central point, calculates the similarity of object and central point, with in The object that heart point similarity is high divides corresponding classification into, updates central point until obtaining preferable cluster result.

Claims

1. resource dispatching strategy based on baseline analysis under a computing environment, it is characterised in that: set up lexical analysis system, institute State lexical analysis system and include log analyzing module, data analysis module and scheduling of resource module；Described log analyzing module bag Including daily record uploading unit and daily record resolution unit, described data analysis module includes Data Integration and analytic unit and checks analysis Result unit, described scheduling of resource module includes resource dispatching strategy signal generating unit and resource dispatching strategy performance element.

Resource dispatching strategy based on baseline analysis under computing environment the most according to claim 1, it is characterised in that: described Log analyzing module uses Map/Reduce programming model, and Mapper is responsible for reading daily record and resolving, and generates the day being prone to read Will object gives Reducer process, and Reducer is responsible for merging all daily record datas, and exports HDFS storage.

Resource dispatching strategy based on baseline analysis under computing environment the most according to claim 1, it is characterised in that: described Data analysis module first passes through to be integrated data, and form as requested merges；Then utilize K-means to data Realize cluster analysis, improve computational load amount and efficiency, utilize Map/Reduce programming framework that K-means is realized parallelization.

Resource dispatching strategy based on baseline analysis under computing environment the most according to claim 1, it is characterised in that: described Scheduling of resource module, according to Data Integration result, arranges virtual machine related information, by generating relevant scheduling strategy, Improving the utilization rate of node, simultaneously after scheduling strategy generates, resource dispatching strategy performance element can perform associated virtual machine Scheduler task, thus improve usefulness on the whole.