CN108153657B

CN108153657B - Method for dividing application roles of large-scale data center server

Info

Publication number: CN108153657B
Application number: CN201711405036.8A
Authority: CN
Inventors: 武志昊; 林友芳; 万怀宇
Original assignee: Beijing Jiaotong University
Current assignee: Beijing Jiaotong University
Priority date: 2017-12-22
Filing date: 2017-12-22
Publication date: 2020-10-20
Anticipated expiration: 2037-12-22
Also published as: CN108153657A

Abstract

The invention relates to the technical field of server operation and maintenance management, in particular to a method for dividing application roles of a large-scale data center server. The method for dividing the application roles of the large-scale data center server effectively overcomes the defects of the traditional operation and maintenance method, and does not need to consume a large amount of manpower and material resources to collect and count data; operation and maintenance personnel do not need to have sufficient experience knowledge on the system architecture of the application of the data center; the service condition and log data based on the server can be automatically constructed, and operation and maintenance personnel can be effectively assisted to master the role playing condition of the server of the data center.

Description

Method for dividing application roles of large-scale data center server

Technical Field

The invention relates to the technical field of server operation and maintenance management, in particular to a method for dividing application roles of a large-scale data center server.

Background

In recent years, the rapid increase in the number of servers in large data centers has put a great pressure on the operation and maintenance management departments. It is increasingly difficult for operation and maintenance managers to master the actual use condition of the server inside the data center, and to master the use condition of the server due to the role that the server mainly plays in use.

With the rapid development of the internet and the coming of the cloud computing and big data era, many enterprises begin to build own data centers and cloud computing platforms to support the huge and complex business systems of the enterprises. However, due to the inherent complexity and dynamics of the system resulting from the increase in system size, there are significant challenges to the management of server resources in a data center. A plurality of service systems with complicated relationships run on servers of the data center, and different servers play different roles and take different functions. However, operation and maintenance managers cannot clearly understand many problems of the data center. For example: the operation and maintenance manager cannot know the problems of any program on the server, the boundary of each service system, the relationship of servers in the service systems, the relationship among the service systems and the like in detail. Therefore, when the server fails, the operation and maintenance personnel cannot accurately find the reason. There are many problems caused by recording information such as the use status, function status, and character type of the server only by management means such as registration. These records are not time-sensitive, accurate, comprehensive, and also cause administrative inconvenience and misdirection. On the other hand, most of the operation and maintenance management of the existing data centers focus on real-time monitoring. Although the data center stores massive historical operation record data, effective mining and utilization are not performed, how to find the operation mode characteristics of the server from the historical data, grasp the role category of the server, and perform safety monitoring and management pertinently.

Disclosure of Invention

Aiming at the defects in the prior art, the invention provides a method for dividing application roles of a large-scale data center server so as to realize effective operation and maintenance management of the data center server.

The invention provides the following scheme:

a method for dividing application roles of a large data center server comprises the following steps:

s1, analyzing the original log data of the large data center server in a certain time period, wherein the time period takes days as a unit, and carrying out denoising, serialization, conversion, decompression and other processing on the unstructured original log data;

extracting process information data from original log data, wherein the process information data at least comprises seven contents of a sampled server name, a process ID, a CPU utilization rate, a virtual memory utilization rate, a process name, a path and a parameter;

extracting log sampling time data of a server from original log data, wherein the log sampling time data at least comprises two contents of the name of the server and the number of CPU cores of the server;

s2, filtering the loading process when the server is started, avoiding the influence of the process on the role division of the server, and extracting the process characteristics;

regarding the process with the process name beginning with the [ "symbol and the process with the virtual memory utilization rate of 0, the process is considered as a loading process when the server is started, and the filtering is not used any more; otherwise, reserving the process for server role division;

extracting process information data of all servers from the process information data of S1, wherein the process information data at least comprises process names and virtual memory utilization rates of processes in all servers; calculating characteristic values TF-IDF of each process of the server by using the process information data and the log sampling time data of the server;

calculating characteristic information TF-IDF of each process of the server, wherein the characteristic information TF-IDF comprises the following steps:

traversing the server processes screened under the current timestamp, counting the sampling frequency of each server, counting the frequency TF value of each process in each server, and counting the frequency DF value of each process in all samples, then:

TF-IDF＝TF/DF

the larger the TF-IDF value is, the larger the distinguishing degree between the servers is;

s3, calculating the contribution degree of each process to server classification, and constructing a multi-dimensional feature matrix:

s3.1 first, count the frequency a of each process appearing in the positive samples, the frequency B of each process appearing in the negative samples, the frequency C of each process not appearing in the positive samples, the frequency D of each process not appearing in the negative samples, and the total number N of samples:

s3.1.1 traversing the training data to count the number N of positive samples belonging to a given role in the sample₁And a total number of samples N;

s3.1.2 traversing the training data, counting the processes contained in all servers, and forming a process set;

s3.1.3 traversing all processes in the process set, counting the positive sample occurrence frequency A of the process occurring in the designated server role and the negative sample occurrence frequency B of the process occurring in the non-designated role;

s3.1.4 using the sum of A and C as 1, to give C;

s3.1.5 with the sum of B and D being 1, giving D;

s3.2 calculating contribution CHI of each process to classification:

and traversing each process, and calculating each process by using the following formula to obtain the CHI value:

the size of the CHI value represents the contribution degree of the process to the classification, and the larger the CHI is, the larger the classification distinguishing degree is;

s3.3, constructing a feature matrix according to the feature value TF-IDF and the contribution CHI of the process to the classification effect, wherein the feature matrix comprises the following steps:

according to the calculated contribution CHI of each process to the classification effect, selecting the process with the numerical value of the first 20% as the attribute used for classification; using the characteristic value TF-IDF obtained by the calculation as a value of the attribute; and constructing a high-dimensionality characteristic matrix, wherein the dimensionality is the screened attribute number.

S4, training a Support Vector Machine (SVM) (support Vector machine) classification model according to the constructed feature matrix, wherein the constructed feature matrix of each server is used as sample input, and the server types marked by the samples are used as sample output; training a classifier model suitable for server role division;

s5, on the test set, calculating the characteristic value TF-IDF of the sampling server, and using the process selected in S3 as the attribute of the characteristic matrix; and selecting the characteristic value TF-IDF of the corresponding process as an attribute value, thereby constructing a characteristic matrix of the test set.

S6, the multi-dimensional characteristic matrix constructed on the test set is used as the input of the classifier, the trained support vector machine classifier model is used as the core, and the classification of the server is used as the output, so that the role division is carried out on the unknown test set server.

The invention has the following technical effects: the method for dividing the application roles of the large-scale data center server effectively overcomes the defects of the traditional operation and maintenance method, and does not need to consume a large amount of manpower and material resources to collect and count data; operation and maintenance personnel do not need to have sufficient experience knowledge on the system architecture of the application of the data center; the service condition and log data based on the server can be automatically constructed, and operation and maintenance personnel can be effectively assisted to master the role playing condition of the server of the data center.

Drawings

Fig. 1 is a flowchart of a method for server application role division according to an embodiment of the present invention.

Detailed Description

For the convenience of understanding the embodiments of the present invention, the following description will be further explained by taking several specific embodiments as examples in conjunction with the drawings, and the embodiments are not to be construed as limiting the embodiments of the present invention.

The embodiment of the invention provides a method for dividing application roles of a large-scale data center server in order to overcome the problems of information loss, lag and the like of the conventional data center service logic network based on manual statistics.

The embodiment of the invention provides a method for dividing application roles of a data center server, which comprises the following processing flows of:

s1 analyzes the raw log data of the large-scale data center server in a certain time period, where the time period is in days, and in practical applications, the time period may also be selected from weeks, months, and other time intervals, and the unstructured raw log data is subjected to denoising, serialization, conversion, decompression, and other processing.

And extracting process information data from the original log data, wherein the process information data at least comprises seven contents of sampled server names, process IDs, CPU utilization rates, virtual memory utilization rates, process names, paths and parameters.

The process information data includes fields shown in table 1 below;

TABLE 1

And extracting log sampling time data of the server from the original log data, wherein the log sampling time data at least comprises the name of the server and the number of CPU cores of the server. The log sample time data includes the following fields shown in table 2:

TABLE 2

Numbering	Name (R)	Explanation of the invention
			1	Server name	Sampled server name
2	Number of server CPU cores	CPU core number contained in server
			3	Date	Date of sampling

S2, the loading process when the server starts is filtered, and the influence of the process on the role division of the server is avoided. The filtering calculation method is as follows:

traversing all process information data in the current time period, and extracting the process name and the virtual memory utilization rate of each process in each server;

regarding the process with the process name beginning with the [ "symbol and the process with the virtual memory utilization rate of 0, the process is considered as a loading process when the server is started, and the filtering is not used any more; otherwise, the process is reserved for server role partitioning usage.

On the basis of the filtering mode, feature information data TF-IDF of the process is extracted from the reserved process.

TF-IDF＝TF/DF

And extracting the characteristic data of the screened process. Traversing all log sampling time data under the current timestamp, and counting the sampling frequency of each server; counting TF values of the frequency of each process in each server; and counting the DF frequency of all samples of each process. The larger the TF-IDF value, the greater the degree of differentiation during service.

The extracted TF-IDF feature values include at least the fields shown in table 3 below.

TABLE 3

Numbering	Name (R)	Explanation of the invention
			1	Server name	Sampled server name
2	Process name
			3	DF	Frequency of occurrence of process in all samples
4	TFIDF	Calculated process feature values
			5	Date	Date of sampling

S3 calculates the degree CHI to which each process contributes to the classification.

The occurrence frequency (A) of positive samples, the occurrence frequency (B) of negative samples, the non-occurrence frequency (C) of positive samples, the non-occurrence frequency (D) of negative samples and the total number N of samples of each process are counted.

The processes are calculated for the attribute values CHI as classes. The method specifically comprises the steps of traversing training data and counting the number N of positive samples belonging to a specified role in a sample₁And a total number of samples N; traversing the training data, and counting processes contained in all servers to form a process set; traversing all processes in the process set, and counting the frequency A of the process appearing in the designated server role and the frequency B of the process appearing in the non-designated role; obtaining C by using the fact that intersection does not exist between A and C and the union is the number of positive samples; and similarly, obtaining the quantity D by using the number of negative samples which are obtained by taking the union of the negative samples and the negative samples without intersection between the B and the D. And then traversing each process, and calculating each process by using the obtained numbers and formulas to obtain the CHI value. The magnitude of the CHI value represents the degree of contribution of the process to the classification, with the greater the CHI, the greater the degree of differentiation of the classification.

The information after the calculation of the CHI feature value includes at least the fields shown in table 4 below.

TABLE 4

Numbering	Name (R)	Explanation of the invention
			1	Process name
2	CHI	Chi-square detection value
			3	Categories	Class of belonging

And constructing a feature matrix. According to the CHI value of each process obtained by calculation, selecting the process with the first 20% of numerical value as the attribute used for classification; using the TF-IDF value obtained by the calculation as a value of the attribute; and constructing a high-dimensionality eigenvector matrix, wherein the dimensionality is the screened attribute number.

S4 trains a Support Vector Machine (SVM) (support Vector machine) classifier using the sample data. Inputting a constructed characteristic matrix of each server as a sample, and outputting the server type marked by the sample as the sample; and training a classifier model suitable for server role division.

S5, calculating the characteristic value TF-IDF of the sampling server on the test set, and using the process selected in the step S130 as the attribute of the characteristic matrix; and selecting the characteristic value TF-IDF of the corresponding process as an attribute value, thereby constructing a characteristic matrix of the test set.

S6, using the high-dimensional feature matrix of the test set constructed in the step S5 as input, using the classifier model trained in the step S140 as a core, using server classification as output, classifying the test set to obtain a classification result, and completing server role classification of the data center.

In summary, the server role division method provided in the embodiment of the present invention effectively overcomes the above-mentioned defects of the conventional server role division method, and does not need to consume a large amount of manpower and material resources to collect and count data; operation and maintenance personnel do not need to have sufficient experience knowledge on the system architecture of the application of the data center; the server-based role division can be automatically constructed, and operation and maintenance personnel can be effectively assisted to master the role playing condition of the server of the data center.

The automatic role division of the server constructed by the invention can accurately reflect the role playing condition of the data center server and assist the management of operation and maintenance personnel. The input of the invention only needs server log snapshot data of the data center, and based on the server log snapshot data, the invention can automatically find the role division of the server, thereby not only having accurate result, but also not needing too much manual operation, and saving a large amount of manpower and material resources expenses.

Claims

1. A method for dividing application roles of a large data center server is characterized by comprising the following steps:

s1, analyzing the original log data of the large data center server in a certain time period, wherein the time period takes days as a unit, and denoising, serializing, converting and decompressing the unstructured original log data;

TF-IDF＝TF/DF

s3.1.4 using the sum of A and C as 1, to give C;

s3.1.5 with the sum of B and D being 1, giving D;

s3.2 calculating contribution CHI of each process to classification:

according to the calculated contribution CHI of each process to the classification effect, selecting the process with the numerical value of the first 20% as the attribute used for classification; using the characteristic value TF-IDF obtained by the calculation as a value of the attribute; constructing a high-dimensionality characteristic matrix, wherein dimensionality is the screened attribute number;

s4, training a SVM classification model according to the constructed feature matrix, wherein the constructed feature matrix of each server is used as sample input, and the server types marked by the samples are used as sample output; training a classifier model suitable for server role division;

s5, on the test set, calculating the characteristic value TF-IDF of the sampling server, and using the process selected in S3 as the attribute of the characteristic matrix; selecting the characteristic value TF-IDF of the corresponding process as an attribute value, and constructing a characteristic matrix of the test set according to the attribute value TF-IDF;