CN110750376B

CN110750376B - Server system fault acquisition and processing method and device and storage medium

Info

Publication number: CN110750376B
Application number: CN201910811799.5A
Authority: CN
Inventors: 张旭芳; 匡志鹏
Original assignee: Suzhou Inspur Intelligent Technology Co Ltd
Current assignee: Suzhou Inspur Intelligent Technology Co Ltd
Priority date: 2019-08-30
Filing date: 2019-08-30
Publication date: 2022-10-18
Anticipated expiration: 2039-08-30
Also published as: CN110750376A

Abstract

The invention relates to a server system fault acquisition processing method, a device and a storage medium, comprising the following steps: s1: collecting data; s2: according to the technical problem classification of the system, establishing a problem classification keyword library, matching keywords, and identifying the problem type of a specific problem; s3: identifying the problem type; s4: recording the problem processing efficiency; s5: analyzing the problem processing efficiency; s6: and (5) the problem processing efficiency.

Description

Server system fault acquisition and processing method and device and storage medium

Technical Field

The invention belongs to the technical field of server system fault processing, and particularly relates to a server system fault acquisition processing method, a server system fault acquisition processing device and a storage medium.

Background

Servers, also known as servers, are devices that provide computing services. Since the server needs to respond to and process the service request, the server generally has the capability of supporting and securing the service.

The server is composed of a processor, a hard disk, a memory, a system bus and the like, and is similar to a general computer architecture, but has high requirements on processing capacity, stability, reliability, safety, expandability, manageability and the like due to the need of providing highly reliable services.

Under a network environment, the server is divided into a file server, a database server, an application server, a WEB server and the like according to different service types provided by the server.

Various problems are always encountered in a client room, a server test laboratory, operation and maintenance personnel and testing personnel, such as severe crash, blue screen and automatic restart, slight error finding in logs, performance degradation and the like, but the problem is complex in type and low in solution efficiency; the running of customer service and the efficiency of research and development testing are affected. This is a drawback and deficiency in the prior art.

In view of the above, the present invention provides a method, an apparatus and a storage medium for collecting and processing a server system fault; to address the above-mentioned deficiencies and problems in the prior art.

Disclosure of Invention

Aiming at the problems of the prior art, the problems are complex in type and low in solution efficiency; the invention provides a server system fault acquisition and processing method, a device and a storage medium, aiming at solving the technical problem.

In order to achieve the purpose, the invention provides the following technical scheme:

in a first aspect, the present invention provides a method for collecting and processing a server system fault, including the following steps:

s1: the data acquisition step comprises:

problem data of technical problems occurring in a server system are collected, and the collected technical problems are classified according to performance characteristics of the problem data;

s2: according to the technical problem classification of the system, establishing a problem classification keyword library, matching keywords, and identifying the problem type of a specific problem;

s3: and step of problem type identification: a problem classification keyword library is connected, keyword identification is carried out on specific problem descriptions, and the identified highest problem type is used as the problem type of the problem;

for example, the following 3 keywords, namely "dead halt", "severe error" and "system unresponsiveness", are identified in the description of the specific problem, and according to the first keyword "dead halt", it can be determined that the problem belongs to the downtime problem, and then the subsequent keyword determination is not performed. For example, the following 2 keywords- "installation configuration" and "severe error" are identified in the description of the specific problem, and according to the first keyword "installation configuration", it can be judged that the problem may belong to a technical consultation problem, and according to the second keyword "severe error", it can be judged that the problem may belong to a "system analysis problem", and then the maximum value of the two is taken, that is, the problem is a system analysis problem.

S4: recording the problem processing efficiency:

and (4) storing historical data of the total processing time of all similar problems in a classified manner according to different problem types, and calculating and updating the average processing time, the longest processing time and the shortest processing time of different types of problems in real time. These 3 data are used as reference data for evaluating the efficiency of problem processing.

S5: analyzing the problem processing efficiency:

problem data is obtained from the data acquisition step, and the actual processing time of the problem is calculated. Analyzing the processing efficiency of the problem according to the identified problem type and three data of the average processing time, the longest processing time and the shortest processing time of the problem of the type,

if the actual problem processing time length = the average processing time length, the problem processing efficiency =1

If the shortest processing time < the actual processing time of the problem < the average processing time, the problem processing efficiency =1+ ((actual processing time of the problem-shortest processing time)/average processing time)

If the average processing time length < the actual problem processing time length < the longest processing time length, the problem processing efficiency =1- ((longest processing time length-actual problem processing time length)/average processing time length)

And analyzing the obtained problem processing efficiency data, and storing the data for subsequent system function expansion and data analysis.

S6: the step of problem processing efficiency:

and the system is responsible for system interface display, and sends the problem processing efficiency data to a problem processor, a supervisor of the problem processor and other people needing the data by mails, so as to serve as a means for supervising people to improve the working efficiency and an assessment basis of work results.

Preferably, in step S1, the object of data acquisition includes an existing problem processing system and an EXCEL table for recording a problem process; the content of data acquisition comprises specific problem description, problem proposing time, problem starting processing time, problem each turn recovery time, reply content, problem closing time and a problem final processing scheme. Problem data are collected in an all-around mode; the comprehensiveness and the accuracy of problem data acquisition are improved.

Preferably, in step S1, the technical problem is classified into three categories according to the performance characteristics of the problem data:

the first type, the downtime problem, seriously affects the customer service operation or project schedule, such as the system can not be installed or started, the system is down, automatic restart, crash, blue screen;

secondly, the system analyzes the problems, the system can normally operate, but the problems of system service starting failure, severe error logs related to software and hardware and potential threats and hidden dangers to service operation caused by system performance reduction occur;

the third category, technical consultation category, does not affect the client business operation and test progress, and helps the problems in confirmation and log analysis, such as: system tools, command use consultation, system installation configuration consultation, kernel upgrading and drive compiling technical support problems.

Preferably, in step S2, the keywords in the question classification keyword library include, but are not limited to, the following fields:

downtime problem-downtime, automatic system restart, system installation failure, system non-installation, system installation error, system startup failure, system startup error, automatic machine restart, blue screen, crash, system no response;

system analysis problems-service startup failure, severe error, warning, performance degradation, performance substandard;

technical consultation type problems-tool usage, command usage, installation configuration, kernel upgrade, driver compilation. Dividing technical problems of the system in detail; the technical problem is classified and indexed more accurately.

In a second aspect, the present invention provides a server system fault collecting and processing apparatus, including:

the data acquisition module is used for acquiring problem data of technical problems appearing in the server system and classifying the acquired technical problems according to the performance characteristics of the problem data;

the keyword library establishing module is used for establishing a problem classification keyword library according to system technical problem classification, and the problem classification keyword library is used for being butted with the problem type identification module to match keywords and identify the problem type of a specific problem;

the problem type identification module is used for butting a problem classification keyword library and carrying out keyword identification on specific problem descriptions, and the identified highest problem type is used as the problem type of the problem;

for example, the following 3 keywords, namely "dead halt", "severe error" and "system unresponsiveness", are identified in the description of the specific problem, and according to the first keyword "dead halt", it can be determined that the problem belongs to the downtime problem, and then the subsequent keyword determination is not performed. For example, the following 2 keywords- "installation configuration" and "severe error" are identified in the description of the specific problem, the problem can be judged to possibly belong to the technical consultation problem according to the installation configuration of the first keyword, the problem can be judged to possibly belong to the system analysis problem according to the severe error of the second keyword, and the maximum value of the two is taken, namely the problem is the system analysis problem;

the problem processing efficiency history library module is used for storing the historical data of the total processing time of all similar problems in a classified manner according to different problem types, and calculating and updating the average processing time, the longest processing time and the shortest processing time of different types of problems in real time; these 3 data are used as reference data for evaluating the efficiency of problem processing.

The problem processing efficiency analysis module is used for obtaining problem data from the data acquisition module and calculating the actual processing time of the problem; analyzing the processing efficiency of the problem according to the problem type identified by the problem type identification module and three data of the average processing time, the longest processing time and the shortest processing time of the type of problem obtained from the problem processing efficiency historical library module;

If the shortest processing time < actual processing time of problem < average processing time, then the problem processing efficiency =1+ ((actual processing time of problem-shortest processing time)/average processing time)

And analyzing the obtained problem processing efficiency data, storing the problem processing efficiency data into a problem processing efficiency history library module, and using the problem processing efficiency history library module as subsequent system function expansion and data analysis.

And the problem processing efficiency display module is responsible for displaying a system interface, and sending the problem processing efficiency data to a problem processor, a supervisor of the problem processor and other people needing the data by mails, and is used as a means for supervising people to improve the work efficiency and an assessment basis of work results.

Preferably, in the data acquisition module, the object of data acquisition includes an existing problem processing system and an EXCEL table for recording problem processes; the data acquisition content comprises specific problem description, problem presentation time, problem processing starting time, problem each-turn recovery time, recovery content, problem closing time and a problem final processing scheme. Problem data are collected in an all-around mode; the comprehensiveness and the accuracy of problem data acquisition are improved.

Preferably, in the data acquisition module, the technical problems are classified into three categories according to the performance characteristics of the problem data:

Preferably, in the keyword library establishing module, the keywords in the question classification keyword library include, but are not limited to, the following fields:

technical consultation type problems-tool use, command use, installation configuration, kernel upgrade, driver compilation. Dividing technical problems of the system in detail; the technical problem is classified and indexed more accurately.

In a third aspect, the present invention provides a computer storage medium having stored therein instructions which, when run on a computer, cause the computer to perform the method of the first aspect described above.

In a fourth aspect, the present invention provides a terminal, comprising:

a processor, a memory, wherein,

the memory is used for storing a computer program which,

the processor is configured to retrieve and execute the computer program from the memory, so that the terminal performs the method according to the first aspect.

In a fifth aspect, the present invention provides a computer program product comprising instructions which, when run on a computer, cause the computer to perform the method of the first aspect described above.

The invention has the advantages that the processing efficiency of the technical problems of the system is objectively evaluated, the technical service personnel is helped to improve the working efficiency, and meanwhile, objective basis is provided for the work result assessment of the technical service personnel.

In addition, the invention has reliable design principle, simple structure and very wide application prospect.

Drawings

In order to more clearly illustrate the embodiments or prior art solutions of the present invention, the drawings used in the description of the embodiments or prior art will be briefly described below, and it is obvious for those skilled in the art to obtain other drawings without creative efforts.

Fig. 1 is a flowchart of a server system fault collection processing method provided by the present invention.

Fig. 2 is a schematic block diagram of a server system fault collection and processing apparatus provided in the present invention.

The system comprises a data acquisition module, a 2-keyword library establishment module, a 3-problem type identification module, a 4-problem processing efficiency history library module, a 5-problem processing efficiency analysis module and a 6-problem processing efficiency display module.

Detailed Description

In order to make those skilled in the art better understand the technical solution of the present invention, the technical solution in the embodiment of the present invention will be clearly and completely described below with reference to the drawings in the embodiment of the present invention, and it is obvious that the described embodiment is only a part of the embodiment of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Example 1:

as shown in fig. 1, the method for collecting and processing a fault of a server system provided in this embodiment includes the following steps:

s1: the data acquisition step comprises:

the object of data acquisition comprises an existing problem processing system and an EXCEL table for recording the problem process; the content of data acquisition comprises specific problem description, problem proposing time, problem starting processing time, problem each turn recovery time, reply content, problem closing time and a problem final processing scheme. Problem data are collected in an all-around mode; the comprehensiveness and the accuracy of problem data acquisition are improved.

The technical problems are divided into three categories according to the performance characteristics of the problem data:

keywords in the question category keyword library include, but are not limited to, the following fields:

S3: step of problem type identification: a problem classification keyword library is connected, keyword identification is carried out on specific problem descriptions, and the identified highest problem type is used as the problem type of the problem;

S4: recording the problem processing efficiency:

S5: analyzing the problem processing efficiency:

problem data is obtained from the data acquisition step, and the actual processing duration of the problem is calculated. Analyzing the processing efficiency of the problem according to the identified problem type and three data of the average processing time, the longest processing time and the shortest processing time of the problem of the type,

If the average processing time length is less than the actual problem processing time length less than the maximum processing time length, the problem processing efficiency =1- ((maximum processing time length-actual problem processing time length)/average processing time length)

And analyzing the obtained problem processing efficiency data, and storing the problem processing efficiency data for subsequent system function expansion and data analysis.

S6: the step of problem treatment efficiency:

and (5) taking charge of system interface display. And sending the problem processing efficiency data to a problem processor, a supervisor of the problem processor and other people needing the data by mails, wherein the data is used as a means for supervising people to improve the work efficiency and an assessment basis of the work result.

Example 2:

as shown in fig. 2, the apparatus for acquiring and processing a server system fault provided in this embodiment includes:

the data acquisition module 1 is used for acquiring problem data of technical problems appearing in the server system and classifying the acquired technical problems according to the performance characteristics of the problem data;

The technical problem is divided into three categories according to the performance characteristics of the problem data:

secondly, the system analyzes the problems, the system can normally operate, but the problems of potential threats and hidden dangers to service operation caused by system service starting failure, severe error logs related to software and hardware and system performance reduction occur;

the third category, technical consultation category, does not affect the client business operation and test progress, and helps confirm and log analysis issues, such as: system tools, command use consultation, system installation configuration consultation, kernel upgrading and drive compiling technical support problems.

A key word library establishing module 2, which establishes a problem classification key word library according to the system technical problem classification, wherein the problem classification key word library is used for being butted with a problem type identification module to match key words and identify the problem type of a specific problem;

The problem type identification module 3 is used for butting a problem classification keyword library and carrying out keyword identification on specific problem descriptions, and the identified highest problem type is used as the problem type of the problem;

the problem processing efficiency history library module 4 is used for storing the historical data of the total processing time of all similar problems in a classified manner according to different problem types, and calculating and updating the average processing time, the longest processing time and the shortest processing time of different types of problems in real time; these 3 data are used as reference data for evaluating the efficiency of problem processing.

The problem processing efficiency analysis module 5 is used for obtaining problem data from the data acquisition module and calculating the actual processing time of the problem; analyzing the processing efficiency of the problem according to the problem type identified by the problem type identification module and three data of the average processing time, the longest processing time and the shortest processing time of the type of problem obtained from the problem processing efficiency historical library module;

And storing the analyzed problem processing efficiency data into a problem processing efficiency history library module for subsequent system function expansion and data analysis.

And the problem processing efficiency display module 6 is responsible for displaying a system interface, sending the problem processing efficiency data to a problem processor, a supervisor of the problem processor and other people needing the data by mails, and is used as a means for supervising people to improve the work efficiency and an assessment basis of work results.

Example 3:

the present embodiment provides a computer storage medium having stored therein instructions that, when run on a computer, cause the computer to perform the method of embodiment 1 described above.

Example 4:

the present embodiment provides a terminal, including:

a processor, a memory, wherein,

the memory is used for storing a computer program which,

the processor is configured to call and run the computer program from the memory, so that the terminal executes the method described in embodiment 1 above.

Example 5:

the present embodiment provides a computer program product comprising instructions which, when run on a computer, cause the computer to perform the method of embodiment 1 above.

Although the present invention has been described in detail by referring to the drawings in connection with the preferred embodiments, the present invention is not limited thereto. Various equivalent modifications or substitutions can be made on the embodiments of the present invention by those skilled in the art without departing from the spirit and scope of the present invention, and these modifications or substitutions are within the scope of the present invention/any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims

1. A server system fault acquisition and processing method is characterized by comprising the following steps:

s1: the data acquisition step comprises:

s4: recording the problem processing efficiency:

according to different problem types, storing historical data of total processing time of all similar problems in a classified manner, and calculating and updating average processing time, longest processing time and shortest processing time of different types of problems in real time;

s5: analyzing the problem processing efficiency:

obtaining problem data from the data acquisition step, calculating the actual processing time of the problem, analyzing the processing efficiency of the problem according to the identified problem type and three data of the average processing time, the longest processing time and the shortest processing time of the problem,

if the actual problem processing time length = the average processing time length, the problem processing efficiency =1;

if the shortest processing time < the actual problem processing time < the average processing time, the problem processing efficiency =1+ ((actual problem processing time-shortest processing time)/average processing time);

if the average processing time length is less than the problem actual processing time length less than the longest processing time length, the problem processing efficiency =1- ((longest processing time length-problem actual processing time length)/average processing time length);

analyzing the obtained problem processing efficiency data, and storing the problem processing efficiency data for subsequent system function expansion and data analysis;

s6: the step of problem processing efficiency:

and the system is responsible for displaying the system interface, and sends the problem processing efficiency data to a problem processor, a supervisor of the problem processor and other people needing the data by mails.

2. The method for collecting and processing the fault of the server system according to claim 1, wherein in the step S1, the object of data collection includes an existing problem processing system and an EXCEL form for recording a problem process; the content of data acquisition comprises specific problem description, problem proposing time, problem starting processing time, problem each turn recovery time, reply content, problem closing time and a problem final processing scheme.

3. The method for collecting and processing the fault of the server system according to claim 2, wherein in the step S2, the keywords in the keyword library of the problem classification include but are not limited to the following fields:

downtime problem-downtime, automatic system restart, system installation failure, system startup error reporting, automatic machine restart, blue screen, crash, system no response;

system analysis type problems-service startup failure, severe error, warning, performance reduction, performance failure to reach standard;

technical consultation type problems-tool usage, command usage, installation configuration, kernel upgrade, driver compilation.

4. A server system fault acquisition and processing device is characterized by comprising:

the problem processing efficiency history library module is used for storing the historical data of the total processing time of all similar problems in a classified manner according to different problem types, and calculating and updating the average processing time, the longest processing time and the shortest processing time of different types of problems in real time;

the problem processing efficiency analysis module is used for acquiring problem data from the data acquisition module and calculating the actual processing time of the problem; analyzing the processing efficiency of the problem according to the problem type identified by the problem type identification module and three data of the average processing time, the longest processing time and the shortest processing time of the type of problem obtained from the problem processing efficiency historical library module;

analyzing the obtained problem processing efficiency data, storing the problem processing efficiency data into a problem processing efficiency history library module, and using the problem processing efficiency history library module as subsequent system function expansion and data analysis;

and the problem processing efficiency display module is responsible for displaying the system interface and sending the problem processing efficiency data to the problem processing person, the supervisor of the problem processing person and other persons needing the data by mails.

5. The server system fault collection and processing device according to claim 4, wherein in the data collection module, the data collection objects include existing problem processing systems and EXCEL tables for recording problem processes; the data acquisition content comprises specific problem description, problem presentation time, problem processing starting time, problem each-turn recovery time, recovery content, problem closing time and a problem final processing scheme.

6. The server system fault collection and processing device according to claim 5, wherein in the keyword library establishing module, the keywords in the problem classification keyword library include but are not limited to the following fields:

downtime problem-downtime, automatic system restart, system installation failure, system startup error reporting, automatic machine restart, blue screen, crash, system no-response;

system analysis problems-service startup failure, severe error, warning, performance degradation, performance not up to standard;

technical consultation problems, namely tool use, command use, installation configuration, kernel upgrade and drive compilation, and divide the technical problems of the system in detail.

7. A computer storage medium having stored therein instructions that, when run on a computer, cause the computer to perform the method of any one of claims 1-3.