CN112948132A - Vectorization method of cloud service event and service level contract data - Google Patents

Vectorization method of cloud service event and service level contract data Download PDF

Info

Publication number
CN112948132A
CN112948132A CN202110372833.0A CN202110372833A CN112948132A CN 112948132 A CN112948132 A CN 112948132A CN 202110372833 A CN202110372833 A CN 202110372833A CN 112948132 A CN112948132 A CN 112948132A
Authority
CN
China
Prior art keywords
violation
state
cloud service
event
instance
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110372833.0A
Other languages
Chinese (zh)
Other versions
CN112948132B (en
Inventor
李肖坚
张翠萍
杨昊澎
黄程灵
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangxi Normal University
Original Assignee
Guangxi Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangxi Normal University filed Critical Guangxi Normal University
Priority to CN202110372833.0A priority Critical patent/CN112948132B/en
Publication of CN112948132A publication Critical patent/CN112948132A/en
Application granted granted Critical
Publication of CN112948132B publication Critical patent/CN112948132B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5011Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
    • G06F9/5016Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals the resource being the memory
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/4881Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/542Event management; Broadcasting; Multicasting; Notifications

Abstract

The invention discloses a vectorization method of cloud service events and service level contract data thereof, which comprises the steps of starting from fragment data and contract data of discrete cloud service events, formalizing the events and constructing state elements of the events, formalizing the contracts and extracting violation elements and indexes of the violation elements, mapping a connection tuple of event conditions and service level contracts, quantizing the state elements and the indexes of the state elements, generating a condition-index vector sample sequence of the cloud service events and the service level contracts, and taking the condition-index vector sample sequence as vectorization trace data of cloud server log suspected violation. The condition-index vector sample obtained by the method can be used for neural network deep judgment aiming at event violations such as entity states or entity contact and the like, and can also be used for intrusion detection, event investigation, tracking and tracing and the like based on deep learning.

Description

Vectorization method of cloud service event and service level contract data
Technical Field
The invention relates to the technical field of network security of cloud servers, and aims to provide a vectorization method for cloud service events and service level contract data thereof.
Background
A Cloud Server (Cloud Server) is a physical or virtual infrastructure that executes application programs and information processing storage. The physical server is divided into a plurality of virtual servers through virtualization software, and an Infrastructure as a service (IaaS) architecture is applied to process the workload and store information, so that a user can remotely access the functions of the virtual servers through an online interface. An ari cloud server (ECS) is a simple, efficient, and processing-capacity-elastically scalable IaaS-level cloud computing Service, and is shown in fig. 1 as a cloud server structure diagram. A cloud service is a behavior in which a Process (Process) of a cloud server replies to information as requested by a specific protocol. And the cloud service log is a trace record of service behavior.
Network security is the ability to protect the hardware, software and their data of a network system from attacks, intrusions, interferences, damages or unauthorized accesses and other unexpected emergencies, to keep them in a stable and reliable operating state, and to ensure the confidentiality, integrity, authenticity, availability and resistance to repudiation of the network data. In the ECS architecture, the security group is essentially a virtual firewall that defines access traffic (i.e., which applications can be accessed) by some rules, and has state detection and packet filtering capabilities for partitioning security domains in the cloud. Security group rules may allow or disallow access to and from both extranets and intranets of ECS instances. That is, by configuring security group rules, ingress and egress traffic to and from ECS instances within a security group can be controlled. An ECS instance must belong to at least one security group. When creating an instance, a security group needs to be selected for network access control.
Big data computing service (MaxCompute) is a cloud computing service developed autonomously by alrbaba for processing structured and semi-structured big data. The method adopts an abstract operation processing framework to unify the computing tasks of different scenes on the same platform, shares security, storage, data management and resource scheduling, and provides a unified programming interface and interface for the data processing tasks from different user requirements. The system supports SQL processing compatible with standard syntax, an extended MapReduce programming framework and the like.
A Service Level Agreement (SLA) is a contract that is formally negotiated between a Service provider and a client to ensure a desired Level of Service. It specifies service availability level indicators and indemnity schemes for cloud services provided by the Ali cloud to customers. And the violation refers to the behavior that the cloud service provider or the customer does not reach the contract. A cloud service provided by a cloud service provider may be blamed if its computing performance (e.g., availability, security, etc.) does not meet service contract requirements.
The cloud service events are generally stored in the form of cloud service logs, and the log data volume is large, discrete and discontinuous, and further comprises non-numerical character strings. The cloud service level index is mostly described by natural language or characters, and has semantic interval with the cloud service log. Neither service log data nor grade index data can directly participate in calculation, and the method is more adverse to the neural network deep judgment of event violation such as entity state or entity contact.
The existing data vectorization method has the following four defects in processing cloud service events and level contract data thereof:
the first is that the vectorization method relying on the context or word stock is constrained by the structural data and word stock, can vectorize only limited character strings, cannot process a large number of and any character strings, and is not suitable for processing unstructured data. In addition, the method is not strong in adaptability and consumes time, and the model needs to be trained again when different data are input;
the second is that the traditional vectorization method for processing time series data is not suitable for processing non-time series data;
thirdly, the existing index quantification method only considers indexes such as time or resource use conditions singly, and is difficult to synthesize and quantify multi-party indexes such as time, quantity, operation and resource use conditions.
Fourth is that no vectorization is specific to a service event and its contract data.
Therefore, the existing data vectorization method cannot process discrete non-time sequence event fragment data, cannot process multi-index contract data at the same time, lacks a path for converting log data into violation semantics, and needs to construct an index for converting the log data into the violation semantics.
Disclosure of Invention
The method aims to solve the technical problem that the existing method cannot process discrete non-time sequence event fragment data and multi-index contract data at the same time. The invention provides a vectorization method of cloud service events and level contract data thereof. Firstly, formalizing a cloud service event and constructing a state element of the event; secondly, formalizing a service level contract and extracting violation elements and indexes thereof; thirdly, mapping the relation between the event condition and the service level to obtain a contact tuple of 'state element-violation element'; fourthly, constructing a 'status-index' contact tuple according to the 'status element-violation element' contact tuple and the index; fifthly, quantizing the 'status-index' contact tuple according to a quantization rule; and finally, generating a 'condition-index' vector sample of the cloud service event.
The vectorization method of the cloud service event and the level contract data thereof has the advantages that:
the method provides a way for converting the semantics of the event and the contract data thereof into the semantics of violation by formalizing the cloud service event and the contract thereof and respectively constructing the state element and the violation element.
Secondly, the method is not restricted by whether the log data is complete or not, and only needs the discontinuous fragment data to generate the data.
Compared with a vectorization method for describing data by only one dimension of a high-dimensional vector, the method of the invention generates less redundant dimensions and brings less pressure to calculation and storage.
The method can simultaneously integrate multi-party indexes such as time, quantity, operation, resource use condition and the like to measure the events.
The method generates violation judgment which can be used as the state change event of the entity or the contact event between the entities.
The method can be expanded to be used for intrusion detection, event investigation, tracing and tracing.
Drawings
Fig. 1 is a diagram of a cloud server ECS structure.
FIG. 2 is a map of cloud service events and their tier contracts of the present invention.
Fig. 3 is a flowchart of a vectorization method of cloud service events and their level contract data according to the present invention.
Fig. 4 is a flow chart of quantization of state elements in the present invention.
FIG. 5 is an accuracy, precision, and recall of applying KNN to determine long-tailed violations.
Fig. 6 is a false positive rate for applying KNN to determine a long-tailed violation.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings.
The objects processed by the invention are cloud Service logs and cloud Service Level contract data (SLA) thereof. Each cloud service log is sourced from an MR jobmodule, an SQL jobmodule and the like in the big data computing service storage/computing layer. The big data computing service is a proprietary cloud enterprise edition of Ali cloud, published on page 11 of big data computing service product introduction, 11/18/2020. Product version: v3.12.0 are provided. The version of big data computing service MaxCompute service level contract has effective date of 2018, month 02 and day 01.
In the invention, SLA restricts the responsibility of the cloud service, and the user submits the demand to the cloud service by taking JOB JOB as a unit. Therefore, the unit that the SLA can be mapped out most completely is a job. If the request of JOB happens to be assumed by a TASK, the responsibility can be mapped to the TASK.
Formalizing cloud service events
The cloud service log records the execution status of the job. A log records the operating conditions of an instance. A job is composed of one or more tasks, and a task is composed of one or more instances.
In the present invention, the JOB is denoted as JOB; TASK, noted as TASK.
The JOB comprises a plurality of tasks, and the task set is expressed as M in a set formTA, and
Figure BDA0003010018470000021
Figure BDA0003010018470000022
indicating the first task belonging to the JOB.
Figure BDA0003010018470000023
Indicating that the second task belongs to a JOB.
Figure BDA0003010018470000024
Indicating the ith task belonging to the JOB.
Figure BDA0003010018470000025
Indicating the last task belonging to the JOB.
For the sake of convenience in explaining the invention, the subscript i denotes the identification number of the task, i.e. the
Figure BDA0003010018470000026
Also referred to as any one task. The lower subscript m represents the total number of tasks.
In the invention, an example is denoted as INST; any one of the tasks
Figure BDA0003010018470000031
There are multiple instances, and the set of instances is expressed in a set form as
Figure BDA0003010018470000032
And is
Figure BDA0003010018470000033
Figure BDA0003010018470000034
Indicates belonging to any oneA task
Figure BDA0003010018470000035
The first example of (1).
Figure BDA0003010018470000036
Indicating belonging to any one of the tasks
Figure BDA0003010018470000037
The second example of (1).
Figure BDA0003010018470000038
Indicating belonging to any one of the tasks
Figure BDA0003010018470000039
The j-th instance of (1).
Figure BDA00030100184700000310
Indicating belonging to any one of the tasks
Figure BDA00030100184700000311
The last example of (1).
For the sake of convenience in explaining the present invention, the subscript j denotes an identification number of an example, i.e.
Figure BDA00030100184700000312
Also referred to as any one instance under the task. The lower subscript n represents the total number of instances.
Content of field contained in cloud service event
In the present invention, any one of the examples will be given
Figure BDA00030100184700000313
The record is taken as a cloud service event. Then, any one of the examples
Figure BDA00030100184700000314
ComprisingCloud service event field content, noted
Figure BDA00030100184700000315
And is
Figure BDA00030100184700000316
The start _ time represents the start time of the instance. end _ time represents the end time of the instance. The machine _ id represents a cloud server identification. task _ name represents the task name. job _ name represents a job name. inst _ name represents an instance name. seq _ no represents the number of instance retries. total _ seq _ no represents the total number of instance retries. status represents the state of the instance. CPU _ avg represents the average CPU utilization of an instance. CPU _ max represents the maximum CPU utilization of the instance. mem _ avg represents the average memory usage of the instance. mem _ max represents the maximum memory usage of the instance.
In the invention, the cloud service event field content
Figure BDA00030100184700000317
Is used for constructing a cloud service event state element set
Figure BDA00030100184700000318
And cloud service event violation element set
Figure BDA0003010018470000041
Of (2) is used.
In the invention, if the cloud service event state element is collected
Figure BDA0003010018470000042
A certain state element in (1) is a violation element set related to a violation cloud service event
Figure BDA0003010018470000043
A factor of a violation element in (1), then a cloud service event
Figure BDA0003010018470000044
The state element of (1) is the violation element.
State element of cloud service event
In the present invention, cloud service events
Figure BDA0003010018470000045
Each field in (a) is a constituent element of a sentence. The sentence structure component method is applied, and the subject part and the predicate part of the sentence are divided by double vertical lines. One time cloud service event
Figure BDA0003010018470000046
An expression of "main predicate expression" is denoted as SYS _ EVENT, and SYS _ EVENT [ period ]]The (specific) example | [ retry ] is]In the form of state<Load(s)>。
Referring to fig. 2, in the mapping diagram of the cloud service EVENT and its level contract, in the present invention, a cloud service EVENT state element set is formed by seven state elements, which is denoted as EVENT _ STATUS, and
Figure BDA0003010018470000047
in the present invention, the seven state elements are:
the duration state element TIME describes a duration state of the cloud service event, and is denoted as TIME ═ start _ TIME, end _ TIME }.
LOCATION state element LOCATION, which describes the LOCATION state of the cloud service event, is denoted as LOCATION ═ machine _ id, job _ name, task _ name }.
A NUMBER state element NUMBER, which describes the NUMBER state of the cloud service event, is denoted as NUMBER ═ inst _ name.
The RETRY state element RETRY describes a RETRY state of the cloud service event, which is denoted as RETRY ═ { seq _ no, total _ seq _ no }.
The OPERATION state element OPERATION, which describes the OPERATION state of the cloud service event, is denoted as OPERATION ═ status }.
A CPU load state element CPU describing a CPU load state of the cloud service event, which is denoted as CPU ═ CPU _ avg, CPU _ max }.
Memory load state element MEM, which describes the memory load state of a cloud service event, is denoted MEM ═ MEM _ avg, MEM _ max }.
Violation element of cloud service event
In the invention, the set of VIOLATION elements of the cloud service event is marked as VIOLATION, and
Figure BDA0003010018470000051
vf _ longTail represents the instance level duration-violation element.
vf location represents the location element at the instance level-violation element.
vf _ number represents the number element of the job level-violation element.
vf retry represents retry-violation element at instance level.
vf operation represents an instance level operation element-violation element.
vf _ CPU represents the instance level CPU load element-violation element.
vf mem represents the instance level memory load element-violation element.
In the invention, the illegal element refers to a cloud service event-situation specification
Figure BDA0003010018470000052
Refers to elements that violate the specification. Constructing the violation elements extracted from the SLAS to obtain a cloud service event violation element set
Figure BDA0003010018470000053
Formalized service level contracts
Referring to fig. 2, a mapping chart of a cloud service event and a level contract thereof constructed by the present invention is shown, if the cloud service event
Figure BDA0003010018470000054
Is embodied in the sentence SYS _ EVENT ═ time period]The (specific) example | [ retry ] is]In the form of state<Load(s)>The predicate and the predicate complement components. Multiple events
Figure BDA0003010018470000055
The number of bursts violation is reflected in the sentence SYS _ EVENT ═ time period]The (specific) example | [ retry ] is]In the form of state<Load(s)>Is counted, e.g. NUMBER state element NUMBER.
In the invention, a cloud service event-situation stipulation SLAS is set according to a big data computing service MaxCommute service level contract. SLAS includes 7 conventions, expressed in sets as:
Figure BDA0003010018470000061
sla4inst _ time represents the duration element specification at the instance level. sla4inst _ location represents the instance level specification of the location element. sla4job _ number represents the number element specification of the job level. sla4inst _ retry represents retry meta-reduction at the instance level. sla4inst _ operation represents an example level operational specification. sla4inst _ CPU represents the example-level CPU load specification. sla4inst _ mem represents the instance level memory load specification.
Instance level long meta-protocol
The duration element specification at the instance level is noted as sla4inst _ time. The sla4inst _ TIME refers to the specification of the TIME status element TIME. The Chinese expression is as follows: and if the running time period of a certain instance in the task is more than or equal to 3 times the average running time period of all instances, the instance has a long tail.
The example-level duration state element specification of the chinese expression is formalized as formula (1):
Figure BDA0003010018470000062
Figure BDA0003010018470000063
v denotes a condition of predicate decision.
When the predicate decision result of v is
Figure BDA0003010018470000064
Then equation (1) is violated and it is marked as violation-instance level duration element convention, i.e. the state element is violation element vf longTail.
When the predicate decision result of v is not
Figure BDA0003010018470000065
The instance-level length element specification is satisfied.
Predicate IsLongTail (-) represents an instance
Figure BDA0003010018470000066
Is greater than or equal to the long tail indicator.
longtail _ metric represents an example
Figure BDA0003010018470000067
Index of long tail.
n represents a task
Figure BDA0003010018470000068
Examples of (A) to (B)
Figure BDA0003010018470000069
The total number of (a).
Figure BDA00030100184700000610
Showing examples
Figure BDA00030100184700000611
The end time of (c).
Figure BDA00030100184700000612
Showing examples
Figure BDA00030100184700000613
The start time of (c).
Figure BDA00030100184700000614
Showing examples
Figure BDA00030100184700000615
The operating period of (c).
Instance-level duration violation elements and their indices are extracted from sla4task _ time.
In sla4task _ time, the element of the suspected violation convention is the runtime of the instance, i.e., reflected on the duration state element. Therefore, the duration element-violation element vf _ longTail is an example running period, which is denoted as vf _ longTail ═ end _ time, start _ time >.
Example Long tailed violation index longTail _ metric is 3 times the average run time period of all examples, expressed as
Figure BDA0003010018470000071
Instance level specification of location elements
The position meta-convention at instance level, is denoted as sla4inst _ location. The sla4inst _ LOCATION refers to the specification of the LOCATION state element LOCATION. The Chinese expression is as follows: if one instance on one cloud server is in a failure state, the instance disables the machine. If the instance enables the disabled machine, then the instance is deemed to have a location violation.
The example-level position element of the Chinese expression is formalized as formula (3):
Figure BDA0003010018470000072
Figure BDA0003010018470000073
v denotes a condition of predicate decision.
When the predicate decision result of v is
Figure BDA0003010018470000074
Then equation (3) is violated and the rule is written as a violation-instance level position element convention, i.e. the state element is a violation element vf _ location。
When the predicate decision result of v is not
Figure BDA0003010018470000075
The instance-level position meta-convention is satisfied.
Predicate IsUsued (-) represents an instance
Figure BDA0003010018470000076
The status of (1) is Uue (whether cloud server status is available).
Predicate IsFailed (-) represents an instance
Figure BDA0003010018470000077
The state of (1) is Failed.
The forbidden _ machine represents the identity of the cloud server that is disabled.
Figure BDA0003010018470000078
Representing an instance of operation
Figure BDA0003010018470000079
The identification of the cloud server.
Figure BDA00030100184700000710
Representing an instance of operation
Figure BDA00030100184700000711
Of another cloud server.
Figure BDA00030100184700000712
Showing examples
Figure BDA00030100184700000713
The state of (1).
The location violation at instance level and its index are extracted from sla4inst _ location.
In sla4inst _ location, the element relating to the violation is the identification of the cloud server that carried the instance running, i.e., reflected on the location state element. Job names and task names can uniquely identify instances. Therefore, the location element-violation element vf _ location is expressed as vf _ location ═ machine _ id, job _ name, task _ name }.
And the index location _ metric of the location violation is the device id of the disabled machine, denoted as location _ metric ═ machine _ id.
Job-level number element specification
And the number element specification of the job level is recorded as sla4 jobnumber. The sla4 jobnumber refers to the NUMBER state element NUMBER reduction. The Chinese expression is as follows: if the number of Reduce instances of a job exceeds 2000, or the number of Map instances of a job exceeds 8000, then the number of instances of the job is overrun.
The Chinese expression job level quantity element is reduced and expressed into a formula (5):
Figure BDA0003010018470000081
Figure BDA0003010018470000082
Figure BDA0003010018470000083
rNumber_metric=2000 (8)
mNumber_metric=8000 (9)
v denotes a condition of predicate decision.
When the predicate decision result of v is
Figure BDA0003010018470000084
Then equation (5) is violated and it is noted as violation-job level number element convention, i.e. the state element is violation element vf number.
When predicate of v is determinedThe result is not
Figure BDA0003010018470000085
The job level number meta-convention is satisfied.
The predicate IsOverRedNumber (·) represents that the number of Reduce instances of a job exceeds its quantity index.
The predicate IsOverMapNumber (·) indicates that the number of Map instances for a job exceeds its quantity index.
The predicate IsReduceTask (·) indicates that the argument is a Reduce task.
The predicate IsMapTask (·) indicates that the argument is a Map task.
rInstNumberOfJob indicates the number of Reduce instances in a job.
mInstNumberOfJob indicates the number of Map instances in a job.
rNumber _ metric represents the number of Reduce instances index in one job.
mNumber _ metric represents an index of the number of Map instances in one job.
Figure BDA0003010018470000086
Indicating the task name to which the instance belongs.
The number of job-level violation elements and their indices are extracted from sla4 jobnumber.
In sla4job _ number, the element suspected of a violation is the number of instances, i.e., reflected on the number state element. Therefore, the location element-violation element vf _ number is expressed as vf _ number ═ { inst _ name }.
The number violation index is that the number of Reduce instances of the job exceeds 2000, or the number of Map instances of the job exceeds 8000, so the number violation index number _ metric of the job level is expressed as number _ metric ═ 2000,8000 }.
Retry meta-specification at instance level
The retry meta-convention at the instance level is denoted as sla4inst _ retry. The sla4inst _ RETRY refers to the specification of the RETRY state element RETRY. The Chinese expression is as follows: if the retry number exceeds 3 after the Map instance or Reduce instance fails, the retry of the instance is over-limit.
Formalizing an example level retry meta-reduction for a Chinese expression as equation (10)
Figure BDA0003010018470000091
v denotes a condition of predicate decision.
When the predicate decision result of v is
Figure BDA0003010018470000092
Then equation (10) is violated and the rule is written as retry element at the violation-instance level, i.e. the state element is a violation element vf retry.
When the predicate decision result of v is not
Figure BDA0003010018470000093
The meta-convention is retried to satisfy the instance level.
The predicate IsOverRetry (·) indicates that the number of retries or the total number of retries for an instance exceeds the retry violation index.
Predicate IsFailed (-) represents an instance
Figure BDA0003010018470000094
The state of (1) is Failed.
The predicate IsReduceTask (·) indicates that the argument is a Reduce task.
The predicate IsMapTask (·) indicates that the argument is a Map task.
Figure BDA0003010018470000095
Showing examples
Figure BDA0003010018470000096
The number of retries and the total number of retries.
retry _ metric represents a retry violation indicator.
The retry violation and its index at the instance level are extracted from sla4inst _ retry.
At sla4inst _ retry, the element of the suspected violation is the number of retries or total number of retries of the instance, i.e., reflected on the retry status element. Therefore, the retry element — violation element vf _ retry is expressed as vf _ retry ═ { seq _ no, total _ seq _ no }. Its retry violation indicator retry _ metric is 3.
Instance level specification of operation elements
The specification of the operation element at the instance level is denoted as sla4inst _ operation. The sla4inst _ OPERATION refers to the specification of the OPERATION state element OPERATION. The Chinese expression is as follows: if the instance is operating for itself (e.g., the instance exceeds the number of restarts, etc.) and the instance Failed state, then the instance operation is not violated; if an instance is Interrupted by the system because the CPU or memory load exceeds its limit, and the instance is in an Interrupted state, then the instance operates in violation.
The example-level operation element specification for the chinese expression is formalized as equation (12):
Figure BDA0003010018470000101
retry_metric=3 (13)
v denotes a condition of predicate decision.
When the predicate decision result of v is
Figure BDA0003010018470000102
Then equation (12) is violated and it is marked as violation-instance level operation element specification, i.e. the state element is violation element vf operation.
When the predicate decision result of v is not
Figure BDA0003010018470000111
The specification of the instance level operation element is satisfied.
Predicate IsFailed (-) represents an instance
Figure BDA0003010018470000112
The state of (1) is Failed.
The predicate IsOverRetry (·) indicates that the number of retries or the total number of retries for an instance exceeds the retry violation index.
The predicate IsReduceTask (·) indicates that the argument is a Reduce task.
The predicate IsMapTask (·) indicates that the argument is a Map task.
The predicate IsInterrupted (-) represents an instance
Figure BDA0003010018470000113
The state is Interrupted.
Predicate IsOverPlanCPU (-) represents an instance
Figure BDA0003010018470000114
Over the projected CPU load limit.
Predicate IsOverPlanEM (-) represents an example
Figure BDA0003010018470000115
Exceeds the planned memory load limit.
Figure BDA0003010018470000116
Showing examples
Figure BDA0003010018470000117
The CPU load of (1).
Figure BDA0003010018470000118
Showing examples
Figure BDA0003010018470000119
The memory load of (2).
Figure BDA00030100184700001110
Showing examples
Figure BDA00030100184700001111
The number of retries and the total number of retries.
The instance level operation violation and its index are extracted from sla4inst _ operation.
In sla4inst _ operation, the element suspected of a violation is the state of the instance, i.e., reflected on the operand. Therefore, the operand-violation argument vf _ operation is expressed as vf _ operation ═ status }.
Among them, Failed and Interrupted of an instance are the manifestation of an operation violation. The operation violation index operation _ metric of the example level is thus denoted as operation _ metric { 'Failed', 'Interrupted'.
Instance level CPU load meta-specification
The example level CPU load meta specification, noted sla4inst _ CPU. The sla4inst _ CPU refers to the specification of the CPU load state element CPU. The Chinese expression is as follows: if the instance's CPU load exceeds its limit (e.g., the projected CPU load), then the instance CPU load is overrun.
The example-level CPU load element specification for the chinese expression is formalized as equation (14):
Figure BDA00030100184700001112
v denotes a condition of predicate decision.
When the predicate decision result of v is
Figure BDA00030100184700001113
Then equation (14) is violated and it is marked as violation-instance level CPU load element specification, i.e. the state element is violation element vf _ CPU.
When the predicate decision result of v is not
Figure BDA00030100184700001114
The specification of the instance level CPU load element is satisfied.
Predicate IsOverPlanCPU (-) represents an instance
Figure BDA0003010018470000121
Over the projected CPU load limit.
Instance-level CPU load violation elements and their indices are extracted from sla4inst _ CPU.
In sla4inst _ CPU, the element suspected of a violation is the CPU load of the instance, i.e., reflected on the CPU load state element. Therefore, the CPU load element — violation element vf _ CPU is expressed as vf _ CPU ═ CPU _ avg, CPU _ max.
The indicator of CPU load violation is the CPU load limit plan _ CPU that the system allocates to the instance, so the example-level CPU load violation indicator CPU _ metric is { plan _ CPU }.
Instance level memory load meta-specification
The example level memory load meta-convention is denoted as sla4inst _ mem. The sla4inst _ MEM refers to the specification of the memory load state element MEM. The Chinese expression is as follows: if the instance's memory load exceeds its limit (e.g., projected memory load), then the instance's memory load is overrun.
The example-level memory load element of the Chinese expression is reduced and expressed as a formula (15):
Figure BDA0003010018470000122
v denotes a condition of predicate decision.
When the predicate decision result of v is
Figure BDA0003010018470000123
Then equation (15) is violated and the violation-instance level specification of the memory load element is recorded, i.e. the state element is the violation element vf mem.
When the predicate decision result of v is not
Figure BDA0003010018470000124
Then the instance memory load meta-convention is satisfied.
Predicate IsOverPlanEM (-) represents an example
Figure BDA0003010018470000125
Exceeds the planned memory load limit.
The instance level memory load violation and its index are extracted from sla4inst _ mem.
At sla4inst _ mem, the element suspected of a violation is the memory load of the instance, i.e., reflected on the memory load state element. Thus, the memory load element-violation element vf _ mem ═ { mem _ avg, mem _ max }.
The index of the memory load violation is the memory load limit plan _ mem allocated to the instance by the system, so the example-level memory load violation index mem _ metric is { plan _ mem }.
In the invention, the cloud service log is derived from an MR jobmodule and an SQL jobmodule in a storage/computation layer of the Alice cloud big data computing service (MaxCommute). The method records the operation execution condition of the cloud server in the Ali cloud storage/computing cluster. The service level contract comes from the version with the effective date of 2018, month 02 and day 01.
Referring to fig. 3, the vectorization method of cloud service events and level contract data thereof according to the present invention includes the following steps:
firstly, a cloud service event is formalized;
step 101, collecting logs of a cloud server;
collecting log records of the cloud server execution JOBs, recording the JOBs in the log records as JOBs, wherein a plurality of tasks exist in the JOBs, and expressing a task set as a set form
Figure BDA0003010018470000126
Figure BDA0003010018470000127
There are multiple instances, and the set of instances is expressed in a set form as
Figure BDA0003010018470000128
The cloud service log records operated by the invention are derived from the Alibaba cluster trace v2018 data set.
102, setting field contents of cloud service events;
to take any instance under a task
Figure BDA0003010018470000129
Recording as a cloud service event, then recording the content of the cloud service event field as
Figure BDA00030100184700001210
The above-mentioned
Figure BDA0003010018470000131
The above-mentioned
Figure BDA0003010018470000132
The subscript i in (1) represents the identification number of the task, and the subscript j represents the identification number of the instance.
Step two, constructing a state element of the cloud service event;
step 201, expressing major and minor sentences of cloud service events;
in the present invention, cloud service events
Figure BDA0003010018470000133
Each field in (a) is a constituent element of a sentence. The sentence structure component method is applied, and the subject part and the predicate part of the sentence are divided by double vertical lines. One time cloud service event
Figure BDA0003010018470000134
An expression of "main predicate expression" is denoted as SYS _ EVENT, and SYS _ EVENT [ period ]]The (specific) example | [ retry ] is]In the form of state<Load(s)>。
202, based on the state of the cloud service event represented by the state element;
in the invention, a major predicate SYS _ EVENT ═ time interval is adopted]The (specific) example | [ retry ] is]In the form of state<Load(s)>For example set
Figure BDA0003010018470000135
The field semantics of each instance are subjected to sentence structure component division, a state element set of the cloud service EVENT is constructed and recorded as EVENT _ STATUS, and the EVENT _ STATUS comprises the following contents:
Figure BDA0003010018470000136
in the present invention, a TIME element of a cloud service event is used to describe a TIME state of the cloud service event, where the TIME is { start _ TIME, end _ TIME }.
In the present invention, LOCATION state element LOCATION of a cloud service event is used to describe the LOCATION state of the cloud service event, where the LOCATION is { machine _ id, job _ name, task _ name }.
In the present invention, the NUMBER status element NUMBER of the cloud service event is used to describe the NUMBER status of the cloud service event, and the NUMBER is { inst _ name }.
In the present invention, a RETRY state element RETRY of a cloud service event is used to describe a RETRY state of the cloud service event, where the RETRY state element RETRY is { seq _ no, total _ seq _ no }.
In the invention, the OPERATION state element OPERATION of the cloud service event is used for describing the OPERATION state of the cloud service event, and the OPERATION state element OPERATION is { status }.
In the present invention, the CPU load state element CPU of the cloud service event is used for describing the CPU load state of the cloud service event, where the CPU is { CPU _ avg, CPU _ max }.
In the present invention, a memory load state element MEM of a cloud service event is used for describing the memory load state of the cloud service event, where MEM is { MEM _ avg, MEM _ max }.
Event field content for any one cloud service
Figure BDA0003010018470000137
Constructed set of cloud service event state elements
Figure BDA0003010018470000141
Comprises the following steps:
Figure BDA0003010018470000142
step three, formalizing a service level contract of the cloud service event;
in the invention, a big data computing service MaxCommute service level contract is based on and is combined with the field content of the cloud service event
Figure BDA0003010018470000143
The construction results in a cloud service event-situation conventions, SLAS, which includes 7 conventions.
The above-mentioned
Figure BDA0003010018470000144
Instance level long meta-protocol
Figure BDA0003010018470000145
Instance level specification of location elements
Figure BDA0003010018470000146
Job-level number element specification
Figure BDA0003010018470000147
Retry meta-specification at instance level
Figure BDA0003010018470000148
Example level operational specification
Figure BDA0003010018470000151
Instance level CPU load specification
Figure BDA0003010018470000152
Instance level memory load specification
Figure BDA0003010018470000153
Step four, extracting violation elements;
in the invention, the elements in the rule according to the third step are used as violation elements.
In the invention, the illegal element refers to a cloud service event-situation specification
Figure BDA0003010018470000154
Refers to elements that violate the specification. Constructing the violation elements extracted from the SLAS to obtain a cloud service event violation element set
Figure BDA0003010018470000155
In the present invention, the duration element specification of violation instance level sla4inst _ time is called duration element-violation element vf _ longTail: the vf _ longTail is < end _ time, start _ time >.
In the present invention, the violation of the instance-level position meta convention sla4inst _ location is called position meta-violation meta vf _ location: the vf _ location ═ machine _ id, job _ name, task _ name }.
In the present invention, the number-element-violation-rule sla4 jobnumber, which is a violation of the job-level, is called a number-element-violation-element vf _ number: and vf _ number is { inst _ name }.
In the present invention, the retry meta-convention of violation instance level sla4inst _ retry is called retry meta-violation meta-vf _ retry:
and vf _ retry is { seq _ no, total _ seq _ no }.
In the present invention, the operation element specification sla4inst _ operation at the violation instance level is called the operation element-violation element vf _ operation:
the vf _ operation ═ status }.
In the present invention, a violation of the example-level CPU load element specification sla4inst _ CPU is referred to as CPU load element-violation element vf _ CPU:
and vf _ cpu ═ { cpu _ avg, cpu _ max }.
In the present invention, a violation of the example-level memory load element convention sla4inst _ mem is called a memory load element-violation element vf _ mem:
the vf _ mem ═ { mem _ avg, mem _ max }.
In the present invention, violation means that the cloud service event-situation is not reachedProtocol
Figure BDA0003010018470000161
The behavior of (c). Event violation refers to a cloud service event
Figure BDA0003010018470000162
Violation of the convention
Figure BDA0003010018470000163
Then the
Figure BDA0003010018470000164
Violation.
In the present invention, violation element refers to violation of a convention
Figure BDA0003010018470000165
The factor (1). Violation meta-exposure cloud service event
Figure BDA0003010018470000166
The nature of the violation, and thus the ability to generate the required vector samples for accurately determining the violation, requires the discovery of the cloud service event
Figure BDA0003010018470000167
Factor of violation (i.e., violation element), which becomes a cloud service event
Figure BDA0003010018470000168
The violation element. In order to consider the factors that the cloud service event is possibly suspected of being illegal from multiple aspects, the invention constructs the state element set of the cloud service event
Figure BDA0003010018470000171
Step five, extracting indexes;
in the present invention, event-situation conventions are serviced from the cloud
Figure BDA0003010018470000172
And cloud service event violation element set
Figure BDA0003010018470000173
And extracting the violation limit value as a violation index to obtain a specification-index set METRIC.
The above-mentioned
Figure BDA0003010018470000174
In the present invention, from the instance-level duration element specification sla4inst _ time and "duration element-violation element" vf _ longTail, a duration violation index longTail _ metric is extracted:
Figure BDA0003010018470000175
in the present invention, from the instance-level location element specification sla4inst _ location and location element-violation element vf _ location, the location violation indicator location _ metric is extracted: the location _ metric ═ { machine _ id }.
In the present invention, from the number element specification sla4 jobnumber _ number and number element-violation element vf _ number of the job level, the number violation index number _ metric is extracted: the number _ metric is {2000,8000 }.
In the present invention, a retry violation index retry _ metric is extracted from the retry meta-reduction sla4inst _ retry and retry meta-violation meta-vf _ retry at instance level: the retry _ metric is {3 }.
In the present invention, from the instance-level operation element specification sla4inst _ operation and the operation element-violation element vf _ operation, the operation violation index operation _ metric is extracted: the operation _ metric { 'Failed', 'Interrupted' }.
In the present invention, a CPU load violation index CPU _ metric is extracted from a CPU load element specification sla4inst _ CPU and a CPU load element-violation element vf _ CPU at an instance level: the cpu _ metric is { plan _ cpu }.
In the present invention, a memory violation index mem _ metric is extracted from an instance-level memory load element specification sla4inst _ mem and a memory load element-violation element vf _ mem: the mem _ metric is { plan _ mem }.
Mapping and constructing a condition element-violation element-relation group;
in the invention, according to the collection of the state elements of the cloud service event
Figure BDA0003010018470000181
And cloud service event violation element set thereof
Figure BDA0003010018470000182
And mapping the state element-violation element contact element group to obtain a condition element-violation element-contact element group set which is marked as PSV.
Figure BDA0003010018470000183
PSV _ TIME represents the "duration state element-duration violation element" contact tuple.
PSV _ LOCATION represents a "LOCATION state element-LOCATION violation element" contact tuple.
PSV _ NUMBER represents the "NUMBER state element-NUMBER violation element" contact tuple.
PSV _ RETRY represents a "RETRY status element-RETRY violation element" contact tuple.
PSV _ OPERATION represents an "OPERATION state element-OPERATION violation element" contact tuple.
PSV _ CPU represents the "CPU load state element-CPU load violation element" contact tuple.
PSV _ MEM represents a "memory load status element-memory load violation element" association tuple.
In the invention, according to the TIME length state element TIME of an event and the violation element vf _ longTail thereof, a TIME length state element-TIME length violation element contact tuple PSV _ TIME is mapped:
PSV_TIME=(end_time,start_time)。
in the invention, according to the LOCATION state element LOCATION of an event and the violation element vf _ LOCATION thereof, a LOCATION state element-LOCATION violation element contact tuple PSV _ LOCATION is mapped:
the PSV _ LOCATION ═ is (machine _ id, job _ name, task _ name).
In the invention, according to the NUMBER state element NUMBER and the violation element vf _ NUMBER thereof, the NUMBER state element-NUMBER violation element association tuple PSV _ NUMBER is mapped:
the PSV _ NUMBER ═ is (inst _ name).
In the invention, a RETRY state element-RETRY violation element contact tuple PSV _ RETRY is mapped according to the RETRY state element RETRY of an event and the violation element vf _ RETRY thereof:
the PSV _ RETRY is equal to (seq _ no, total _ seq _ no).
In the invention, according to the OPERATION state element OPERATION and the violation element vf _ OPERATION of the event, the OPERATION state element-OPERATION violation element contact tuple PSV _ OPERATION is mapped:
the PSV _ OPERATION ═ status.
In the invention, according to the CPU load state element CPU of an event and the violation element vf _ CPU thereof, a contact tuple PSV _ CPU of 'CPU load state element-CPU load violation element' is mapped:
the PSV _ CPU ═ CPU _ avg (CPU _ max).
In the invention, according to the memory load state element MEM of an event and the violation element vf _ MEM thereof, a "memory load state element-memory load violation element" contact tuple PSV _ MEM is mapped:
the PSV _ MEM ═ (MEM _ avg, MEM _ max).
In the invention, if the state element of the cloud service event is a factor related to violation of the convention, the state element of the event is a violation element.
Constructing a state element-index element-connection group;
step 701, according to the mapped status element-violation element-association element set
Figure BDA0003010018470000191
And the extracted specification-index set
Figure BDA0003010018470000192
A set of state-index-tuple sets is constructed for the cloud service event,denoted as PSM.
The above-mentioned
Figure BDA0003010018470000193
In the present invention, a duration state element-duration violation element index tuple
Figure BDA0003010018470000201
In the present invention, a location state element-location violation element index tuple
Figure BDA0003010018470000202
In the present invention, a number state element-number violation element index tuple
Figure BDA0003010018470000203
In the present invention, retry state element-retry violation element index tuple
Figure BDA0003010018470000204
In the present invention, an operation state element-operation violation element index tuple
Figure BDA0003010018470000205
In the invention, CPU load state element-CPU load violation element index tuple
Figure BDA0003010018470000206
In the present invention, the memory load state element-memory load violation element index tuple
Figure BDA0003010018470000207
Step 702, according to the state element-index element-tuple set
Figure BDA0003010018470000208
Making a Cartesian product of the condition event and the index to construct a condition-index contact tuple of the cloud service event, and recording the tuple as RSM;
the RSM ═ EVENT _ STATUS (METRIC)
EVENT _ STATUS represents an instance condition EVENT.
METRIC represents an event violation indicator.
In the invention, the constructed cloud service event condition-index contact tuple
Figure BDA0003010018470000211
According to the relation among the state elements of the event, the violation elements of the specification and the indexes of the violation elements, the cloud service event is known to be violated if the state elements of the cloud service event do not accord with or exceed the indexes.
Step eight, generating a status-index vectorization sample of the cloud service event;
the vectorization method is a word2vec method similar to natural language, and cloud service events and service level contract data of the cloud service events are quantized into vectors.
Referring to the state element quantization flow diagram shown in FIG. 4, an example set is read in
Figure BDA0003010018470000212
Traversing each instance condition event
Figure BDA0003010018470000213
If any one of the examples
Figure BDA0003010018470000214
State element of
Figure BDA0003010018470000215
If not, extracting the numerical value in the position state element value and the numerical value; mapping the operand values to different integer values ("Terminated" state mapping to value 0, "Ready" state mapping to value 1, "Running" state mapping to value 2, "Terminating" state mapping to value 3,the "Interrupted" state is mapped to a value of 4 and the "Failed" state is mapped to a value of 5); if the values of the duration state element, the retry state element, the CPU load state element and the memory load state element are numerical values, the numerical values are saved; if the values of the CPU load state element and the memory load state element have null values, the values are filled with 0.
If any one of the examples
Figure BDA0003010018470000216
State element of
Figure BDA0003010018470000217
If it is empty, it indicates that the traversal has been completed and the quantization is completed
Figure BDA0003010018470000221
All state elements of
Figure BDA0003010018470000222
And finally, saving the quantization result of the state element into a file.
The violation indicators relate to the same quantification of the instance's state as the event state elements, with the exception of the operational violation indicators. That is, the operation violation indicator maps to a value of 0, except that the fail state and the interrupt state of the instance are quantized the same as the event state element (i.e., "interleaved" state maps to a value of 4 and "Failed" state maps to a value of 5).
And finally generating a 'condition-index' vector sample of the cloud service event.
In the invention, one operation from the Alibaba cluster trace v2018 data set is selected to generate a 'status-index' vector sample of the cloud service event.
For example, a cloud service instance status event derived from the Alibaba cluster trace v2018 dataset
Figure BDA0003010018470000223
The above-mentioned
Figure BDA0003010018470000224
For example, vectorizing events using the method of the present invention
Figure BDA0003010018470000225
Obtaining a vector sample;
the above-mentioned
Figure BDA0003010018470000226
Line 1 in the left end parenthesis represents the duration state element of the event.
Line 2 in the left end brackets represents the location status element of the event.
Line 3 in the left end brackets represents the number of events state element.
Line 4in the left end brackets represents the retry status element for the event.
Line 5 in the left end brackets represents the operational state element for the event.
Line 6 in the left end brackets represents the CPU load state element for the event.
Line 7 in the left end parenthesis represents the memory load state element for the event.
Line 1 in the right parenthesis represents the long-tailed violation indicator for the event.
Line 2 in the right parenthesis represents the location violation indicator for the event.
Line 3 in the right parenthesis represents the number of events violation indicator.
Line 4in the right parenthesis represents the retry violation indicator for the event.
Line 5 in the right parenthesis represents the operation violation indicator for the event.
Line 6 in the right parenthesis represents the CPU load violation indicator for the event.
Line 7 in the right parenthesis represents the memory load violation indicator for the event.
Step nine, verifying;
the vectorization method of the cloud service event and the service level contract is installed in a K neighbor KNN model to form an improved KNN model. From the data set of the Alibaba cluster trace v2018, cloud service events are randomly selected as a training set and a testing set of a model to judge whether the cloud service events violate rules, wherein a label of '1' indicates violation, and a label of '0' indicates no violation, as shown in Table 1.
TABLE 1 improved KNN model input vector samples generated by the present method
Figure BDA0003010018470000231
Referring to fig. 5 and 6, experimental results show that the method of the present invention can accurately determine the violation of the cloud service event by applying the improved KNN model: the misjudgment rate is maintained below 0.06%, and the accuracy and the recall rate are both maintained at 99% or above. In addition, the method can obtain the technical effects of low misjudgment and high precision only by using a few 1's of sample numbers. The method can provide basis for the configuration of the security group rule of the ECS structure in the aspects of violation, abnormal detection or tracing and the like, thereby achieving the effect of network access control and improving the state detection and data packet filtering capability of the virtual firewall.

Claims (3)

1. A vectorization method of cloud service events and service level contract data is characterized by comprising the following steps:
firstly, a cloud service event is formalized;
step 101, collecting logs of a cloud server;
collecting log records of the cloud server execution JOBs, wherein the JOBs in the log records are marked as JOBs, and a plurality of tasks exist in the JOBs; any one instance under one task is taken
Figure FDA0003010018460000011
Recording as a primary cloud service event;
102, setting field contents of cloud service events;
will be described in
Figure FDA0003010018460000012
Marking and recording the content of the cloud service event fieldIs composed of
Figure FDA0003010018460000013
The above-mentioned
Figure FDA0003010018460000014
The above-mentioned
Figure FDA0003010018460000015
The lower subscript i in (1) represents the identification number of the task, and the lower subscript j represents the identification number of the instance;
start _ time represents the start time of the instance;
end _ time represents the end time of the instance;
the machine _ id represents a cloud server identifier;
task _ name represents the task name;
job _ name represents a job name;
inst _ name represents an instance name;
seq _ no represents the number of instance retries;
total _ seq _ no represents the total number of instance retries;
status represents the state of the instance;
CPU _ avg represents the average CPU utilization of an instance;
CPU _ max represents the maximum CPU utilization of the instance;
mem _ avg represents the average memory usage of the instance;
mem _ max represents the maximum memory usage of the instance;
step two, constructing a state element of the cloud service event;
step 201, expressing major and minor sentences of cloud service events;
cloud service events
Figure FDA0003010018460000016
Each field in (1) is a constituent element of a sentence; dividing a subject part and a predicate part of the sentence by using a double vertical line by applying a sentence structure component method; one time cloud service event
Figure FDA0003010018460000021
An expression of the major predicate is denoted as SYS _ EVENT, and SYS _ EVENT [ period ]]The (specific) example | [ retry ] is]In the form of state<Load(s)>;
202, based on the state of the cloud service event represented by the state element;
using a major sentence pattern SYS _ EVENT ═ time period]The (specific) example | [ retry ] is]In the form of state<Load(s)>For example set
Figure FDA0003010018460000022
The field semantics of each instance are subjected to sentence structure component division, a state element set of the cloud service EVENT is constructed and recorded as EVENT _ STATUS, and the EVENT _ STATUS comprises the following contents:
Figure FDA0003010018460000023
the duration state element TIME of the cloud service event is used for describing the duration state of the cloud service event, and the TIME is { start _ TIME, end _ TIME };
LOCATION state element LOCATION of the cloud service event is used for describing the LOCATION state of the cloud service event, and the LOCATION is { machine _ id, job _ name, task _ name };
the NUMBER state element NUMBER of the cloud service event is used for describing the NUMBER state of the cloud service event, and the NUMBER is { inst _ name };
a RETRY state element RETRY of the cloud service event is used for describing a RETRY state of the cloud service event, where the RETRY state is { seq _ no, total _ seq _ no };
the OPERATION state element OPERATION of the cloud service event is used for describing the OPERATION state of the cloud service event, wherein the OPERATION is { status };
the CPU load state element CPU of the cloud service event is used for describing the CPU load state of the cloud service event, and the CPU is { CPU _ avg, CPU _ max };
the memory load state element MEM of the cloud service event is used for describing the memory load state of the cloud service event, and the MEM is { MEM _ avg, MEM _ max };
event field content for any one cloud service
Figure FDA0003010018460000024
Constructed set of cloud service event state elements
Figure FDA0003010018460000025
Comprises the following steps:
Figure FDA0003010018460000031
step three, formalizing a service level contract of the cloud service event;
based on big data computing service MaxCommute service level contract and combined with cloud service event field content
Figure FDA0003010018460000032
Constructing to obtain a cloud service event-situation specification SLAS;
the above-mentioned
Figure FDA0003010018460000033
Instance level long meta-protocol
Figure FDA0003010018460000034
Instance level specification of location elements
Figure FDA0003010018460000035
Job-level number element specification
Figure FDA0003010018460000036
Retry meta-specification at instance level
Figure FDA0003010018460000041
Example level operational specification
Figure FDA0003010018460000042
Instance level CPU load specification
Figure FDA0003010018460000043
Instance level memory load specification
Figure FDA0003010018460000044
Step four, extracting violation elements;
taking the elements in the rule according with the rule formulated in the step three as violation elements;
the violation element refers to the cloud service event-situation specification
Figure FDA0003010018460000051
To elements that violate the specification; constructing the violation elements extracted from the SLAS to obtain a cloud service event violation element set
Figure FDA0003010018460000052
Violating the instance-level duration meta-convention sla4inst _ time, called duration meta-violation meta vf _ longTail:
the vf _ longTail is < end _ time, start _ time >;
violation of the instance-level location element convention sla4inst _ location, called location element-violation element vf _ location:
the vf _ location ═ machine _ id, jobname, task _ name };
the number-of-job-level violation meta-convention sla4 jobnumber, referred to as the number-of-violation meta-vf _ number:
the vf _ number is { inst _ name };
violation of the instance-level retry meta-convention sla4inst _ retry, called retry meta-violation meta-vf _ retry:
the vf _ retry is { seq _ no, total _ seq _ no };
violation of the instance-level operand specification sla4inst _ operation, referred to as operand-violation argument vf _ operation:
the vf _ operation ═ status };
violating the example-level CPU load element specification sla4inst _ CPU, called CPU load element-violating element vf _ CPU:
the vf _ cpu ═ { cpu _ avg, cpu _ max };
violating the example-level memory load element convention sla4inst _ mem, called memory load element-violation element vf _ mem:
the vf _ mem ═ { mem _ avg, mem _ max };
violation means that the cloud service event-situation convention is not reached
Figure FDA0003010018460000061
The behavior of (c); event violation refers to a cloud service event
Figure FDA0003010018460000062
Then the
Figure FDA0003010018460000063
Violation of rules;
violation element refers to violation of a convention
Figure FDA0003010018460000064
The factor (1) of (1); violation meta-exposure cloud service event
Figure FDA0003010018460000065
The nature of the violation, and thus the ability to generate the required vector samples for accurately determining the violation, requires the discovery of the cloud service event
Figure FDA0003010018460000066
Factor of violation (i.e., violation element), which becomes a cloud service event
Figure FDA0003010018460000067
The violation element of (1); to consider the possible violation factors of a cloud service event from multiple aspects, a cloud service event state element set is constructed
Figure FDA0003010018460000068
Step five, extracting indexes;
event-situation specification from cloud service
Figure FDA0003010018460000071
And cloud service event violation element set
Figure FDA0003010018460000072
Extracting violation limit values as violation indexes to obtain a specification-index set METRIC;
the above-mentioned
Figure FDA0003010018460000073
Mapping and constructing a condition element-violation element-relation group;
event state element collection according to cloud service
Figure FDA0003010018460000074
And cloud service event violation element set thereof
Figure FDA0003010018460000075
Mapping a state element-violation element contact tuple to obtain a condition element-violation element-contact element group set which is marked as PSV;
the above-mentioned
Figure FDA0003010018460000081
PSV _ TIME represents a contact tuple of 'duration state element-duration violation element';
PSV _ LOCATION represents a "LOCATION status element-LOCATION violation element" contact tuple;
PSV _ NUMBER represents the "NUMBER state element-NUMBER violation element" contact tuple;
PSV _ RETRY represents a "RETRY status element-RETRY violation element" contact tuple;
PSV _ OPERATION represents an "OPERATION state element-OPERATION violation element" contact tuple;
PSV _ CPU represents a contact tuple of 'CPU load state element-CPU load violation element';
PSV _ MEM represents a contact tuple of memory load state element and memory load violation element;
mapping a TIME-length state element-TIME violation element contact element PSV _ TIME according to the TIME-length state element TIME of the event and the violation element vf _ longTail thereof:
PSV_TIME=(end_time,start_time);
according to the LOCATION state element LOCATION of the event and the violation element vf LOCATION, mapping the LOCATION state element-LOCATION violation element contact tuple PSV _ LOCATION:
PSV_LOCATION=(machine_id,job_name,task_name);
mapping a NUMBER state element-NUMBER violation element association tuple PSV _ NUMBER according to the NUMBER state element NUMBER and the violation element vf _ NUMBER thereof:
PSV_NUMBER=(inst_name);
mapping a RETRY state element-RETRY violation element association tuple PSV _ RETRY according to the RETRY state element RETRY of the event and the violation element vf _ RETRY thereof:
PSV_RETRY=(seq_no,total_seq_no);
mapping an OPERATION state element-OPERATION violation element contact tuple PSV _ OPERATION according to the OPERATION state element OPERATION of the event and the violation element vf _ OPERATION thereof:
PSV_OPERATION=(status);
mapping a contact tuple PSV _ CPU of 'CPU load state element-CPU load violation element' according to the CPU load state element CPU of the event and the violation element vf _ CPU thereof:
PSV_CPU=(cpu_avg,cpu_max);
according to the memory load state element MEM of the event and the violation element vf _ MEM thereof, mapping a "memory load state element-memory load violation element" contact tuple PSV _ MEM:
PSV_MEM=(mem_avg,mem_max);
constructing a state element-index element-connection group;
step 701, according to the mapped status element-violation element-association element set
Figure FDA0003010018460000091
And the extracted specification-index set
Figure FDA0003010018460000092
Constructing a state element-index element-tuple set of the cloud service event, and recording the state element-index element-tuple set as PSM;
the above-mentioned
Figure FDA0003010018460000093
PSM _ TIME represents a duration status element-duration violation element index tuple;
PSM _ LOCATION represents a LOCATION status element-LOCATION violation element indicator tuple;
PSM _ NUMBER represents a NUMBER status element-NUMBER violation element indicator tuple;
PSM _ RETRY represents a RETRY status element-RETRY violation element indicator tuple;
PSM _ OPERATION represents an OPERATION state element-OPERATION violation element index tuple;
PSM _ CPU represents CPU load state element-CPU load violation element index tuple;
PSM _ MEM represents a memory load state element-memory load violation element index tuple;
duration state element-duration violation element index tuple
Figure FDA0003010018460000101
Location state element-location violation element index tuple
Figure FDA0003010018460000102
Number state element-number violation element index tuple
Figure FDA0003010018460000103
Retry state meta-retry violation meta-pointer tuple
Figure FDA0003010018460000104
Operation state element-operation violation element index tuple
Figure FDA0003010018460000105
CPU load state element-CPU load violation element index tuple
Figure FDA0003010018460000106
Memory load state element-memory load violation element index tuple
Figure FDA0003010018460000107
Step 702, according to the state element-index element-tuple set
Figure FDA0003010018460000111
Making a Cartesian product of the condition event and the index to construct a condition-index contact tuple of the cloud service event, and recording the tuple as RSM;
the RSM ═ EVENT _ STATUS (METRIC)
EVENT _ STATUS represents an instance condition EVENT;
METRIC represents an event violation indicator;
structured cloud service event condition-index contact tuple
Figure FDA0003010018460000112
Step eight, generating a status-index vectorization sample of the cloud service event;
the vectorization method is a word2vec method similar to natural language, and is used for quantizing cloud service events and service level contract data thereof into vectors;
reading in a set of instances
Figure FDA0003010018460000113
Traversing state elements of each instance condition event
Figure FDA0003010018460000114
If any one of the examples
Figure FDA0003010018460000121
State element of
Figure FDA0003010018460000122
If not, extracting the numerical value in the position state element value and the numerical value;
the Terminated state is mapped to a value of 0;
the Ready state is mapped to a numerical value of 1;
the Running state is mapped to a numerical value of 2;
the Terminating state is mapped to a value of 3;
mapping an interleaved state into a value of 4;
the Failed state is mapped to a value of 5;
if the values of the duration state element, the retry state element, the CPU load state element and the memory load state element are numerical values, the numerical values are saved; if the values of the CPU load state element and the memory load state element have null values, filling the null values into a value 0;
if any one of the examples
Figure FDA0003010018460000123
State element of
Figure FDA0003010018460000124
If it is empty, it indicates that the traversal has been completed and the quantization is completed
Figure FDA0003010018460000125
All state elements of
Figure FDA0003010018460000126
And finally, saving the quantization result of the state element into a file.
2. The method of claim 1, wherein the vectorization of cloud service events and service level contract data comprises: each rule in the cloud service event-situation specification SLAS is defined;
the example-level duration state element specification is formalized as equation (1):
Figure FDA0003010018460000131
Figure FDA0003010018460000132
v represents a condition of predicate decision;
when the predicate decision result of v is
Figure FDA0003010018460000133
Violating the formula (1) and recording as violation-instance-level duration element convention, that is, the state element is violation element vf _ longTail;
when the predicate decision result of v is not
Figure FDA0003010018460000134
Then the instance-level length element specification is satisfied;
the example-level position element specification is formalized as equation (3):
Figure FDA0003010018460000135
v represents a condition of predicate decision;
when the predicate decision result of v is
Figure FDA0003010018460000136
Violating the formula (3), and recording as violation-instance level position element convention, that is, the state element is violation element vf _ location;
when the predicate decision result of v is not
Figure FDA0003010018460000137
Then the instance level position element specification is satisfied;
the job-level number element specification is formalized as equation (5):
Figure FDA0003010018460000138
rNumber_metric=2000 (8)
mNumber_metric=8000 (9)
v represents a condition of predicate decision;
when the predicate decision result of v is
Figure FDA0003010018460000144
Violating the formula (5), and recording as violation-job-level quantity element convention, that is, the state element is violation element vf _ number;
when the predicate decision result of v is not
Figure FDA0003010018460000145
The job level number element specification is satisfied;
example level retry meta-reduction formalization as equation (10)
Figure FDA0003010018460000141
retry_metric=3 (11)
v represents a condition of predicate decision;
Figure FDA0003010018460000142
when the predicate decision result of v is
Figure FDA0003010018460000143
Then the formula (10) is violated and the rule is recorded as retry element rule of violation-instance level, i.e. the state element is violation element vf _ retry;
Figure FDA0003010018460000151
when the predicate decision result of v is not
Figure FDA0003010018460000152
Retry meta-conventions for satisfaction of instance level;
the example level operation element specification is formalized as equation (12):
Figure FDA0003010018460000153
retry_metric=3 (13)
v represents a condition of predicate decision;
when the predicate decision result of v is
Figure FDA0003010018460000161
Then equation (12) is violated and it is written as an operation element convention at the violation-instance level, i.e., the state element isViolation element vf _ operation;
when the predicate decision result of v is not
Figure FDA0003010018460000162
Then the instance level operation element specification is satisfied;
the example-level CPU load element specification for the chinese expression is formalized as equation (14):
Figure FDA0003010018460000163
v represents a condition of predicate decision;
when the predicate decision result of v is
Figure FDA0003010018460000164
The formula (14) is violated and the violation-instance-level CPU load element specification is recorded, that is, the state element is the violation element vf _ CPU;
when the predicate decision result of v is not
Figure FDA0003010018460000165
Then the example level CPU load element specification is satisfied;
the example-level memory load element of the Chinese expression is reduced and expressed as a formula (15):
Figure FDA0003010018460000171
v represents a condition of predicate decision;
when the predicate decision result of v is
Figure FDA0003010018460000172
Violating the formula (15), and recording as violation-instance-level memory load element convention, that is, the state element is a violation element vf _ mem;
when the predicate decision result of v is not
Figure FDA0003010018460000173
Then the instance memory load meta-convention is satisfied.
3. The method of claim 1, wherein the vectorization of cloud service events and service level contract data comprises: the log of the cloud server adopts an Alibaba cluster trace v2018 data set.
CN202110372833.0A 2021-04-07 2021-04-07 Vectorization method of cloud service event and service level contract data Active CN112948132B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110372833.0A CN112948132B (en) 2021-04-07 2021-04-07 Vectorization method of cloud service event and service level contract data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110372833.0A CN112948132B (en) 2021-04-07 2021-04-07 Vectorization method of cloud service event and service level contract data

Publications (2)

Publication Number Publication Date
CN112948132A true CN112948132A (en) 2021-06-11
CN112948132B CN112948132B (en) 2022-09-06

Family

ID=76230852

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110372833.0A Active CN112948132B (en) 2021-04-07 2021-04-07 Vectorization method of cloud service event and service level contract data

Country Status (1)

Country Link
CN (1) CN112948132B (en)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106100902A (en) * 2016-08-04 2016-11-09 腾讯科技(深圳)有限公司 High in the clouds index monitoring method and apparatus
US20160342403A1 (en) * 2015-05-22 2016-11-24 Vmware, Inc. Application management in enterprise environments using cloud-based application recipes
CN109861844A (en) * 2018-12-07 2019-06-07 中国人民大学 A kind of cloud service problem fine granularity intelligence source tracing method based on log
CN109886847A (en) * 2019-01-30 2019-06-14 广西师范大学 Political theory courses resource-sharing based on cloud service cooperates with system of cultivating talent
CN111095876A (en) * 2017-10-02 2020-05-01 Vm维尔股份有限公司 Creating virtual networks across multiple public clouds
CN111182582A (en) * 2019-12-30 2020-05-19 东南大学 Multitask distributed unloading method facing mobile edge calculation
US10684909B1 (en) * 2018-08-21 2020-06-16 United States Of America As Represented By Secretary Of The Navy Anomaly detection for preserving the availability of virtualized cloud services
CN111698278A (en) * 2020-04-10 2020-09-22 湖南大学 Multi-cloud data storage method based on block chain
CN112527759A (en) * 2021-02-09 2021-03-19 腾讯科技(深圳)有限公司 Log execution method and device, computer equipment and storage medium

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160342403A1 (en) * 2015-05-22 2016-11-24 Vmware, Inc. Application management in enterprise environments using cloud-based application recipes
CN106100902A (en) * 2016-08-04 2016-11-09 腾讯科技(深圳)有限公司 High in the clouds index monitoring method and apparatus
CN111095876A (en) * 2017-10-02 2020-05-01 Vm维尔股份有限公司 Creating virtual networks across multiple public clouds
US10684909B1 (en) * 2018-08-21 2020-06-16 United States Of America As Represented By Secretary Of The Navy Anomaly detection for preserving the availability of virtualized cloud services
CN109861844A (en) * 2018-12-07 2019-06-07 中国人民大学 A kind of cloud service problem fine granularity intelligence source tracing method based on log
CN109886847A (en) * 2019-01-30 2019-06-14 广西师范大学 Political theory courses resource-sharing based on cloud service cooperates with system of cultivating talent
CN111182582A (en) * 2019-12-30 2020-05-19 东南大学 Multitask distributed unloading method facing mobile edge calculation
CN111698278A (en) * 2020-04-10 2020-09-22 湖南大学 Multi-cloud data storage method based on block chain
CN112527759A (en) * 2021-02-09 2021-03-19 腾讯科技(深圳)有限公司 Log execution method and device, computer equipment and storage medium

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
MARIO A. BOCHICCHIO 等: "Modelling Contract Management for Cloud Services", 《2011 IEEE 4TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING》 *
李渤 等: "云服务溯源技术研究综述", 《江苏科技信息》 *
蔡文伟: "基于云服务的量化效能评估模型构建方法", 《研究与开发》 *

Also Published As

Publication number Publication date
CN112948132B (en) 2022-09-06

Similar Documents

Publication Publication Date Title
US20210374610A1 (en) Efficient duplicate detection for machine learning data sets
US9635101B2 (en) Proposed storage system solution selection for service level objective management
EP3161635B1 (en) Machine learning service
US10205627B2 (en) Method and system for clustering event messages
US10776439B2 (en) Efficient log-file-based query processing
EP2674875B1 (en) Method, controller, program and data storage system for performing reconciliation processing
US8489550B2 (en) Multi-tenancy data storage and access method and apparatus
US20150370799A1 (en) Method and system for clustering and prioritizing event messages
US8620921B1 (en) Modeler for predicting storage metrics
US9122739B1 (en) Evaluating proposed storage solutions
US20150370885A1 (en) Method and system for clustering event messages and managing event-message clusters
US11811839B2 (en) Managed distribution of data stream contents
US10198346B1 (en) Test framework for applications using journal-based databases
US11880272B2 (en) Automated methods and systems that facilitate root-cause analysis of distributed-application operational problems and failures by generating noise-subtracted call-trace-classification rules
Malik et al. Sketching distributed data provenance
CN110502472A (en) A kind of the cloud storage optimization method and its system of large amount of small documents
CN112948132B (en) Vectorization method of cloud service event and service level contract data
US11210352B2 (en) Automatic check of search configuration changes
Ribeiro et al. A data integration architecture for smart cities
WO2021057824A1 (en) Method and apparatus for querying data, computing device, and storage medium
CN108363761A (en) Hadoop awr automatic loads analyze information bank, analysis method and storage medium
US11580082B2 (en) Object storage system with control entity quota usage mapping
US11947537B1 (en) Automatic index management for a non-relational database
Kvet et al. Enhancing Analytical Select Statements Using Reference Aliases
US20240104074A1 (en) Location-constrained storage and analysis of large data sets

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant