CN114385436A - Server grouping method and device, electronic equipment and storage medium - Google Patents

Server grouping method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN114385436A
CN114385436A CN202111481222.6A CN202111481222A CN114385436A CN 114385436 A CN114385436 A CN 114385436A CN 202111481222 A CN202111481222 A CN 202111481222A CN 114385436 A CN114385436 A CN 114385436A
Authority
CN
China
Prior art keywords
server
feature vector
asset information
grouping
determining
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111481222.6A
Other languages
Chinese (zh)
Inventor
刘卓龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wangsu Science and Technology Co Ltd
Original Assignee
Wangsu Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wangsu Science and Technology Co Ltd filed Critical Wangsu Science and Technology Co Ltd
Priority to CN202111481222.6A priority Critical patent/CN114385436A/en
Publication of CN114385436A publication Critical patent/CN114385436A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3051Monitoring arrangements for monitoring the configuration of the computing system or of the computing system component, e.g. monitoring the presence of processing resources, peripherals, I/O links, software programs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3065Monitoring arrangements determined by the means or processing involved in reporting the monitored data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/216Parsing using statistical methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Quality & Reliability (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application relates to the technical field of internet communication, and discloses a server grouping method, a server grouping device, electronic equipment and a storage medium, wherein the method comprises the following steps: acquiring at least one item of asset information on each server; for each server, determining a first feature vector of the server according to the acquired at least one item of asset information; and according to a preset clustering algorithm, taking a plurality of first eigenvectors with similarity greater than a preset threshold value as a class of eigenvectors, obtaining a clustering result of each first eigenvector, and determining the grouping of each server according to the clustering result. The method comprises the steps of generating first feature vectors representing service features of servers by collecting asset information of the servers, determining grouping of the servers according to clustering results obtained by the first feature vectors according to vector similarity, and performing alternative clustering grouping on the servers accurately and efficiently by using the first feature vectors corresponding to the servers and a clustering algorithm, so that grouping efficiency is improved, and meanwhile, grouping difficulty and cost are reduced.

Description

Server grouping method and device, electronic equipment and storage medium
Technical Field
The embodiment of the application relates to the technical field of internet communication, in particular to a server grouping method, a server grouping device, electronic equipment and a storage medium.
Background
With the continuous development of communication technology and internet, in order to better meet the use requirements of users, a server gradually becomes one of the extremely important infrastructures in modern times. No matter the enterprises engaged in the traditional industry or the high and new technology industry, a large number of servers are required to perform tasks such as data storage, calculation, website service providing and the like, so that the servers are grouped according to different deployment services on the servers, the management of the servers by the enterprises is facilitated, and meanwhile, in the security industry, the grouping of the servers is also helpful for performing fine protection by utilizing grouping information.
The traditional server grouping is mainly realized manually, usually, operation and maintenance personnel mark grouping information to the servers when putting the servers on shelves, and the method has no outstanding problem when the number of the servers is small, but the manual grouping has many defects such as high labor cost, difficulty in real-time maintenance and the like when the number of the servers, the service complexity of the servers and the service change are continuously increased today.
Therefore, how to simply and efficiently complete server grouping and further realize accurate management of the servers is a technical problem which needs to be solved urgently.
Disclosure of Invention
The embodiment of the application mainly aims to provide a server grouping method, a server grouping device, electronic equipment and a storage medium, aiming at simply and efficiently completing server grouping, reducing server grouping difficulty and cost and realizing accurate management of servers.
In order to achieve the above object, an embodiment of the present application provides a server grouping method, including: acquiring at least one item of asset information on each server; for each server, determining a first feature vector of the server according to the acquired at least one item of asset information; and according to a preset clustering algorithm, taking a plurality of first feature vectors with similarity greater than a preset threshold value as a class of feature vectors, acquiring a clustering result of each first feature vector, and determining the grouping of each server according to the clustering result.
In order to achieve the above object, an embodiment of the present application further provides a server grouping apparatus, including: the acquisition module is used for acquiring at least one item of asset information on each server; the determining module is used for determining a first feature vector of each server according to the acquired at least one item of asset information; and the grouping module is used for taking a plurality of first characteristic vectors with similarity greater than a preset threshold value as a class of characteristic vectors according to a preset clustering algorithm, acquiring a clustering result of each first characteristic vector, and determining the grouping of each server according to the clustering result.
In order to achieve the above object, an embodiment of the present application further provides an electronic device, where the electronic device includes: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the server grouping method as described above.
To achieve the above object, the embodiment of the present application further provides a computer-readable storage medium, which stores a computer program, and the computer program, when executed by a processor, implements the server grouping method as described above.
According to the server grouping method provided by the embodiment of the application, at least one item of asset information on each server is obtained, the first characteristic vector of each server is determined according to the obtained at least one item of asset information, then the first characteristic vectors with the similarity larger than the preset threshold are used as the characteristic vectors through a clustering algorithm, the clustering result of each first characteristic vector is further obtained, and the server grouping is determined according to the clustering result of each first characteristic vector. By acquiring asset information of a server and generating a corresponding first feature vector for the server according to the asset information, accurately representing the service features of the server through the first feature vector; according to the similarity between the first characteristic vectors and the relation of a preset threshold value, the clustering results of the first characteristic vectors are obtained, the grouping results of the servers are further obtained, the server is subjected to alternative clustering grouping accurately and efficiently by using the first characteristic vectors corresponding to the servers and a clustering algorithm, the grouping efficiency is improved, and meanwhile the grouping difficulty and the cost are reduced.
Drawings
One or more embodiments are illustrated by the corresponding figures in the drawings, which are not meant to be limiting.
FIG. 1 is a flow chart of a server grouping method in an embodiment of the present application;
fig. 2 is a schematic structural diagram of a server grouping apparatus in another embodiment of the present application;
fig. 3 is a schematic structural diagram of an electronic device in another embodiment of the present application.
Detailed Description
As known from the background art, the current server grouping method has many disadvantages, such as high labor cost and difficulty in real-time maintenance of server grouping, under the conditions of increasing the number of servers, increasing the complexity of services on the servers, and changing the services. Therefore, how to simply and efficiently implement server grouping and further implement accurate management on the servers is a technical problem which needs to be solved urgently.
In order to solve the foregoing problem, some embodiments of the present application provide a server grouping method, including: acquiring at least one item of asset information on each server; for each server, determining a first feature vector of the server according to the acquired at least one item of asset information; and according to a preset clustering algorithm, taking a plurality of first eigenvectors with similarity greater than a preset threshold value as a class of eigenvectors, obtaining a clustering result of each first eigenvector, and determining the grouping of each server according to the clustering result.
To make the objects, technical solutions and advantages of the embodiments of the present application clearer, the embodiments of the present application will be described in detail below with reference to the accompanying drawings. However, it will be appreciated by those of ordinary skill in the art that in the examples of the present application, numerous technical details are set forth in order to provide a better understanding of the present application. However, the technical solution claimed in the present application can be implemented without these technical details and various changes and modifications based on the following embodiments. The following embodiments are divided for convenience of description, and should not constitute any limitation to the specific implementation manner of the present application, and the embodiments may be mutually incorporated and referred to without contradiction.
The following describes in detail the implementation details of the server grouping method described in the present application with reference to specific embodiments, and the following is provided only for the convenience of understanding and is not necessary for implementing the present solution.
A first aspect of the embodiments of the present application provides a server grouping method, where a flow of the server grouping method refers to fig. 1, and in some embodiments, the server grouping method is applied to a terminal with communication and data analysis capabilities, such as a computer, an electronic device such as a management server, and the present embodiment is described by taking application in a computer as an example, and the server grouping includes the following steps:
step 101, at least one item of asset information on each server is obtained.
Specifically, when the computer groups a plurality of servers that are already on-line or are ready to be on-line, at least one item of asset information on the plurality of servers to be grouped is acquired by an asset acquisition tool, such as a Host-based Intrusion Detection System (HIDS), and the acquired asset information is independently stored with an identifier.
In one example, the server asset information obtained by the computer via the asset collection tool includes one or any combination of the following: the system comprises processes, port binding information, system users, group users, timing tasks, startup starting items and environment variables. When the computer collects asset information, it can collect one or more specific asset information in the server according to the actual need and the specific classification requirement, for example, four asset information of acquisition process, port binding information, system user and timing task. And then, accurately grouping the servers by using the acquired asset information, wherein the embodiment does not limit the types and the number of the specifically selected asset information.
It should be noted that the asset acquisition tool used by the computer for acquiring asset information may be a host-type intrusion detection system, or may be another asset acquisition tool, for example, Goby, and a specific application may be selected according to factors such as computer performance and acquisition requirements.
And 102, for each server, determining a first feature vector of the server according to the acquired at least one item of asset information.
Specifically, after asset information of a plurality of servers to be grouped is collected, for convenience of subsequent grouping, for each server, service features of the server are extracted according to at least one item of collected asset information, and therefore a first feature vector of the server is determined according to at least one item of acquired asset information. The first feature vector is comprehensively determined by the result of feature extraction of at least one item of asset information, so that the first feature vector can accurately represent the service features of each server.
In one example, determining a first feature vector of a server according to the acquired at least one item of asset information includes: under the condition that the obtained asset information is N items, wherein N is an integer greater than 1; classifying all asset information of each server to generate N types of asset information sets, and determining a keyword set corresponding to each type of asset information set; acquiring a target keyword set corresponding to each item of asset information of the server; the target keyword set is a keyword set corresponding to the asset information set to which each item of asset information belongs; generating a second feature vector corresponding to each asset information according to the occurrence frequency of each keyword in the corresponding target keyword set in each asset information; and determining the first feature vector according to the N second feature vectors of the server. Specifically, when a first feature vector is generated for each server according to the collected asset information, N items of asset information collected from each server are classified according to asset information types to generate N types of asset information sets, and then keyword extraction and summarization are performed on each pair of asset information sets to generate a keyword set corresponding to each asset information set. When a first feature vector is generated for a server, a target keyword set corresponding to each asset information item is respectively obtained according to an asset information set to which each asset information item belongs, then keyword detection is carried out on each asset information item, the occurrence sequence and the occurrence frequency of each keyword in the target keyword set in each asset information item are counted, a corresponding second feature vector is generated for each asset information item according to the occurrence frequency and the occurrence sequence of each keyword, and therefore second feature vectors corresponding to a plurality of collected asset information items are obtained. And then determining a first feature vector corresponding to the server according to the obtained plurality of second feature vectors. The method comprises the steps of performing keyword detection on each item of asset information collected on each server, generating a second feature vector corresponding to each item of asset information according to the sequence and the times of occurrence of each keyword in a target keyword set in each item of asset information, generating a first feature vector corresponding to each server by combining a plurality of items of asset information collected on each server, and accurately representing the features of the service deployed by the servers by using the first feature vectors.
For example, a bag-of-words model (bag-of-word) acquired in advance is called, a text with time sequence information corresponding to a plurality of items of asset information is generated according to the plurality of items of asset information acquired from each server, then keyword detection is performed on each time sequence text, information such as the occurrence frequency of each keyword in each time sequence text is counted, and a second feature vector corresponding to each item of asset information is generated. For example, the collected asset information includes process information and user information, the server to be grouped includes a server a and a server B, and the process information of the server a is: the user information is [ 'root', 'test' ], [ 'mysql', 'python', 'python' ]; the process information of the server B is: [ 'redis', 'java', 'tomcat' ], the user information is: [ 'root', 'watch' ]. After keyword detection is performed on the process information by using the bag-of-words model, 5 keywords in the keyword set corresponding to the process information are detected, and the keyword ordering result is as follows: mysql ',' python ', redis', 'java' and 'tomcat', according to the keyword set corresponding to the process information and the process information of the server a, the second feature vector corresponding to the process information of the server a is obtained to be [1,2,0,0,0], according to the keyword set corresponding to the process information and the process information of the server a, the second feature vector corresponding to the process information of the server B is obtained to be [0,0,1,1,1], wherein each numerical value in the second feature vector corresponding to the process information respectively represents the number of times that 5 keywords 'mysql', 'python', 'redis', 'java' and 'tomcat' appear in the process information of the server in the keyword set of the process information. After keyword detection is performed on user information by using a bag-of-words model, 3 keywords in a keyword set corresponding to the user information are detected, and a keyword sequencing result is as follows: the method comprises the steps of obtaining a second feature vector [1,1,0] corresponding to user information of a server A according to a keyword set corresponding to the user information and the user information of the server A, obtaining a second feature vector [1,0,1] corresponding to the user information of the server B according to the keyword set corresponding to the user information and the user information of the server B, wherein each numerical value in the second feature vector corresponding to the user information respectively represents the frequency of occurrence of 3 keywords 'root', 'test' and 'watch' in the keyword set of the user information in the user information. And by analogy, similar processing is carried out on the resource information of the other servers to be grouped to obtain second feature vectors corresponding to the asset information of each server respectively. And then, for any server to be grouped, determining a corresponding first feature vector according to the second feature vector corresponding to each item of asset information of the server. For example, the first feature vector is formed by using each second feature vector as a new vector element. The method comprises the steps of utilizing a bag-of-words model to conduct feature extraction and vector conversion on asset information of a server, and generating a first feature vector corresponding to the server according to a second feature vector corresponding to each item of asset information, so that the generated first feature vector can accurately represent service features of the server, and accuracy of a grouping result when the server is grouped according to the first feature vector is improved.
It should be noted that, when the second feature vector is generated for the asset information, the keywords may be sorted according to the occurrence time, or sorted according to a preset priority or other conditions, which is not limited in this embodiment.
Further, determining the first feature vector according to the N second feature vectors of the server includes: for each second feature vector, acquiring the weight of each element in the second feature vector; generating a third feature vector corresponding to the second feature vector according to the weight of each element in the second feature vector; and splicing the generated N third eigenvectors according to the same sequence aiming at each server to obtain the first eigenvector. Specifically, when the corresponding first feature vector is generated according to N second feature vectors of the server, in order to enable the feature vectors to more intuitively represent the service features of the server, the second feature vectors for counting the occurrence times of different keywords may be converted, and the second feature vectors are converted into third feature vectors representing importance degrees of different keywords. And calculating the weight of each element in each second feature vector according to the number of times of occurrence of the keyword corresponding to each element in the second feature vector in the corresponding asset information, the total number of keywords in the asset information set to which the asset information corresponding to the second feature vector belongs, the total number of times of occurrence of each element in the corresponding asset information and the like, and then replacing each element in the second feature vector with the calculated weight to generate a third feature vector corresponding to the second feature vector. After the third eigenvectors corresponding to the N second eigenvectors corresponding to the servers are obtained, the N third eigenvectors of the servers are spliced together respectively according to the same sequence for each server, and the first eigenvector corresponding to each server is obtained. The splicing sequence may be determined according to the importance degree of the asset information, the acquisition sequence, the feature vector generation time, and other factors, which is not limited in this embodiment. By performing weight analysis on each element in each second feature vector corresponding to the server, and after converting the second feature vector into a third feature vector according to the weight of each element, splicing a plurality of third feature vectors of different servers according to the same sequence to form a first feature vector corresponding to each server, the obtained first feature vector can accurately represent the service features of the server from the dimension of the importance degree of different services, so that subsequent grouping is performed according to the importance degree of different services in each server, and the accuracy of a grouping result is further improved.
Further, obtaining the weight of each element in the second feature vector includes: for each element in the second characteristic vector, determining the word frequency of the corresponding keyword according to the frequency of the corresponding keyword appearing in the asset information corresponding to the second characteristic vector and the total number of the keywords in the keyword set corresponding to the asset information set to which the asset information corresponding to the second characteristic vector belongs; determining the inverse document frequency of the corresponding keyword according to the quantity of the asset information containing the corresponding keyword and the total quantity of the asset information of each server; and determining the weight of the element according to the word frequency and the inverse document frequency of the corresponding keyword. Specifically, when the computer obtains the weight of each element in the second feature vector, the computer obtains asset information corresponding to the second feature vector and an asset information set to which the asset information corresponding to the second feature vector belongs; for each element in the second characteristic vector, determining the word frequency of the keyword corresponding to the element according to the number of times of the keyword corresponding to the element appearing in the asset information corresponding to the second characteristic vector and the total number of keywords in a keyword set corresponding to the asset information set to which the asset information corresponding to the second characteristic vector belongs; then, carrying out statistics on the data of the asset information acquired on each server, carrying out keyword detection on each asset information, and determining the inverse document frequency of the keywords corresponding to the elements according to the quantity of the asset information containing the keywords corresponding to the elements and the total quantity of the asset information of each server; and finally, determining the weight, namely the importance degree of the element in the asset information according to the word frequency and the inverse document frequency of the keyword corresponding to the element. The weight corresponding to each element in the second characteristic vector is accurately obtained according to the word frequency and the inverse text frequency of the keyword corresponding to the element, so that the servers can be accurately grouped according to the first characteristic vector representing different service importance degrees in the follow-up process.
Still further, after obtaining the weight of each element in the second feature vector, the method further includes: the weights of the elements are normalized according to the following formula:
Figure BDA0003395312360000051
wherein, ω isnorm,iNormalizing the weight of the ith element in the second feature vector by ωiIs the weight of the ith element, ωjIs the weight of the jth element in the second feature vector, and m is the total number of elements in the second feature vector; the computer generates a third feature vector corresponding to the second feature vector according to the weight of each element in the second feature vector, and the method comprises the following steps: and generating a third feature vector corresponding to the second feature vector according to the weight of each element of the second feature vector after normalization processing.
Specifically, after the weight of each element in the second feature vector is obtained, in order to further improve the accuracy of the representation of the importance degree of the service feature on the server by the weight of each element, the weight is normalized according to the obtained initial weight of each element. For example, in the example continuing the above, the second feature vector corresponding to the process information of server a is [1,2,0,0,0], and the second feature vector corresponding to the process information of server B is [0,0,1,1,1 ]; the second feature vector corresponding to the user information of the server a is [1,1,0], and the second feature vector corresponding to the user information of the server B is [1,0,1 ]. After each second eigenvector is processed by using a word frequency-inverse document frequency algorithm, the weight matrix of the second eigenvector corresponding to the process information of the server A is [0.3920, 0.7840, 0,0,0 ]; the weight matrix of the second eigenvector corresponding to the process information of the server B is [0,0, 0.3920, 0.3920, 0.3920 ]. After normalization processing is performed according to the formula, the weight matrix of the second eigenvector corresponding to the process information of the server a is [0.4472, 0.8944, 0,0,0 ]; the weight matrix of the second eigenvector corresponding to the process information of the server B is [0,0,0.5774, 0.5774, 0.5774 ]. And respectively normalizing the second feature vectors corresponding to the user information of the server A and the server B in a similar manner. According to the weight of each element of the second eigenvector after normalization processing, a third eigenvector corresponding to the process information of the server A is generated as [0.4472, 0.8944, 0,0,0], and a third eigenvector corresponding to the user information is [0.5797,0.8148,0 ]; the generated third feature vector corresponding to the process information of the server B is [0,0,0.5774, 0.5774, 0.5774], and the third feature vector corresponding to the user information is [0.5797,0,0.8148 ]; and each element in the third feature vector represents the importance degree of the corresponding keyword to the server. After weight normalization processing is performed on the weights of the elements in the second eigenvector, N third eigenvectors corresponding to the servers are obtained, the third eigenvectors of the servers are respectively spliced according to the same preset sequence, and a first eigenvector of each server is obtained, for example, the first eigenvector of the server is formed by splicing the third eigenvectors corresponding to the user information after the third eigenvector corresponding to the process information is obtained, so that the first eigenvector of the server a is [0.4472, 0.8944, 0,0,0,0.5797,0.8148,0], and the first eigenvector of the server B is [0,0,0.5774, 0.5774, 0.5774, 0.5797,0,0.8148 ].
It should be noted that the asset information included in the first feature vector of the server, the splicing order and the obtaining manner of the third feature vector may be selected and changed according to actual needs, which is not limited in this embodiment.
And 103, acquiring the clustering result of each first feature vector, and determining the grouping of each server according to the clustering result.
Specifically, after corresponding first feature vectors are generated for the servers to be grouped, a plurality of first feature vectors with similarity greater than a preset threshold are used as a class of feature vectors according to a preset clustering algorithm, so that a clustering result of the first feature vectors corresponding to each server is obtained, and then the groups of the servers to be grouped are determined according to the servers corresponding to each feature vector in each class of feature vectors in the clustering result. The first feature vectors representing the service features of different servers are clustered according to a clustering algorithm, and the grouping of each server is determined according to a clustering result, so that the grouping accuracy is ensured as much as possible, the grouping efficiency is improved, and the server grouping difficulty is reduced.
In one example, the obtaining a clustering result of each first feature vector by using a plurality of first feature vectors with similarity greater than a preset threshold as a class of feature vectors according to a preset clustering algorithm includes: acquiring a target hyper-parameter of a preset density clustering algorithm; the target hyper-parameters comprise interval thresholds among the characteristic vectors of one type and the minimum number of the vectors of the characteristic vectors of one type; and performing clustering iteration on each first feature vector based on the target hyper-parameter to obtain a clustering result. Specifically, after first feature vectors corresponding to servers to be grouped are obtained, corresponding target hyper-parameters are obtained according to a preset density clustering algorithm, and the target hyper-parameters can be preset or can be input by an administrator in real time according to prompts of a computer. The target hyper-parameter comprises an interval threshold between one type of feature vectors and the minimum number of the vectors of one type of feature vectors in the clustering process. The interval between the vectors is used for representing the similarity between the first characteristic vectors, the interval threshold is a preset threshold of the similarity, and under the condition that the interval between the two first characteristic vectors is smaller than the interval threshold, the two characteristic vectors are judged to belong to the same class; the minimum vector number represents that one class of feature vectors at least needs to contain a plurality of feature vectors, and in a group with a plurality of feature vectors, when the number of the feature vectors does not meet the minimum vector number, each feature vector in the group is judged to be not the same class of feature vectors. And after the computer sets the called hyper-parameter of the preset density clustering algorithm as a target hyper-parameter, performing clustering iteration on the first characteristic vectors respectively corresponding to the servers by using the density clustering algorithm to obtain a clustering result of the first characteristic vectors.
For example, a DBSCAN clustering algorithm is used to cluster first feature vectors corresponding to 10 servers to be grouped, the obtained target hyper-parameters are the minimum distance eps between the feature vectors being 5 and the minimum vector number min-samples contained in a class of feature vectors being 5, after clustering iteration, 5 first feature vectors are contained in class a, 4 first feature vectors are contained in class B, and the distances between one first feature vector I and the remaining first feature vectors are greater than 5, then the clustering result output by the DBSCAN clustering algorithm is that 5 first feature vectors in class a are the same class of feature vectors, and the remaining 5 feature vectors are discrete feature vectors. And then, the computer takes the servers corresponding to the 5 first characteristic vectors in the class A as the same group of servers according to the clustering result of the first characteristic vectors, and takes the remaining 5 servers as discrete servers individually to obtain the grouping results of the ten servers to be grouped.
It is worth mentioning that the servers to be grouped can be a plurality of servers only having accurate online, a plurality of servers only having online or a plurality of servers having online and a plurality of servers having accurate online, and not only can the servers to be grouped be accurately grouped, but also the grouping of all the servers can be dynamically maintained flexibly and efficiently by putting the servers having online and the servers to be online together for regrouping, so that the difficulty and the cost of grouping maintenance are reduced.
In another example, after determining the grouping of the servers according to the clustering result, the method further includes: for each group, acquiring key services of each server in the group; and performing server service marking on the packets according to the public key service of each server. Specifically, after obtaining grouping results of a plurality of servers to be grouped, the computer performs key service detection on each server in each group, and determines a plurality of key services of each server in the same group. And then detecting one or more public key services of each server in the same group, and marking the core service of the server in the current group according to the obtained public key service, so that the core service of the server in the current group can be conveniently and visually obtained, and the server can be accurately managed and maintained.
In addition, after the grouping result of the server to be grouped is obtained, the management personnel can be reminded to check the grouping result, and the grouping result can be further adjusted according to the adjustment instruction input by the management personnel, so that the grouping result can better meet the actual requirement.
In addition, it should be understood that the above steps of the various methods are divided for clarity, and the implementation may be combined into one step or split into some steps, and the steps are divided into multiple steps, so long as the same logical relationship is included in the protection scope of the present patent; it is within the scope of the patent to add insignificant modifications to the algorithms or processes or to introduce insignificant design changes to the core design without changing the algorithms or processes.
Another aspect of the embodiments of the present application further provides a server grouping apparatus, referring to fig. 2, including:
an obtaining module 201, configured to obtain at least one item of asset information on each server.
And the query module 202 is configured to determine, for each server, a first feature vector of the server according to the acquired at least one item of asset information.
The sending module 203 is configured to use a plurality of first feature vectors with similarity greater than a preset threshold as a class of feature vectors according to a preset clustering algorithm, obtain a clustering result of each first feature vector, and determine a group of each server according to the clustering result.
It should be understood that the present embodiment is an apparatus embodiment corresponding to the method embodiment, and the present embodiment can be implemented in cooperation with the method embodiment. The related technical details mentioned in the method embodiment are still valid in this embodiment, and are not described herein again in order to reduce repetition. Accordingly, the related art details mentioned in the present embodiment can also be applied in the method embodiment.
It should be noted that, all the modules involved in this embodiment are logic modules, and in practical application, one logic unit may be one physical unit, may also be a part of one physical unit, and may also be implemented by a combination of multiple physical units. In addition, in order to highlight the innovative part of the present invention, a unit which is not so closely related to solve the technical problem proposed by the present invention is not introduced in the present embodiment, but this does not indicate that there is no other unit in the present embodiment.
Another aspect of the embodiments of the present application further provides an electronic device, with reference to fig. 3, including: comprises at least one processor 301; and a memory 302 communicatively coupled to the at least one processor 301; the memory 302 stores instructions executable by the at least one processor 301, and the instructions are executed by the at least one processor 301, so that the at least one processor 301 can execute the server grouping method described in any of the above method embodiments.
Where the memory 302 and the processor 301 are coupled in a bus, the bus may comprise any number of interconnected buses and bridges, the buses coupling one or more of the various circuits of the processor 301 and the memory 302. The bus may also connect various other circuits such as peripherals, voltage regulators, power management circuits, and the like, which are well known in the art, and therefore, will not be described any further herein. A bus interface provides an interface between the bus and the transceiver. The transceiver may be one element or a plurality of elements, such as a plurality of receivers and transmitters, providing a means for communicating with various other apparatus over a transmission medium. The data processed by the processor 301 is transmitted over a wireless medium via an antenna, which further receives the data and transmits the data to the processor 301.
The processor 301 is responsible for managing the bus and general processing and may also provide various functions including timing, peripheral interfaces, voltage regulation, power management, and other control functions. And memory 302 may be used to store data used by processor 301 in performing operations.
Another aspect of the embodiments of the present application also provides a computer-readable storage medium storing a computer program. The computer program realizes the above-described method embodiments when executed by a processor.
That is, as can be understood by those skilled in the art, all or part of the steps in the method according to the above embodiments may be implemented by a program instructing related hardware, where the program is stored in a storage medium and includes several instructions to enable a device (which may be a single chip, a chip, or the like) or a processor (processor) to execute all or part of the steps in the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
It will be understood by those of ordinary skill in the art that the foregoing embodiments are specific examples for carrying out the present application, and that various changes in form and details may be made therein without departing from the spirit and scope of the present application in practice.

Claims (11)

1. A server grouping method, comprising:
acquiring at least one item of asset information on each server;
for each server, determining a first feature vector of the server according to the acquired at least one item of asset information;
and according to a preset clustering algorithm, taking a plurality of first feature vectors with similarity greater than a preset threshold value as a class of feature vectors, acquiring a clustering result of each first feature vector, and determining the grouping of each server according to the clustering result.
2. The server grouping method according to claim 1, wherein the determining a first feature vector of the server according to the obtained at least one item of asset information comprises:
under the condition that the obtained asset information is N items, wherein N is an integer greater than 1;
classifying all the asset information of each server to generate N types of asset information sets, and determining a keyword set corresponding to each type of asset information set;
acquiring a target keyword set corresponding to each asset information of the server; the target keyword set is the keyword set corresponding to the asset information set to which each item of asset information belongs;
generating a second feature vector corresponding to each item of asset information according to the occurrence frequency of each keyword in the corresponding target keyword set in each item of asset information;
and determining the first feature vector according to the N second feature vectors of the server.
3. The server grouping method according to claim 2, wherein the determining the first eigenvector from the N second eigenvectors of the server comprises:
for each second feature vector, acquiring the weight of each element in the second feature vector;
generating a third feature vector corresponding to the second feature vector according to the weight of each element in the second feature vector;
and for each server, splicing the generated N third feature vectors according to the same sequence to obtain the first feature vector.
4. The server grouping method according to claim 3, wherein the obtaining the weight of each element in the second feature vector comprises:
for each element in the second feature vector, determining the word frequency of the corresponding keyword according to the number of times of the corresponding keyword appearing in the asset information corresponding to the second feature vector and the total number of keywords in the keyword set corresponding to the asset information set to which the asset information corresponding to the second feature vector belongs;
determining the inverse document frequency of the corresponding keyword according to the quantity of the asset information containing the corresponding keyword and the total quantity of the asset information of each server;
and determining the weight of the element according to the word frequency and the inverse document frequency of the corresponding keyword.
5. The server grouping method according to claim 3, further comprising, after the obtaining the weights of the elements in the second feature vector: the weights of the elements are normalized according to the following formula:
Figure FDA0003395312350000011
wherein, ω isnorm,iNormalizing the weight of the ith element in the second feature vector by ωiIs the weight, ω, of the ith elementjIs the weight of the jth element in the second feature vector, and m is the total number of elements in the second feature vector;
generating a third feature vector corresponding to the second feature vector according to the weight of each element in the second feature vector, including:
and generating a third feature vector corresponding to the second feature vector according to the weight of each element of the second feature vector after normalization processing.
6. The server grouping method according to claim 1, wherein the obtaining a clustering result of each first feature vector by using a plurality of first feature vectors with similarity greater than a preset threshold as a class of feature vectors according to a preset clustering algorithm comprises:
acquiring a target hyper-parameter of a preset density clustering algorithm; the target hyper-parameters comprise interval thresholds among the characteristic vectors of one type and the minimum number of the vectors of the characteristic vectors of one type;
and performing clustering iteration on each first feature vector based on the target hyper-parameter to obtain the clustering result.
7. The server grouping method according to any one of claims 1 to 6, wherein the asset information includes one of the following or any combination thereof: the system comprises processes, port binding information, system users, group users, timing tasks, startup starting items and environment variables.
8. The server grouping method according to any one of claims 1 to 6, further comprising, after the determining the grouping of the servers according to the clustering result:
for each group, acquiring key services of each server in the group;
and according to the public key service of each server, performing server service marking on the grouping.
9. A server grouping apparatus, comprising:
the acquisition module is used for acquiring at least one item of asset information on each server;
the determining module is used for determining a first feature vector of each server according to the acquired at least one item of asset information;
and the grouping module is used for taking a plurality of first characteristic vectors with similarity greater than a preset threshold value as a class of characteristic vectors according to a preset clustering algorithm, acquiring a clustering result of each first characteristic vector, and determining the grouping of each server according to the clustering result.
10. An electronic device, comprising:
at least one processor; and the number of the first and second groups,
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the server grouping method of any one of claims 1 to 8.
11. A computer-readable storage medium storing a computer program, wherein the computer program, when executed by a processor, implements the server grouping method of any one of claims 1 to 8.
CN202111481222.6A 2021-12-06 2021-12-06 Server grouping method and device, electronic equipment and storage medium Pending CN114385436A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111481222.6A CN114385436A (en) 2021-12-06 2021-12-06 Server grouping method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111481222.6A CN114385436A (en) 2021-12-06 2021-12-06 Server grouping method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN114385436A true CN114385436A (en) 2022-04-22

Family

ID=81195156

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111481222.6A Pending CN114385436A (en) 2021-12-06 2021-12-06 Server grouping method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN114385436A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115065600A (en) * 2022-06-13 2022-09-16 远景智能国际私人投资有限公司 Equipment grouping method, device, equipment and storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115065600A (en) * 2022-06-13 2022-09-16 远景智能国际私人投资有限公司 Equipment grouping method, device, equipment and storage medium
CN115065600B (en) * 2022-06-13 2024-01-05 远景智能国际私人投资有限公司 Equipment grouping method, device, equipment and storage medium

Similar Documents

Publication Publication Date Title
US10459971B2 (en) Method and apparatus of generating image characteristic representation of query, and image search method and apparatus
US8527811B2 (en) Problem record signature generation, classification and search in problem determination
CN109872162B (en) Wind control classification and identification method and system for processing user complaint information
CN110807085B (en) Fault information query method and device, storage medium and electronic device
CN112035599B (en) Query method and device based on vertical search, computer equipment and storage medium
CN111986792B (en) Medical institution scoring method, device, equipment and storage medium
CN110929125A (en) Search recall method, apparatus, device and storage medium thereof
KR101965277B1 (en) System and method for analysis of hypergraph data and computer program for the same
CN111177360B (en) Self-adaptive filtering method and device based on user logs on cloud
CN111104242A (en) Method and device for processing abnormal logs of operating system based on deep learning
CN112667750A (en) Method and device for determining and identifying message category
CN112364014B (en) Data query method, device, server and storage medium
US20170147652A1 (en) Search servers, end devices, and search methods for use in a distributed network
CN104615723B (en) The determination method and apparatus of query word weighted value
CN110489142B (en) Evaluation method and device for equipment software upgrading, storage medium and terminal
CN107871055B (en) Data analysis method and device
CN113326363B (en) Searching method and device, prediction model training method and device and electronic equipment
CN114385436A (en) Server grouping method and device, electronic equipment and storage medium
US8918406B2 (en) Intelligent analysis queue construction
CN105512270B (en) Method and device for determining related objects
CN112711678A (en) Data analysis method, device, equipment and storage medium
CN110781232A (en) Data processing method, data processing device, computer equipment and storage medium
CN115859932A (en) Log template extraction method and device, electronic equipment and storage medium
CN113569879B (en) Training method of abnormal recognition model, abnormal account recognition method and related device
CN114490246A (en) Monitoring method, monitoring device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination