CN113760778B

CN113760778B - Word vector model-based micro-service interface division evaluation method

Info

Publication number: CN113760778B
Application number: CN202111316694.6A
Authority: CN
Inventors: 李莹; 夏轩轩; 张凌飞; 朱晓莉; 方燕翎; 毛义华
Original assignee: Tianjin Zhongyi Science And Technology Co ltd; Binhai Industrial Technology Research Institute of Zhejiang University
Current assignee: Tianjin Zhongyi Science And Technology Co ltd; Binhai Industrial Technology Research Institute of Zhejiang University
Priority date: 2021-11-09
Filing date: 2021-11-09
Publication date: 2022-02-08
Anticipated expiration: 2041-11-09
Also published as: CN113760778A

Abstract

The invention provides a word vector model-based micro-service interface division evaluation method, which comprises the following steps of: the server side constructs a micro-service cluster; collecting log data to restore a distributed link calling process among all micro-service applications; model training: splitting a graph-shaped calling chain into linear calling subchains, extracting interface names according to a calling sequence to form an interface character string array, and obtaining a man-made micro-service interface division set omega; performing word vector model training based on the interface character string array to obtain a word vector of the interface name; interface division evaluation: taking the category number K of the micro-service application in the current cluster as the cluster number to obtain a cluster division set of a K-means algorithm; and evaluating the rationality of the omega interface division of the set by using a Purity algorithm with the clustering division set of the K-means algorithm as a reference. The method is based on the calling relation of the actual operation of the micro-service interface, uses a mathematical method to subdivide the interface set, compares the interface set with the micro-service interface divided manually, and guides the optimization of the existing micro-service architecture.

Description

Word vector model-based micro-service interface division evaluation method

Technical Field

The invention belongs to the field of micro service interfaces, and particularly relates to a micro service interface division evaluation method based on a word vector model.

Background

The traditional single application architecture is generally based on Tomcat middleware, and the complexity of the system is increased by the architecture, so that the cooperation among developers is difficult, and the system is difficult to be smoothly and continuously integrated and continuously released. In actual operation, the problem of chain reaction of faults is easy to occur, and the rapidly-increased business scale of the internet company cannot be met.

Compared with the traditional single architecture, the micro-service architecture decomposes the functions into discrete services, each service is cohesive enough, so that the coupling of the system is reduced, the services can be horizontally and vertically expanded and independently deployed, the problem of one service cannot lead the whole system to be paralyzed, and the system cannot be limited on a certain technical stack for a long time. The project adopting the micro-service architecture can realize the integration of rapid iteration, frequent release, development, operation and maintenance.

Based on the above advantages, more and more companies split the monolithic application into the micro-service architecture, for example, patent document with publication number CN112988122A discloses a monolithic application splitting tool and method based on the correlation between functional characteristics and micro-services, and patent document with publication number CN111026468A discloses a backend splitting strategy based on micro-services.

However, when the single application system has complex business, huge codes and numerous modules coupled together, it is challenging to comb out an ideal micro-service structure by means of manual disassembly. Unreasonable service interface division can lead to more complex service dependence relationship, recursively increases call delay among services, and sometimes even some simple functions are difficult to construct. This has the result that development progress is slowed, migration is more difficult, and the like.

In order to better build a micro-service architecture and reduce the call delay between services, the rationality of micro-service interface division needs to be measured and objectively evaluated.

Disclosure of Invention

In view of the above, the present invention aims to provide a method for evaluating micro-service interface division based on a word vector model, so as to solve the problem of low efficiency caused by unreasonable interface division and complex inter-service dependency relationship.

In order to achieve the purpose, the technical scheme of the invention is realized as follows:

a micro service interface division evaluation method based on a word vector model comprises the following steps:

s1, collecting data, specifically comprising the following steps:

s11, the server side constructs a micro service cluster;

s12, collecting and restoring the distributed link calling process among the micro service applications and forming a graph-shaped calling chain;

s2, setting a word vector model, inputting an interface character string array, and obtaining a word vector of an interface name, wherein the method comprises the following specific steps:

s21, dividing the graph-shaped calling chain into m linear calling subchains by a depth-first search method DFS, extracting interface names according to a calling sequence to form an interface character string array, and obtaining a man-made micro-service interface division set omega;

s22, carrying out word vector model training based on the interface character string array of the step S21 to obtain a word vector of the interface name;

s3, interface division evaluation, which comprises the following steps:

s31, taking the category number K of the micro-service application as the cluster number, and using the word vector of the clustering interface name of the K-means algorithm to obtain the clustering cluster division set C of the K-means algorithm ═ C₁,c₂,...,c_k}；

S32, clustering cluster division set C ═ { C) by K-means algorithm₁,c₂,...,c_kUsing Purity algorithm to evaluate human as referenceThe micro service interface of (2) is divided into the rationality of a set omega.

Further, in step S11, the method for the server to construct the micro service cluster includes:

the method comprises the steps that a server side constructs a micro-service cluster on the basis of spring cluster, service discovery annotations @ EnableDiscoveryClient and Feign annotations @ EnableFeign Clients are started on a micro-service application starting class, and calling is carried out between micro-service applications through the Feign Client.

Further, in step S12, a method for restoring the distributed link call procedure between the microservice applications and forming a graph-like call chain is collected:

adding a link tracking tool SOFATracer dependency, a Spring Cloud OpenFeign dependency and a data collection tool Zipkin dependency in a configuration file of each micro-service application, and performing embedded point access on a Spring Cloud OpenFeign component by using the SOFATracer to obtain a link calling process of each micro-service application;

introducing a link collection and display tool Zipkin into each project engineering, starting a Zipkin server, receiving link log data reported by a SOFATracer, cleaning the link log data by the Zipkin to form a graph-shaped calling chain, and restoring a distributed link calling process.

Further, the parameters of the sofatrecer configuration include:

a logging path, which designates a log file output directory;

com, alipay, sofa, tracker, Zipkin, enabled, starting the SOFATracer to report data to Zipkin remotely;

com, alipay, sofa, tracker, Zipkin, baseUrl, report data to the server address of Zipkin

The Spring Cloud OpenFeign summary log output by sofatrer can be seen in the log catalog of the project, and the parameters contained in one piece of data in the log are as follows:

app, representing the current microservice application name;

url, which represents the request interface address;

traceId, which represents the ID in sofastracer representing a unique request;

the spanId represents the level of the request in the whole call link;

the naming rule of the spanId is the number of a father spanId + a son spanId, the calling chain context relationship is included, and the spanIds with the same TraceId are collected to form a complete link tree.

Further, in step S21, the method for extracting the interface names according to the calling order and forming the interface character string array includes:

converting each linear calling subchain into an interface character string separated by a space, wherein m linear calling subchains form an interface character string array with the length of m, each interface character string represents an interface calling process of a primary child request, and the extracted interface granularity is a father path in an interface address and represents a resource type in micro-service application;

performing deduplication processing on all extracted interface names, and dividing the interface names into k clusters omega { omega ═ omega } according to the categories of the micro-service applications corresponding to the interface names₁,ω₂,...，ω_k}，Ω＝{ω₁,ω₂,...,ω_kAnd k represents the number of categories of micro service applications in the current cluster.

Further, in step S22, the word vector model is a CBOW model in the word vector models provided by the python genetic library;

the specific steps of training the word vector model are as follows:

setting a generated word vector dimension S, a window size C and a lowest word frequency min _ count equal to 1;

inputting an interface character string array, and establishing a sliding window with the size of C on each interface character string;

the central word of the sliding window is used as a target of the training, the rest words in the window are used as input nodes of the neural network, a piece of training data is generated after the window slides once, and a word vector representation set v ═ v { v } of each interface name is obtained through repeated iterative training₁,v₂,...,v_n}。

Further, in step S31, the word vectors of the interface names are clustered using the K-means algorithm, and a cluster partition set C ═ C of the K-means algorithm is obtained₁，c₂，...，c_kThe concrete steps are as follows:

taking the category number K of the microservice applications in the current cluster in step S21 as the cluster number of the K-means algorithm, first randomly selecting K vectors { mu ] from the interface word vector set v₁，μ₂，...，μ_kAs each class cluster C in the set C_iAnd initializing cluster c_i＝{μ_i}，i∈{1，2，...，k}；

Computing interface word vectors v_jAnd each mean vector mu_iDistance d of_jiWherein j ∈ {1, 2., n }, and v is determined according to the nearest mean vector_jCluster class λ of_j＝arg min_{i∈{1,2,...,k}}d_ji，λ_jMeans when the distance d_jiThe value of the minimum time variable i, i.e. λ_jE {1, 2.., k }, and a vector v of interface words_jInto a corresponding cluster

t 1,2,3, …, original

After one iteration is finished, aiming at each class cluster c_i，c_i∈{c₁,c₂,...,c_kRecalculate the center point

Clustering the mean vector mu of the current class_iUpdated to be mu'_iThen for each interface word vector v_jSearching the central point closest to the user again;

repeatedly circulating until the set C of the two iterations does not change, and finally obtaining the clustering cluster division set C of the K-means algorithm (C ═ C)₁,c₂,...,c_k}。

Further, the calculating of the interface word vector v_jAnd each mean vector mu_iDistance d of_jiThe specific method comprises the following steps:

vector v of interface word_jAnd each mean vector mu_iAre normalized and converted into unit vectors;

vector v of interface word_jAnd each mean vector mu_iThe normalized unit vector is subjected to vector dot product operation to obtain vector inner product, namely vector space cosine included angle, and the value of the cosine included angle is taken as the distance d between the two vectors_ji。

The range of the cosine is [ -1,1], if the cosine between two vectors tends to-1, the semantic difference is larger, and tends to 1, the semantic similarity is considered to be higher.

Further, in step S32, the formula of the Purity algorithm is:

where N denotes the total number of word vectors, and Ω ═ ω { [ ω ]₁，ω₂，...,ω_kDenotes an artificial micro-service interface partition set, C ═ C₁,c₂,...,c_kExpressing a clustering cluster division set of a K-means algorithm;

purity ∈ [0,1], with closer to 1 indicating more reasonable partitioning of the microservice interface.

For each cluster omega_iAssigning a category j, wherein the assignment rule is that the interface word vector v with the category j is in the cluster omega_iIn which v ∈ C is the largest_jCalculate each cluster ω_iAnd summing and normalizing the occurrence times of the word vectors with the category of j to obtain the final score Purity.

Compared with the prior art, the word vector model-based micro-service interface division evaluation method has the following beneficial effects:

the micro-service interface division evaluation method based on the word vector model is based on the calling relation of the actual operation of the micro-service interface, uses mathematical methods such as the word vector model, the K-means clustering and the Purity algorithm to divide the interface set again, compares the interface set with the manually divided micro-service interface, calculates the manual interface division evaluation score, and guides the existing micro-service architecture to carry out further optimization and adjustment, so that the micro-service architecture more conforms to the principle of high-cohesion and low-coupling micro-service architecture.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate an embodiment of the invention and, together with the description, serve to explain the invention and not to limit the invention. In the drawings:

FIG. 1 is a flow chart of a micro-service interface partitioning evaluation method based on a word vector model according to the present invention;

FIG. 2 is a process diagram of a restore request call chain according to the present invention;

FIG. 3 is a diagram illustrating a word vector model according to the present invention;

FIG. 4 is a schematic diagram of the K-means clustering algorithm and the Purity algorithm according to the present invention.

Detailed Description

It should be noted that the embodiments and features of the embodiments may be combined with each other without conflict.

In the description of the present invention, it is to be understood that the terms "center", "longitudinal", "lateral", "up", "down", "front", "back", "left", "right", "vertical", "horizontal", "top", "bottom", "inner", "outer", and the like, indicate orientations or positional relationships based on those shown in the drawings, and are used only for convenience in describing the present invention and for simplicity in description, and do not indicate or imply that the referenced devices or elements must have a particular orientation, be constructed and operated in a particular orientation, and thus, are not to be construed as limiting the present invention. Furthermore, the terms "first", "second", etc. are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first," "second," etc. may explicitly or implicitly include one or more of that feature. In the description of the present invention, "a plurality" means two or more unless otherwise specified.

In the description of the present invention, it should be noted that, unless otherwise explicitly specified or limited, the terms "mounted," "connected," and "connected" are to be construed broadly, e.g., as meaning either a fixed connection, a removable connection, or an integral connection; can be mechanically or electrically connected; they may be connected directly or indirectly through intervening media, or they may be interconnected between two elements. The specific meaning of the above terms in the present invention can be understood by those of ordinary skill in the art through specific situations.

The present invention will be described in detail below with reference to the embodiments with reference to the attached drawings.

As shown in fig. 1, the method for evaluating the division of the micro service interface based on the word vector model mainly includes a data collection stage S1, a model training stage S2 and an interface evaluation stage S3.

S1, a data collection stage, which comprises the following steps:

s11, the server side constructs a micro service cluster, and each micro service application independently collects the embedded point logs.

s2, in the model training phase, setting a word vector model, inputting the preprocessed interface character string array, and obtaining the word vector representation of the interface name, wherein the method specifically comprises the following steps:

s21, dividing the graph-shaped calling chain into m linear calling subchains by a depth-first search method DFS, extracting interface names according to a calling sequence, forming an interface character string array, generating training data of a word vector model, and obtaining a micro-service interface set omega divided artificially;

s3, interface evaluation stage, which comprises the following steps:

s31, using the category number K of the micro-service application as the cluster number, using the name word vector of the clustering interface of the K-means algorithm to obtain the clustering cluster division set C of the K-means algorithm ═ C₁,c₂,...,c_k}；

S32, clustering cluster division set C ═ { C) by K-means algorithm₁,c₂,...,c_kAnd (5) evaluating the rationality of the artificial micro-service interface division set omega by using a Purity algorithm as a reference.

In step S11, the method for the server to construct the micro service cluster includes:

the method comprises the following steps that a server side constructs a micro-service cluster on the basis of Spring Cloud, SOFATracer dependence, Spring Cloud OpenFeign dependence and Zipkin dependence are added into a pom file of an engineering module, and parameters needed to be used by a link tracking tool SOFATracer and a data collection tool Zipkin are added into a configuration file of each micro-service application, wherein the parameters comprise:

a logging path, which designates a log file output directory;

After the configuration of the dependency and the parameters of each micro service project is completed, service discovery notes @ EnableDiscoveryClients and Feign notes @ EnableFeign Clients are started on a micro service application starting class, and the micro service applications are called through the Feign Clients.

app, representing the current microservice application name;

url, which represents the request interface address;

traceId, which represents the ID in sofastracer representing a unique request;

the spanId represents the level of the request in the whole call link,

in step S12, a method for collecting and restoring the distributed link call process between the microservice applications and forming a graph call chain is provided:

starting a Zipkin server, reporting the Spring Cloud OpenFeign summary log to the Zipkin server by the SOFATracer component integrated by each micro-service application, optionally, according to the size of data volume, performing corresponding configuration on the Zipkin server to enable log data to be persisted to databases such as Mysql or elastic search.

As shown in fig. 2, firstly, the reported link log data is extracted from the database, data with the same TraceId is from the same request, the naming rule of the spanId parameter in each piece of data is the number of parent spanId + child spanId, which includes the context relationship of the call chain, the position of the piece of data in the call chain requested according to the spanId is restored, and the format of the request. "name of method in micro service application address/name of micro service resource class/class", such as: "http://122.224.64.250: 8083/device/getInfo";

url parameters, such as device, are extracted as an interface api of the data request, and finally each request is restored to a graph-like call chain, as shown in the first dotted box of fig. 2, a, B, …, G indicate data with the same TraceId in the database, TraceId and spanId are parameters carried by the data, and api is a parameter generated by artificial extraction.

In step S21, the method for generating word vector model training data includes:

and traversing the link data of each request by a depth-first search method DFS, and splitting all the graph-shaped call chains into m linear call subchains as shown by a second dotted box in FIG. 2. Traversing each sub-chain, extracting an api parameter in each piece of data according to a calling sequence, and converting each linear calling sub-chain into an interface character string separated by a space, such as 'sa sd sc sg', wherein each interface character string represents an interface calling process of a sub-request at one time, and m linear calling sub-chains form an interface character string array with the length of m.

And performing duplicate removal processing on all extracted interface names sa, sb, sc and the like, and dividing the extracted interface names into k clusters omega { omega ═ omega } according to the micro service application names local₁,ω₂，...,ω_k}，Ω＝{ω₁，ω₂，...,ω_kThe method is an artificial micro service interface division set, and k represents the micro service in the current clusterNumber of categories of service applications.

The interface string array is a training corpus as the word vector model in step S22.

As shown in fig. 3, in step S22, the word vector model is a CBOW model in the word vector models provided by the python general library, where the CBOW model is a three-layer neural network including an Input layer (Input layer), a Hidden layer (Hidden layer), and an Output layer (Output layer);

the specific steps of training the word vector model are as follows:

setting a word vector model training parameter, generating a word vector dimension S of 100, a window size C of 5, and a lowest word frequency min _ count of 1 (every interface appearing on a request link should not be ignored);

an interface character string array is input, a sliding window with the size of C is established on each interface character string, and a1, a2 and … a6 in the figure 3 represent interface names contained in one interface character string. The central word a3 of the sliding window is used as the target of the training, the rest words a1, a2, a4 and a5 in the window are used as input nodes of the neural network, each interface name can be converted into N-dimensional One-Hot codes, N is the number of the extracted and de-weighted interface names, and the One-Hot codes of 4 input nodes are respectively multiplied by a shared input weight matrix W_N×SObtaining 4 vectors, generating an S-dimensional hidden layer vector after weighted averaging, and multiplying the hidden layer vector by an output weight matrix W'_N×SObtaining an output vector, comparing the output vector with One-Hot codes of the headword a3, updating the weight matrixes W and W ', generating a piece of training data after the window slides once, and obtaining an output weight matrix W ' after repeated iterative training '_N×SFor the interface word vector matrix, each row of the matrix corresponds to an S-dimensional interface word vector, and finally, a word vector representation set V ═ V of each interface name is obtained₁,v₂,...,v_nThe distribution of the set V in space is shown in the first dashed box of fig. 4.

The interface word vectors with similar contexts in the call chain are close to each other in position in the space coordinate, and the interface word vectors with larger context difference are far away from each other.

In step S31, the word vectors of the K-means algorithm clustering interface names are used to obtain a clustering cluster division set C ═ C of the K-means algorithm₁,c₂,...,c_kThe method comprises the following specific steps:

taking the category number K of the microservice applications in the current cluster in step S21 as the cluster number of the K-means algorithm, first randomly selecting K vectors { mu ] from the interface word vector set v₁,μ₂,...,μ_kAs each class cluster C in the set C_iAnd initializing cluster c_i＝{μ_i}，i∈{1,2,...,k}；

t 1,2,3, …, original

After one iteration is finished, aiming at each class cluster c_i，c_i∈{c₁，c₂,...，c_kRecalculate the center point

The computing interface wordVector v_jAnd each mean vector mu_iDistance d of_jiThe specific method comprises the following steps:

In step S32, the Purity algorithm formula is:

where N denotes the total number of word vectors, and Ω ═ ω { [ ω ]₁，ω₂，...，ω_kDenotes an artificial micro-service interface partition set, C ═ C₁，c₂，...，c_kExpressing a clustering cluster division set of a K-means algorithm;

The process of the Purity algorithm is shown in FIG. 4, wherein solid circles represent interface word vectors which are not classified by the Kemeans algorithm, open circles, open triangles and open squares represent interface word vectors which are classified into different classes by the K-means algorithm, the second dashed box in FIG. 4 represents the distribution of the interface word vectors in the set C, the third dashed box represents the distribution of the interface word vectors in the set omega, and the Purity formula is given to each class cluster omega_iAssigning a category j, wherein the assignment rule is that the interface word vector v with the category j is in the cluster omega_iIn which v ∈ C is the largest_jCalculate each cluster ω_iAnd summing the occurrence times of the interface word vectors with the category of j, and normalizing to obtain the final score Purity.

Based on the calling relation of the actual operation of the micro-service interface, the invention uses mathematical methods such as a word vector model, K-means clustering and a Purity algorithm to re-divide the interface set, compares the interface set with the micro-service interface divided manually, calculates to obtain the evaluation score of the division of the manual interface, and guides the existing micro-service architecture to carry out further optimization and adjustment so as to ensure that the micro-service architecture better conforms to the principle of the micro-service architecture with high cohesion and low coupling.

The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims

1. A micro service interface division evaluation method based on a word vector model is characterized by comprising the following steps:

s1, collecting data, specifically comprising the following steps:

s11, the server side constructs a micro service cluster;

s22, inputting the interface character string array based on the step S21 into a set word vector model to obtain a word vector of the interface name;

s3, interface division evaluation, which comprises the following steps:

s31, taking the category number K of the micro-service application in the current cluster as the cluster number, and using the word vector of the K-means algorithm clustering interface name to obtain the clustering cluster division set C ═ { C ═ C of the K-means algorithm₁，c₂，...，c_k}；

S32, using K-means algorithmCluster partition set C ═ { C ═ C₁，c₂，...，c_kAnd (5) evaluating the rationality of the artificial micro-service interface division set omega by using a Purity algorithm as a reference.

2. The method for dividing and evaluating the micro-service interface based on the word vector model according to claim 1, wherein in step S11, the method for the server to construct the micro-service cluster comprises:

3. The method for evaluating division of micro-service interfaces based on word vector model according to claim 1, wherein in step S12, the method for restoring the distributed link calling process between micro-service applications and forming a graph-like calling chain is collected:

introducing a link collection and display tool Zipkin into each project engineering, starting a Zipkin server, receiving link log data reported by a SOFATracer, cleaning the link log data to form a shape calling chain, and restoring a distributed link calling process.

4. The micro service interface partition evaluation method based on the word vector model according to claim 3, wherein the parameters of the SOFATracer configuration include:

a logging path, which designates a log file output directory;

app, representing the current microservice application name;

url, which represents the request interface address;

traceId, which represents the ID in sofastracer representing a unique request;

the spanId represents the level of the request in the whole call link;

5. The method for evaluating division of micro-service interfaces based on word vector models according to claim 1, wherein in step S21, the method for extracting the interface names according to the calling order and forming the interface character string array comprises:

converting each linear calling subchain into an interface character string separated by a space, wherein m linear calling subchains form an interface character string array with the length of m, each interface character string represents an interface calling process of a primary child request, and the extracted interface granularity is a father path in an interface address and represents a resource class name in micro-service application;

performing deduplication processing on all extracted interface names, and dividing the interface names into k clusters omega { omega ═ omega } according to the categories of the micro-service applications corresponding to the interface names₁，ω₂,...,ω_k}，Ω＝{ω₁，ω₂,...,ω_kAnd k represents the number of categories of micro service applications in the current cluster.

6. The micro-service interface division evaluation method based on the word vector model according to claim 1, wherein in step S22, the word vector model is a CBOW model in the word vector model provided by a python genetic library;

the specific steps of training the word vector model are as follows:

7. The method for evaluating micro-service interface partition based on word vector model of claim 5, wherein in step S31, word vectors of interface names are clustered by using K-means algorithm, and a cluster partition set C ═ C { C } of K-means algorithm is obtained₁,c₂,...,c_kThe method comprises the following specific steps:

taking the category number K of micro-service application in the current cluster as the cluster number of a K-means algorithm, firstly randomly selecting K vectors (mu) from an interface word vector set v₁，μ₂,...,μ_kAs each class cluster C in the set C_iAnd initializing cluster c_i＝{μ_i}，i∈{1，2，...，k}；

Computing interface word vectors v_jAnd each mean vector mu_iDistance d of_jiWherein j ∈ {1, 2., n }, and v is determined according to the nearest mean vector_jCluster class λ of_j＝arg min_{i∈{1，2，...，k}}d_ji，λ_jMeans when the distance d_jiThe value of the minimum time variable i, i.e. λ_jE {1, 2.., k }, and a vector v of interface words_jInto a corresponding cluster

Initial

After one iteration is finished, aiming at each class cluster c_i，c_i∈{c₁，c₂，...，c_kRecalculate the center point

repeatedly circulating until the set C of the two iterations does not change, and finally obtaining the clustering cluster division set C of the K-means algorithm (C ═ C)₁，c₂，...，c_k}。

8. The method according to claim 7, wherein the calculation interface word vector v is a word vector model-based micro-service interface partition evaluation method_jAnd each mean vector mu_iDistance d of_jiThe specific method comprises the following steps:

9. The method for evaluating division of micro-service interfaces based on word vector models according to claim 5, wherein in step S32, the Purity algorithm formula is:

where N denotes the total number of word vectors, and Ω ═ ω { [ ω ]₁，ω₂，...,ω_kDenotes an artificial micro-service interface partition set, C ═ C₁，c₂,...，c_kExpressing a clustering cluster division set of a K-means algorithm;