CN115309634A - Micro-service extraction method, system, medium, equipment and information processing terminal - Google Patents

Micro-service extraction method, system, medium, equipment and information processing terminal Download PDF

Info

Publication number
CN115309634A
CN115309634A CN202210835089.8A CN202210835089A CN115309634A CN 115309634 A CN115309634 A CN 115309634A CN 202210835089 A CN202210835089 A CN 202210835089A CN 115309634 A CN115309634 A CN 115309634A
Authority
CN
China
Prior art keywords
micro
service
microservice
class
source code
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210835089.8A
Other languages
Chinese (zh)
Inventor
王宾
邓亚楠
贺小伟
吴昊
张渊辉
王师蕾
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
NORTHWEST UNIVERSITY
Original Assignee
NORTHWEST UNIVERSITY
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by NORTHWEST UNIVERSITY filed Critical NORTHWEST UNIVERSITY
Priority to CN202210835089.8A priority Critical patent/CN115309634A/en
Publication of CN115309634A publication Critical patent/CN115309634A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Prevention of errors by analysis, debugging or testing of software
    • G06F11/3668Testing of software
    • G06F11/3672Test management
    • G06F11/3688Test management for test execution, e.g. scheduling of test suites
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Prevention of errors by analysis, debugging or testing of software
    • G06F11/3668Testing of software
    • G06F11/3672Test management
    • G06F11/3684Test management for test design, e.g. generating new test cases

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention belongs to the technical field of micro-service extraction, and discloses a micro-service extraction method, a system, a medium, equipment and an information processing terminal, which are used for carrying out hierarchical division on a source code; acquiring a time sequence diagram of each method of a control layer through reverse engineering, and acquiring class diagrams of a physical layer, a database access layer and other layers; modeling a display service function of the time sequence diagram; carrying out implicit business function modeling on the class diagram; extracting candidate micro-services of the source code based on the business function model through spectral clustering; evaluating the quality of the candidate micro-service; and the candidate micro-services are visualized in a graph structure form, and an adjusting function is provided for the framework personnel. The method uses the spectral clustering algorithm to extract the micro-services, and achieves the aims of high cohesion inside the micro-services and low coupling among the micro-services; the extraction of the micro-service after the business function modeling is carried out on the source code greatly reduces the use threshold of the method, and reduces the time cost and the labor cost of extracting the monomer architecture to the micro-service.

Description

一种微服务提取方法、系统、介质、设备及信息处理终端A microservice extraction method, system, medium, device and information processing terminal

技术领域technical field

本发明属于微服务提取技术领域,尤其涉及一种微服务提取方法、系统、介质、设备及信息处理终端。The invention belongs to the technical field of microservice extraction, and in particular relates to a microservice extraction method, system, medium, equipment and information processing terminal.

背景技术Background technique

将单体架构的系统向微服务架构进行迁移,可以较好的解决基于单体架构的系统所面临的问题。微服务提取的任务主要是从单体架构的系统中合理地将具有相同业务功能的类提取出来,组成一组候选微服务。现有的微服务提取方法相比传统人工提取和识别的方法,实现了自动或者半自动的微服务的识别,但是决定提取候选微服务质量的关键输入部分还是依赖于专业人员的生成,方法的使用门槛较高,自动提取方法难度较高。目前自动提取的方法主要分为静态和动态的两种方法,具有以下特点:1.基于程序运行执行轨迹进行微服务的提取方法依赖测试用例的设计,测试用例必须要熟悉系统的软件设计师生成,遗留系统往往拥有业务复杂,开发人员流动大,文档不全面的问题,因此测试用例的设计很难做到完全覆盖所有业务功能,而测试用例设计的不全面会导致提出的微服务类覆盖率较低,因此该方法使用门槛较高。2.基于源代码进行微服务提取方法,普遍采用代码的相似性作为依据进行微服务提取,不能充分反映业务关系,提取的微服务质量较差,可用性较差,微服务的质量会很大程度上决定微服务的扩展性,因此该方法的使用有较大的隐患。以上问题的存在,都导致了微服务自动提取的难度。Migrating a system with a monolithic architecture to a microservice architecture can better solve the problems faced by a system based on a monolithic architecture. The task of microservice extraction is mainly to reasonably extract classes with the same business function from the single-architecture system to form a group of candidate microservices. Compared with traditional manual extraction and identification methods, existing microservice extraction methods realize automatic or semi-automatic identification of microservices, but the key input for determining the quality of candidate microservices still depends on the generation of professionals and the use of methods The threshold is high, and the automatic extraction method is difficult. At present, the methods of automatic extraction are mainly divided into static and dynamic methods, which have the following characteristics: 1. The method of extracting microservices based on program running and execution traces depends on the design of test cases, and the test cases must be familiar with the system software designers. As a result, legacy systems often have the problems of complex business, high turnover of developers, and incomplete documentation. Therefore, the design of test cases is difficult to completely cover all business functions, and the incomplete design of test cases will lead to the coverage of the proposed microservice class. The rate is low, so the threshold for using this method is high. 2. The method of extracting microservices based on source code generally uses the similarity of code as the basis for extracting microservices, which cannot fully reflect the business relationship. The above determines the scalability of microservices, so the use of this method has great hidden dangers. The existence of the above problems has led to the difficulty of automatic extraction of microservices.

通过上述分析,现有技术存在的问题及缺陷为:基于程序运行执行轨迹进行微服务的提取方法依赖测试用例的设计,测试用例必须要熟悉系统的软件设计师生成,遗留系统往往拥有业务复杂,开发人员流动大,文档不全面的问题,因此测试用例的设计很难做到完全覆盖所有业务功能,而测试用例设计的不全面会导致提出的微服务类覆盖率较低,因此该方法使用门槛较高。基于源代码进行微服务提取方法,普遍采用代码的相似性作为依据进行微服务提取,不能充分反映业务关系,提取的微服务质量较差,可用性较差,微服务的质量会很大程度上决定微服务的扩展性,因此该方法的适用性不高。Through the above analysis, the existing problems and defects of the existing technology are: the method of extracting microservices based on the program execution trajectory depends on the design of the test case, the test case must be generated by a software designer who is familiar with the system, and the legacy system often has complex business , the problem of high turnover of developers and incomplete documentation, so it is difficult to design test cases to completely cover all business functions, and incomplete design of test cases will lead to low coverage of the proposed microservice class, so this method uses The threshold is high. The microservice extraction method based on the source code generally uses code similarity as the basis for microservice extraction, which cannot fully reflect the business relationship. The quality of the extracted microservices is poor, and the usability is poor, and the quality of the microservices will be determined to a large extent. The scalability of microservices, so the applicability of this method is not high.

发明内容Contents of the invention

针对现有技术存在的问题,本发明提供了一种微服务提取方法、系统、介质、设备及信息处理终端,尤其涉及一种基于单体系统源代码的微服务提取方法、系统、介质、设备及信息处理终端。Aiming at the problems existing in the prior art, the present invention provides a microservice extraction method, system, medium, equipment, and information processing terminal, and particularly relates to a microservice extraction method, system, medium, and equipment based on the source code of a single system and information processing terminals.

本发明是这样实现的,一种微服务提取方法,所述微服务提取方法包括:根据源代码生成类图和时序图;基于类图和时序图进行业务功能提取;使用快速谱聚类算法对提取的业务功能进行微服务提取,最后对生成的候选微服务进行质量评估并可视化显示。The present invention is realized in this way, a microservice extraction method, the microservice extraction method includes: generating a class diagram and a sequence diagram according to the source code; extracting business functions based on the class diagram and the sequence diagram; using a fast spectral clustering algorithm to The extracted business functions are extracted from microservices, and finally the quality of the generated candidate microservices is evaluated and visualized.

对源代码进行层次划分;通过逆向工程获取控制层每一个方法的时序图,获取实体层、数据库访问层以及其他层的类图;对时序图进行显示业务功能建模;对类图进行隐式业务功能建模;通过谱聚类基于业务功能模型提取源代码的候选微服务;对候选微服务的质量进行评估;最后以图结构的形式对候选微服务进行可视化,为架构人员提供调整功能。Divide the source code into layers; obtain the sequence diagram of each method of the control layer through reverse engineering, and obtain the class diagrams of the entity layer, database access layer and other layers; model the display business function of the sequence diagram; Business function modeling; extract candidate microservices of source code based on business function model through spectral clustering; evaluate the quality of candidate microservices; finally visualize candidate microservices in the form of graph structure, providing adjustment functions for architects.

进一步,所述微服务提取方法包括以下步骤:Further, the microservice extraction method includes the following steps:

步骤一,对源代码进行层次划分,通过逆向工程获取控制层的时序图,获取实体层和数据库访问层以及其他层的类图;是因为源代码编写不规范会影响微服务的提取。Step 1: Divide the source code into layers, obtain the sequence diagram of the control layer through reverse engineering, and obtain the class diagrams of the entity layer, database access layer and other layers; this is because irregular source code writing will affect the extraction of microservices.

步骤二,对时序图进行显示业务功能建模,对类图进行隐式业务功能建模;是因为时序图能够反映类之间的动态关系,分析类图主要是考虑到领域驱动模型是当前微服务拆分比较热门的思想,通过类图分析,可以挖掘领域关系,从而补充时序图不能反映的部分业务关系。Step 2: Model the display business function of the sequence diagram, and model the implicit business function of the class diagram; because the sequence diagram can reflect the dynamic relationship between classes, the analysis of the class diagram mainly considers that the domain-driven model is the current micro Service splitting is a popular idea. Through class diagram analysis, domain relationships can be mined to supplement some business relationships that cannot be reflected in sequence diagrams.

步骤三,通过谱聚类基于业务功能模型提取源代码的候选微服务,是因为谱聚类将微服务的识别看作图聚类,提取出高内聚低耦合的微服务。Step 3: Extract candidate microservices of the source code based on the business function model through spectral clustering, because spectral clustering regards the identification of microservices as graph clustering, and extracts microservices with high cohesion and low coupling.

步骤四,对候选微服务的质量进行评估,以图结构的形式对候选微服务进行可视化并提供调整功能;是使用质量较差的微服务改进遗留系统,扩展性较差,不能充分体现微服务架构的优势,对微服务的质量进行评估可以避免这种情况。Step 4: Evaluate the quality of the candidate microservices, visualize the candidate microservices in the form of a graph structure and provide adjustment functions; use the poor quality microservices to improve the legacy system, the scalability is poor, and the microservices cannot be fully reflected Taking advantage of the architecture, evaluating the quality of microservices can avoid this situation.

进一步,所述步骤一中,对单体遗留系统的源代码进行层次划分,识别源代码中的控制层、业务层、数据库访问层以及实体层和其他层。Further, in the first step, the source code of the single legacy system is divided into layers, and the control layer, business layer, database access layer, entity layer and other layers in the source code are identified.

根据单体系统程序的源代码,通过逆向工程生成系统的时序图和类图:According to the source code of the single system program, the sequence diagram and class diagram of the system are generated through reverse engineering:

(1)使用SequenceDiagram工具对源代码的控制层中的每一个方法生成时序图,其中源代码的控制层指与负责用户界面显示进行交互的类;(1) Use the SequenceDiagram tool to generate a sequence diagram for each method in the control layer of the source code, where the control layer of the source code refers to the class that interacts with the user interface display;

(2)使用PlantUML Parser工具对源代码的每一个类文件生成数据库访问层及实体层和其他层的类图。(2) Use the PlantUML Parser tool to generate class diagrams for the database access layer, entity layer, and other layers for each class file of the source code.

进一步,所述步骤二中,通过对时序图进行显示业务功能建模,得到类之间的调用关系映射表,具体包括:Further, in the second step, by modeling the sequence diagram to display business functions, a calling relationship mapping table between classes is obtained, specifically including:

(1)在多个时序图文件中,读取一个时序图文件进行解析,当时序图文件为空时,结束显示业务功能建模流程;(1) Among multiple sequence diagram files, read a sequence diagram file for analysis, and when the sequence diagram file is empty, end the display business function modeling process;

(2)统计两个类Ci与Cj之间的调用次数fij,每出现一次Ci到Cj之间的调用关系,fij加1;(2) Count the number of calls f ij between two classes C i and C j , and add 1 to f ij every time there is a call relationship between C i and C j ;

(3)将两个类Ci与Cj之间的调用次数存储在Map结构中,其中键为两个类的类名使用“_”进行拼接组成的字符串,值为调用次数fij(3) The number of calls between the two classes C i and C j is stored in the Map structure, where the key is a character string formed by splicing the class names of the two classes with "_", and the value is the number of calls f ij ;

(4)所有时序图文件解析结束后,输出模型Map。(4) After all the sequence diagram files are parsed, the model Map is output.

根据类图进行隐式业务功能建模,得到类之间的语义相似关系矩阵:Model the implicit business function according to the class diagram, and obtain the semantic similarity relationship matrix between classes:

(1)在多个类图文件中,读取两个类图进行解析,当类图文件为空时,结束隐式业务功能建模流程;(1) In multiple class diagram files, read two class diagrams for analysis, and when the class diagram file is empty, end the implicit business function modeling process;

(2)计算两个类图Ci与Cj之间的语义相似度Sij;根据输入的文本信息制作词袋dictionary,根据文本中的词语与词袋中的key进行匹配,得到语料库corpus;初始化tf-idf变换模型,得到转换后的语料corpus_tfidf,将corpus_tfidf语料库使用Lsi模型进行训练,计算稀疏矩阵相似度;格式转换将需要寻找相似度的分词列表做成语料库doc_test_vec,获得文本的相似度。(2) Calculate the semantic similarity S ij between the two class graphs C i and C j ; make a word bag dictionary according to the input text information, and match the words in the text with the keys in the word bag to obtain the corpus; Initialize the tf-idf transformation model, obtain the converted corpus corpus_tfidf, use the Lsi model to train the corpus_tfidf corpus, and calculate the similarity of the sparse matrix; the format conversion converts the word segmentation list that needs to find similarity into a corpus doc_test_vec, and obtains the similarity of the text.

进一步,所述步骤三中,获得源代码的业务功能模型后,使用谱聚类算法进行聚类得到候选微服务。Further, in the third step, after obtaining the business function model of the source code, perform clustering using a spectral clustering algorithm to obtain candidate microservices.

(1)构造相似矩阵,公式如下:(1) Construct a similarity matrix, the formula is as follows:

Figure BDA0003749592600000041
Figure BDA0003749592600000041

其中,Wij表示相似度矩阵,如果两个类i和j之间存在调用关系则wij=mapij反之wij=Sij;Dij为对角矩阵,对角上的值为W矩阵中对应的行或列的和;Among them, W ij represents the similarity matrix, if there is a call relationship between two classes i and j, then w ij = map ij otherwise w ij = S ij ; D ij is a diagonal matrix, and the value on the diagonal is in the W matrix The sum of the corresponding row or column;

(2)构造拉普拉斯矩阵L,并对L进行归一化,公式如下所示:(2) Construct the Laplacian matrix L, and normalize L, the formula is as follows:

L=D-W;L=D-W;

Figure BDA0003749592600000042
Figure BDA0003749592600000042

(3)对拉普拉斯矩阵进行特征值分解,使用Lanczos方法加速分解过程,得到前k个最小特征值和对应的特征向量,最终组成k维的特征矩阵F;(3) Decompose the eigenvalues of the Laplacian matrix, use the Lanczos method to accelerate the decomposition process, obtain the first k smallest eigenvalues and corresponding eigenvectors, and finally form a k-dimensional eigenmatrix F;

(4)对k维的特征矩阵F使用Kmeans进行聚类;使用AFK-MC2进行初始聚类中心的选择,随机选取抽取一个初始中心样本c1;计算所有数据集的提案分布q(x),从q(x)中随机抽取一个数据点并计算距离dx;用马尔可夫链蒙特卡罗采样出一个长为m的序列,取最后k-1个作为中心点C={C1,C2,...,Ck};(4) Use Kmeans to cluster the k-dimensional feature matrix F; use AFK-MC 2 to select the initial cluster center, randomly select an initial center sample c 1 ; calculate the proposal distribution q(x) of all data sets , randomly select a data point from q(x) and calculate the distance dx; use Markov chain Monte Carlo to sample a sequence of length m, and take the last k-1 as the center point C={C1,C2, ...,Ck};

Figure BDA0003749592600000043
Figure BDA0003749592600000043

Figure BDA0003749592600000044
Figure BDA0003749592600000044

Figure BDA0003749592600000045
Figure BDA0003749592600000045

使用A-means减少K-means算法将数据点分配到聚类簇C的时间;计算每个数据点xi到所有聚类中心的距离,选择距离自己最近的质心Ck,计算所述点的等距指数αi,公式如下:Use A-means to reduce the time for K-means algorithm to assign data points to cluster C; calculate the distance from each data point xi to all cluster centers, select the centroid C k closest to itself, and calculate the The isometric index α i , the formula is as follows:

αi=abs(||i-μ1||2-||i-μ2||2);α i =abs(||i-μ 1 || 2 -||i-μ 2 || 2 );

计算改进阈值

Figure BDA0003749592600000051
公式如下:Calculate the improvement threshold
Figure BDA0003749592600000051
The formula is as follows:

Figure BDA0003749592600000052
Figure BDA0003749592600000052

Figure BDA0003749592600000053
时,xi在轮迭代中不会移动,将xi固定分配给簇Ck,不再参与距离计算和重新分配。when
Figure BDA0003749592600000053
When xi does not move in round iterations, xi is fixedly assigned to cluster C k , and no longer participates in distance calculation and reassignment.

经过谱聚类后,得到一组聚类结果,每一个聚类簇就是一个微服务候选者。After spectral clustering, a set of clustering results is obtained, and each cluster is a microservice candidate.

进一步,所述步骤四中的对提取的候选微服务进行质量评估包括:Further, the quality assessment of the extracted candidate microservices in the step 4 includes:

使用模块化评估的方式,模块化评估值越高,所述微服务间的耦合度越低,微服务内部的聚合度更高,计算公式如下:Using the modular evaluation method, the higher the modular evaluation value, the lower the coupling degree between the microservices, and the higher the aggregation degree inside the microservice. The calculation formula is as follows:

Figure BDA0003749592600000054
Figure BDA0003749592600000054

其中,N是微服务的个数,Ni是微服务内类的个数,Wi是微服务i内部边之间的权重和,Wi,j是微服务i和微服务j之间边的权重和。Among them, N is the number of microservices, N i is the number of classes in microservices, W i is the weight sum of internal edges of microservice i, W i, j is the edge between microservice i and microservice j weight and .

对提取的候选微服务进行可视化,基于Echarts以图结构形式显示各个类文件所属微服务以及类之间的调用关系,拖动顶点手动调整类所属的微服务。Visualize the extracted candidate microservices, display the microservices of each class file and the calling relationship between classes based on Echarts in the form of a graph structure, and drag the vertices to manually adjust the microservices to which the class belongs.

本发明的另一目的在于提供一种应用所述的微服务提取方法的微服务提取系统,所述微服务提取系统包括:Another object of the present invention is to provide a microservice extraction system applying the microservice extraction method, the microservice extraction system comprising:

层次划分模块,用于对源代码进行层次划分,并通过逆向工程获取控制层的时序图,获取实体层和数据库访问层以及其他层的类图;The layer division module is used to divide the source code into layers, and obtain the sequence diagram of the control layer through reverse engineering, and obtain the class diagram of the entity layer, database access layer and other layers;

业务功能建模模块,用于对时序图进行显示业务功能建模,对类图进行隐式业务功能建模;The business function modeling module is used to model the display business function of the sequence diagram and model the implicit business function of the class diagram;

候选微服务提取模块,用于通过谱聚类基于业务功能模型提取源代码的候选微服务;The candidate microservice extraction module is used to extract the candidate microservices of the source code based on the business function model through spectral clustering;

质量评估模块,用于对候选微服务的质量进行评估,以图结构的形式对候选微服务进行可视化并提供调整功能。The quality evaluation module is used to evaluate the quality of candidate microservices, visualize the candidate microservices in the form of graph structure and provide adjustment functions.

本发明的另一目的在于提供一种计算机设备,所述计算机设备包括存储器和处理器,所述存储器存储有计算机程序,所述计算机程序被所述处理器执行时,使得所述处理器执行所述的微服务提取方法的步骤。Another object of the present invention is to provide a computer device, the computer device includes a memory and a processor, the memory stores a computer program, when the computer program is executed by the processor, the processor executes the The steps of the microservice extraction method described above.

本发明的另一目的在于提供一种计算机可读存储介质,存储有计算机程序,所述计算机程序被处理器执行时,使得所述处理器执行所述的微服务提取方法的步骤。Another object of the present invention is to provide a computer-readable storage medium storing a computer program. When the computer program is executed by a processor, the processor executes the steps of the microservice extraction method.

本发明的另一目的在于提供一种信息数据处理终端,所述信息数据处理终端用于实现所述的微服务提取系统。Another object of the present invention is to provide an information data processing terminal, which is used to implement the microservice extraction system.

结合上述的技术方案和解决的技术问题,请从以下几方面分析本发明所要保护的技术方案所具备的优点及积极效果为:Combining the above-mentioned technical solutions and technical problems to be solved, please analyze the advantages and positive effects of the technical solutions to be protected by the present invention from the following aspects:

第一、针对上述现有技术存在的技术问题以及解决该问题的难度,紧密结合本发明的所要保护的技术方案以及研发过程中结果和数据等,详细、深刻地分析本发明技术方案如何解决的技术问题,解决问题之后带来的一些具备创造性的技术效果。具体描述如下:First, in view of the technical problems existing in the above-mentioned prior art and the difficulty of solving the problems, closely combine the technical solution to be protected in the present invention and the results and data in the research and development process, etc., to analyze in detail and profoundly how to solve the technical solution of the present invention Technical problems, some creative technical effects brought about after solving the problems. The specific description is as follows:

本发明提供的微服务提取方法,基于源代码进行业务功能建模,解决了直接基于源代码相似度进行微服务提取的方法模块性较差的问题;使用谱聚类算法进行微服务的提取,更好地实现了微服务内部高内聚,微服务之间低耦合的目标;对源代码进行业务功能建模后在进行微服务的提取极大地降低了方法的使用门槛,架构人员不需要了解遗留系统即可获得候选微服务,实现了微服务提取的自动化,降低了单体架构向微服务提取的时间成本和人力成本。The microservice extraction method provided by the present invention performs business function modeling based on source code, and solves the problem of poor modularity of the method for extracting microservices directly based on source code similarity; uses spectral clustering algorithm to extract microservices, Better achieve the goal of high cohesion within microservices and low coupling between microservices; extracting microservices after business function modeling of source code greatly reduces the threshold for using methods, and architects do not need to understand The legacy system can obtain candidate microservices, which realizes the automation of microservice extraction, and reduces the time and labor costs of extracting microservices from single architectures.

本发明通过提出基于源代码进行业务功能建模的方法,解决了微服务提取技术的关键输入部分的生成需要专业人员参与,方法使用门槛高的问题。该发明使得架构师在进行单体架构遗留系统的微服务提取工作时只需要通过逆向工程自动生成时序图和类图,即可快速获得候选微服务。不需要对遗留系统的业务功能进行分析,降低了微服务提取方法的使用门槛。该发明同时对提取后的微服务以图的形式进行了可视化,便于架构师直观的观察提取结果进行调整。The present invention solves the problem that the generation of the key input part of the microservice extraction technology requires the participation of professionals and the threshold for using the method is high by proposing a method for modeling business functions based on source codes. This invention allows architects to quickly obtain candidate microservices only by automatically generating sequence diagrams and class diagrams through reverse engineering when extracting microservices from legacy systems with monolithic architectures. There is no need to analyze the business functions of the legacy system, which lowers the threshold for using the microservice extraction method. At the same time, the invention visualizes the extracted microservices in the form of a graph, which is convenient for architects to intuitively observe and adjust the extraction results.

第二,把技术方案看做一个整体或者从产品的角度,本发明所要保护的技术方案具备的技术效果和优点,具体描述如下:Second, regarding the technical solution as a whole or from the perspective of a product, the technical effects and advantages of the technical solution to be protected by the present invention are specifically described as follows:

本发明提供的微服务提取方法,致力于利用单体遗留系统的源代码,自动获取业务功能信息进行微服务的提取,降低微服务提取方法的使用门槛。本发明帮助架构人员快速对单体遗留系统进行微服务提取,使其无需了解单体遗留系统的业务信息,降低方法的使用门槛,方便技术人员去更快地获取高内聚低耦合的微服务,降低单体架构向微服务架构迁移的代价和风险。The microservice extraction method provided by the present invention is dedicated to using the source code of a single legacy system to automatically obtain business function information to extract microservices, reducing the threshold for using the microservice extraction method. The invention helps architects to quickly extract microservices from single legacy systems, making it unnecessary to understand the business information of single legacy systems, lowering the threshold for using the method, and facilitating technicians to obtain high-cohesion and low-coupling microservices faster , to reduce the cost and risk of migrating from a monolithic architecture to a microservice architecture.

第三,作为本发明的权利要求的创造性辅助证据,还体现在以下几个重要方面:Third, as an auxiliary evidence of the inventiveness of the claims of the present invention, it is also reflected in the following important aspects:

(1)本发明的技术方案填补了国内外业内技术空白:(1) The technical scheme of the present invention fills up the technical gap in the industry at home and abroad:

目前主流的微服务提取技术存在着以下问题:1.基于程序运行执行轨迹进行微服务的提取方法依赖测试用例的设计,测试用例必须要熟悉系统的软件设计师生成,遗留系统往往拥有业务复杂,开发人员流动大,文档不全面的问题,因此测试用例的设计很难做到完全覆盖所有业务功能,而测试用例设计的不全面会导致提出的微服务类覆盖率较低,因此该方法使用门槛较高。2.基于源代码进行微服务提取方法,普遍采用代码的相似性作为依据进行微服务提取,不能充分反映业务关系,提取的微服务质量较差,可用性较差,微服务的质量会很大程度上决定微服务的扩展性,因此该方法的使用有较大的隐患。以上问题的存在,都导致了微服务自动提取的难度。The current mainstream microservice extraction technology has the following problems: 1. The method of extracting microservices based on program execution traces depends on the design of test cases. Test cases must be generated by software designers who are familiar with the system, and legacy systems often have complex businesses. , the problem of high turnover of developers and incomplete documentation, so it is difficult to design test cases to completely cover all business functions, and incomplete design of test cases will lead to low coverage of the proposed microservice class, so this method uses The threshold is high. 2. The method of extracting microservices based on source code generally uses the similarity of code as the basis for extracting microservices, which cannot fully reflect the business relationship. The above determines the scalability of microservices, so the use of this method has great hidden dangers. The existence of the above problems has led to the difficulty of automatic extraction of microservices.

本方明提出了基于源代码进行业务功能提取的技术,利用逆序工程生成从源代码中获取时序图,进而得到反映程序业务功能的调用关系。可以再提高微服务质量的同时,不需要专家的参与,降低了微服务自动提取的难度。Ben Fangming proposed a technology for extracting business functions based on source codes, using reverse engineering to generate sequence diagrams from source codes, and then obtain calling relationships reflecting program business functions. While the quality of microservices can be improved, the participation of experts is not required, which reduces the difficulty of automatic extraction of microservices.

(2)本发明的技术方案解决了人们一直渴望解决、但始终未能获得成功的技术难题:(2) The technical solution of the present invention solves the technical problem that people have always been eager to solve but failed to achieve success:

微服务提取技术一直致力于研究一种能够自动将单体系统拆分成微服务的方法,且获得的微服务拥有良好的功能模块性,易于扩展、开发和维护,能够解决单体系统难以维护和扩展的问题。但是当前的微服务提取方法并不能同时解决上述难题。Microservice extraction technology has been committed to researching a method that can automatically split a single system into microservices, and the obtained microservices have good functional modularity, are easy to expand, develop and maintain, and can solve the problem of difficult maintenance of single systems. and extended questions. But the current microservice abstraction methods cannot solve the above problems at the same time.

本发明基于源代码进行业务功能提取的方法解决了目前自动获取程序业务功能,需要使用基于程序运行轨迹进行解析,离不开专家设计测试用例,无法做到完全自动化的技术难题;克服了传统基于源代码方法提取出的微服务质量较差的技术难题。The method for extracting business functions based on source codes in the present invention solves the current automatic acquisition of program business functions, which needs to be analyzed based on the running track of the program, which cannot be completely automated without the design of test cases by experts; overcomes the technical problem of The technical problem of poor quality microservices extracted by the source code method.

附图说明Description of drawings

为了更清楚地说明本发明实施例的技术方案,下面将对本发明实施例中所需要使用的附图做简单的介绍,显而易见地,下面所描述的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the following will briefly introduce the drawings that need to be used in the embodiments of the present invention. Obviously, the drawings described below are only some embodiments of the present invention. For Those of ordinary skill in the art can also obtain other drawings based on these drawings without making creative efforts.

图1是本发明实施例提供的微服务提取方法流程图;Fig. 1 is a flow chart of a microservice extraction method provided by an embodiment of the present invention;

图2是本发明实施例提供的显示业务功能建模流程图;Fig. 2 is a flow chart of display business function modeling provided by an embodiment of the present invention;

图3是本发明实施例提供的隐式业务功能建模流程图;FIG. 3 is a flowchart of implicit business function modeling provided by an embodiment of the present invention;

图4是本发明实施例提供的候选微服务可视化的结果图。FIG. 4 is a result diagram of visualization of candidate microservices provided by an embodiment of the present invention.

具体实施方式Detailed ways

为了使本发明的目的、技术方案及优点更加清楚明白,以下结合实施例,对本发明进行进一步详细说明。应当理解,此处所描述的具体实施例仅仅用以解释本发明,并不用于限定本发明。In order to make the object, technical solution and advantages of the present invention more clear, the present invention will be further described in detail below in conjunction with the examples. It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention.

针对现有技术存在的问题,本发明提供了一种微服务提取方法、系统、介质、设备及信息处理终端,下面结合附图对本发明作详细的描述。Aiming at the problems existing in the prior art, the present invention provides a microservice extraction method, system, medium, equipment and information processing terminal. The present invention will be described in detail below in conjunction with the accompanying drawings.

一、解释说明实施例。为了使本领域技术人员充分了解本发明如何具体实现,该部分是对权利要求技术方案进行展开说明的解释说明实施例。1. Explain the embodiment. In order to make those skilled in the art fully understand how to implement the present invention, this part is an explanatory embodiment for explaining the technical solution of the claims.

如图1所示,本发明实施例提供的微服务提取方法包括以下步骤:As shown in Figure 1, the microservice extraction method provided by the embodiment of the present invention includes the following steps:

S101,对源代码进行层次划分;S101, performing hierarchical division on the source code;

S102,通过逆向工程获取控制层的时序图和类图;S102, obtaining the sequence diagram and class diagram of the control layer through reverse engineering;

S103,对时序图进行显示业务功能建模;S103, displaying business function modeling on the sequence diagram;

S104,对类图进行隐式业务功能建模;S104, performing implicit business function modeling on the class diagram;

S105,通过谱聚类基于业务功能模型提取源代码的候选微服务;S105, extract candidate microservices of the source code based on the business function model through spectral clustering;

S106,对候选微服务的质量进行评估;S106, evaluating the quality of the candidate microservices;

S107,以图结构的形式对候选微服务进行可视化并提供调整功能。S107. Visualize the candidate microservices in the form of a graph structure and provide an adjustment function.

作为优选实施例,本发明实施例提供的微服务提取方法具体包括以下步骤:As a preferred embodiment, the microservice extraction method provided in the embodiment of the present invention specifically includes the following steps:

S1:对单体遗留系统的源代码进行层次划分,如图2所示,识别源代码中的控制层,业务层,数据库访问层,实体层以及其它层。一般遵循开发规范的源代码,相同层都位于同一个包下,且类的后缀名相同。控制层一般位于Controller或者Action或者View等直接接收用户请求并响应用户的类所在的包下。数据库访问层一般位于Dao或者直接与数据库进行交互的包下。服务层一般位于Service等即与控制层又与数据库层交互的包下。实体层中的类一般用来封装数据,进行数据传输。其他类主要涉及项目的配置类以及工具类等。S1: Divide the source code of the single legacy system into layers, as shown in Figure 2, identify the control layer, business layer, database access layer, entity layer and other layers in the source code. Generally follow the source code of the development specification, the same layer is located under the same package, and the suffix of the class is the same. The control layer is generally located under the package of the classes that directly receive user requests and respond to users, such as Controller, Action, or View. The database access layer is generally located under Dao or a package that directly interacts with the database. The service layer is generally located under the package that interacts with the control layer and the database layer such as Service. Classes in the physical layer are generally used to encapsulate data and perform data transmission. Other classes mainly involve project configuration classes and tool classes.

S2:通过逆向工程生成时序图和类图。具体地,该步骤包含以下子步骤:S2: Generate sequence diagrams and class diagrams through reverse engineering. Specifically, this step includes the following sub-steps:

S21:使用SequenceDiagram工具对源代码的控制层中的每一个方法生成时序图,其中源代码的控制层指与负责用户界面显示进行交互的类;S21: Use the SequenceDiagram tool to generate a sequence diagram for each method in the control layer of the source code, where the control layer of the source code refers to the class that interacts with the user interface display;

S22:使用PlantUMLParser工具对对源代码的每一个类文件生成类图。S22: Use the PlantUMLParser tool to generate a class diagram for each class file of the source code.

S3:根据时序图进行显示业务功能建模,图2为本发明实施例提供的显示业务功能建模流程图,详细过程如下:S3: Perform display service function modeling according to the sequence diagram. FIG. 2 is a flow chart of display service function modeling provided by an embodiment of the present invention. The detailed process is as follows:

S31:首先在多个时序图文件中,读取一个时序图文件进行解析,当时序图文件为空时,结束显示业务功能建模流程;S31: Firstly, among multiple sequence diagram files, read a sequence diagram file for analysis, and when the sequence diagram file is empty, end the display business function modeling process;

S32:统计两个类Ci与Cj之间的调用次数fij,没出现一次Ci到Cj之间的调用关系,fij加1;S32: Count the number of calls f ij between two classes C i and C j , if there is no call relationship between C i and C j , add 1 to f ij ;

S33:将两个类Ci与Cj之间的调用次数存储在Map结构中,其中键为两个类的类名使用“_”进行拼接组成的字符串,值为调用次数fijS33: Store the number of calls between the two classes C i and C j in the Map structure, where the key is a string formed by splicing the class names of the two classes with "_", and the value is the number of calls f ij ;

S34:所有时序图文件解析结束后,输出模型Map。S34: After the analysis of all the sequence diagram files is completed, the model Map is output.

S4:根据类图进行隐式业务功能建模,图3为本发明实施例提供的隐式功能建模的流程图,详细过程如下:S4: Perform implicit business function modeling according to the class diagram. FIG. 3 is a flowchart of the implicit function modeling provided by the embodiment of the present invention. The detailed process is as follows:

S41:首先在多个类图文件中,读取两个类图进行解析,当类图文件为空时,结束隐式业务功能建模流程;S41: Firstly, in multiple class diagram files, read two class diagrams for analysis, and when the class diagram file is empty, end the implicit business function modeling process;

S42:计算两个类图Ci与Cj之间的语义相似度Sij。首先根据输入的文本信息制作词袋dictionary,然后根据文本中的词语与词袋中的key进行匹配,得到语料库corpus,初始化tf-idf变换模型,得到转换后的语料corpus_tfidf,将corpus_tfidf语料库使用Lsi模型进行训练,计算稀疏矩阵相似度,格式转换将需要寻找相似度的分词列表做成语料库doc_test_vec,获得文本的相似度。S42: Calculate the semantic similarity S ij between the two class diagrams C i and C j . First make a word bag dictionary according to the input text information, and then match the words in the text with the key in the word bag to get the corpus corpus, initialize the tf-idf transformation model, get the converted corpus corpus_tfidf, and use the corpus_tfidf corpus to use the Lsi model Perform training, calculate the similarity of the sparse matrix, and convert the word segmentation list that needs to find the similarity into a corpus doc_test_vec to obtain the similarity of the text.

S5:使用谱聚类进行聚类得到候选微服务,详细过程如下:S5: Use spectral clustering to perform clustering to obtain candidate microservices. The detailed process is as follows:

S51:构造相似矩阵,公式如下,Wij表示相似度矩阵,如果两个类i和j之间存在调用关系则wij=mapij反之wij=Sij;Dij为对角矩阵,对角上的值为W矩阵中对应的行或列的和。S51: Construct a similarity matrix, the formula is as follows, W ij represents a similarity matrix, if there is a calling relationship between two classes i and j, then w ij = map ij otherwise w ij = S ij ; D ij is a diagonal matrix, diagonal The value above is the sum of the corresponding row or column in the W matrix.

Figure BDA0003749592600000101
Figure BDA0003749592600000101

S52:构造拉普拉斯矩阵L,并对L进行归一化,公式如下所示:S52: Construct a Laplacian matrix L, and normalize L, the formula is as follows:

L=D-WL=D-W

Figure BDA0003749592600000102
Figure BDA0003749592600000102

S53:对拉普拉斯矩阵进行特征值分解,使用Lanczos方法加速分解过程,得到前k个最小特征值和对应的特征向量,最终组成k维的特征矩阵F。S53: Perform eigenvalue decomposition on the Laplacian matrix, use the Lanczos method to accelerate the decomposition process, obtain the first k smallest eigenvalues and corresponding eigenvectors, and finally form a k-dimensional feature matrix F.

S54:对k维的特征矩阵F使用Kmeans进行聚类。首先使用AFK-MC2进行初始聚类中心的选择,随机选取抽取一个初始中心样本c1,然后计算所有数据集的提案分布q(x)(如公式5所示),从q(x)中随机抽取一个数据点并计算距离dx(如公式6所示);最后用马尔可夫链蒙特卡罗(Metropolis-Hastings)采样出一个长为m的序列,取最后k-1个作为中心点C={C1,C2,...,Ck}。S54: Perform clustering on the k-dimensional feature matrix F using Kmeans. First, use AFK-MC 2 to select the initial cluster center, randomly select an initial center sample c1, and then calculate the proposal distribution q(x) of all data sets (as shown in formula 5), randomly select from q(x) Extract a data point and calculate the distance dx (as shown in formula 6); finally use Markov chain Monte Carlo (Metropolis-Hastings) to sample a sequence of length m, and take the last k-1 as the center point C= {C1,C2,...,Ck}.

Figure BDA0003749592600000111
Figure BDA0003749592600000111

Figure BDA0003749592600000112
Figure BDA0003749592600000112

dx=d(x,Ci-1)2 d x =d(x,C i-1 ) 2

使用A-means减少K-means算法将数据点分配到聚类簇C的时间。计算每个数据点xi到所有聚类中心的距离,选择距离自己最近的质心Ck,计算该点的等距指数αi,公式如下:Use A-means to reduce the time it takes for the K-means algorithm to assign data points to cluster C. Calculate the distance from each data point x i to all cluster centers, select the centroid C k closest to itself, and calculate the isometric index α i of this point, the formula is as follows:

αi=abs(||i-μ1||2-||i-μ2||2)α i =abs(||i-μ 1 || 2 -||i-μ 2 || 2 )

计算改进阈值

Figure BDA0003749592600000113
公式如下:Calculate the improvement threshold
Figure BDA0003749592600000113
The formula is as follows:

Figure BDA0003749592600000114
Figure BDA0003749592600000114

Figure BDA0003749592600000115
时,认为xi在此轮迭代中将不会移动,将xi固定分配给簇Ck,不再参与距离计算和重新分配。when
Figure BDA0003749592600000115
When , it is considered that xi will not move in this round of iterations, and xi is fixedly allocated to cluster C k , and no longer participates in distance calculation and reallocation.

经过以上步骤的谱聚类后,会得到一组聚类结果,每一个聚类簇就是一个微服务候选者。After spectral clustering in the above steps, a set of clustering results will be obtained, and each cluster is a microservice candidate.

S6:对提取的候选微服务进行质量评估。使用模块化评估的方式,模块化评估值越高,该微服务间的耦合度越低,微服务内部的聚合度更高。公式如下:S6: Evaluate the quality of the extracted candidate microservices. Using the modular evaluation method, the higher the modular evaluation value, the lower the coupling degree between the microservices, and the higher the aggregation degree within the microservice. The formula is as follows:

Figure BDA0003749592600000116
Figure BDA0003749592600000116

其中,N是微服务的个数,Ni是微服务内类的个数,Wi是微服务i内部边之间的权重和,Wi,j是微服务i和微服务j之间边的权重和。Among them, N is the number of microservices, N i is the number of classes in microservices, W i is the weight sum of internal edges of microservice i, W i, j is the edge between microservice i and microservice j weight and .

S7:对本方法提取的候选微服务进行可视化,基于Echarts,以图结构的形式显示各个类所属的微服务。如图4所示,其中图的顶点代表类文件,顶点的颜色代表所属的微服务,顶点颜色相同的属于同一个微服务,边代表类之间的调用关系。S7: Visualize the candidate microservices extracted by this method, and display the microservices to which each class belongs in the form of a graph structure based on Echarts. As shown in Figure 4, the vertices of the graph represent class files, the colors of the vertices represent the microservices they belong to, the vertices with the same color belong to the same microservice, and the edges represent the calling relationship between classes.

本发明实施例提供的微服务提取系统包括:The microservice extraction system provided by the embodiment of the present invention includes:

层次划分模块,用于对源代码进行层次划分,并通过逆向工程获取控制层的时序图,获取实体层和数据库访问层以及其他层的类图;The layer division module is used to divide the source code into layers, and obtain the sequence diagram of the control layer through reverse engineering, and obtain the class diagram of the entity layer, database access layer and other layers;

业务功能建模模块,用于对时序图进行显示业务功能建模,对类图进行隐式业务功能建模;The business function modeling module is used to model the display business function of the sequence diagram and model the implicit business function of the class diagram;

候选微服务提取模块,用于通过谱聚类基于业务功能模型提取源代码的候选微服务;The candidate microservice extraction module is used to extract the candidate microservices of the source code based on the business function model through spectral clustering;

质量评估模块,用于对候选微服务的质量进行评估,以图结构的形式对候选微服务进行可视化并提供调整功能。The quality evaluation module is used to evaluate the quality of candidate microservices, visualize the candidate microservices in the form of graph structure and provide adjustment functions.

二、应用实施例。为了证明本发明的技术方案的创造性和技术价值,该部分是对权利要求技术方案进行具体产品上或相关技术上的应用实施例。2. Application examples. In order to prove the creativity and technical value of the technical solution of the present invention, this part is the application example of the claimed technical solution on specific products or related technologies.

下面结合应用程序SpringBlog的微服务提取实施例对本方面的技术方案做详细描述。The technical solution in this aspect will be described in detail below in combination with the microservice extraction embodiment of the application program SpringBlog.

第一步,在github上拉取SpringBlog应用程序的源代码,分析源代码的层级关系。识别源代码中的控制层,业务层,数据库访问层,实体层以及其它层。控制层的类以后缀Controller结尾下,是直接接收用户请求并响应用户的类。数据库访问层一般位于下repositories,与数据库进行交互的包下。服务层位于Service等即与控制层又与数据库层交互的包下。实体层中的类一般用来封装数据,进行数据传输,主要位于models以及forms包下。其他类主要涉及项目的配置类以及工具类。The first step is to pull the source code of the SpringBlog application on github and analyze the hierarchical relationship of the source code. Identify the control layer, business layer, database access layer, entity layer, and other layers in the source code. The class of the control layer ends with the suffix Controller, which is a class that directly receives user requests and responds to users. The database access layer is generally located under repositories, the package that interacts with the database. The service layer is located under the package that interacts with the control layer and the database layer, such as Service. Classes in the physical layer are generally used to encapsulate data and perform data transmission, and are mainly located under the models and forms packages. Other classes mainly involve project configuration classes and tool classes.

第二步,使用工具对SpringBlog的控制层获得控制层的时序图,以及其他层的类图。The second step is to use tools to obtain the timing diagram of the control layer and the class diagrams of other layers for the control layer of SpringBlog.

第三步,根据时序图进行显示业务功能建模。在多个时序图文件中,读取一个时序图文件进行解析,当时序图文件为空时,结束显示业务功能建模流程;输出模型Map。统计两个类Ci与Cj之间的调用次数fij,没出现一次Ci到Cj之间的调用关系,fij加1;将两个类Ci与Cj之间的调用次数存储在Map结构中,其中键为两个类的类名使用“_”进行拼接组成的字符串,值为调用次数fijThe third step is to model the display business function according to the sequence diagram. Among multiple sequence diagram files, read a sequence diagram file for analysis, and when the sequence diagram file is empty, end the display business function modeling process; output the model Map. Count the number of calls f ij between two classes C i and C j , if there is no call relationship between C i and C j , add 1 to f ij ; count the number of calls between two classes C i and C j Stored in the Map structure, where the key is a string composed of the class names of the two classes spliced with "_", and the value is the number of calls f ij ;

第四步,根据类图进行隐式业务功能建模。计算两个类图Ci与Cj之间的语义相似度Sij。首先根据输入的文本信息制作词袋dictionary,然后根据文本中的词语与词袋中的key进行匹配,得到语料库corpus,初始化tf-idf变换模型,得到转换后的语料corpus_tfidf,将corpus_tfidf语料库使用Lsi模型进行训练,计算稀疏矩阵相似度,格式转换将需要寻找相似度的分词列表做成语料库doc_test_vec,获得文本的相似度。The fourth step is to model the implicit business function according to the class diagram. Calculate the semantic similarity S ij between two class diagrams Ci and Cj. First make a word bag dictionary according to the input text information, and then match the words in the text with the key in the word bag to get the corpus corpus, initialize the tf-idf transformation model, get the converted corpus corpus_tfidf, and use the corpus_tfidf corpus to use the Lsi model Perform training, calculate the similarity of the sparse matrix, and convert the word segmentation list that needs to find the similarity into a corpus doc_test_vec to obtain the similarity of the text.

第五步,使用谱聚类进行聚类得到候选微服务。The fifth step is to use spectral clustering for clustering to obtain candidate microservices.

第六步,使用模块化评估的方式,模块化评估值越高,该微服务间的耦合度越低,微服务内部的聚合度更高。公式如下:The sixth step is to use the modular evaluation method. The higher the modular evaluation value, the lower the coupling degree between the microservices, and the higher the aggregation degree within the microservice. The formula is as follows:

Figure BDA0003749592600000131
Figure BDA0003749592600000131

其中,N是微服务的个数,Ni是微服务内类的个数,Wi是微服务i内部边之间的权重和,Wi,j是微服务i和微服务j之间边的权重和。Among them, N is the number of microservices, N i is the number of classes in microservices, W i is the weight sum of internal edges of microservice i, W i, j is the edge between microservice i and microservice j weight and .

第七步,对本方法提取的候选微服务进行可视化。The seventh step is to visualize the candidate microservices extracted by this method.

三、实施例相关效果的证据。本发明实施例在研发或者使用过程中取得了一些积极效果,和现有技术相比的确具备很大的优势,下面内容结合试验过程的数据、图表等进行描述。3. Evidence of the relevant effects of the embodiment. The embodiment of the present invention has achieved some positive effects in the process of research and development or use, and indeed has great advantages compared with the prior art. The following content is described in conjunction with the data and charts of the test process.

本文提出微服务提取方法与传统的基于源代码的微服务提取方法相比,模块质量确实有了较大的改善,如下表是两种方法对SpringBlog应用程序提取微服务的质量评估。Compared with the traditional source code-based microservice extraction method, the microservice extraction method proposed in this paper has indeed greatly improved the module quality. The following table shows the quality evaluation of the two methods for SpringBlog application extraction microservices.

其中MEMSC为传统方法的提取的微服务的MQ值,OUR是本文提出的方法获得的微服务的MQ值。K是微服务的拆分粒度,我们分别将SpringBlog应用程序拆分为6个微服务,7个微服务,8个微服务,9个微服务,10个微服务,统计微服务的模块质量MQ。MQ的值代表微服务的质量,该值越高,越说明微服务的模块性越好,微服务越符合高内聚低耦合的标准。Among them, MEMSC is the MQ value of the microservice extracted by the traditional method, and OUR is the MQ value of the microservice obtained by the method proposed in this paper. K is the splitting granularity of microservices. We split the SpringBlog application into 6 microservices, 7 microservices, 8 microservices, 9 microservices, and 10 microservices, and count the module quality MQ of microservices . The value of MQ represents the quality of the microservice. The higher the value, the better the modularity of the microservice, and the more the microservice meets the standard of high cohesion and low coupling.

通过观察可以看到,本方面提出的微服务提取方法在五种不同拆分粒度下的微服务质量都高于传统的微服务提取方法。尤其是在拆分粒度k=9的情况下,微服务的质量得到了大幅提升。It can be seen from observation that the microservice extraction method proposed in this aspect has higher microservice quality than the traditional microservice extraction method under five different split granularities. Especially in the case of split granularity k=9, the quality of microservices has been greatly improved.

Figure BDA0003749592600000141
Figure BDA0003749592600000141

应当注意,本发明的实施方式可以通过硬件、软件或者软件和硬件的结合来实现。硬件部分可以利用专用逻辑来实现;软件部分可以存储在存储器中,由适当的指令执行系统,例如微处理器或者专用设计硬件来执行。本领域的普通技术人员可以理解上述的设备和方法可以使用计算机可执行指令和/或包含在处理器控制代码中来实现,例如在诸如磁盘、CD或DVD-ROM的载体介质、诸如只读存储器(固件)的可编程的存储器或者诸如光学或电子信号载体的数据载体上提供了这样的代码。本发明的设备及其模块可以由诸如超大规模集成电路或门阵列、诸如逻辑芯片、晶体管等的半导体、或者诸如现场可编程门阵列、可编程逻辑设备等的可编程硬件设备的硬件电路实现,也可以用由各种类型的处理器执行的软件实现,也可以由上述硬件电路和软件的结合例如固件来实现。It should be noted that the embodiments of the present invention can be realized by hardware, software, or a combination of software and hardware. The hardware part can be implemented using dedicated logic; the software part can be stored in memory and executed by a suitable instruction execution system such as a microprocessor or specially designed hardware. Those of ordinary skill in the art will understand that the above-described devices and methods can be implemented using computer-executable instructions and/or contained in processor control code, for example, on a carrier medium such as a magnetic disk, CD or DVD-ROM, such as a read-only memory Such code is provided on a programmable memory (firmware) or on a data carrier such as an optical or electronic signal carrier. The device and its modules of the present invention can be realized by hardware circuits such as very large scale integrated circuits or gate arrays, semiconductors such as logic chips, transistors, etc., or programmable hardware devices such as field programmable gate arrays, programmable logic devices, etc., It can also be realized by software executed by various types of processors, or by a combination of the above hardware circuits and software such as firmware.

以上所述,仅为本发明的具体实施方式,但本发明的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本发明揭露的技术范围内,凡在本发明的精神和原则之内所作的任何修改、等同替换和改进等,都应涵盖在本发明的保护范围之内。The above is only a specific embodiment of the present invention, but the scope of protection of the present invention is not limited thereto. Anyone familiar with the technical field within the technical scope disclosed in the present invention, whoever is within the spirit and principles of the present invention Any modifications, equivalent replacements and improvements made within shall fall within the protection scope of the present invention.

Claims (10)

1. A micro-service extraction method is characterized by comprising the following steps:
performing hierarchical division on a source code; acquiring a time sequence diagram of each method of a control layer through reverse engineering, and acquiring class diagrams of a physical layer, a database access layer and other layers; modeling a display service function of the time sequence diagram; carrying out implicit business function modeling on the class diagram; extracting candidate micro-services of the source code based on the business function model through spectral clustering; evaluating the quality of the candidate micro-service; and finally, visualizing the candidate micro-services in a graph structure form to provide an adjusting function for the framework personnel.
2. A microservice extraction process as claimed in claim 1, characterised in that it comprises the following steps:
step one, performing hierarchical division on a source code, acquiring a timing diagram of a control layer through reverse engineering, and acquiring class diagrams of a physical layer, a database access layer and other layers;
step two, performing display service function modeling on the time sequence diagram, and performing implicit service function modeling on the class diagram;
extracting candidate micro-services of the source codes based on the business function model through spectral clustering;
and step four, evaluating the quality of the candidate micro-service, visualizing the candidate micro-service in a graph structure form and providing an adjusting function.
3. The microservice extraction method of claim 2, wherein in the first step, the source code of the single legacy system is hierarchically divided, and a control layer, a business layer, a database access layer, a physical layer and other layers in the source code are identified;
generating a time sequence diagram and a class diagram of the system through reverse engineering according to the source code of the single system program:
(1) Generating a timing graph for each method in a control layer of the source code using a SequenceDiagram tool, wherein the control layer of the source code refers to a class that interacts with a user interface display in charge;
(2) Class diagrams for the database access layer and the physical and other layers are generated for each class file of the source code using the PlantUML Parser tool.
4. The microservice extraction method of claim 2, wherein in the second step, the obtaining of the call relation mapping table between classes by performing display service function modeling on the timing diagram specifically comprises:
(1) Reading one time sequence diagram file from a plurality of time sequence diagram files for analysis, and ending the business function modeling process when the time sequence diagram file is empty;
(2) Counting the calling times fij between the two classes Ci and Cj, and adding 1 to fij when the calling relation between Ci and Cj occurs once;
(3) Storing the calling times between the two classes Ci and Cj in a Map structure, wherein the key is a character string formed by splicing the class names of the two classes by using a _, and the value is the calling times fij;
(4) After all the time sequence diagram files are analyzed, outputting a model Map;
carrying out implicit business function modeling according to the class diagram to obtain a semantic similarity relation matrix between classes:
(1) Reading two class diagrams from a plurality of class diagram files for analysis, and ending the implicit business function modeling process when the class diagram files are empty;
(2) Calculating semantic similarity Sij between the two class diagrams Ci and Cj; making a bag dictionary according to input text information, and matching the words in the text with keys in the bag to obtain a corpus; initializing a tf-idf transformation model to obtain a converted corpus coprus _ tfidf, training a coprus _ tfidf corpus by using an Lsi model, and calculating the sparse matrix similarity; and converting the format by using a word segmentation list needing to find the similarity as a corpus doc _ test _ vec to obtain the similarity of the text.
5. The microservice extraction method of claim 2, wherein in the third step, after obtaining the service function model of the source code, clustering is performed by using a spectral clustering algorithm to obtain candidate microservices;
(1) Constructing a similarity matrix, wherein the formula is as follows:
Figure FDA0003749592590000021
wherein Wij represents a similarity matrix, if a calling relationship exists between two classes i and j, wij = mapij, and vice versa; dij is a diagonal matrix, and the value on the diagonal is the sum of corresponding rows or columns in the W matrix;
(2) Constructing a Laplace matrix L and normalizing the L, wherein the formula is as follows:
L=D-W;
Figure FDA0003749592590000022
(3) Decomposing eigenvalues of the Laplacian matrix, accelerating the decomposition process by using a Lanczos method to obtain the first k minimum eigenvalues and corresponding eigenvectors, and finally forming a k-dimensional eigenvalue matrix F;
(4) Clustering the k-dimensional feature matrix F by using Kmeans; selecting initial clustering centers by using AFK-MC2, and randomly selecting and extracting an initial center sample c 1 (ii) a Calculating a proposal distribution q (x) of all data sets, randomly extracting a data point from q (x) and calculating a distance dx; sampling a sequence with the length of m by using Markov chain Monte Carlo, and taking the last k-1 sequences as central points C = { C1, C2,.., ck };
Figure FDA0003749592590000031
Figure FDA0003749592590000032
using A-means to reduce the time for the K-means algorithm to assign data points to cluster C; calculate each data point x i Selecting the closest centroid C to the distance from the cluster center k Calculating the equidistant index alpha of said points i The formula is as follows:
α i =abs(||i-μ 1 || 2 -||i-μ 2 || 2 );
calculating an improvement threshold
Figure FDA0003749592590000033
The formula is as follows:
Figure FDA0003749592590000034
when in use
Figure FDA0003749592590000035
When x is i Do not move in round iterations, x i Fixed allocation to cluster C k No longer participating in distance calculation and reallocation;
after spectral clustering, a group of clustering results is obtained, and each clustering cluster is a micro service candidate.
6. The microservice extraction method of claim 2, wherein the quality assessment of the extracted candidate microservices at step four comprises:
using a modularization evaluation mode, the higher the modularization evaluation value is, the lower the coupling degree between the micro services is, and the higher the polymerization degree inside the micro service is, and the calculation formula is as follows:
Figure FDA0003749592590000036
where N is the number of microservices, N i Is the number of classes within the microservice, W i Is the weighted sum, W, between the internal edges of the microservice i i,j Is the sum of the weights of the edges between micro service i and micro service j;
and visualizing the extracted candidate micro-services, displaying the micro-services to which the class files belong and the calling relation among the classes in a graph structure form based on Echarts, and dragging a vertex to manually adjust the micro-services to which the classes belong.
7. A microservice extraction system to which the microservice extraction method of any of claims 1 to 6 is applied, the microservice extraction system comprising:
the hierarchical division module is used for hierarchically dividing the source code, acquiring a timing chart of the control layer through reverse engineering, and acquiring class diagrams of a physical layer, a database access layer and other layers;
the business function modeling module is used for performing display business function modeling on the time sequence diagram and performing implicit business function modeling on the class diagram;
the candidate micro-service extraction module is used for extracting candidate micro-services of the source codes based on the business function model through spectral clustering;
and the quality evaluation module is used for evaluating the quality of the candidate micro-service, visualizing the candidate micro-service in a graph structure form and providing an adjusting function.
8. A computer arrangement, characterized in that the computer arrangement comprises a memory and a processor, the memory storing a computer program which, when executed by the processor, causes the processor to carry out the steps of the microservice extraction method according to any of the claims 1-6.
9. A computer-readable storage medium, storing a computer program which, when executed by a processor, causes the processor to carry out the steps of the microservice extraction method according to any of claims 1 to 6.
10. An information data processing terminal characterized by being configured to implement the micro-service extraction system of claim 7.
CN202210835089.8A 2022-07-16 2022-07-16 Micro-service extraction method, system, medium, equipment and information processing terminal Pending CN115309634A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210835089.8A CN115309634A (en) 2022-07-16 2022-07-16 Micro-service extraction method, system, medium, equipment and information processing terminal

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210835089.8A CN115309634A (en) 2022-07-16 2022-07-16 Micro-service extraction method, system, medium, equipment and information processing terminal

Publications (1)

Publication Number Publication Date
CN115309634A true CN115309634A (en) 2022-11-08

Family

ID=83857533

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210835089.8A Pending CN115309634A (en) 2022-07-16 2022-07-16 Micro-service extraction method, system, medium, equipment and information processing terminal

Country Status (1)

Country Link
CN (1) CN115309634A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115759720A (en) * 2022-11-24 2023-03-07 北京中知智慧科技有限公司 Method and device for disassembling micro-service of software system
CN117311801A (en) * 2023-11-27 2023-12-29 湖南科技大学 Micro-service splitting method based on networking structural characteristics
CN117539433A (en) * 2023-11-02 2024-02-09 北京航空航天大学 Microservice design method based on model driven architecture

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115759720A (en) * 2022-11-24 2023-03-07 北京中知智慧科技有限公司 Method and device for disassembling micro-service of software system
CN117539433A (en) * 2023-11-02 2024-02-09 北京航空航天大学 Microservice design method based on model driven architecture
CN117311801A (en) * 2023-11-27 2023-12-29 湖南科技大学 Micro-service splitting method based on networking structural characteristics
CN117311801B (en) * 2023-11-27 2024-04-09 湖南科技大学 Micro-service splitting method based on networking structural characteristics

Similar Documents

Publication Publication Date Title
CN115309634A (en) Micro-service extraction method, system, medium, equipment and information processing terminal
US10691770B2 (en) Real-time classification of evolving dictionaries
CN114419304B (en) A multimodal document information extraction method based on graph neural network
CN104160392B (en) Semantic estimating unit, method
US20050246353A1 (en) Automated transformation of unstructured data
CN106126577A (en) A kind of weighted association rules method for digging based on data source Matrix dividing
Qian et al. Generating accurate caption units for figure captioning
CN108391446A (en) Based on machine learning algorithm automatically extracting to the training corpus for data sorter
CN111512315A (en) Block-wise extraction of document metadata
CN106294466A (en) Disaggregated model construction method, disaggregated model build equipment and sorting technique
CN113360654B (en) Text classification method, apparatus, electronic device and readable storage medium
Egbert et al. Bootstrapping techniques
CN112529743A (en) Contract element extraction method, contract element extraction device, electronic equipment and medium
CN116402166B (en) Training method and device of prediction model, electronic equipment and storage medium
Sarkar et al. Gaussian mixture modeling and model-based clustering under measurement inconsistency
Yadav et al. Statistical analysis of the Indus script using n-grams
CN110069558A (en) Data analysing method and terminal device based on deep learning
Vu et al. Revising FUNSD dataset for key-value detection in document images
US20230023636A1 (en) Methods and systems for preparing unstructured data for statistical analysis using electronic characters
CN116561338A (en) Industrial knowledge graph generation method, device, equipment and storage medium
CN114943306A (en) Intent classification method, device, equipment and storage medium
Oosthuizen et al. Analysis of INCOSE Systems Engineering journal and international symposium research topics
Manuaba et al. Comparison Study of Image Augmentation on Modified Cnn Architecture for Indonesian Lasem-Batik’S Motifs
Hou et al. [Retracted] Automatic Classification of Basic Nursing Teaching Resources Based on the Fusion of Multiple Neural Networks
Ma et al. Analysis feature recognition and mixed-dimensional model reconstruction from finite element analysis mesh model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination