CN117520209B

CN117520209B - Code review method, device, computer equipment and storage medium

Info

Publication number: CN117520209B
Application number: CN202410002350.5A
Authority: CN
Inventors: 王万里; 张晋铭
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2024-01-02
Filing date: 2024-01-02
Publication date: 2024-04-26
Anticipated expiration: 2044-01-02
Also published as: CN117520209A

Abstract

The application relates to a code review method, a code review device, computer equipment and a storage medium. The method comprises the following steps: acquiring a software code and a corresponding analysis model, wherein the analysis model comprises at least one of a standard analysis structure diagram corresponding to the software code and a standard analysis sequence diagram corresponding to a standard function of the software code; inputting the software codes and the analysis model into a code review model, and outputting review information corresponding to the software codes; wherein the review information is used for characterizing matching information of the software code and the analysis model; the code review model is obtained by training an initial natural language model based on a sample software code and a corresponding sample analysis model. The application can efficiently and accurately automatically realize the difference comparison of the software code and the analysis model, discover the problems in the software code and give detailed and comprehensive review information.

Description

Code review method, device, computer equipment and storage medium

Technical Field

The present application relates to the field of artificial intelligence technology, and in particular, to a code review method, apparatus, computer device, storage medium, and computer program product.

Background

In software engineering, analytical models are an important way to guide code implementation. After the analytical model is determined in the analytical modeling stage, the specific implementation of the software code needs to be performed according to the analytical model. However, many times, code implementations deviate from analytical models due to complexity and differences between analytical models and software code implementations.

In the related technology, whether the manual review analysis model is consistent with the software code is realized, the manual review has the advantages of high time consumption and low efficiency; subjectivity and inconsistency exist depending on the experience and ability of the panelist; and it is difficult to find subtle differences and potential problems. Therefore, how to efficiently and accurately compare the inconsistency between the code and the analysis model is a technical problem to be solved.

Disclosure of Invention

In view of the foregoing, it is desirable to provide a code review method, apparatus, computer device, computer-readable storage medium, and computer program product that can efficiently and accurately automatically implement the difference comparison of software codes and analytical models.

In a first aspect, the present application provides a code review method. The method comprises the following steps:

acquiring a software code and a corresponding analysis model, wherein the analysis model comprises at least one of a standard analysis structure diagram corresponding to the software code and a standard analysis sequence diagram corresponding to a standard function of the software code; the standard analysis structure diagram is used for representing a standard structure and a static relation among the standard structures, and the standard analysis sequence diagram is used for representing a standard function corresponding to the standard structure and an interactive relation among the standard structures;

Inputting the software codes and the analysis model into a code review model, and outputting review information corresponding to the software codes; wherein the review information is used for characterizing matching information of the software code and the analysis model; the code review model is obtained by training an initial natural language model based on a sample software code and a corresponding sample analysis model.

In one embodiment, the code review model is obtained by:

acquiring a first sample set; the first sample set comprises sample software codes and sample analysis models corresponding to various review tasks, wherein the sample software codes are marked with review information;

Respectively inputting sample software codes and sample analysis models corresponding to various review tasks into an initial natural language model, and outputting a first prediction result;

and iteratively adjusting the initial natural language model based on the difference between the first prediction result and each piece of marked review information until the difference meets the preset requirement to obtain a code review model.

In one embodiment, the acquiring the first set of samples includes:

acquiring initial sample software codes and initial sample analysis models corresponding to various detection tasks;

performing code reconstruction, annotation addition or variable name modification on the initial sample software code to obtain an expanded sample software code;

and obtaining a first sample set based on the initial sample software code, the initial sample analysis model and the extended sample software code.

In one embodiment, the acquiring the first set of samples includes:

Acquiring candidate sample software codes corresponding to various detection tasks and corresponding candidate sample analysis models;

Inputting the candidate sample software codes and the corresponding candidate sample analysis models into an initial natural language model, and outputting a third prediction result and corresponding prediction probability;

And determining candidate sample software codes and corresponding candidate sample analysis models based on probability distribution of the prediction probability of the third prediction result to obtain a first sample set.

In a second aspect, the present application further provides a training method for a code review model, including:

Acquiring a first sample set; the first sample set comprises sample software codes and sample analysis models corresponding to various review tasks, wherein the sample software codes are marked with review information; the sample analysis model comprises at least one of a standard analysis structure diagram corresponding to the sample software code and a standard analysis sequence diagram of a standard function corresponding to the sample software code; the standard analysis structure diagram is used for representing a standard structure and a static relation among the standard structures, and the standard analysis sequence diagram is used for representing a standard function corresponding to the standard structure and an interactive relation among the standard structures; wherein the review information is used to characterize matching information of the sample software code and the analysis model;

And iteratively adjusting the initial natural language model based on the difference between the prediction result and each piece of marked review information until the difference meets the preset requirement, so as to obtain a code review model.

In a third aspect, the present application also provides a code review device, including:

The first acquisition module is used for acquiring a software code and a corresponding analysis model, wherein the analysis model comprises at least one of a standard analysis structure diagram corresponding to the software code and a standard analysis sequence diagram corresponding to a standard function of the software code; the standard analysis structure diagram is used for representing a standard structure and a static relation among the standard structures, and the standard analysis sequence diagram is used for representing a standard function corresponding to the standard structure and an interactive relation among the standard structures;

The generation module is used for inputting the software codes and the analysis model into a code review model and outputting review information corresponding to the software codes; wherein the review information is used for characterizing matching information of the software code and the analysis model; the code review model is obtained by training an initial natural language model based on a sample software code and a corresponding sample analysis model.

In one embodiment, the static relationship includes a generalization relationship, and the first obtaining module is further configured to:

Acquiring each standard structure of a software system corresponding to a software code;

determining a sub-level standard structure from the standard structures and a parent level standard structure containing the sub-level standard structure;

And connecting each sub-level standard structure with a corresponding parent level standard structure through a generalized relational character to obtain a standard analysis structure diagram.

In one embodiment, the static relationship includes an association relationship or a dependency relationship, and the first obtaining module is further configured to:

determining a plurality of groups from each standard structure of a software system corresponding to the software code; wherein each packet includes a first standard structure and a second standard structure having a static relationship;

Determining the number of first standard structures in the group and a number identifier for representing the number of the first standard structures, the number of second standard structures and a number identifier for representing the number of the second standard structures respectively;

and connecting the first standard structure of each group with the corresponding second standard structure through static relation symbols, and marking corresponding number identifiers on the first standard structure and the second standard structure to obtain a standard analysis structure diagram.

In one embodiment, the first acquisition module is further configured to:

Acquiring each standard structure of a system use case of a software system corresponding to a software code;

adding the literal description of the standard function of each standard structure to the corresponding position of the corresponding standard structure;

and connecting the standard structures with interactive behaviors through the interactive relational symbols to obtain a standard analysis sequence chart.

In one embodiment, the generating module is further configured to:

Extracting semantic information in the analysis model, wherein the semantic information comprises standard structures and static relations among the standard structures, and/or standard functions corresponding to the standard structures and interaction relations among the standard structures;

writing the semantic information into a text file to obtain a text file containing the semantic information;

and inputting the text file containing the semantic information and the software code into a code review model.

In one embodiment, the code information or the model information includes at least one of each standard structure, an attribute of the standard structure, a standard function corresponding to the standard structure, a static relationship between the standard structures, and an interaction relationship, the review information includes review information of a missing class, and the generating module is further configured to:

Outputting review information of the missing class corresponding to the software code; wherein the review information of the missing class includes at least one of:

code information described by the code segments of the software code is not recorded in the analytical model;

Model information described in the analytical model is implemented in the software code without the corresponding code segments.

In one embodiment, the code information or the model information includes at least one of each standard structure, an attribute of the standard structure, a standard function corresponding to the standard structure, a static relationship between the standard structures, and an interactive relationship, the review information includes review information of a missing class, the review information includes review information of a non-uniform class, and the generating module is further configured to:

outputting non-uniform class review information corresponding to the software code; wherein, the review information of the non-uniform class comprises:

Code information described by the code segments of the software code is inconsistent with model information recorded in the analysis model.

In one embodiment, the code review device further comprises:

A third acquisition module, configured to acquire a first sample set; the first sample set comprises sample software codes and sample analysis models corresponding to various review tasks, wherein the sample software codes are marked with review information;

the sample input module is used for respectively inputting sample software codes and sample analysis models corresponding to various review tasks into the initial natural language model and outputting a first prediction result;

and the first adjustment module is used for iteratively adjusting the initial natural language model based on the difference between the first prediction result and the marked review information until the difference meets the preset requirement to obtain a code review model.

In one embodiment, the initial natural language model includes an encoder network and a decoder network, and the sample input module is further configured to:

Sample software codes and sample analysis models corresponding to various review tasks are input to an encoder network, and characteristics are output; wherein the features include semantic features and structural features;

inputting the characteristics and the characteristics corresponding to the last predicted result of the first predicted result to a decoder network, and outputting the first predicted result;

iteratively adjusting the initial natural language model based on the difference between the first prediction result and the marked review information until the difference meets the preset requirement, including:

And adjusting the encoder network and the decoder network based on the difference between the first prediction result and the marked review information, and adjusting the encoder network and the decoder network based on the characteristics corresponding to the first prediction result, the characteristics corresponding to the sample software code and the sample analysis model until the difference meets the preset requirement.

In one embodiment, the third obtaining module is further configured to:

generating a plurality of review rules based on at least one of the attribute of the standard structure, the standard function, and the static and interactive relationship between the standard structures in the analysis model;

determining a plurality of review tasks based on the plurality of review rules;

Sample software code and a sample analysis model matched with each review task are obtained.

In one embodiment, the code review device further comprises:

A fourth acquisition module, configured to acquire a second sample set; wherein the second sample set comprises unlabeled sample text;

the word segmentation module is used for carrying out word segmentation processing on the sample text to obtain words;

The word input module is used for sequentially inputting words into the initial network model according to the sequence of the words in the sample text, and outputting a second prediction result, wherein the second prediction result comprises the predicted next word of the words;

And the second adjusting module is used for iteratively adjusting the initial network model based on the difference between the second prediction result and the actual next word of the words until the difference meets the preset requirement to obtain an initial natural language model.

In one embodiment, the third obtaining module is further configured to:

In one embodiment, the code review device further comprises:

The first display module is used for displaying the software codes in a code display column and marking and displaying the code fragments which are not matched with the analysis model;

And the second display module is used for displaying the review information corresponding to the code segment in the review information display column.

In one embodiment, the second display module is further configured to:

displaying the review information corresponding to the code segment in a review information display column and the acquisition address of the analysis model;

And responding to clicking operation for acquiring the address, and displaying the analysis model.

In one embodiment, the second display module is further configured to:

Receiving input associated problem information aiming at the evaluation information display column;

and inputting the associated problem information, the software codes and the analysis model into a code review model again, and outputting and displaying reply information corresponding to the associated problem information.

In a fourth aspect, the present application further provides a training device for a code review model, including:

The second acquisition module is used for acquiring the first sample set; the first sample set comprises sample software codes and sample analysis models corresponding to various review tasks, wherein the sample software codes are marked with review information; the sample analysis model comprises at least one of a standard analysis structure diagram corresponding to the sample software code and a standard analysis sequence diagram of a standard function corresponding to the sample software code; the standard analysis structure diagram is used for representing a standard structure and a static relation among the standard structures, and the standard analysis sequence diagram is used for representing a standard function corresponding to the standard structure and an interactive relation among the standard structures; wherein the review information is used to characterize matching information of the sample software code and the analysis model;

The input module is used for respectively inputting sample software codes and sample analysis models corresponding to various review tasks into the initial natural language model and outputting a first prediction result;

and the processing module is used for iteratively adjusting the initial natural language model based on the difference between the prediction result and the marked review information until the difference meets the preset requirement to obtain a code review model.

In a fifth aspect, the present application also provides a computer device. The computer device comprises a memory storing a computer program and a processor implementing the code review method according to any of the embodiments of the present disclosure when the processor executes the computer program.

In a sixth aspect, the present application also provides a computer readable storage medium. The computer readable storage medium has stored thereon a computer program which, when executed by a processor, implements a code review method as in any of the embodiments of the present disclosure.

In a seventh aspect, the present application also provides a computer program product. The computer program product comprises a computer program which, when executed by a processor, implements the code review method according to any of the embodiments of the present disclosure.

The code review method, the code review device, the computer equipment, the storage medium and the computer program product train the initial natural language model through the sample software codes and the sample analysis model to obtain a code review model. The code review model can automatically analyze and compare the difference between the software code and the analysis model, and can automatically find out the difference between the software code and the analysis model. Further, the analysis model comprises a standard analysis structure diagram and a standard analysis sequence diagram. The standard analysis structure diagram is used for representing the standard structure and the static relation between the standard structures, so that the code review model can discover various static relations between the standard structures in the software code by comparing the standard analysis structure diagram with the software code. The standard analysis sequence diagram is used for representing standard functions corresponding to the standard structure and interaction relations among the standard structures, so that the code review model can find out problems of function implementation errors, step implementation sequence errors and the like in the software code by comparing the standard analysis sequence diagram with the software code. Therefore, according to the embodiment of the disclosure, the difference comparison of the software code and the analysis model can be automatically realized efficiently and accurately, the problems in the software code are found, and detailed and comprehensive review information is given.

Drawings

FIG. 1 is a flow diagram of a code review method in one embodiment;

FIG. 2 is a schematic diagram of a standard structure in one embodiment;

FIG. 3 is a schematic diagram of a standard analysis block diagram with generalized relationships in one embodiment;

FIG. 4 is a schematic diagram of a standard analysis structure diagram with an association relationship in one embodiment;

FIG. 5 is a schematic diagram of a standard analysis architecture with dependencies in one embodiment;

FIG. 6 is a schematic diagram of a standard analysis sequence diagram in one embodiment;

FIG. 7 is a flow diagram of a method of code review in one embodiment;

FIG. 8 is a diagram of the relationship of a software system, a standard analysis block diagram, and a standard analysis sequence diagram in one embodiment;

FIG. 9 is a flow diagram of a code review method in one embodiment;

FIG. 10 is a schematic representation of a text representation in an analytical model in one embodiment;

FIG. 11 is a flow diagram of a method of code review in one embodiment;

FIG. 12 is a schematic diagram of an initial natural language model structure in one embodiment;

FIG. 13 is a flow diagram of a method of code review in one embodiment;

FIG. 14 is a visual interface diagram of review information in one embodiment;

FIG. 15 is a flow diagram of a code review method in one embodiment;

FIG. 16 is a block diagram of a code review method in one embodiment;

FIG. 17 is a block diagram of a code review device in one embodiment;

FIG. 18 is a block diagram of a training device for a code review model in one embodiment;

FIG. 19 is an internal block diagram of a computer device in one embodiment;

fig. 20 is an internal structural view of a computer device in one embodiment.

Detailed Description

The present application will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present application more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the application.

In order to facilitate understanding of the technical solutions provided by the embodiments of the present disclosure by those skilled in the art, a technical environment in which the technical solutions are implemented is described below.

Artificial intelligence (ARTIFICIAL INTELLIGENCE, AI) is the theory, method, technique, and application system that simulates, extends, and extends human intelligence using a digital computer or a machine controlled by a digital computer, perceives the environment, obtains knowledge, and uses the knowledge to obtain optimal results. In other words, artificial intelligence is an integrated technology of computer science that attempts to understand the essence of intelligence and to produce a new intelligent machine that reacts in a manner similar to human intelligence. Artificial intelligence, i.e. research on design principles and implementation methods of various intelligent machines, enables the machines to have functions of sensing, reasoning and decision.

The artificial intelligence technology is a comprehensive subject, and relates to the technology with wide fields, namely the technology with a hardware level and the technology with a software level. Artificial intelligence infrastructure technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, pre-training model technologies, operation/interaction systems, mechatronics, and the like. The pre-training model is also called a large model and a basic model, and can be widely applied to all large-direction downstream tasks of artificial intelligence after fine adjustment. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and other directions.

Computer Vision (CV) is a science of studying how to "look" a machine, and more specifically, to replace human eyes with a camera and a Computer to perform machine Vision such as recognition, positioning and measurement on a target, and further perform graphic processing to make the Computer process into an image more suitable for human eyes to observe or transmit to an instrument to detect. As a scientific discipline, computer vision research-related theory and technology has attempted to build artificial intelligence systems that can acquire information from images or multidimensional data. The large model technology brings important transformation for the development of computer vision technology, and pre-trained models in the vision fields of swin-transducer, viT, V-MOE, MAE and the like can be quickly and widely applied to downstream specific tasks through fine tuning. Computer vision techniques typically include image processing, image recognition, image semantic understanding, image retrieval, OCR, video processing, video semantic understanding, video content/behavior recognition, three-dimensional object reconstruction, 3D techniques, virtual reality, augmented reality, synchronous positioning, and map construction, among others, as well as common biometric recognition techniques such as face recognition, fingerprint recognition, and others.

Key technologies to speech technology (Speech Technology) are automatic speech recognition technology (ASR) and speech synthesis technology (TTS) and voiceprint recognition technology. The method can enable the computer to listen, watch, say and feel, is a development direction of human-computer interaction in the future, and voice becomes one of human-computer interaction modes which are watched in the future. The large model technology brings reform for the development of the voice technology, and the pre-training models such as WavLM, uniSpeech and the like which use a transducer architecture have strong generalization and universality and can excellently finish voice processing tasks in all directions.

Natural language processing (Nature Language processing, NLP) is an important direction in the fields of computer science and artificial intelligence. It is studying various theories and methods that enable effective communication between a person and a computer in natural language. Natural language processing involves natural language, i.e., language that people use daily, and is closely studied with linguistics. The pre-training model was developed from a large language model (Large Language Model) in the NLP domain. Through fine tuning, the large language model can be widely applied to downstream tasks. Natural language processing techniques typically include text processing, semantic understanding, machine translation, robotic questions and answers, knowledge graph techniques, and the like.

Machine learning (MACHINE LEARNING, ML) is a multi-domain interdisciplinary, involving multiple disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory, and the like. It is specially studied how a computer simulates or implements learning behavior of a human to acquire new knowledge or skills, and reorganizes existing knowledge structures to continuously improve own performance. Machine learning is the core of artificial intelligence, a fundamental approach to letting computers have intelligence, which is applied throughout various areas of artificial intelligence. Machine learning and deep learning typically include techniques such as artificial neural networks, confidence networks, reinforcement learning, transfer learning, induction learning, teaching learning, and the like. The pre-training model is the latest development result of deep learning, and integrates the technology.

The automatic driving technology refers to that the vehicle realizes self-driving without operation of a driver. Typically including high-precision maps, environmental awareness, computer vision, behavioral decision-making, path planning, motion control, and the like. The automatic driving comprises various development paths such as single car intelligence, car-road coordination, networking cloud control and the like. The automatic driving technology has wide application prospect, and the current field is the field of logistics, public transportation, taxis and intelligent transportation, and is further developed in the future.

With the research and advancement of artificial intelligence technology, the research and application of artificial intelligence technology is developed in various fields, such as common smart home, smart wearable devices, virtual assistants, smart speakers, smart marketing, unmanned, autopilot, unmanned, digital twin, virtual man, robot, artificial Intelligence Generation Content (AIGC), conversational interactions, smart medical treatment, smart customer service, game AI, etc., and it is believed that with the development of technology, the artificial intelligence technology will be applied in more fields and with increasing importance value. The scheme provided by the embodiment of the application relates to the technologies of artificial intelligence, such as computer vision technology, natural language processing technology, machine learning/deep learning and the like.

In one embodiment, as shown in fig. 1, an information detection method is provided, where the method is applied to a terminal to illustrate the method, it is understood that the method may also be applied to a server, and may also be applied to a system including the terminal and the server, and implemented through interaction between the terminal and the server. In this embodiment, the method includes the steps of:

step S101, acquiring a software code and a corresponding analysis model, wherein the analysis model comprises at least one of a standard analysis structure diagram corresponding to the software code and a standard analysis sequence diagram corresponding to a standard function of the software code; the standard analysis structure diagram is used for representing a standard structure and static relations among the standard structures, and the standard analysis sequence diagram is used for representing standard functions corresponding to the standard structure and interactive relations among the standard structures.

Wherein the analytical model is a way to describe the core knowledge of the software system. The behavior and data of a software system package may be collectively referred to as knowledge of the software system, where the knowledge that plays an important role in the functioning of the software system is referred to as core knowledge. The software system comprises an entity which encapsulates behaviors and data and provides specified functions to the outside through a software coding form, for example, a document processing system encapsulates related behavior logic and data which can finish document processing, and provides the document processing functions to the outside. The relationship of the software system to the analytical model may include a one-to-many relationship: each software system has its own analytical model or models, one describing each software system. For example, the core knowledge of the document processing system includes documents, lines, pages, words, etc., and the document processing system describes these core knowledge, including what types of core knowledge, relationships between them, etc.

In one embodiment, a standard analysis structure diagram, which is one of the analysis models, is used to describe the core knowledge in a software system. In the standard analysis structure diagram, the concept of a standard structure can be used to describe core knowledge, for example, a class in a programming language, wherein the class is a programming structure in software development, is a blueprint for defining entities with the same attribute (data member) and behavior (member function), and is a core concept of an object-oriented programming language (such as C++, java). In a particular implementation, the standard analysis structure may include an analysis class diagram, for example, referring to FIG. 2, in which "employees" appear in the personnel management system, then the "employees" may be represented using "employee classes". The attributes of the employee in FIG. 2 include name, gender, telephone and job position, indicating that these attributes are of interest to the system, and other attributes, such as height, weight, etc., which may not be of interest to the system, may not be listed.

In the embodiment of the disclosure, the static relationship between the standard structure and the standard structure in the standard analysis structure chart may include a fixed state of the software system, which represents the attribute of the system during the existence period. The static relationship may include: generalizing relations, incidence relations and dependency relations. The generalization relation is used for representing the set relation, and the standard structure B and the standard structure C are generalized to the standard structure A, namely the standard structure B and the standard structure C are subsets of the standard structure A, and from the code design perspective, the standard structure B and the standard structure C inherit the standard structure A. For example, referring to FIG. 3, engineers, product managers are generalized to employees, who should have name attributes. In addition, engineers and product managers may have their own unique attributes, such as engineers having technical types of attributes and product managers having attributes responsible for the product. The association relationship is used for representing the correlation of the individuals of the standard structures, and if the association relationship is formed by the two standard structures, the individuals of the two standard structures are linked. For example, referring to fig. 4, an engineer has an association with a computer because the engineer's individual is related to the individual of the computer (the engineer has an office computer). The association is divided into a one-way association and a two-way association, for example, the engineer and the computer in fig. 4 are two-way association, which means that the engineer individual knows which computers are, and the computer individual also records which engineer the engineer individual belongs to. The computer and the mouse are associated in one way, which means that the computer individual records the mouse to which the computer individual belongs, and the mouse individual does not record which computer the computer individual belongs to. The association relationship may include a one-to-many association relationship, for example, the engineer and the computer in fig. 4 are associated, one engineer has a plurality of computers, and one computer is allocated to only one engineer.

The dependency relationship in the static relationship represents the correlation between standard structures, and the correlation does not belong to a generalization relationship or an association relationship. For example, in FIG. 5, the correlation between engineer classes and printer classes, where the engineer uses printers, the individual printers are not strongly correlated to the individual engineers, and there is no one-to-one fixed relationship. There are typically a large number of dependencies in a software system, the number far exceeding the relationships and generalization. It should be noted that, the setting manner of the static relationship is not limited to the above examples, such as the dependency relationship, the aggregation relationship and the combination relationship, and other modifications may be made by those skilled in the art in light of the technical spirit of the present application, but all the functions and effects implemented by the static relationship are included in the protection scope of the present application as long as they are the same or similar to the present application.

In one embodiment, the standard analysis sequence diagram belongs to one of analysis models, and is used for representing standard functions corresponding to the standard structure and interaction relations among the standard structures. For example, referring to FIG. 6, the analysis sequence diagram corresponds to a new page creation function, and the participating standard structures are product managers, engineers, and testers. The figure shows what kind of responsibilities each standard structure assumes at each step, which responsibilities are triggered by which standard structure, and the interactions between standard structures. For example, after the product manager adjusts the page, the task is sent to the engineer, after the engineer modifies the page, a new page to be tested is formed, the task is sent to the tester, after the tester tests the new page, the engineer returns the new page, and the engineer sends the task to the product manager, so that the product manager can accept the new page. In an exemplary embodiment, the standard analysis sequence diagram may further include judgment logic, such as a new page that is online when the acceptance result is passed; and when the acceptance result is that the page fails, modifying the page.

Step S103, inputting the software codes and the analysis model into a code review model, and outputting review information corresponding to the software codes; wherein the review information is used for characterizing matching information of the software code and the analysis model; the code review model is obtained by training an initial natural language model based on a sample software code and a corresponding sample analysis model.

Wherein the review information is used to characterize the matching information of the software code to the analysis model, e.g. that certain code segments of the software code match the analysis model or that certain code segments of the software code do not match the analysis model. Wherein, the mismatch between the software code and the code segment may include the review information of the missing class and the review information of the inconsistent class. In an exemplary embodiment, the review information for the missing class may include: code information described by code segments of a software code is not recorded in the analytical model or model information described in the analytical model is implemented without corresponding code segments in the software code. In another exemplary embodiment, the code information described by the code segments of the software code is inconsistent with the model information described in the analytical model. In a specific embodiment, for example, software code 1 and analysis model 2 are input to a code review model, and the review information for software code 1 is output: "Standard Structure A and Standard Structure B in analytical model are generalization relations, standard Structure B should be regarded as sub-level Standard Structure of Standard Structure A in software code 1, whereas Standard Structure A and Standard Structure B in software code 1 have no generalization relations, suggesting modification".

The initial natural language model can be obtained by performing pretraining processing on the initial network model by adopting large-scale text data and performing fine tuning training by using a special sample software code and a corresponding sample analysis model on the basis of the initial natural language model. The initial network model may include a recurrent neural network model, a long and short memory network model, a transformation network (transducer) model, and the like.

In a specific implementation, an initial network model may be learned unsupervised using a large amount of unlabeled sample data, such as web pages, books, articles, etc. The probability distribution of the n+1th word may be predicted using a training objective of an autoregressive language (ARLM) model, such as the first N words of a given text sequence. Through the unsupervised learning, the initial natural language model has the bottom knowledge of natural languages such as vocabulary, grammar, sentence structure and the like. Further, during fine tuning training, a sample analysis model conforming to software engineering disciplines and software methods is obtained, sample software codes and marked review information of the sample analysis model are realized, the initial natural language model is subjected to supervised training, a code review model is obtained, so that the code review model has analysis capability of the analysis model and the software codes, and review information corresponding to the software codes is output.

According to the code review method, the initial natural language model is trained through the sample software codes and the sample analysis model to obtain the code review model. The code review model can automatically analyze and compare the difference between the software code and the analysis model, and can automatically find out the difference between the software code and the analysis model. Further, the analysis model comprises a standard analysis structure diagram and a standard analysis sequence diagram. The standard analysis structure diagram is used for representing the standard structure and the static relation between the standard structures, so that the code review model can discover various static relations between the standard structures in the software code by comparing the standard analysis structure diagram with the software code. The standard analysis sequence diagram is used for representing standard functions corresponding to the standard structure and interaction relations among the standard structures, so that the code review model can find out problems of function implementation errors, step implementation sequence errors and the like in the software code by comparing the standard analysis sequence diagram with the software code. Therefore, according to the embodiment of the disclosure, the difference comparison of the software code and the analysis model can be automatically realized efficiently and accurately, the problems in the software code are found, and detailed and comprehensive review information is given.

In one embodiment, the static relationship includes a generalization relationship, and the obtaining a standard analysis structure diagram corresponding to the software code includes:

And acquiring each standard structure of the software code corresponding to the software system.

A sub-level standard structure is determined from each standard structure, and a parent level standard structure containing the sub-level standard structure.

In particular, a standard structure may be used to describe the core knowledge of a software system. Such as personnel management systems including staff, engineers, product manager, computers, mice, printers, etc. As another example, a document processing system includes standard structures for documents, lines, pages, words, and the like. In an exemplary embodiment, a generalized relationship may exist between standard structures. The generalization relation is used for representing the set relation, and the standard structure B and the standard structure C are generalized to the standard structure A, namely the standard structure B and the standard structure C are called sub-level standard structures, and the standard structure C is called parent level standard structure. In a specific embodiment, referring to FIG. 3, a child level standard structure engineer, a product manager is generalized to parent level standard structure employees, which should have employee name attributes. In addition, the sub-level standard structure engineer and the product manager can also have unique attributes, such as the technical type attribute of the engineer and the attribute of the product manager in charge of the product. Alternatively, as shown with reference to FIG. 3, the child level standard structure and the parent level standard structure may be connected by generalized relational hollow arrows. Wherein the key head points to the parent standard structure. It should be noted that, the parent level standard structure and the child level standard structure in the embodiments of the present disclosure are relatively speaking, for example, the child level standard structure may also be used as a parent level standard structure of other standard structures, which is not limited in this aspect of the present application.

According to the embodiment, the standard analysis structure diagram with the generalization relation is obtained by acquiring each standard structure in the software system corresponding to the software code and connecting the child level standard structure and the parent level standard structure in the standard structure through the generalization relation symbol, and the standard structure with the generalization relation in the standard analysis structure diagram can be clearly represented.

In one embodiment, referring to fig. 7, the static relationship includes an association relationship or a dependency relationship, and the obtaining a standard analysis structure diagram corresponding to the software code includes:

Step S701, determining a plurality of packets from each standard structure of the software system corresponding to the software code; wherein each packet includes a first standard structure and a second standard structure having a static relationship.

In the embodiment of the disclosure, the static relationship may include an association relationship or a dependency relationship. The association relationship is used for representing the correlation of the individuals of the standard structures, and if the association relationship is formed by the two standard structures, the individuals of the two standard structures are linked. For example, referring to fig. 4, an engineer has an association with a computer because the engineer's individual is related to the individual of the computer (the engineer has an office computer). The association is divided into a one-way association and a two-way association, for example, the engineer and the computer in fig. 4 are two-way association, which means that the engineer individual knows which computers are, and the computer individual also records which engineer the engineer individual belongs to. The computer and the mouse are associated in one way, which means that the computer individual records the mouse to which the computer individual belongs, and the mouse individual does not record which computer the computer individual belongs to. The association relationship may include a one-to-many association relationship, for example, the engineer and the computer in fig. 4 are associated, one engineer has a plurality of computers, and one computer is allocated to only one engineer.

The dependency relationship in the static relationship represents the correlation between standard structures, and the correlation does not belong to a generalization relationship or an association relationship. For example, in FIG. 5, the correlation between engineer classes and printer classes, where the engineer uses printers, the individual printers are not strongly correlated to the individual engineers, and there is no one-to-one fixed relationship. There are typically a large number of dependencies in a software system, the number far exceeding the relationships and generalization.

Step S703 determines the number of the first standard structures in the packet and the number identifier for characterizing the number of the first standard structures, the number of the second standard structures and the number identifier for characterizing the number of the second standard structures, respectively.

In the embodiment of the disclosure, the groupings divide the standard structures according to the static relationship, and it is understood that the same standard structure may be located in different groupings, for example, in fig. 4, the worker and the computer have an association relationship and may belong to the same grouping; in fig. 5, the engineer and the printer have a dependency relationship, and may belong to the same group. In an exemplary embodiment, the number identifier may be used to represent the number of standard structures in a packet in a standard analysis structure diagram. E.g. 1, wherein 1 represents 1, and x represents a plurality. In fig. 4, the engineers and computers have a one-to-many association, which means that one engineer has a plurality of computers, and one computer is allocated to only one engineer. For another example, 1, 0, wherein 1 represents 1, 0 identifies 0 to more. The meaning of the more quantitative notations is as follows: 1: 1; 0: 0; 0..1:0 or 1; 0..*:0 to more; 1..*:1 to a plurality; * : multiple (greater than 1). It should be noted that, the arrangement of the number symbols is not limited to the above examples, and those skilled in the art may make other modifications in light of the technical spirit of the present application, but all the functions and effects achieved by the present application are included in the protection scope of the present application as long as they are the same or similar to the present application.

Step S705, the first standard structure and the corresponding second standard structure of each group are connected through static relation symbol, and the corresponding number identifiers are marked on the first standard structure and the second standard structure, so as to obtain a standard analysis structure diagram.

In the disclosed embodiment, static relational symbols are used to represent categories of static relationships between standard structures. For example, in FIG. 4 the engineer and computer are two-wire, connected by a solid line. The computer and the mouse are in one-way association, and the computer one-way association mouse is indicated by a single arrow. For another example, in fig. 5, there is a dependency relationship between the engineer and the printer, and the dependency relationship is indicated by a dotted line. There are many of the above-described standard structures in the standard analysis structure diagram, as well as static relationships between standard structures.

In the above embodiment, the number of the first standard structures and the corresponding number identifiers in the packet are determined by determining a plurality of packets in each standard structure of the software system, and the number of the second standard structures and the corresponding number identifiers; the first standard structure of each group is connected with the corresponding second standard structure through a static relation symbol, and the corresponding number identifiers are marked on the first standard structure and the second standard structure to obtain a standard analysis structure diagram, so that the standard structure with association relation and dependency relation in the standard analysis structure diagram can be clearly represented.

In one embodiment, obtaining a standard analysis sequence diagram corresponding to software code includes:

and acquiring each standard structure of the system use cases of the software code corresponding to the software system.

A textual description of the standard function of each standard structure is added to the corresponding location of the corresponding standard structure.

Specifically, the software system includes an entity that encapsulates behavior and data, provides specified functions externally in the form of software code, such as a document processing system, encapsulates relevant behavior logic and data that can complete document processing, and provides document processing functions externally. In an exemplary embodiment, referring to FIG. 8, a software system has multiple functions, referred to as a system use case, e.g., an e-commerce system has a customer order use case. Thus, a software system may correspond to multiple standard analysis sequence diagrams.

In a specific implementation process, for each standard structure of the system use case, a text description of a standard function of each standard structure is added to a corresponding position of a corresponding standard structure. For example, referring to FIG. 6, the analysis sequence diagram corresponds to a function of creating a new page, and the participating standard structures are product manager, engineer, and tester. Standard functions of the product manager include designing new pages, adjusting pages, and accepting new pages; standard functions of engineers include modifying pages; the tester's functions include testing new pages. Further, the standard structures where the interaction occurs are connected by interaction relations, for example, after the product manager adjusts the page, the page is sent to the engineer, and the interaction occurs between the product manager and the engineer. For another example, the engineer sends a new page to the tester, and interaction occurs between the engineer and the tester. In an exemplary embodiment, the interaction behavior described above may be represented using interaction relatives, such as arrows between product manager and engineer. In an exemplary embodiment, the standard analysis sequence diagram may further include judgment logic, such as a new page that is online when the acceptance result is passed; and when the acceptance result is that the page fails, modifying the page.

According to the embodiment of the disclosure, the standard structures with interactive behaviors are connected through the interactive relational symbol by adding the text description of the standard functions of each standard structure to the corresponding positions of the standard structures, so that the standard structures in the standard analysis sequence diagram, the functions of the standard structures and the interactive relations of the standard structures can be clearly represented.

In one embodiment, referring to FIG. 9, the inputting the software code and the analysis model into a code review model includes:

step S901, extracting semantic information in the analysis model, where the semantic information includes a standard structure and a static relationship between standard structures, and/or a standard function corresponding to the standard structure and an interactive relationship between standard structures.

In the embodiment of the present disclosure, the standard structures in the semantic information and the static relationship between the standard structures, and/or the standard functions corresponding to the standard structures and the various concepts of the interaction relationship between the standard structures are the same as the description of the above embodiment, and are not repeated herein. In an exemplary embodiment, semantic information in the analysis model may be identified by the image processing model and converted to text content. In another exemplary embodiment, the analysis model may also be described directly in text, at which time there may be no need to make a text conversion of the analysis model.

Step S903, writing the semantic information into a text file, to obtain a text file containing the semantic information.

Step S905, inputting the text file containing the semantic information and the software code into a code review model.

In an exemplary embodiment, referring to fig. 10, semantic information is written into a text file, so as to obtain a text file containing the semantic information, where the text file is, for example, txt file, HTML file, JSON file, and the present application is not limited to the format of the text file. In an exemplary embodiment, the analytical model may also include english names of standard structures, such as: engineers, engineers; a Computer; mouse, etc. And English names are added in the standard structure of the analysis model, so that when a developer writes a program language according to the analysis model, the standard structure in the program language has the same text expression as the standard structure in the analysis model, and the comparison of the analysis model and the software codes is facilitated. Finally, the text file containing the voice information and the software code are input into the code review model.

In the embodiment, the semantic information of the analysis model is extracted, the semantic information is written into the text file, and the text file containing the semantic information and the software code are input into the code review model, so that the network structure of the code review model is lighter, and the output efficiency of the code review model is improved.

In one embodiment, the code information or the model information includes at least one of each standard structure, an attribute of the standard structure, a standard function corresponding to the standard structure, a static relationship between the standard structures, and an interactive relationship, the review information includes review information of a missing class, and outputting the review information corresponding to the software code includes:

In an embodiment of the present disclosure, the code information includes: each standard structure, the attribute of the standard structure, the standard function corresponding to the standard structure, the static relation between the standard structures and the interactive relation. The model information includes: each standard structure, the attribute of the standard structure, the standard function corresponding to the standard structure, the static relation between the standard structures and the interactive relation. Various concepts related to the code information and the model information are the same as those described in the foregoing embodiments, and are not repeated herein.

In an embodiment of the present disclosure, the review information of the missing class includes code information described by the code segments of the software code that is not recorded in the analytical model. For example: the standard structure 1 of the software code a is not defined in the standard analysis structure diagram S, and the standard structure 1 of the software code a is not satisfactory. The review information of the missing class includes model information described in the analysis model, and is implemented without corresponding code segments in the software code. For example: standard functions of the standard structure 2 in the standard analysis sequence chart are not correspondingly implemented in the software code B, and the standard functions of the standard structure 2 need to be implemented in the software code B in a supplementary manner. For another example: in the standard analysis structure diagram, a standard structure 3 and a standard structure 4 are in an association relation, and a software code C does not record the standard structure 4 in the association relation with the standard structure 3.

In the above embodiment, the software code and the analysis model are input to the code review model, and the review information of the missing class is output. Code information described by code segments of the software code is advantageously detected not to be recorded in the analytical model, and model information described in the analytical model is such that no code defect is present in the software code to be implemented by the corresponding code segment.

In one embodiment, the code information or the model information includes at least one of each standard structure, an attribute of the standard structure, a standard function corresponding to the standard structure, a static relationship between the standard structures, and an interactive relationship, the review information includes review information of a missing class, the review information includes review information of a non-uniform class, and the outputting the review information corresponding to the software code includes:

The non-uniform class of review information in the embodiments of the present disclosure includes: code information described by the code segments of the software code is inconsistent with model information recorded in the analysis model. For example: the properties of the standard structure in the software code E are inconsistent with the properties of the corresponding standard structure in the standard analysis structure. For another example: the method for realizing the standard structure a in the software code F is inconsistent with the function of the standard structure a in the standard analysis sequence chart. For another example: the software code G calls the standard structure c of a certain function, which is inconsistent with the call of the function standard structure e in the standard sequence diagram. Also for example: the flow arrangement for realizing a certain function in the software code J is inconsistent with the flow arrangement for realizing the function in the standard analysis sequence chart.

In the above embodiment, the software code and the analysis model are input to the code review model, and the review information of the non-uniform class is output. It is advantageous to detect various code defects where the code information described by the code segments of the software code is inconsistent with the model information described in the analytical model.

In one embodiment, referring to fig. 11, the code review model is obtained by:

Step S1101, acquiring a first sample set; the first sample set comprises sample software codes and a sample analysis model corresponding to various review tasks, and the sample software codes are marked with review information.

In the embodiment of the disclosure, the review tasks correspond to the categories of the review information, for example, the review tasks of the deletion class are formulated for the review information of the deletion class, and the review tasks of the non-uniform class are formulated for the review information of the non-uniform class. In an exemplary embodiment, for the review information of the missing class or the review information of the inconsistent class, a plurality of specific classes can be further subdivided, and a corresponding review task can be set for each subdivided class during model training.

In a specific embodiment, the sample software code, sample analysis model, and review information corresponding to review task Q, such as: the standard structure A and the standard structure B in the sample analysis model are generalized relations, and the standard structure B is used as a sub-level standard structure of the standard structure A. Standard structure a and standard structure B have no generalization relationship in the sample software code. The annotated review information may include: the standard structure A and the standard structure B in the sample analysis model are generalization relations, the standard structure B in the software code is used as a sub-level standard structure of the standard structure A, and the standard structure A and the standard structure B in the sample software code are not generalization relations, so that modification is suggested. It will be appreciated that the larger the first sample set, the more abundant the covered review tasks, and the more accurate and comprehensive the review information of the review tasks.

Step S1103, respectively inputting the sample software codes and the sample analysis models corresponding to the plurality of review tasks to the initial natural language model, and outputting the first prediction result.

In an exemplary embodiment, the sample software codes and the sample analysis model corresponding to the plurality of review tasks may be integrally input into the initial natural language model, and the first prediction result may be output. In another exemplary embodiment, the sample software code and the sample analysis model corresponding to each of the multiple review tasks may be input to the initial natural language model, the initial natural language model may be modified, and when the requirements are met, the sample software code and the sample analysis model corresponding to the next review task may be input to the initial natural language model until all the sample software codes and the sample analysis models corresponding to the review tasks are input.

In the embodiment of the disclosure, the initial natural language model may include pre-training the initial network model by using large-scale text data, and performing fine-tuning training by using a special sample software code and a corresponding sample analysis model on the basis of the initial natural language model. The initial network model may include a recurrent neural network model, a long and short memory network model, a transformation network (transducer) model, and the like.

And step S1105, carrying out iterative adjustment on the initial natural language model based on the difference between the first prediction result and the marked review information until the difference meets the preset requirement, and obtaining a code review model.

In an exemplary embodiment, the sample software codes and the sample analysis model corresponding to the plurality of review tasks are integrally input into the initial natural language model, and the first prediction result is output. For example, the review task a corresponds to a first prediction result a, and a difference a exists between the first prediction result a and the review information of the review task a; the review task B corresponds to a first prediction result B, and a difference B exists between the first prediction result B and review information of the review task B; the review task C corresponds to the first prediction result C, and a difference C exists between the first prediction result C and review information of the review task C. The loss function may be constructed based on the sum of the differences a, b, and c.

In another exemplary embodiment, the sample software code and the sample analysis model corresponding to each of the multiple review tasks may be input to the initial natural language model, for example, the sample software code and the sample analysis model corresponding to the review task a may be input to the initial natural language model, so as to obtain the first prediction result a. There is a difference a between the first prediction result a and the review information of the review task a. And (3) adjusting parameters of the initial natural language model based on the difference a, and inputting a sample software code and a sample analysis model corresponding to the review task B into the initial natural language model after the preset requirement is met, so as to obtain a first prediction result B. There is a difference B between the first prediction result B and the review information of the review task B. And (3) performing parameter adjustment on the initial natural language model based on the difference b, and continuously inputting the sample software codes and the sample analysis models corresponding to the review task C after the preset requirement is met until all the review task training is finished.

According to the embodiment, the initial natural language model is trained based on the supervised training mode through the sample software codes and the sample analysis models corresponding to the various review tasks, and the obtained code review model can detect and review mismatching of the software codes and aspects of the analysis model, so that more comprehensive review information is output.

In one embodiment, the initial natural language model includes an encoder network and a decoder network, and the method inputs sample software codes and sample analysis models corresponding to various review tasks to the initial natural language model, outputs a first prediction result, and includes:

Sample software codes and sample analysis models corresponding to various review tasks are input to an encoder network, and characteristics are output; wherein the features include semantic features and structural features.

And inputting the characteristics and the characteristics corresponding to the last predicted result of the first predicted result into a decoder network, and outputting the first predicted result.

In an embodiment of the present disclosure, the initial natural language model includes an encoder network and a decoder network. Take the example of a transform (transform) network. Referring to fig. 12, the transformation network includes an encoder and a decoder. When the model runs, the model is expected to input software codes and analyze the model, and the model outputs review information. In an exemplary embodiment, the sample software code, sample analysis model, may be converted into a numerical vector so that the transformation network can be identified. The encoder network then reads the input numerical vector, extracts semantic features and structural features from the numerical vector, and inputs the semantic features and structural features to the decoder network. The input to the decoder network has two sources, the first being the semantic and structural features of the encoder output and the second being the first prediction result of the decoder output before, similar to the answer to a question, not only based on the question itself, but also on the solutions made before.

Wherein a self-attention mechanism appears in the transformation network, and the model uses the self-attention mechanism to complete extraction of semantic features and structural features in the input content. Taking the text "How you" as an example, firstly, for each word input, the query (Q), key (K) and value (V) vectors are calculated respectively, and the input text is encoded into a numeric vector, specifically, each word is converted into a word embedding vector, the word embedding vector contains semantic information of the word, and the process of converting the word into the word embedding vector is completed through an independent neural network model. The "How are you" is subjected to preliminary processing to obtain word embedding vectors of each word、/>、/>. Then, the word embedding vector is multiplied by a weight matrix to obtain the corresponding Q, K, V, the weight matrix W is globally uniform for the model, and the obtained parameters are also part of model training, i.e. the parameters are initially random numbers, and the final determination is continuously adjusted along with the model training.

（1）

Q, K, V of the' are, you、/>、/>、/>、/>、/>) The same way is obtained. After Q, K, V is obtained, the word-to-word correlation can be calculated: /(I)

（2）

The above formula is used to calculate the correlation between "How" and "are", the output is also a matrix,The parameters are fixed constants, softmax is a normalization function, which inputs a vector (1×n), resulting in a fraction between 0 and 1. The transformation network generates an output based on the word-to-word direct correlation information extracted from the attention mechanism and the word-to-word direct relative position information (both also referred to as semantic features and structural features).

According to the embodiment, based on the fact that the initial natural language model comprises the encoder network and the decoder network, the input of the decoder network is not only the output of the encoder network, but also the input of the corresponding characteristic of the last prediction result, and the prediction accuracy of the model after training is higher. And the self-attention mechanism in the encoder network and the decoder network provides technical support for cross fusion of the features, and improves the accuracy of the code review model prediction.

In one embodiment, obtaining a first set of samples includes:

A plurality of review rules are generated based on at least one of the attributes of the standard structures, the standard functions, and the static and interactive relationships between the standard structures in the analytical model.

Based on the multiple review rules, multiple review tasks are determined, and sample software codes and sample analysis models matched with each review task are obtained.

Specifically, the analysis model analyzes the attributes of the standard structures, the standard functions, and the static relationships and various concepts of the interaction relationships between the standard structures, which are the same as those described in the above embodiments, and the disclosure is not repeated here. Accordingly, the review rules may include review rules for the missing class, such as: code information described by the code segments of the software code is not recorded in the analytical model; model information described in the analytical model is implemented in the software code without the corresponding code segments. The review rules may also include non-uniform classes of review rules, such as: code information described by the code segments of the software code is inconsistent with model information recorded in the analysis model. The code information and the model information have been described in the above embodiments, and are not described herein. It should be noted that, the setting manner of the evaluation rule is not limited to the above examples, for example, a new evaluation rule may be added along with the change of the application scenario, and other modifications may be made by those skilled in the art in light of the technical spirit of the present application, but as long as the implemented functions and effects are the same or similar to those of the present application, all the modifications should be covered in the protection scope of the present application.

In the above embodiment, the plurality of review rules are generated by analyzing at least one of the attribute of the standard structure, the standard function, and the static relationship and the interactive relationship between the standard structures in the model. And setting corresponding review tasks for each review rule. Sample software code and a sample analysis model are obtained, which are matched for each review task, wherein the sample software code and the sample analysis model can comprise positive samples and negative samples. And the matching information of the software codes and the analysis model can be comprehensively detected by utilizing the sample software codes corresponding to the various evaluation rules and the code evaluation model trained by the sample analysis model.

In one embodiment, referring to fig. 13, the obtaining manner of the initial natural language model includes:

step S1301, obtaining a second sample set; wherein the second sample set includes unlabeled sample text.

Step S1303, performing word segmentation processing on the sample text to obtain words.

In the embodiment of the disclosure, the second sample set includes a large amount of unlabeled sample text, such as web pages, books, articles, and the like. In an exemplary embodiment, word segmentation is performed on the sample text to obtain words. For example, "weather today is good" and the word "today", "weather", "good" is obtained after the word segmentation.

Step S1305, sequentially inputting the words into the initial network model according to the order of the words in the sample text, and outputting a second prediction result, wherein the second prediction result comprises the predicted next word of the words.

Step S1307, based on the difference between the second prediction result and the actual word next to the word, performing iterative adjustment on the initial network model until the difference meets a preset requirement, to obtain an initial natural language model.

In the embodiment of the disclosure, after a word is input to an initial network model, a probability value of a next word of the word is output, for example, the probability of the next word being "good" is 80%, the probability of the next word being "me" is 50%, and the initial network model selects "good" with the highest probability as a second prediction result.

In an exemplary embodiment, an autoregressive training approach may also be employed in training the initial natural language model. Autoregressive is a mathematical model for predicting future values based on previous values in time series data, the output of which is based on previously generated text and problem text, and autoregressive is used to calculate the probability of the next word, so that the prediction accuracy of an initial natural language model can be better improved.

In one embodiment, the acquiring the first set of samples includes:

and acquiring initial sample software codes and initial sample analysis models corresponding to various detection tasks. And carrying out code reconstruction, annotation addition or modification of variable names on the initial sample software code to obtain an expanded sample software code.

According to the embodiment of the disclosure, initial sample software codes and initial sample analysis models corresponding to various detection tasks are obtained. In an exemplary embodiment, the initial sample software code may be code reconstructed, for example: deleting a part of code fragments of the initial sample software code, exchanging code fragments located at different positions in the initial sample software code, inserting code fragments in other initial sample software codes into the initial sample software code currently processed, and the like. In another exemplary embodiment, annotations may be added to the initial sample software code, such as: adding static relationships of the standard structure to other standard structures, etc. In another exemplary embodiment, the variable name may be modified in the initial sample software code, such as modifying the name of the standard structure, modifying the properties of the standard structure, and so forth. With the above modifications, an extended sample software code can be obtained.

According to the embodiment, the sample size of the initial sample software code can be expanded by carrying out code reconstruction, annotation addition and variable name modification on the initial sample software code, and a code review model trained by using the expanded sample software code has good generalization capability and accuracy.

In one embodiment, the acquiring the first set of samples includes:

Acquiring candidate sample software codes corresponding to various detection tasks and corresponding candidate sample analysis models; and inputting the candidate sample software codes and the corresponding candidate sample analysis models into an initial natural language model, and outputting a third prediction result and corresponding prediction probability.

In the embodiment of the disclosure, candidate sample software codes corresponding to various detection tasks and corresponding candidate sample analysis models are input to an initial natural language model, and a third prediction result and corresponding prediction probability are obtained through output. Based on the probability distribution of the prediction probabilities corresponding to the third prediction result, for example: and after the candidate sample software codes and the candidate sample analysis model are input into the initial natural language model for a plurality of times, the obtained third prediction results are different, and the candidate sample software codes and the candidate sample analysis model are used as sample software codes and sample analysis models in the first sample set. For another example: after the candidate sample software code and the candidate sample analysis model are input into the initial natural language model for a plurality of times, probability values of two or more candidate prediction results are relatively close, and then the candidate sample software code and the candidate sample analysis model can be used as sample software codes and sample analysis models in the first sample set. It should be noted that, the extension of the first sample set in the present application is not limited to the above embodiment, and the training efficiency of the code review model may be improved by using a migration learning manner.

According to the embodiment of the disclosure, the complexity of the labeling work of the first sample set is considered, a part of candidate sample software codes and candidate sample analysis models are screened in advance according to the probability distribution of the third prediction result, and the sample software codes and the sample analysis models with the highest information value are screened, so that the rapid convergence of the code review model is facilitated.

In one embodiment, after the outputting the code line and the corresponding comment information that do not conform to the analysis model, the method further includes:

the software code is presented in a code presentation column and code segments that do not match the analysis model are marked for presentation.

And displaying the review information corresponding to the code segment in a review information display column.

Specifically, referring to fig. 14, the embodiment of the present disclosure presents review information through a visual interface. Wherein the original content of the software code is displayed in the code display column. In an exemplary embodiment, when a code segment does not match an analytical model, the code presentation column may mark the code segment for presentation, such as represented by the box in FIG. 14. Alternatively, the font color of the code segment or the model size may be changed to mark and display the code segment. In an exemplary embodiment, the review information corresponding to the code segment may be presented in a review information presentation column.

The above embodiment, the software code is presented in the code presentation column and the code segments that do not match the analysis model are presented in the label. And displaying the review information corresponding to the code segment in a review information display column. The method is beneficial for a code developer to quickly discover defects in the software code and know the reasons for the defects from the review information.

In one embodiment, displaying the review information corresponding to the code segment in a review information display column includes:

and displaying the review information corresponding to the code segment and the acquisition address of the analysis model in a review information display column.

In the embodiment of the disclosure, the software codes and the corresponding analysis models are input into the code review model review information display column, so that not only the review information corresponding to the code fragments but also the acquisition addresses of the analysis models can be displayed. Clicking on the acquired address may reveal the analytical model. And the research personnel can conveniently compare the analysis model with the software codes, and check and review the information.

In one embodiment, the method further includes the steps of:

In embodiments of the present disclosure, the associated issue information may include issue information identified in natural language. For example: the review information includes: in the standard analysis structure diagram, a standard structure A and a standard structure B are generalization relations, the standard structure B is used as a sub-level standard structure of the standard structure A in the software code, and the standard structure A and the standard structure B in the software code are not generalization relations, so that modification is suggested. The associated problem information includes: please explain the generalized relationship between the standard structure and the standard structure in detail. At this time, the associated question information, the software code, and the analysis model may be input again to the code review model, and the reply information corresponding to the associated question information may be output. For example: the reply information includes: generalizing means that the set relationship, standard Structure B, standard Structure C are generalized to Standard Structure A, i.e., standard Structure B and Standard Structure C are subsets of Standard Structure A. From the code reference point of view, the standard structure B and the standard structure C inherit the standard structure a. In an exemplary embodiment, reply information may be further questioned, for example, the standard structure C is also a generalization relationship with the standard structure a, the standard structure C is already a parent standard structure of the standard structure a, and if the standard structure a is listed as a parent standard structure of the standard structure B, a multi-layer generalization relationship between CBAs is formed, which is inconvenient for code management, and thus is not written as such. Accordingly, the code review model may perform an input operation again to give reply information, such as: the multiple inheritance relationships you mention do have the problem of unmanageable in some programming languages, but according to the domain model, the standard structure a is a generalized relationship with the standard structure B, and then the standard structure a needs to be written as a parent standard structure of the standard structure B. If a dispute exists, this can be discussed further with the modeler.

According to the embodiment, the function of inputting the associated problem information of the review information is set, the associated problem information, the software code and the analysis model are input into the code review model again, so that the code review model is combined with the last reply and the associated problem information, the code fragments with the review question are further interpreted in detail, and accuracy of the review information is improved.

In one embodiment, referring to fig. 15, a training method of a code review model is provided, where the method is applied to a terminal to illustrate the method, it is understood that the method may also be applied to a server, and may also be applied to a system including the terminal and the server, and implemented through interaction between the terminal and the server. In this embodiment, the method includes the steps of:

Step S1501, acquiring a first sample set; the first sample set comprises sample software codes and sample analysis models corresponding to various review tasks, wherein the sample software codes are marked with review information; the sample analysis model comprises at least one of a standard analysis structure diagram corresponding to the sample software code and a standard analysis sequence diagram of a standard function corresponding to the sample software code; the standard analysis structure diagram is used for representing a standard structure and a static relation among the standard structures, and the standard analysis sequence diagram is used for representing a standard function corresponding to the standard structure and an interactive relation among the standard structures; wherein the review information is used to characterize matching information of the sample software code and the analytical model.

The review task may correspond to a review rule, and the review rule may be set according to model information in the analysis model. The review rules may include review rules for the missing class, such as: code information described by the code segments of the software code is not recorded in the analytical model; model information described in the analytical model is implemented in the software code without the corresponding code segments. The review rules may also include non-uniform classes of review rules, such as: code information described by the code segments of the software code is inconsistent with model information recorded in the analysis model.

Wherein the analytical model is a way to describe the core knowledge of the software system. A standard analysis structure diagram belongs to one of analysis models and is used for describing core knowledge in a software system. In the standard analysis structure diagram, the core knowledge can be described using the concept of a standard structure. Wherein, the standard structure has a static relation with the standard structure, and the static relation may include: generalizing relations, incidence relations and dependency relations. The generalization relation is used for representing the set relation, and the standard structure B and the standard structure C are generalized to the standard structure A, namely the standard structure B and the standard structure C are subsets of the standard structure A, and from the code design perspective, the standard structure B and the standard structure C inherit the standard structure A. The association relationship is used for representing the correlation of the individuals of the standard structures, and if the association relationship is formed by the two standard structures, the individuals of the two standard structures are linked. The dependency relationship in the static relationship represents the correlation between standard structures, and the correlation does not belong to a generalization relationship or an association relationship.

The standard analysis sequence diagram belongs to one of analysis models and is used for representing standard functions corresponding to the standard structure and interaction relations among the standard structures. For example, referring to FIG. 6, the analysis sequence diagram corresponds to a new page creation function, and the participating standard structures are product managers, engineers, and testers. The figure shows what kind of responsibilities each standard structure assumes at each step, which responsibilities are triggered by which standard structure, and the interactions between standard structures.

Wherein the review information is used to characterize the matching information of the software code to the analysis model, e.g. that certain code segments of the software code match the analysis model or that certain code segments of the software code do not match the analysis model. Wherein, the mismatch between the software code and the code segment may include the review information of the missing class and the review information of the inconsistent class. In an exemplary embodiment, the review information for the missing class may include: code information described by code segments of a software code is not recorded in the analytical model or model information described in the analytical model is implemented without corresponding code segments in the software code. In another exemplary embodiment, the code information described by the code segments of the software code is inconsistent with the model information described in the analytical model.

In step S1503, the sample software codes and the sample analysis models corresponding to the multiple review tasks are respectively input to the initial natural language model, and the first prediction result is output.

Step S1505, based on the difference between the prediction result and the marked review information, iteratively adjusting the initial natural language model until the difference meets the preset requirement, thereby obtaining the code review model.

The code review model obtained by the method can automatically analyze and compare the difference between the software code and the analysis model, and can automatically find the difference between the software code and the analysis model. Further, the analysis model comprises a standard analysis structure diagram and a standard analysis sequence diagram. The standard analysis structure diagram is used for representing the standard structure and the static relation between the standard structures, so that the code review model can discover various static relations between the standard structures in the software code by comparing the standard analysis structure diagram with the software code. The standard analysis sequence diagram is used for representing standard functions corresponding to the standard structure and interaction relations among the standard structures, so that the code review model can find out problems of function implementation errors, step implementation sequence errors and the like in the software code by comparing the standard analysis sequence diagram with the software code. Therefore, according to the embodiment of the disclosure, the difference comparison of the software code and the analysis model can be automatically realized efficiently and accurately, and the problems in the software code can be found.

In a specific embodiment, the method of the application can be applied to various application scenes aiming at personnel management systems, electronic commerce systems, document management systems and the like. In the prior art, whether the sequence diagram, the standard analysis structure diagram and the code implementation are consistent or not is judged by manual judgment, and the manual judgment has the advantages of high time consumption and low efficiency; subjectivity and inconsistency exist depending on the experience and ability of the panelist; and it is difficult to find subtle differences and potential problems. The application provides a code review method which can efficiently, accurately and automatically realize the difference comparison of software codes and analysis models, discover problems in the software codes and give detailed and comprehensive review information.

Referring to FIG. 16, the code review method of the present application includes a model training phase and a deployment usage phase.

Specifically, the deployment and use stage can further comprise a review information generation stage and a review information display stage. The model training phase may in turn include an initial natural language model training phase and a training phase of a code review model.

In the evaluation information generation stage, the method comprises the steps of obtaining software codes and corresponding analysis models, inputting the software codes and the analysis models into a code evaluation model, and outputting evaluation information corresponding to the software codes.

In one embodiment, the analysis model includes a standard analysis structure diagram of a software system corresponding to the software code, and a standard analysis sequence diagram of a certain system use case of the software system. In the standard analysis structure diagram, the concept of a standard structure can be used to describe core knowledge, for example, a class in a programming language, wherein the class is a programming structure in software development, is a blueprint for defining entities with the same attribute (data member) and behavior (member function), and is a core concept of an object-oriented programming language (such as C++, java). In a particular implementation, the standard analysis structure may include an analysis class diagram, for example, referring to FIG. 2, in which "employees" appear in the personnel management system, then the "employees" may be represented using "employee classes". The attributes of the employee in FIG. 2 include name, gender, telephone and job position, indicating that these attributes are of interest to the system, and other attributes, such as height, weight, etc., which may not be of interest to the system, may not be listed. In the embodiment of the disclosure, the static relationship between the standard structure and the standard structure in the standard analysis structure chart may include a fixed state of the software system, which represents the attribute of the system during the existence period. The static relationship may include: generalizing relations, incidence relations and dependency relations.

In one embodiment, the standard analysis sequence diagram belongs to one of analysis models, and is used for representing standard functions corresponding to the standard structure and interaction relations among the standard structures. For example, referring to FIG. 6, the analysis sequence diagram corresponds to a new page creation function, and the participating standard structures are product managers, engineers, and testers.

In one embodiment, the review information is used to characterize the matching information of the software code to the analysis model, e.g., that certain code segments of the software code match the analysis model or that certain code segments of the software code do not match the analysis model. Wherein, the mismatch between the software code and the code segment may include the review information of the missing class and the review information of the inconsistent class. In an exemplary embodiment, the review information for the missing class may include: code information described by code segments of a software code is not recorded in the analytical model or model information described in the analytical model is implemented without corresponding code segments in the software code. In another exemplary embodiment, the code information described by the code segments of the software code is inconsistent with the model information described in the analytical model. In a specific embodiment, for example, software code 1 and analysis model 2 are input to a code review model, and the review information for software code 1 is output: "Standard Structure A and Standard Structure B in analytical model are generalization relations, standard Structure B should be regarded as sub-level Standard Structure of Standard Structure A in software code 1, whereas Standard Structure A and Standard Structure B in software code 1 have no generalization relations, suggesting modification".

And in the review information display stage, displaying the software codes in a code display column and marking and displaying the code fragments which are not matched with the analysis model. And displaying the review information corresponding to the code segment in a review information display column.

In one embodiment, referring to FIG. 14, the embodiment of the present disclosure presents review information via a visual interface. Wherein the original content of the software code is displayed in the code display column. In an exemplary embodiment, when a code segment does not match an analytical model, the code presentation column may mark the code segment for presentation, such as represented by the box in FIG. 14. Alternatively, the font color of the code segment or the model size may be changed to mark and display the code segment. In an exemplary embodiment, the review information corresponding to the code segment may be presented in a review information presentation column.

In one embodiment, the software code and the corresponding analysis model are input into the code review model review information display column, so that not only can the review information corresponding to the code segment be displayed, but also the acquisition address of the analysis model can be displayed. Clicking on the acquired address may reveal the analytical model.

In one embodiment, input associated question information for the review information is received at a review information presentation column; and inputting the associated problem information, the software codes and the analysis model into a code review model again, and outputting and displaying reply information corresponding to the associated problem information.

In an initial natural language model training stage, the training method comprises the following steps: acquiring a second sample set; wherein the second sample set comprises unlabeled sample text; word segmentation processing is carried out on the sample text to obtain words; inputting the words into an initial network model in sequence according to the appearance sequence of the words in the sample text, and outputting a second prediction result, wherein the second prediction result comprises the predicted next word of the words; and iteratively adjusting the initial network model based on the difference between the second prediction result and the actual next word of the words until the difference meets the preset requirement, so as to obtain an initial natural language model.

In a training stage of a code review model, acquiring a first sample set on the basis of an initial natural language model; the first sample set comprises sample software codes and a sample analysis model corresponding to various review tasks, and the sample software codes are marked with review information. Sample software codes and sample analysis models corresponding to various review tasks are respectively input into an initial natural language model, and a first prediction result is output. And iteratively adjusting the initial natural language model based on the difference between the first prediction result and each piece of marked review information until the difference meets the preset requirement to obtain a code review model.

It should be understood that, although the steps in the flowcharts related to the embodiments described above are sequentially shown as indicated by arrows, these steps are not necessarily sequentially performed in the order indicated by the arrows. The steps are not strictly limited to the order of execution unless explicitly recited herein, and the steps may be executed in other orders. Moreover, at least some of the steps in the flowcharts described in the above embodiments may include a plurality of steps or a plurality of stages, which are not necessarily performed at the same time, but may be performed at different times, and the order of the steps or stages is not necessarily performed sequentially, but may be performed alternately or alternately with at least some of the other steps or stages.

Based on the same inventive concept, the embodiment of the application also provides a code review device for realizing the above related code review method. The implementation of the solution provided by the device is similar to the implementation described in the above method, so the specific limitation in one or more code review device embodiments provided below may be referred to the limitation of the code review method hereinabove, and will not be repeated here.

In one embodiment, as shown in FIG. 17, there is provided a code review apparatus comprising:

A first obtaining module 1701, configured to obtain a software code and a corresponding analysis model, where the analysis model includes at least one of a standard analysis structure diagram corresponding to the software code and a standard analysis sequence diagram corresponding to a standard function of the software code; the standard analysis structure diagram is used for representing a standard structure and a static relation among the standard structures, and the standard analysis sequence diagram is used for representing a standard function corresponding to the standard structure and an interactive relation among the standard structures;

The generating module 1703 is configured to input the software code and the analysis model to a code review model, and output review information corresponding to the software code; wherein the review information is used for characterizing matching information of the software code and the analysis model; the code review model is obtained by training an initial natural language model based on a sample software code and a corresponding sample analysis model.

In one embodiment, the first acquisition module is further configured to:

In one embodiment, the generating module is further configured to:

In one embodiment, the code review device further comprises:

In one embodiment, the third obtaining module is further configured to:

determining a plurality of review tasks based on the plurality of review rules;

In one embodiment, the code review device further comprises:

In one embodiment, the third obtaining module is further configured to:

In one embodiment, the code review device further comprises:

In one embodiment, the second display module is further configured to:

In one embodiment, referring to FIG. 18, a training apparatus 1800 for a code review model is provided, comprising:

A second acquiring module 1801, configured to acquire a first sample set; the first sample set comprises sample software codes and sample analysis models corresponding to various review tasks, wherein the sample software codes are marked with review information; the sample analysis model comprises at least one of a standard analysis structure diagram corresponding to the sample software code and a standard analysis sequence diagram of a standard function corresponding to the sample software code; the standard analysis structure diagram is used for representing a standard structure and a static relation among the standard structures, and the standard analysis sequence diagram is used for representing a standard function corresponding to the standard structure and an interactive relation among the standard structures; wherein the review information is used to characterize matching information of the sample software code and the analysis model;

The input module 1803 is configured to input sample software codes and sample analysis models corresponding to multiple review tasks to an initial natural language model, and output a first prediction result;

And a processing module 1805, configured to iteratively adjust the initial natural language model based on the difference between the prediction result and the annotated review information until the difference meets a preset requirement, thereby obtaining a code review model.

The above-described code review device and the respective modules in the training device of the code review model may be implemented in whole or in part by software, hardware, and combinations thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.

In one embodiment, a computer device is provided, which may be a server, and the internal structure of which may be as shown in fig. 19. The computer device includes a processor, a memory, an Input/Output interface (I/O) and a communication interface. The processor, the memory and the input/output interface are connected through a system bus, and the communication interface is connected to the system bus through the input/output interface. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, computer programs, and a database. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The database of the computer device is for storing code review data. The input/output interface of the computer device is used to exchange information between the processor and the external device. The communication interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a code review method.

In one embodiment, a computer device is provided, which may be a terminal, and an internal structure diagram thereof may be as shown in fig. 20. The computer device includes a processor, a memory, an input/output interface, a communication interface, a display unit, and an input means. The processor, the memory and the input/output interface are connected through a system bus, and the communication interface, the display unit and the input device are connected to the system bus through the input/output interface. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The input/output interface of the computer device is used to exchange information between the processor and the external device. The communication interface of the computer device is used for carrying out wired or wireless communication with an external terminal, and the wireless mode can be realized through WIFI, a mobile cellular network, NFC (near field communication) or other technologies. The computer program is executed by a processor to implement a code review method. The display unit of the computer equipment is used for forming a visual picture, and can be a display screen, a projection device or a virtual reality imaging device, wherein the display screen can be a liquid crystal display screen or an electronic ink display screen, the input device of the computer equipment can be a touch layer covered on the display screen, can also be a key, a track ball or a touch pad arranged on a shell of the computer equipment, and can also be an external keyboard, a touch pad or a mouse and the like.

It will be appreciated by those skilled in the art that the structure shown in FIG. 20 is merely a block diagram of some of the structures associated with the present inventive arrangements and is not limiting of the computer device to which the present inventive arrangements may be applied, and that a particular computer device may include more or fewer components than shown, or may combine some of the components, or have a different arrangement of components.

Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, database, or other medium used in embodiments provided herein may include at least one of non-volatile and volatile memory. The nonvolatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical Memory, high density embedded nonvolatile Memory, resistive random access Memory (ReRAM), magneto-resistive random access Memory (Magnetoresistive Random Access Memory, MRAM), ferroelectric Memory (Ferroelectric Random Access Memory, FRAM), phase change Memory (PHASE CHANGE Memory, PCM), graphene Memory, and the like. Volatile memory can include random access memory (Random Access Memory, RAM) or external cache memory, and the like. By way of illustration, and not limitation, RAM can be in various forms such as static random access memory (Static Random Access Memory, SRAM) or dynamic random access memory (Dynamic Random Access Memory, DRAM), etc. The databases referred to in the embodiments provided herein may include at least one of a relational database and a non-relational database. The non-relational database may include, but is not limited to, a blockchain-based distributed database, and the like. The processor referred to in the embodiments provided in the present application may be a general-purpose processor, a central processing unit, a graphics processor, a digital signal processor, a programmable logic unit, a data processing logic unit based on quantum computing, or the like, but is not limited thereto.

The technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.

The foregoing examples illustrate only a few embodiments of the application and are described in detail herein without thereby limiting the scope of the application. It should be noted that it will be apparent to those skilled in the art that several variations and modifications can be made without departing from the spirit of the application, which are all within the scope of the application. Accordingly, the scope of the application should be assessed as that of the appended claims.

Claims

1. A code review method, comprising:

Acquiring a software code and a corresponding analysis model, wherein the analysis model comprises at least one of a standard analysis structure diagram of a software system corresponding to the software code and a standard analysis sequence diagram of a system use case of the software system corresponding to the software code; the standard analysis structure diagram is used for representing a standard structure and a static relation among the standard structures, and the standard analysis sequence diagram is used for representing a standard function corresponding to the standard structure and an interactive relation among the standard structures; the standard analysis sequence diagram comprises a text description of standard functions of each standard structure; the static relationship comprises an association relationship or a dependency relationship, and the obtaining of the standard analysis structure diagram corresponding to the software code comprises the following steps: determining a plurality of groups from each standard structure of a software system corresponding to the software code; wherein each packet includes a first standard structure and a second standard structure having a static relationship; determining the number of first standard structures in the group and a number identifier for representing the number of the first standard structures, the number of second standard structures and a number identifier for representing the number of the second standard structures respectively; connecting the first standard structure of each group with the corresponding second standard structure through static relation symbols, and marking corresponding number identifiers on the first standard structure and the second standard structure to obtain a standard analysis structure diagram;

Inputting the software codes and the analysis model into a code review model, and outputting review information corresponding to the software codes; wherein the review information is used for characterizing matching information of the software code and the analysis model; the code review model is obtained by training an initial natural language model based on sample software codes and corresponding sample analysis models corresponding to various review tasks, wherein the various review tasks are obtained based on various review rules; the inputting the software code and the analysis model into a code review model includes: extracting semantic information in the analysis model, wherein the semantic information comprises standard structures and static relations among the standard structures, and/or standard functions corresponding to the standard structures and interaction relations among the standard structures; writing the semantic information into a text file to obtain a text file containing the semantic information; and inputting the text file containing the semantic information and the software code into a code review model.

2. The method of claim 1, wherein the static relationship comprises a generalized relationship, and the obtaining a standard analysis structure corresponding to the software code comprises:

3. The method of claim 1, wherein obtaining a standard analysis sequence diagram corresponding to the software code comprises:

4. The method of claim 1, wherein the code information or the model information includes at least one of each standard structure, an attribute of the standard structure, a standard function corresponding to the standard structure, a static relationship between the standard structures, and an interactive relationship, the review information includes review information of a missing class, and the outputting the review information corresponding to the software code includes:

5. The method according to claim 1, wherein the code information or the model information includes at least one of respective standard structures, attributes of the standard structures, standard functions corresponding to the standard structures, static relationships between the standard structures, and interaction relationships, the review information includes review information of a missing class, the review information includes review information of a non-uniform class, and the outputting the review information corresponding to the software code includes:

6. The method of claim 1, wherein the code review model is obtained by:

7. The method of claim 6, wherein the initial natural language model includes an encoder network and a decoder network, wherein inputting the sample software codes and the sample analysis models corresponding to the plurality of review tasks to the initial natural language model, respectively, and outputting the first prediction result includes:

8. The method of claim 6, wherein obtaining the first set of samples comprises:

determining a plurality of review tasks based on the plurality of review rules;

9. The method of claim 6, wherein the initial natural language model is obtained by:

acquiring a second sample set; wherein the second sample set comprises unlabeled sample text;

word segmentation processing is carried out on the sample text to obtain words;

Inputting the words into an initial network model in sequence according to the appearance sequence of the words in the sample text, and outputting a second prediction result, wherein the second prediction result comprises the predicted next word of the words;

And iteratively adjusting the initial network model based on the difference between the second prediction result and the actual next word of the words until the difference meets the preset requirement, so as to obtain an initial natural language model.

10. The method of claim 1, further comprising, after said outputting the review information corresponding to the software code:

Displaying the software codes in a code display column and marking and displaying code fragments which are not matched with the analysis model;

11. The method of claim 10, wherein displaying the review information corresponding to the code segment in a review information display column comprises:

12. The method of claim 10, further comprising, after the review information display column displays the review information corresponding to the code segment:

13. A method of training a code review model, comprising:

Acquiring a first sample set; the first sample set comprises sample software codes and sample analysis models corresponding to various review tasks, wherein the sample software codes are marked with review information; the sample analysis model comprises at least one of a standard analysis structural diagram of a software system corresponding to the sample software code and a standard analysis sequence diagram of a system use case of the software system corresponding to the sample software code; the standard analysis structure diagram is used for representing a standard structure and a static relation among the standard structures, and the standard analysis sequence diagram is used for representing a standard function corresponding to the standard structure and an interactive relation among the standard structures; the standard analysis sequence diagram comprises a text description of standard functions of each standard structure; wherein the review information is used to characterize matching information of the sample software code and the sample analysis model; the static relationship comprises an association relationship or a dependency relationship, and the obtaining of the standard analysis structure diagram corresponding to the sample software code comprises the following steps: determining a plurality of groups from each standard structure of the software system corresponding to the sample software code; wherein each packet includes a first standard structure and a second standard structure having a static relationship; determining the number of first standard structures in the group and a number identifier for representing the number of the first standard structures, the number of second standard structures and a number identifier for representing the number of the second standard structures respectively; connecting the first standard structure of each group with the corresponding second standard structure through static relation symbols, and marking corresponding number identifiers on the first standard structure and the second standard structure to obtain a standard analysis structure diagram;

Respectively inputting sample software codes and sample analysis models corresponding to various review tasks into an initial natural language model, and outputting a first prediction result; wherein the plurality of review tasks are determined based on a plurality of review rules; the input of the sample software codes corresponding to the plurality of review tasks and the sample analysis model to the initial natural language model comprises: extracting semantic information in the sample analysis model, wherein the semantic information comprises standard structures and static relations among the standard structures, and/or standard functions corresponding to the standard structures and interaction relations among the standard structures; writing the semantic information into a text file to obtain a text file containing the semantic information; inputting a text file containing the semantic information and the sample software code into an initial natural language model;

14. A code review device, comprising:

The system comprises a first acquisition module, a second acquisition module and a first analysis module, wherein the first acquisition module is used for acquiring a software code and a corresponding analysis model, and the analysis model comprises at least one of a standard analysis structure diagram of a software system corresponding to the software code and a standard analysis sequence diagram of a standard function of a system use case of the software system corresponding to the software code; the standard analysis structure diagram is used for representing a standard structure and a static relation among the standard structures, and the standard analysis sequence diagram is used for representing a standard function corresponding to the standard structure and an interactive relation among the standard structures; the standard analysis sequence diagram comprises a text description of standard functions of each standard structure; the static relationship comprises an association relationship or a dependency relationship, and the obtaining of the standard analysis structure diagram corresponding to the software code comprises the following steps: determining a plurality of groups from each standard structure of a software system corresponding to the software code; wherein each packet includes a first standard structure and a second standard structure having a static relationship; determining the number of first standard structures in the group and a number identifier for representing the number of the first standard structures, the number of second standard structures and a number identifier for representing the number of the second standard structures respectively; connecting the first standard structure of each group with the corresponding second standard structure through static relation symbols, and marking corresponding number identifiers on the first standard structure and the second standard structure to obtain a standard analysis structure diagram;

The generation module is used for inputting the software codes and the analysis model into a code review model and outputting review information corresponding to the software codes; wherein the review information is used for characterizing matching information of the software code and the analysis model; the code review model is obtained by training an initial natural language model based on sample software codes and corresponding sample analysis models corresponding to various review tasks, wherein the various review tasks are obtained based on various review rules; the inputting the software code and the analysis model into a code review model includes: extracting semantic information in the analysis model, wherein the semantic information comprises standard structures and static relations among the standard structures, and/or standard functions corresponding to the standard structures and interaction relations among the standard structures; writing the semantic information into a text file to obtain a text file containing the semantic information; and inputting the text file containing the semantic information and the software code into a code review model.

15. A training device for a code review model, comprising:

The second acquisition module is used for acquiring the first sample set; the first sample set comprises sample software codes and sample analysis models corresponding to various review tasks, wherein the sample software codes are marked with review information; the sample analysis model comprises at least one of a standard analysis structural diagram of a software system corresponding to the sample software code and a standard analysis sequence diagram of a system use case of the software system corresponding to the sample software code; the standard analysis structure diagram is used for representing a standard structure and a static relation among the standard structures, and the standard analysis sequence diagram is used for representing a standard function corresponding to the standard structure and an interactive relation among the standard structures; the standard analysis sequence diagram comprises a text description of standard functions of each standard structure; wherein the review information is used to characterize matching information of the sample software code and the sample analysis model; the static relationship comprises an association relationship or a dependency relationship, and the obtaining of the standard analysis structure diagram corresponding to the sample software code comprises the following steps: determining a plurality of groups from each standard structure of the software system corresponding to the sample software code; wherein each packet includes a first standard structure and a second standard structure having a static relationship; determining the number of first standard structures in the group and a number identifier for representing the number of the first standard structures, the number of second standard structures and a number identifier for representing the number of the second standard structures respectively; connecting the first standard structure of each group with the corresponding second standard structure through static relation symbols, and marking corresponding number identifiers on the first standard structure and the second standard structure to obtain a standard analysis structure diagram;

The input module is used for respectively inputting sample software codes and sample analysis models corresponding to various review tasks into the initial natural language model and outputting a first prediction result; wherein the plurality of review tasks are determined based on a plurality of review rules; the input of the sample software codes corresponding to the plurality of review tasks and the sample analysis model to the initial natural language model comprises: extracting semantic information in the sample analysis model, wherein the semantic information comprises standard structures and static relations among the standard structures, and/or standard functions corresponding to the standard structures and interaction relations among the standard structures; writing the semantic information into a text file to obtain a text file containing the semantic information; inputting a text file containing the semantic information and the sample software code into an initial natural language model;

16. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor, when executing the computer program, implements the steps of the method of any one of claims 1 to 12 or the steps of the method of claim 13.

17. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method of any one of claims 1 to 12 or the steps of the method of claim 13.

18. A computer program product comprising a computer program, characterized in that the computer program, when being executed by a processor, realizes the steps of the method of any one of claims 1 to 12 or the steps of the method of claim 13.