WO2021246742A1

WO2021246742A1 - Artificial-intelligence-based program code evaluation system and method

Info

Publication number: WO2021246742A1
Application number: PCT/KR2021/006763
Authority: WO
Inventors: 한경식; 김우정; 임소영; 이상민; 이환희
Original assignee: 주식회사 코딩로봇연구소
Priority date: 2020-06-05
Filing date: 2021-05-31
Publication date: 2021-12-09
Also published as: KR102235690B1

Abstract

An artificial-intelligence-based program code evaluation method generates a program level estimation model for estimating the level of a program code by using a plurality of execution features related to the execution of the program code and a plurality of quality features related to the quality of the program code, determines a plurality of execution evaluation features and a plurality of quality evaluation features corresponding to the plurality of execution features and the plurality of quality features with respect to an evaluation program code, respectively, and determines the level of the evaluation program code on the basis of the program level estimation model and the plurality of execution evaluation features and the plurality of quality evaluation features of the evaluation program code.

Description

AI-based program code evaluation system and method

The present invention relates to a system and method for evaluating a program code based on artificial intelligence (AI), and more particularly, to a system and method for evaluating the level of a computer program code by evaluating a computer program code based on artificial intelligence will be.

With the development of the 4th industry, the modern society is rapidly changing from a hardware-oriented society to a software-oriented society. Therefore, the importance of software education is growing very much, with software education courses being opened from elementary school.

For this reason, research on a system that provides computer programming education based on cloud is being actively conducted.

However, since the existing cloud-based computer programming education system evaluates the program code by simply comparing the written program code with a predetermined answer, there is a problem in that it is difficult to properly evaluate the program code written in various ways for each learner.

On the other hand, there are cases in which a person directly evaluates the program code to accurately evaluate the program code written in various forms for each learner. However, when a person directly evaluates the program code, the time required to evaluate the program code increases significantly There is a problem that In addition, there is a problem in that it is difficult to maintain the consistency of program code evaluation because evaluation criteria are different for each evaluator.

An object of the present invention to solve the above problems is an artificial intelligence-based program that can evaluate the program code in multiple ways and maintain the consistency of the evaluation by evaluating the computer program code based on artificial intelligence (AI). To provide a code evaluation system and method.

In order to achieve the above object of the present invention, in an artificial intelligence-based program code evaluation method according to an embodiment of the present invention, a plurality of execution features related to the execution of the program code and the generating a programming level estimation model for estimating a level of the program code using a plurality of quality features related to the quality of the program code, the plurality of execution features and the plurality of quality features for evaluating program code determine a plurality of execution evaluation features and a plurality of quality evaluation features respectively corresponding to The level of the evaluation program code is determined.

In order to achieve the above object of the present invention, an artificial intelligence-based program code evaluation system according to an embodiment of the present invention includes a first preprocessor, a second preprocessor, and a first machine learning module. The first preprocessor includes a plurality of execution features related to execution of the program code and a plurality of qualities related to the quality of the program code for each of a plurality of learning program codes written by a plurality of learners for a plurality of program problems. Determine the features. The second preprocessor determines a plurality of execution evaluation features and a plurality of quality evaluation features respectively corresponding to the plurality of execution features and the plurality of quality features with respect to the evaluation program code. The first machine learning module performs learning to classify the plurality of execution features and the plurality of quality features of each of the plurality of learning program codes into one of first to n-th levels to generate a programming level estimation model and classifying the plurality of execution evaluation features and the plurality of quality evaluation features of the evaluation program code into one of the first to n-th levels using the programming level estimation model to determine the classified level as the evaluation program It is determined by the level of the code.

The artificial intelligence (AI)-based program code evaluation system and the artificial intelligence-based program code evaluation method according to embodiments of the present invention evaluate the program code using machine learning, so that the program code can be evaluated in multiple ways. It can improve the speed of evaluation and maintain consistency of evaluation.

1 is a diagram illustrating an artificial intelligence (AI)-based program code evaluation system according to an embodiment of the present invention.

2 is a flowchart illustrating an AI-based program code evaluation method according to an embodiment of the present invention.

3 is a flowchart illustrating an example of a step of generating the programming level estimation model of FIG. 2 .

4 is a diagram illustrating an example of a first artificial neural network included in the first machine learning module of FIG. 1 .

5 is a flowchart illustrating another example of a step of generating the programming level estimation model of FIG. 2 .

6 is a diagram illustrating an AI-based program code evaluation system according to another embodiment of the present invention.

7 is a flowchart illustrating an AI-based program code evaluation method according to another embodiment of the present invention.

8 is a flowchart illustrating an example of a step of generating an abnormal code detection model of FIG. 7 .

Hereinafter, preferred embodiments of the present invention will be described in more detail with reference to the accompanying drawings. The same reference numerals are used for the same components in the drawings, and repeated descriptions of the same components are omitted.

The artificial intelligence-based program code evaluation system 10 shown in FIG. 1 analyzes various characteristics of the program code to determine a plurality of features, and uses the plurality of features through machine learning to determine the program code A program evaluation model is created by performing learning to evaluate the level of

Thereafter, the artificial intelligence-based program code evaluation system 10 analyzes the program code to be evaluated, determines the plurality of features, and inputs the plurality of features to the program evaluation model.

The program evaluation model outputs an evaluation result of the program code based on the plurality of input features.

Accordingly, the artificial intelligence-based program code evaluation system 10 according to embodiments of the present invention can evaluate the program code in multiple ways based on various characteristics of the program code.

In addition, since the artificial intelligence-based program code evaluation system 10 evaluates program codes using an evaluation model learned through machine learning, it is possible to maintain the consistency of evaluation.

Referring to FIG. 2 , in an artificial intelligence-based program code evaluation method according to an embodiment of the present invention, a plurality of execution features related to the execution of the program code determined by analyzing the program code (features) and generating a programming level estimation model for estimating the level of the program code using a plurality of quality features related to the quality of the program code (step S100).

Meanwhile, in order to generate the programming level estimation model, the same analysis as that performed on the program code is performed on the evaluation program code to be evaluated, so that the plurality of execution features and the plurality of quality features respectively corresponding to the plurality of quality features are performed. Determine the performance evaluation features and the plurality of quality evaluation features (step S200).

Thereafter, the level of the evaluation program code is determined based on the programming level estimation model and the plurality of execution evaluation features and the plurality of quality evaluation features of the evaluation program code (step S300).

In one embodiment, the AI-based program code evaluation method shown in FIG. 2 may generate the programming level estimation model by additionally using a plurality of readability features related to the readability of the program code ( step S100).

For example, the artificial intelligence-based program code evaluation method according to an embodiment of the present invention uses the plurality of execution features, the plurality of readability features, and the plurality of quality features of the program code comprehensively. The programming level estimation model for estimating the level of the program code may be generated.

In this case, the same analysis as performed on the program code to generate the programming level estimation model may be performed on the evaluation program code to additionally determine a plurality of readability evaluation features corresponding to the plurality of readability features. There is (step S200).

Thereafter, the level of the evaluation program code may be determined based on the programming level estimation model and the plurality of execution evaluation features, the plurality of readability evaluation features, and the plurality of quality evaluation features of the evaluation program code. (Step S300).

The AI-based program code evaluation method shown in FIG. 2 may be performed through the AI-based program code evaluation system 10 of FIG. 1 .

Hereinafter, with reference to FIGS. 1 and 2 , the configuration and operation of the artificial intelligence-based program code evaluation system 10 and the artificial intelligence-based program code evaluation method performed by the artificial intelligence-based program code evaluation system 10 will be described. It will be described in detail.

1, the artificial intelligence-based program code evaluation system 10 is a learning data database (TRAINING DATA DB) 100, a first preprocessor (PREPROCESSOR1) 200, a first machine learning module (MACHINE LEARNING MODULE1) ) 300 , and a second preprocessor (PREPROCESSOR2) 400 may be included.

The learning data database 100 may store a plurality of learning program codes T_CDs written by a plurality of learners for a plurality of program problems.

For example, at least some of the plurality of program problems are provided to each of the plurality of learners, and program codes written for the program problems provided by each of the plurality of learners are a plurality of learning program codes (T_CDs) as may be stored in the learning data database 100 .

FIG. 3 is a flowchart illustrating an example of generating the programming level estimation model of FIG. 2 ( S100 ).

1 to 3 , the first preprocessing unit 200 sequentially reads a plurality of learning program codes T_CDs stored in the learning data database 100 and reads the characteristics of the read learning program codes T_CD. A plurality of execution features (E_Fs) related to the execution of the analyzed and read learning program code (T_CD), a plurality of readability features (R_Fs) related to the readability of the read learning program code (T_CD), and read learning A plurality of quality features Q_Fs related to the quality of the program code T_CD may be determined (step S110 ).

In an embodiment, the first preprocessor 200 may determine the plurality of execution features E_Fs by analyzing the execution result of the corresponding learning program code T_CD.

In one embodiment, the plurality of execution features (E_Fs) is a score feature determined based on the number of correct answers among a plurality of output values output by executing the corresponding learning program code (T_CD) for a plurality of test cases may include

For example, the plurality of test cases and a correct answer for each of the plurality of test cases may be predetermined for each of the plurality of program problems.

In an embodiment, the plurality of test cases predetermined for each of the plurality of program problems and the correct answer for each of the plurality of test cases may be stored in the learning data database 100 .

In this case, the first preprocessor 200 reads the plurality of test cases of the program problem corresponding to the corresponding learning program code T_CD and the correct answers to the plurality of test cases from the learning data database 100 . and executing the corresponding learning program code (T_CD) for the plurality of test cases, and counting the number that matches the correct answer among the plurality of output values output from the corresponding learning program code (T_CD) to count the score feature can be decided

Thus, the score feature may represent a measure of the correctness of the corresponding learning program code (T_CD).

In the above, it has been described that the first preprocessor 200 determines the score feature by executing the corresponding learning program code T_CD for the plurality of test cases, but the present invention is not limited thereto.

According to an embodiment, the score feature determined by executing each of the plurality of learning program codes (T_CDs) for the corresponding plurality of test cases is associated with each of the plurality of learning program codes (T_CDs) in the learning data database 100 ) can be stored in advance. In this case, the first preprocessor 200 may read and obtain the score feature of the corresponding learning program code T_CD from the learning data database 100 .

In one embodiment, the plurality of execution features (E_Fs) are determined based on a time taken for a corresponding learning program code (T_CD) to be executed for the plurality of test cases and output the plurality of output values. It may further include a runtime feature that is

For example, the first preprocessor 200 calculates an average of the times required for the corresponding training program code T_CD to be executed for the plurality of test cases and output the plurality of output values as the execution time feature. can be decided with

Thus, the runtime feature may represent a measure of the effectiveness of the corresponding learning program code (T_CD).

In one embodiment, a plurality of execution features (E_Fs) are statistics about the score features of learning program codes (T_CDs) written for the same program problem as the program problem corresponding to the corresponding learning program code (T_CD) It may further include a difficulty feature determined based on the value.

For example, the first preprocessing unit 200 calculates the average value of the score features for the learning program codes T_CDs written for the same program problem and the difficulty level of the learning program codes T_CDs corresponding to the same program problem. can be determined by the feature.

Accordingly, the difficulty feature may indicate the difficulty of the program problem corresponding to the corresponding learning program code T_CD.

In one embodiment, the first preprocessor 200 is the text included in the corresponding learning program code (T_CD), regardless of the programming language such as C, JAVA, PYTHON, etc. used in the corresponding learning program code (T_CD) A plurality of readability features (R_Fs) may be determined by analyzing the readability of .

In one embodiment, the plurality of readability features R_Fs may include a first readability feature determined based on the average number of words per line and the average number of syllables per word of the corresponding learning program code T_CD.

In an embodiment, the plurality of readability features R_Fs may include a second readability feature determined based on a ratio of words with three or more syllables among words included in the corresponding learning program code T_CD.

In one embodiment, the plurality of readability features R_Fs is a third readability feature determined based on a ratio of words not included in a predetermined standard word list among words included in the corresponding learning program code T_CD. may include

According to an embodiment, the first preprocessing unit 200 includes Flesch Reading Ease, Flesch-Kincaid Grade Level, Coleman-Liau Index, Gunning Fog Index, It is also possible to determine the plurality of readability features (R_Fs) by calculating the SMOG Index or the like.

Accordingly, the plurality of readability features R_Fs may represent a measure of how comfortable the corresponding learning program code T_CD is to be written.

In an embodiment, the first preprocessor 200 may determine the plurality of quality features Q_Fs by analyzing the complexity and potential error potential of the corresponding training program code T_CD.

In an embodiment, the plurality of quality features Q_Fs may include a first quality feature determined based on the number of conditional branch points included in functions implemented in the corresponding learning program code T_CD.

Here, the conditional branching points may include statements such as if, while, and for.

In one embodiment, the plurality of quality features Q_Fs may include a second quality feature determined based on the number of lines included in the corresponding learning program code T_CD.

In one embodiment, the plurality of quality features (Q_Fs) may include a third quality feature determined based on the standard deviation of the number of spaces inserted at specific predetermined positions in the corresponding learning program code (T_CD). can

Here, the specific positions are the start point of each line, the end point of each line, immediately after the parentheses indicating the start of the function, immediately before the parentheses indicating the end of the function, immediately after the parentheses indicating the start of the structure, and the end of the structure It may include at least one of the immediately preceding parentheses.

In one embodiment, the plurality of quality features (Q_Fs) may include a fourth quality feature determined based on the number of annotations included in the corresponding learning program code (T_CD) and the number of characters in each of the annotations. have.

In one embodiment, the plurality of quality features (Q_Fs) include the number of empty conditional statements included in the corresponding learning program code (T_CD), the number of unused variables, the overlapping code, and the number of overlapping condition operators. a fifth quality feature determined based on at least one of

According to the embodiment, the first preprocessing unit 200 calculates cyclomatic complexity, Halstead'metrics, etc., which are generally used as a complexity index of the program code, for the corresponding learning program code (T_CD) to obtain a plurality of quality features (Q_Fs). ) can also be determined.

According to another embodiment, the first preprocessor 200 analyzes the corresponding learning program code (T_CD) using the OCLint tool, which is generally used to analyze the potential error of the program code, and returns an index calculated. It may be determined with a plurality of quality features (Q_Fs). Since the OCLint tool is a well-known tool, detailed descriptions of the operation of the OCLint tool and indicators calculated from the OCLint tool will be omitted.

Referring back to FIGS. 1 and 3 , the first preprocessing unit 200 includes a plurality of execution features (E_Fs) of each of a plurality of learning program codes (T_CDs), a plurality of readability features (R_Fs), and a plurality of qualities Provides the features (Q_Fs) to the first machine learning module 300, the first machine learning module 300 is a plurality of execution features (E_Fs) of each of the plurality of learning program codes (T_CDs), a plurality of readability The programming level estimation model is determined by performing learning to classify the features R_Fs and the plurality of quality features Q_Fs into one of the first to nth (n is a positive integer) levels LV1 to LVn. It can be done (step S140).

In an embodiment, the learning data database 100 may store in advance the programming career period CR of the learner who wrote each of the plurality of learning program codes T_CDs in association with the corresponding learning program code T_CD.

In this case, the first preprocessor 200 performs a plurality of execution features (E_Fs), a plurality of readability features (R_Fs), and a plurality of quality features (Q_Fs) of each of the plurality of learning program codes (T_CDs). provide to the first machine learning module 300 together with a corresponding programming career duration (CR), wherein the first machine learning module 300 is based on the programming career duration (CR) of each of the plurality of learning program codes (T_CDs) The plurality of learning program codes (T_CDs) of the plurality of execution features (E_Fs), the plurality of readability features (R_Fs), and the plurality of quality features (Q_Fs) of each of the first to n-th levels (LV1 ~ LVn) may be performed to determine the programming level estimation model.

In an embodiment, the first machine learning module 300 may include a first artificial neural network.

In an embodiment, the first artificial neural network may generate result data encoded by a one-hot encoding method.

In this case, as shown in FIG. 4 , the first artificial neural network has the same number of execution features (E_Fs), a plurality of readability features (R_Fs), and a plurality of quality features (Q_Fs). It may include an input layer including input nodes (INPUT LAYER), at least one hidden layer (HIDDEN LAYER) including a plurality of nodes, and an output layer (OUTPUT LAYER) including n output nodes.

Each of a plurality of execution features (E_Fs), a plurality of readability features (R_Fs), and a plurality of quality features (Q_Fs) is input to each of the input nodes included in the input layer of the first artificial neural network, The first artificial neural network may indicate one of first to n-th levels LV1 to LVn through values output from the output nodes included in the output layer.

4 exemplarily shows that the first artificial neural network includes one hidden layer, but the present invention is not limited thereto, and according to an embodiment, the first artificial neural network may include a plurality of hidden layers.

In the first machine learning module 300, the first artificial neural network writes a training program code corresponding to a plurality of execution features (E_Fs), a plurality of readability features (R_Fs), and a plurality of quality features (Q_Fs). Based on the learner's programming career period (CR), the plurality of execution features (E_Fs), the plurality of readability features (R_Fs), and the plurality of quality features (Q_Fs) are set to the first to n-th levels LV1 to LVn. ), the first artificial neural network may be trained to classify as one of.

For example, the programming career period of the plurality of learners is divided into first to n-th sections at intervals of one year or more, and the programming career period (CR) of the learner who wrote the corresponding learning program code is i(i) is a positive integer less than or equal to n) when included in the interval, the first machine learning module 300 includes a plurality of execution features (E_Fs), a plurality of readability features (R_Fs), and a plurality of quality features (Q_Fs) is provided to the input nodes included in the input layer of the first artificial neural network, the i-th output node included in the output layer of the first artificial neural network outputs 1 and the remaining output nodes output 0 By doing so, it is possible to train the first artificial neural network to classify the plurality of execution features (E_Fs), the plurality of readability features (R_Fs), and the plurality of quality features (Q_Fs) into the i-th level.

The first machine learning module 300 uses a plurality of execution features (E_Fs), a plurality of readability features (R_Fs), and a plurality of quality features (Q_Fs) corresponding to the plurality of learning program codes (T_CDs). Thus, after completing the learning, the first artificial neural network on which the learning is completed may be determined as the programming level estimation model.

5 is a flowchart illustrating another example of generating the programming level estimation model of FIG. 2 ( S100 ).

In an embodiment, when the plurality of learners write the plurality of learning program codes (T_CDs), a base code is provided to the plurality of learners for each of the plurality of program problems, and the plurality of learners use the base code Based on , a plurality of learning program codes T_CDs may be written.

In this case, the plurality of readability features (R_Fs) and the plurality of quality features (Q_Fs) of each of the plurality of learning program codes (T_CDs) generated by the first preprocessor 200 include the influence of the base code. can

Accordingly, as shown in FIG. 5 , the first preprocessor 200 performs a plurality of execution features (E_Fs) for each of the plurality of learning program codes (T_CDs) in the same manner as the method described above with reference to FIG. 3 . , a plurality of readability features (R_Fs), and a plurality of quality features (Q_Fs) are determined (step S110), and then a plurality of readability features ( R_Fs) and an operation of correcting the plurality of quality features Q_Fs may be further performed (steps S120 and S130 ).

Specifically, the first preprocessor 200 determines a plurality of readability features (R_Fs) and a plurality of quality features (Q_Fs) of the base code provided to the plurality of learners for each of the plurality of program problems, and (Step S120), the base code corresponding to each of the plurality of learning program codes (T_CDs) in the plurality of readability features (R_Fs) and the plurality of quality features (Q_Fs) of each of the plurality of learning program codes (T_CDs) A plurality of readability features (R_Fs) and a plurality of quality features (Q_Fs) of each of the plurality of learning program codes (T_CDs) are obtained by subtracting the plurality of readability features (R_Fs) and the plurality of quality features (Q_Fs) of It can be corrected (step S130).

Accordingly, the corrected plurality of readability features R_Fs and the plurality of quality features Q_Fs may more clearly reflect the characteristics of the corresponding learning program code T_CD.

Thereafter, the first preprocessor 200 performs a plurality of execution features (E_Fs) of each of the plurality of learning program codes (T_CDs), a plurality of corrected readability features (R_Fs), and a plurality of corrected quality features ( Q_Fs) to the first machine learning module 300 , and the first machine learning module 300 includes a plurality of execution features (E_Fs) of each of a plurality of learning program codes (T_CDs), a plurality of corrected readability features The programming level estimation model may be determined by performing learning to classify the R_Fs and the corrected plurality of quality features Q_Fs into one of the first to n-th levels LV1 to LVn (step S140 ). .

Referring back to FIGS. 1 and 2 , after the artificial intelligence-based program code evaluation system 10 generates the programming level estimation model through the first preprocessor 200 and the first machine learning module 300 , the second 2 The preprocessor 400 may receive the evaluation program code E_CD corresponding to the evaluation target.

The second preprocessor 400 analyzes the characteristics of the evaluation program code E_CD, and includes a plurality of execution evaluation features EE_Fs, a plurality of readability evaluation features RE_Fs, and a plurality of evaluation program codes E_CD. It is possible to determine the quality evaluation features (QE_Fs) (step S200).

In an embodiment, the second preprocessor 400 performs the same operation as that of the first preprocessor 200 described above with reference to FIG. 3 to obtain a plurality of execution evaluation features for the evaluation program code E_CD. (EE_Fs), a plurality of readability evaluation features (RE_Fs), and a plurality of quality evaluation features (QE_Fs) may be determined.

Accordingly, the plurality of execution evaluation features (EE_Fs), the plurality of readability evaluation features (RE_Fs), and the plurality of quality evaluation features (QE_Fs) for the evaluation program code (E_CD) are the plurality of evaluation features for the learning program code (T_CD). It may correspond to each of the execution features E_Fs, the plurality of readability features R_Fs, and the plurality of quality features Q_Fs.

In an embodiment, the second preprocessor 400 compares a plurality of test cases for a program problem corresponding to the evaluation program code (E_CD) and a correct answer to the plurality of test cases with the evaluation program code (E_CD) and It can be received from outside together.

In this case, the second preprocessor 400 executes the evaluation program code E_CD for the plurality of test cases to determine the score feature and the runtime feature included in the plurality of execution evaluation features EE_Fs. can

In another embodiment, the second preprocessor 400 may receive the score feature and the runtime feature for the evaluation program code E_CD from the outside.

Meanwhile, according to an embodiment, the second preprocessor 400 may further receive the difficulty level DIFF_L of the program problem corresponding to the evaluation program code E_CD.

In this case, the second preprocessor 400 may determine the difficulty level DIFF_L received from the outside as the difficulty feature included in the plurality of execution evaluation features EE_Fs.

The second pre-processing unit 400 first converts the plurality of execution evaluation features (EE_Fs), the plurality of readability evaluation features (RE_Fs), and the plurality of quality evaluation features (QE_Fs) determined for the evaluation program code (E_CD) to the first It may be provided to the machine learning module 300 .

The first machine learning module 300 includes a plurality of execution evaluation features (EE_Fs), a plurality of readability evaluation features (RE_Fs), and a plurality of quality evaluation features ( QE_Fs), it is possible to determine the level of the evaluation program code E_CD (step S300).

In one embodiment, the first machine learning module 300 uses the programming level estimation model to include a plurality of execution evaluation features (EE_Fs), a plurality of readability evaluation features (RE_Fs) of the evaluation program code (E_CD), and classifying the plurality of quality evaluation features QE_Fs into one of first to n-th levels LV1 to LVn, and determining the classified level as the level of the evaluation program code E_CD.

For example, the first machine learning module 300 may include a plurality of execution evaluation features (EE_Fs), a plurality of readability evaluation features (RE_Fs), and a plurality of quality evaluation features (QE_Fs) of the evaluation program code (E_CD). is input to the programming level estimation model, the level of the evaluation program code E_CD may be determined as one of the first to nth levels LV1 to LVn based on a result output from the programming level estimation model.

As described above with reference to FIGS. 1 to 5 , the artificial intelligence-based program code evaluation system 10 and the artificial intelligence-based program code evaluation method according to embodiments of the present invention analyze the program code in various ways to provide the program code determine a plurality of execution features (E_Fs) related to the execution of , a plurality of readability features (R_Fs) related to the readability of the program code, and a plurality of quality features (Q_Fs) related to the quality of the program code; Since the level of the program code is determined based on the plurality of execution features (E_Fs), the plurality of readability features (R_Fs), and the plurality of quality features (Q_Fs), the program code can be evaluated in multiple ways.

In addition, the artificial intelligence-based program code evaluation system 10 and the artificial intelligence-based program code evaluation method according to embodiments of the present invention use the programming level estimation model learned through the first machine learning module 300 . to evaluate the program code, it is possible to improve the speed of evaluation while maintaining the consistency of evaluation.

Referring to FIG. 6 , the artificial intelligence-based program code evaluation system 20 includes a training data database 100 , a first preprocessor PREPROCESSOR1 200 , and a first machine learning module MACHINE LEARNING MODULE1 ) (300), a second preprocessor (PREPROCESSOR2) (400), a third preprocessor (PREPROCESSOR3) (500), a second machine learning module (MACHINE LEARNING MODULE2) (600), and a fourth preprocessor (PREPROCESSOR4) ( 700) may be included.

The artificial intelligence-based program code evaluation system 20 shown in FIG. 6 is a third preprocessor 500 and a second machine learning module 600 in the artificial intelligence-based program code evaluation system 10 shown in FIG. 1 . , and a fourth pre-processing unit 700 may be further included.

The artificial intelligence-based program code evaluation method shown in FIG. 7 may further include some steps ( S400 , S500 , S600 ) in the artificial intelligence-based program code evaluation method shown in FIG. 2 .

The AI-based program code evaluation method shown in FIG. 7 may be performed through the AI-based program code evaluation system 20 of FIG. 6 .

The learning data database 100, the first preprocessor 200, the first machine learning module 300, and the second preprocessor 400 included in the artificial intelligence-based program code evaluation system 20 of FIG. The configuration and operation are the learning data database 100, the first preprocessor 200, the first machine learning module 300, and the second preprocessor 400 included in the artificial intelligence-based program code evaluation system 10. Its configuration and operation are the same.

Configuration and operation of the learning data database 100, the first preprocessor 200, the first machine learning module 300, and the second preprocessor 400 included in the artificial intelligence-based program code evaluation system 10 (Steps S100, S200, S300) have been described above with reference to FIGS. 1 to 5, so here, the learning data database 100 included in the artificial intelligence-based program code evaluation system 20, the first preprocessor 200, The redundant description of the configuration and operation (steps S100, S200, S300) of the first machine learning module 300 and the second preprocessor 400 is omitted, and the artificial intelligence-based program code evaluation system 20 Only the configuration and operations (steps S400, S500, and S600) of the included third preprocessor 500, the second machine learning module 600, and the fourth preprocessor 700 will be described in detail.

In general, even if the program code is perfect in terms of accuracy by outputting the correct answer for all given test cases, the program code written for the purpose of matching only the correct answer without considering efficiency and reusability is not good code.

The AI-based program code evaluation system 20 shown in FIG. 6 and the AI-based program code evaluation method shown in FIG. 7 are program codes written by different learners for a program problem having the same complexity of a specific program code. It may be determined whether the specific program code is a normal code or an abnormal code based on whether it is significantly higher than the average complexity of the programs.

Specifically, referring to FIGS. 6 and 7 , the third preprocessing unit 500 includes a plurality of execution features (E_Fs) of each of the plurality of learning program codes (T_CDs) from the first preprocessing unit 200 , a plurality of readability Receive features R_Fs, and a plurality of quality features Q_Fs.

The third preprocessor 500 converts the learning program codes T_CDs having the maximum value of the score feature included in the plurality of execution features E_Fs among the plurality of learning program codes T_CDs to a perfect learning program Codes can be extracted (step S400).

Thereafter, the third preprocessor 500 and the second machine learning module 600 perform a plurality of execution features (E_Fs), a plurality of readability features (R_Fs), and a plurality of quality features of each of the perfect learning program codes. An abnormal code detection model for estimating whether the program code is abnormal may be generated using the Q_Fs (step S500).

8 is a flowchart illustrating an example of a step S500 of generating an abnormal code detection model of FIG. 7 .

Referring to FIG. 8 , the third preprocessor 500 converts the remaining quality features except for the quality features related to code complexity among the plurality of quality features Q_Fs of each of the perfect learning program codes into a plurality of modified quality features. It can be determined by the values (MQ_Fs) (step S510).

For example, the third preprocessor 500 converts the remaining quality features to a plurality of quality features except for quality features corresponding to Cyclomatic complexity and Halstead'metrics among a plurality of quality features (Q_Fs) of each of the perfect learning program codes. It can be determined by the features (MQ_Fs).

Also, the third preprocessing unit 500 may determine the average of the excluded quality features corresponding to the perfect learning program codes written for the same program problem among the perfect learning program codes as the average complexity of the same program problem. (Step S520).

Thereafter, for each of the perfect learning program codes, the third preprocessing unit 500 sets the value of the excluded quality feature of the corresponding perfect learning program code and the average of the program problem corresponding to the corresponding perfect learning program code. Comparing the complexity, it is possible to determine the abnormal flag (ANF) of the corresponding perfect score learning program code (step S530).

For example, if the value of the excluded quality feature of the corresponding perfect learning program code is greater than the average complexity of the program problem corresponding to the corresponding perfect learning program code by a certain percentage or more, the corresponding perfect learning program code is selected determine an abnormality flag (ANF) of the corresponding perfect score learning program code as a first value to classify it as an abnormal code, and the value of the excluded quality feature of the corresponding perfect score learning program code is the corresponding perfect score learning program code If it is not greater than the average complexity of the program problem corresponding to , an abnormal flag (ANF) of the corresponding perfect learning program code is determined as a second value in order to classify the corresponding perfect learning program code as a normal code. can

Thereafter, the third preprocessing unit 500 performs a plurality of execution features (E_Fs), a plurality of readability features (R_Fs), a plurality of correction quality features (MQ_Fs), and an abnormal flag (E_Fs) of each of the perfect learning program codes. ANF) to the second machine learning module 600 .

In an embodiment, the second machine learning module 600 may include a second artificial neural network.

In this case, the second machine learning module 600 determines that the second artificial neural network performs a plurality of execution features (E_Fs) of each of the perfect learning program codes based on the abnormal flag (ANF) of each of the perfect learning program codes. , the second artificial neural network may be trained to classify the plurality of readability features (R_Fs), and the plurality of modified quality features (MQ_Fs) into one of a normal code (NM) and an abnormal code (ANM) (step S540). .

For example, in the second machine learning module 600 , when the anomalous flag ANF is the first value, the second artificial neural network includes a plurality of execution features (E_Fs), a plurality of readability features (R_Fs), and Classifies a plurality of modified quality features (MQ_Fs) as an anomalous code (ANM), and when the anomaly flag (ANF) is the second value, the second artificial neural network includes a plurality of execution features (E_Fs), a plurality of readability features The second artificial neural network may be trained to classify the R_Fs and the plurality of modified quality features MQ_Fs into the normal code NM.

The second machine learning module 600 is configured to use a plurality of execution features (E_Fs), a plurality of readability features (R_Fs), and a plurality of modification quality features (MQ_Fs) corresponding to the perfect learning program codes, After completing the learning, the second artificial neural network on which the learning has been completed may be determined as the abnormal code detection model (step S550).

6 and 7 again, the fourth preprocessor 700 includes a plurality of execution evaluation features (EE_Fs) for the evaluation program code (E_CD) from the second preprocessor 400, a plurality of readability evaluation features ( RE_Fs), and a plurality of quality evaluation features (QE_Fs).

Thereafter, the fourth preprocessor 700 and the second machine learning module 600 perform a plurality of execution evaluation features (EE_Fs) and a plurality of readability evaluation features (RE_Fs) of the abnormal code detection model and the evaluation program code (E_CD). ), and whether the evaluation program code E_CD is a normal code NM or an abnormal code ANM based on the plurality of quality evaluation features QE_Fs may be determined (step S600 ).

In an embodiment, the fourth preprocessor 700 selects a quality feature related to code complexity among a plurality of quality evaluation features QE_Fs of the evaluation program code E_CD received from the second preprocessor 400 . The remaining quality features may be determined as a plurality of modified quality evaluation features (MQE_Fs).

For example, the fourth preprocessor 700 converts the remaining quality features from among the plurality of quality evaluation features (QE_Fs) of the evaluation program code (E_CD) to the quality features corresponding to Cyclomatic complexity and Halstead'metrics in a plurality of correction qualities. It can be determined by evaluation features (MQE_Fs).

Thereafter, the fourth preprocessor 700 generates a plurality of execution evaluation features (EE_Fs), a plurality of readability evaluation features (RE_Fs), and a plurality of correction quality evaluation features (MQE_Fs) of the evaluation program code (E_CD). 2 may be provided to the machine learning module 600 .

The second machine learning module 600 includes a plurality of execution evaluation features (EE_Fs), a plurality of readability evaluation features (RE_Fs), and a plurality of modification quality evaluation features of the abnormal code detection model and evaluation program code (E_CD). It may be determined whether the evaluation program code E_CD is a normal code NM or an abnormal code ANM based on (MQE_Fs).

For example, the second machine learning module 600 may include a plurality of execution evaluation features (EE_Fs), a plurality of readability evaluation features (RE_Fs), and a plurality of modification quality evaluation features (MQE_Fs) of the evaluation program code (E_CD). ) is input to the abnormal code detection model, and it may be determined whether the evaluation program code E_CD is a normal code NM or an abnormal code ANM based on a result output from the abnormal code detection model.

As described above with reference to FIGS. 6 to 7, the artificial intelligence-based program code evaluation system 20 and the artificial intelligence-based program code evaluation method according to embodiments of the present invention are machine learning-based program codes in various ways. In addition to providing the level of the program code for evaluation, the specific program code is considered normal code based on whether the complexity of the program code is significantly higher than the average complexity of program codes written by other learners for the same program problem. You can also provide whether or not it is an abnormal code written for the purpose of answering only the correct answer without considering cognition or efficiency and reusability.

Accordingly, the AI-based program code evaluation system 20 and the AI-based program code evaluation method according to embodiments of the present invention can provide more accurate and detailed evaluation results for program codes.

The present invention can be usefully used to improve the speed of evaluation and maintain consistency of evaluation while evaluating computer program codes in multiple ways.

As described above, although described with reference to preferred embodiments of the present invention, those of ordinary skill in the art may vary the present invention within the scope without departing from the spirit and scope of the present invention described in the claims below. It will be understood that modifications and changes can be made to

Claims

A programming level estimation model for estimating the level of the program code using a plurality of execution features related to the execution of the program code and a plurality of quality features related to the quality of the program code generating;

determining a plurality of execution evaluation features and a plurality of quality evaluation features respectively corresponding to the plurality of execution features and the plurality of quality features for the evaluation program code; and

and determining the level of the evaluation program code based on the programming level estimation model and the plurality of execution evaluation features and the plurality of quality evaluation features of the evaluation program code. .
2. The method of claim 1, wherein generating the programming level estimation model comprises: the plurality of execution features of the program code and the plurality of quality features and a plurality of readability features related to readability of the program code. using together to generate the programming level estimation model,

determining a plurality of readability evaluation features corresponding to the plurality of readability features for the evaluation program code;

The determining of the level of the evaluation program code may include: based on the programming level estimation model and the plurality of execution evaluation features, the plurality of readability evaluation features, and the plurality of quality evaluation features of the evaluation program code. Artificial intelligence-based program code evaluation method comprising the step of determining the level of the evaluation program code.
3. The method of claim 2, wherein generating the programming level estimation model comprises:

determining the plurality of execution features, the plurality of readability features, and the plurality of quality features for each of a plurality of learning program codes written by a plurality of learners for a plurality of program problems; and

For a first machine learning module, the plurality of execution features, the plurality of readability features, and the plurality of quality features of each of the plurality of learning program codes are first to nth (n is an amount). of integer) performing learning to classify into one of the levels, and determining the programming level estimation model.
4. The method of claim 3, wherein the plurality of execution features, the plurality of readability features, and the plurality of quality features of each of the plurality of learning program codes are selected from the first to nth quality features for the first machine learning module. The step of determining the programming level estimation model by performing learning to classify one of the levels,

A first artificial neural network included in the first machine learning module is based on a programming career period of a learner who writes a learning program code corresponding to the plurality of execution features, the plurality of readable features, and the plurality of quality features. training the first artificial neural network to classify the plurality of execution features, the plurality of readability features, and the plurality of quality features into one of the first to n-th levels; and

and determining the first artificial neural network on which the learning is completed as the programming level estimation model.
5. The method of claim 4, wherein the first artificial neural network is based on the duration of the programming career of a learner who writes learning program code corresponding to the plurality of execution features, the plurality of readable features, and the plurality of quality features. training the first artificial neural network to classify the plurality of execution features, the plurality of readability features, and the plurality of quality features into one of the first to n-th levels;

The programming career period is divided into first to n-th sections at intervals of one year or more, and the programming career period of the learner who wrote the corresponding learning program code is included in the i-th section (i is a positive integer less than or equal to n) if so, training the first artificial neural network to classify the plurality of execution features, the plurality of readability features, and the plurality of quality features into an i-th level. based program code evaluation method.
4. The method of claim 3, wherein the plurality of execution features of each of the plurality of learning program codes comprises:

A corresponding learning program code is executed for a plurality of test cases and includes a score feature determined based on the number of correct answers among a plurality of output values,

The plurality of test cases are determined in advance for each of the plurality of program problems.
7. The method of claim 6, wherein the plurality of execution features of each of the plurality of learning program codes comprises:

a runtime feature determined based on the time it takes for the corresponding learning program code to be executed for the plurality of test cases to output the plurality of output values; and

and a difficulty feature determined based on statistical values of the score features of learning program codes written for the same program problem as the program problem corresponding to the corresponding learning program code.
4. The method of claim 3, wherein the plurality of readability features of each of the plurality of learning program codes comprises:

An artificial intelligence-based method for evaluating program code, comprising: a first readability feature determined based on an average number of words per line and an average number of syllables per word of corresponding learning program code.
4. The method of claim 3, wherein the plurality of readability features of each of the plurality of learning program codes comprises:

An artificial intelligence-based program code evaluation method comprising a second readability feature determined based on a ratio of words with three or more syllables among words included in the corresponding learning program code.
4. The method of claim 3, wherein the plurality of readability features of each of the plurality of learning program codes comprises:

An artificial intelligence-based program code evaluation method comprising a third readability feature determined based on a ratio of words not included in a predetermined standard word list among words included in the corresponding learning program code.
4. The method of claim 3, wherein the plurality of quality features of each of the plurality of learning program codes comprises:

An artificial intelligence-based program code evaluation method comprising a first quality feature determined based on the number of conditional branch points included in functions implemented in the corresponding learning program code.
4. The method of claim 3, wherein the plurality of quality features of each of the plurality of learning program codes comprises:

An artificial intelligence-based program code evaluation method comprising a second quality feature determined based on a number of lines included in the corresponding learning program code.
4. The method of claim 3, wherein the plurality of quality features of each of the plurality of learning program codes comprises:

An artificial intelligence-based program code evaluation method comprising a third quality feature determined based on a standard deviation of a number of spaces inserted at specific predetermined positions in a corresponding learning program code.
14. The method of claim 13, wherein the specific locations,

An artificial intelligence-based program code evaluation method comprising at least one of a start point of each line, an end point of each line, immediately after a parenthesis indicating the start of a function, and immediately before a parenthesis indicating an end of a function.
4. The method of claim 3, wherein the plurality of quality features of each of the plurality of learning program codes comprises:

An artificial intelligence-based program code evaluation method comprising a fourth quality feature determined based on the number of annotations included in the corresponding learning program code and the number of characters in each of the annotations.
4. The method of claim 3, wherein generating the programming level estimation model comprises:

determining the plurality of readability features and the plurality of quality features for the base code provided to the plurality of learners for each of the plurality of program problems when the plurality of learners write the plurality of learning program codes; and

the plurality of readability features and the plurality of quality features of the base code corresponding to each of the plurality of learning program codes in the plurality of readability features and the plurality of quality features of each of the plurality of learning program codes and correcting the plurality of readability features and the plurality of quality features of each of the plurality of learning program codes by subtracting them.
7. The method of claim 6,

extracting learning program codes having the maximum value of the score feature from among the plurality of learning program codes as perfect learning program codes;

generating an abnormal code detection model for estimating whether a program code is abnormal using the plurality of execution features, the plurality of readability features, and the plurality of quality features of each of the perfect score learning program codes; and

determining whether the evaluation program code is abnormal based on the abnormal code detection model and the plurality of execution evaluation features, the plurality of readability evaluation features, and the plurality of quality evaluation features of the evaluation program code; An artificial intelligence-based program code evaluation method that further includes.
The abnormal code detection model according to claim 17, wherein the abnormality code detection model for estimating whether a program code is abnormal using the plurality of execution features, the plurality of readability features, and the plurality of quality features of each of the perfect learning program codes The steps to create

determining, among the plurality of quality features of each of the perfect learning program codes, remaining quality features except for a quality feature related to code complexity as a plurality of modified quality features;

determining an average of the excluded quality features corresponding to perfect learning program codes written for the same program problem among the perfect learning program codes as the average complexity of the same program problem;

determining an abnormal flag of the corresponding perfect learning program code by comparing the value of the excluded quality feature of the corresponding perfect perfect learning program code with the average complexity of the program problem corresponding to the corresponding perfect perfect learning program code;

a second artificial neural network included in a second machine learning module, based on the abnormal flag of each of the perfect learning program codes, the plurality of execution features of each of the perfect learning program codes, the plurality of readability features, and training the second artificial neural network to classify the plurality of correction quality features into one of a normal code and an abnormal code; and

and determining the second artificial neural network on which the learning is completed as the abnormal code detection model.
The abnormality of the evaluation program code according to claim 18, wherein the evaluation program code is abnormal based on the abnormality code detection model and the plurality of execution evaluation features of the evaluation program code, the plurality of readability evaluation features, and the plurality of quality evaluation features The steps to decide whether

determining, among the plurality of quality evaluation features of the evaluation program code, remaining quality evaluation features except for a quality feature related to code complexity, as a plurality of modified quality evaluation features;

inputting the plurality of execution evaluation features, the plurality of readability evaluation features, and the plurality of correction quality evaluation features of the evaluation program code into the anomaly code detection model; and

and determining whether the evaluation program code is a normal code or an abnormal code based on a result output from the abnormal code detection model.
a method for determining, for each of a plurality of learning program codes written by a plurality of learners for a plurality of program problems, a plurality of execution features associated with execution of the program code and a plurality of quality features associated with the quality of the program code 1 preprocessor;

a second preprocessor configured to determine a plurality of execution evaluation features and a plurality of quality evaluation features respectively corresponding to the plurality of execution features and the plurality of quality features with respect to the evaluation program code; and

performing learning to classify the plurality of execution features and the plurality of quality features of each of the plurality of learning program codes into one of first to n-th levels to generate a programming level estimation model, and the programming level estimation model classifying the plurality of execution evaluation features and the plurality of quality evaluation features of the evaluation program code into one of the first to n-th levels using 1 AI-based program code evaluation system with machine learning module.