CN114138320A

CN114138320A - Code workload statistical method, device and equipment

Info

Publication number: CN114138320A
Application number: CN202111464971.8A
Authority: CN
Inventors: 王勇; 李长鸿; 黄铮
Original assignee: Beijing Ziroom Information Technology Co Ltd
Current assignee: Beijing Ziroom Information Technology Co Ltd
Priority date: 2021-12-03
Filing date: 2021-12-03
Publication date: 2022-03-04

Abstract

The invention discloses a code workload statistical method, a device and equipment, wherein the method comprises the following steps: acquiring a target data list in a target time period, wherein the target data list comprises commit data submitted by a user, and the commit data at least comprises a commit id and a submission time; deleting the first type of repeated commissions data in the target data list, wherein the commit id in the first type of repeated commissions data is not unique in the target data list, and the commit time of the first type of repeated commits data is not the earliest compared to the commits data with the same commit id; and counting the code workload of each user in the target time period based on the commits data in the deleted target data list. The technical scheme provided by the invention can avoid repeated statistics of code workload and improve the statistical accuracy of user code workload.

Description

Code workload statistical method, device and equipment

Technical Field

The invention relates to the field of program development, in particular to a code workload statistical method, a device and equipment.

Background

Code quantity statistics is one of the important indexes used for measuring the workload of an employee, a team and a department in many companies. By counting the code amount, a reliable evaluation index can be contributed to a team by one employee, so that the performance condition of one employee can be better measured. At present, code management of most companies is carried out through Gitlab, code workload statistics is carried out through a Gitlab interface, all items visible to a user are obtained firstly through calling the Gitlab interface, then a branch list of all items is obtained, branches are traversed, commit data of the codes are obtained according to branch names, and finally code volumes of users corresponding to commit ids are obtained according to the commit in the commit data, so that personal code statistics is achieved. However, this method easily results in repeated statistics of the workload, such as: the code of the user B is combined in the code of the user A, before the user A pushes the code to an online server, the user B already pushes the code of the user B, unique commit id in commit information of the code is counted under the name of the user B, afterwards, when the user A pushes the code, the previous commit information of the user B is counted again, the repeatedly counted code is counted under the name of the user B again according to the unique commit id, and the workload of the code of the user B is doubled. Therefore, how to avoid the repeated statistics of the codes is a problem to be solved urgently.

Disclosure of Invention

In view of this, embodiments of the present invention provide a method, an apparatus, and a device for counting code workloads, so as to solve the problem that code workloads are repeatedly counted under a user name.

According to a first aspect, the invention provides a code workload statistical method, the method comprising: acquiring a target data list in a target time period, wherein the target data list comprises commit data submitted by a user, and the commit data at least comprises a commit id and a submission time; deleting a first type of duplicate commits data in the target data list, the commit of which is not unique in the target data list and whose commit time is not the earliest compared to commits data having the same commit; and counting the code workload of each user in the target time period based on the commits data in the deleted target data list.

Optionally, after the deleting the first type of duplicate commissions data in the target data list, the method further comprises: acquiring a second data list in a second time period, wherein the second time period is before the target time period, and the second data list comprises commits data submitted by a user; deleting a second type of duplicate commissions data in the target data list, the commimitid in the second type of duplicate commissions data being present in the second data list.

Optionally, the commits data further includes commits information, and after the deleting the second type of duplicate commits data in the target data list, the method further includes: deleting the third type of repeated commissions data in the target data list, wherein the commissions information in the third type of repeated commissions data has the key words of the merging operation.

Optionally, after the deleting the third type of duplicate commits data in the target data list, the method further comprises: judging whether an operation file of a source code corresponding to each commit data in the target data list comprises a code type file, wherein the code type file comprises a header file and an implementation file for executing a header file declaration method; and deleting the current commits data in the target data list when the operation file of the source code corresponding to the current commits data does not comprise a code type file.

Optionally, the method further comprises: when the operation file of the source code corresponding to the current commit data comprises a code type file, judging whether all the operation instructions of the code type file corresponding to the current commit data are deletion operations; and when all the operation instructions of the code type file corresponding to the current commit data are deleting operations, deleting the current commit data from the target data list.

Optionally, the counting the code workload of each user in the target time period based on the commits data in the deleted target data list includes: extracting a corresponding current code based on current commits data in the target data list; determining the code workload of a user corresponding to the commit in the current commit data in the target time period based on the current code; and traversing all the commits data in the target data list until the code workload statistics of each user in the target time period is completed.

Optionally, the extracting a corresponding current code based on current commits data in the target data list includes: and extracting effective codes corresponding to the current commit data in the target data list through a GitLab interface, wherein the effective codes are source codes for adding and modifying the code type file.

According to a second aspect, the present invention provides a code workload statistics apparatus, said apparatus comprising: the system comprises an acquisition module, a processing module and a processing module, wherein the acquisition module is used for acquiring a target data list in a target time period, the target data list comprises commit data submitted by a user, and the commit data at least comprises a commit id and submission time; a deduplication module to delete a first type of duplicate commit data in the target data list, a commit id of the first type of duplicate commit data being not unique in the target data list, and a commit time of the first type of duplicate commit data being not earliest compared to commit data having the same commit id; and the counting module is used for counting the code workload of each user in the target time period based on the commits data in the deleted target data list.

According to a third aspect, an embodiment of the present invention provides a code workload statistics apparatus, including: a memory and a processor, the memory and the processor being communicatively coupled to each other, the memory having stored therein computer instructions, and the processor performing the method of the first aspect, or any one of the optional embodiments of the first aspect, by executing the computer instructions.

According to a fourth aspect, an embodiment of the present invention provides a computer-readable storage medium, which stores computer instructions for causing a computer to thereby perform the method of the first aspect, or any one of the optional implementation manners of the first aspect.

The technical scheme provided by the application has the following advantages:

according to the technical scheme, when the code workload of each user in the target time period is counted, the target data list in the target time period is obtained firstly, the target data list comprises the commit data submitted by each user in the target time period, before the code workload is counted, the commit in the target time period is not unique, the commit data submitted later are deleted, only the commit data corresponding to the unique commit are reserved in the target time period, then the code workload is counted, so that the repeated commit data are prevented from being counted in the target time period, and the accuracy of the user code workload counting is improved.

In addition, after the repeated commit data in the target time period are deleted, a second data list which is a period of time before the target time period is also obtained, and the commit data which are corresponding to the same commit and appear in the second data list and the target data list are deleted from the target data list, so that the situation that part of users submit the same code before the target time period is avoided, and the repeated statistics of the code workload is further avoided. And then, the commits data of the commits information with the key words of the merging operation is also deleted from the target data list, so that missing data of the merging operation is avoided from existing in the target data list.

In addition, in practical situations, the significant workload of the code is mainly the operation of adding and modifying the code type file. Besides the repeated information, the method also judges whether the source code corresponding to the submitted commit data is the operation carried out on the code type file, and if not, deletes the corresponding commit data. And then deleting the commit data only subjected to the deleting operation aiming at the code type file, calling a Gitlab interface to extract the source codes belonging to the operation of newly adding and modifying the code type file in the remaining commit data, and finally counting the actual code workload of each user aiming at the codes, thereby greatly improving the counting accuracy of the code workload.

Drawings

The features and advantages of the present invention will be more clearly understood by reference to the accompanying drawings, which are illustrative and not to be construed as limiting the invention in any way, and in which:

FIG. 1 is a diagram illustrating the steps of a code workload statistical method according to an embodiment of the present invention;

FIG. 2 is a flow diagram illustrating a code workload statistics methodology in accordance with an embodiment of the present invention;

FIG. 3 illustrates an exemplary diagram of a user submitting commits data in one embodiment of the invention;

FIG. 4 is a schematic diagram illustrating a code workload statistics apparatus according to an embodiment of the present invention;

fig. 5 is a schematic structural diagram of a code workload statistics apparatus according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be described clearly and completely with reference to the accompanying drawings of the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be obtained by a person skilled in the art without inventive step based on the embodiments of the present invention, are within the scope of the present invention.

Referring to fig. 1 and fig. 2, in an embodiment, a code workload statistical method specifically includes the following steps:

step S101: and acquiring a target data list in the target time period, wherein the target data list comprises the commit data submitted by the user, and the commit data at least comprises a commit id and a submission time.

Step S102: deleting the first type of duplicate commits data in the target data list, wherein the commit id in the first type of duplicate commits data is not unique in the target data list, and the commit time of the first type of duplicate commits data is not the earliest compared to the commits data having the same commit id.

Step S103: and counting the code workload of each user in the target time period based on the commits data in the deleted target data list.

Specifically, the method is implemented in a Gitlab code management program, when a code of a backend server is submitted, Git Hooks (Git Hooks are customized script programs) are added to each item, when the code is pushed to the backend server, a backend server interface is automatically triggered through the Git Hooks, a plurality of key information (such as commit id, commit information, file names of addition, deletion and modification, code amount of addition, deletion and modification and the like) corresponding to the push are counted on the backend server in real time, and fig. 3 is a schematic diagram of commit data submitted by a user. In this embodiment, before counting the code workload of each user in a period of time, first, a target time period to be counted needs to be determined, and then, a target data list in the target time period is retrieved from the background server. And screening the commit data with repeated commit ids in the list aiming at the target data number list, comparing the submission time of each commit data aiming at the same commit id, and deleting the commit data with later submission time from the target data list, so that the commit data stored in the target data list have the data of the earliest uploading user in a target time period, and then counting the code workload of the corresponding user based on the commit ids of all the remaining data, thereby realizing the accurate counting of the code workload of all the users and avoiding the occurrence of repeated counting.

Specifically, in an embodiment, after the step S102, the following steps are further included:

the method comprises the following steps: and acquiring a second data list in a second time period, wherein the second time period is before the target time period, and the second data list comprises the commits data submitted by the user.

Step two: and deleting the second type of repeated commissions data in the target data list, wherein the commimitid in the second type of repeated commissions data exists in the second data list.

Specifically, in addition to the situation that the code duplicate statistics may occur in the target time period, there is a situation that a user submits a part of the code before the target time period, and the part of the code is merged and submitted by other users in the target time period, which also causes the duplicate statistics of the code workload. Therefore, in this embodiment, the second data list in the second time period before the target time period is obtained, and then whether the commit id of the commit data in the target data list appears in the second data list is checked, and if a commit id appears, it indicates that the commit data corresponding to the commit id and the code have been submitted before the target time period, so that the commit data corresponding to the commit id is deleted from the target data list, the code workload of the user is further ensured not to be repeatedly counted, and the accuracy of the user code workload statistics is improved.

Specifically, in an embodiment, the commit data submitted by the user includes commit information, and after the step two, the method further includes the following steps:

step three: and deleting the third type of repeated commissions data in the target data list, wherein the commissions information in the third type of repeated commissions data has the key words of the merging operation. Specifically, when the user performs code merging operation, operation keywords such as merge and merge are recorded in the commit information, in order to avoid missing data in the commit id duplication-based work, duplication is performed again based on the merge keyword in the commit information, and further, the commit data in the target data list is repeatedly counted.

Specifically, in an embodiment, after the third step, the following steps are further included:

step four: and judging whether the operation file of the source code corresponding to each commit data in the target data list comprises a code type file, wherein the code type file comprises a header file and an implementation file for executing a header file declaration method.

Step five: and deleting the current commits data in the target data list when the operation file of the source code corresponding to the current commits data does not comprise the code type file.

Specifically, in this embodiment, for the actual situation, in addition to the repeated statistics of the commit data, the source code corresponding to the commit data includes codes for operating a large number of picture resources, modifying a configuration file, deleting an entire header file (e.g., an h file) in its entirety, and implementing a file (e.g., an m file), and such codes are eventually counted into the change of the code amount, but these code amounts are actually meaningless operations and are not suitable for being used as the code amount statistics for measuring the true code contribution amount. The source code which really makes sense is the source code which adds and modifies the operation aiming at the head file and the implementation file. To further improve the statistical accuracy of user code workloads. After the third step, for the remaining data in the target data list, the operation file of the source code corresponding to each commit data is checked in a traversal mode to see whether the operation file of the source code corresponding to each commit data contains a code type file, and if the operation file of the source code corresponding to any commit data does not contain a code type file, the commit data can be regarded as a substantially meaningless operation, so that the commit data is deleted, and the accuracy of the user code workload statistics is further improved.

Specifically, in an embodiment, after the step four, the following step is further included:

step six: and when the operation file of the source code corresponding to the current commit data comprises a code type file, judging whether all the operation instructions of the code type file corresponding to the current commit data are deletion operations.

Step seven: and when all the operation instructions of the code type file corresponding to the current commit data are deleting operations, deleting the current commit data from the target data list.

Specifically, in the present embodiment, for the operation of the code type file, the operation that substantially makes sense is addition and modification to the code type file. Therefore, after the operation file of the source code corresponding to the commit data is identified to include the code type file, whether all the operations of the source code of the commit data on the code type file are deletion operations needs to be judged, if all the operations are deletion operations, the source code essentially performs meaningless operations, and no substantial contribution to the code work exists. Therefore, all the operation instructions for the source code are deleted from the operation code type file, and the commit data corresponding to the part of the source code is deleted, so that the accuracy of counting the workload of the user code is further improved.

Specifically, in an embodiment, the step S103 specifically includes the following steps:

step eight: and extracting the corresponding current code based on the current commits data in the target data list.

Step nine: and determining the code workload of the user corresponding to the commit in the current commit data in the target time period based on the current code.

Step ten: and traversing all the commits data in the target data list until the code workload statistics of each user in the target time period is completed.

Specifically, after repeated and invalid commit data are removed from the target data list, based on the current target data list, the commit data are traversed, and the source codes corresponding to the commit data are obtained, so that the code workload is counted under the unique user name corresponding to each commit id, and the accurate counting of the user code workload in the target time period is realized.

Specifically, in an embodiment, the step nine specifically includes the following steps:

step eleven: and extracting an effective code corresponding to the current commit data in the target data list through a GitLab interface, wherein the effective code is a source code for newly adding and modifying the code type file. Specifically, the source code of the commit information in the current target data list cannot guarantee that all the operation files are code type files, and cannot guarantee that all the operations on the code type files are new addition and modification operations. And a small amount of data without statistical significance exists, so that repeated data and working data such as operation pictures and deletion operations without practical significance are removed to a great extent. And calling a Gitlab interface to extract a source code for newly adding and modifying the code type file from the residual data, and counting the code workload to the unique corresponding user name according to the commit id corresponding to the extracted source code, thereby completing the code workload counting work with high precision and high reliability.

Through the steps, according to the technical scheme provided by the application, when the code workload of each user in the target time period is counted, the target data list in the target time period is obtained firstly, the commit data submitted by each user in the target time period are included in the target data list, before the code workload is counted, the commit in the target time period is not unique, the commit data submitted later are deleted, only the commit data corresponding to the unique commit are reserved in the target time period, then the code workload is counted, so that the repeated commit data are prevented from being counted in the target time period, and the accuracy of the user code workload counting is improved.

As shown in fig. 4, the present embodiment further provides a code workload statistics apparatus, which includes:

the acquisition module 101 is configured to acquire a target data list in a target time period, where the target data list includes commit data submitted by a user, and the commit data includes at least a commit id and a commit time. For details, refer to the related description of step S101 in the above method embodiment, and no further description is provided here.

A deduplication module 102 configured to delete a first type of duplicate commit data in the target data list, a commit id of the first type of duplicate commit data being not unique in the target data list, and a commit time of the first type of duplicate commit data being not earliest compared to commit data having the same commit id. For details, refer to the related description of step S102 in the above method embodiment, and no further description is provided here.

And the counting module 103 is used for counting the code workload of each user in the target time period based on the commits data in the deleted target data list. For details, refer to the related description of step S103 in the above method embodiment, and no further description is provided here.

The code workload statistical apparatus provided in the embodiment of the present invention is configured to execute the code workload statistical method provided in the above embodiment, and the implementation manner and the principle thereof are the same, and details are referred to in the related description of the above method embodiment and are not described again.

Through the cooperative cooperation of the components, when the code workload of each user in the target time period is counted, a target data list in the target time period is obtained firstly, the target data list comprises the commit data submitted by each user in the target time period, before the code workload is counted, the commit data in the target time period are not unique and commit data with later submission time are deleted, only the commit data corresponding to the unique commit are reserved in the target time period, then the code workload is counted, so that the repeated commit data are prevented from being counted in the target time period, and the accuracy of the user code workload counting is improved.

Fig. 5 shows a code workload statistics apparatus according to an embodiment of the present invention, where the apparatus includes a processor 901 and a memory 902, which may be connected by a bus or by other means, and fig. 5 illustrates the connection by the bus as an example.

Processor 901 may be a Central Processing Unit (CPU). The Processor 901 may also be other general purpose processors, Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components, or combinations thereof.

The memory 902, which is a non-transitory computer-readable storage medium, may be used to store non-transitory software programs, non-transitory computer-executable programs, and modules, such as program instructions/modules corresponding to the methods in the above-described method embodiments. The processor 901 executes various functional applications and data processing of the processor by executing non-transitory software programs, instructions and modules stored in the memory 902, that is, implements the methods in the above-described method embodiments.

The memory 902 may include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required for at least one function; the storage data area may store data created by the processor 901, and the like. Further, the memory 902 may include high speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, the memory 902 may optionally include memory located remotely from the processor 901, which may be connected to the processor 901 via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

One or more modules are stored in the memory 902, which when executed by the processor 901 performs the methods in the above-described method embodiments.

The specific details of the code workload statistics apparatus may be understood by referring to the corresponding related descriptions and effects in the foregoing method embodiments, and are not described herein again.

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program, and the implemented program can be stored in a computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. The storage medium may be a magnetic Disk, an optical Disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a Flash Memory (Flash Memory), a Hard Disk (Hard Disk Drive, abbreviated as HDD) or a Solid State Drive (SSD), etc.; the storage medium may also comprise a combination of memories of the kind described above.

Although the embodiments of the present invention have been described in conjunction with the accompanying drawings, those skilled in the art may make various modifications and variations without departing from the spirit and scope of the invention, and such modifications and variations fall within the scope defined by the appended claims.

Claims

1. A code workload statistical method, the method comprising:

acquiring a target data list in a target time period, wherein the target data list comprises commit data submitted by a user, and the commit data at least comprises a commit id and a submission time;

deleting a first type of duplicate commits data in the target data list, the commit of which is not unique in the target data list and whose commit time is not the earliest compared to commits data having the same commit;

and counting the code workload of each user in the target time period based on the commits data in the deleted target data list.

2. The method of claim 1, wherein after said deleting the first type of duplicate commissions data in the target data list, the method further comprises:

acquiring a second data list in a second time period, wherein the second time period is before the target time period, and the second data list comprises commits data submitted by a user;

deleting a second type of duplicate commissions data in the target data list, the commimitid in the second type of duplicate commissions data being present in the second data list.

3. The method of claim 2, wherein the commits data further includes commits information, and wherein after said deleting the second type of duplicate commits data in the target data list, the method further comprises:

deleting the third type of repeated commissions data in the target data list, wherein the commissions information in the third type of repeated commissions data has the key words of the merging operation.

4. The method of claim 3, wherein after said deleting the third type of duplicate commissions data in the target data list, the method further comprises:

judging whether an operation file of a source code corresponding to each commit data in the target data list comprises a code type file, wherein the code type file comprises a header file and an implementation file for executing a header file declaration method;

and deleting the current commits data in the target data list when the operation file of the source code corresponding to the current commits data does not comprise a code type file.

5. The method of claim 4, further comprising:

when the operation file of the source code corresponding to the current commit data comprises a code type file, judging whether all the operation instructions of the code type file corresponding to the current commit data are deletion operations;

and when all the operation instructions of the code type file corresponding to the current commit data are deleting operations, deleting the current commit data from the target data list.

6. The method of claim 5, wherein the counting code workloads of respective users in the target time period based on the commits data in the deleted target data list comprises:

extracting a corresponding current code based on current commits data in the target data list;

determining the code workload of a user corresponding to the commit in the current commit data in the target time period based on the current code;

and traversing all the commits data in the target data list until the code workload statistics of each user in the target time period is completed.

7. The method of claim 6, wherein extracting the corresponding current code based on current commit data in the target data list comprises:

and extracting effective codes corresponding to the current commit data in the target data list through a GitLab interface, wherein the effective codes are source codes for adding and modifying the code type file.

8. A code workload statistics apparatus, characterized in that the apparatus comprises:

the system comprises an acquisition module, a processing module and a processing module, wherein the acquisition module is used for acquiring a target data list in a target time period, the target data list comprises commit data submitted by a user, and the commit data at least comprises a commit id and submission time;

a deduplication module to delete a first type of duplicate commit data in the target data list, a commit id of the first type of duplicate commit data being not unique in the target data list, and a commit time of the first type of duplicate commit data being not earliest compared to commit data having the same commit id;

and the counting module is used for counting the code workload of each user in the target time period based on the commits data in the deleted target data list.

9. A code workload statistics apparatus, comprising:

a memory and a processor communicatively coupled to each other, the memory having stored therein computer instructions, the processor executing the computer instructions to perform the method of any of claims 1-7.

10. A computer-readable storage medium having stored thereon computer instructions for causing a computer to thereby perform the method of any one of claims 1-7.