US20230168883A1 - Machine learning-based prediction of completion time of software code changes - Google Patents
Machine learning-based prediction of completion time of software code changes Download PDFInfo
- Publication number
- US20230168883A1 US20230168883A1 US17/536,654 US202117536654A US2023168883A1 US 20230168883 A1 US20230168883 A1 US 20230168883A1 US 202117536654 A US202117536654 A US 202117536654A US 2023168883 A1 US2023168883 A1 US 2023168883A1
- Authority
- US
- United States
- Prior art keywords
- changes
- software code
- completion time
- events
- machine learning
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000010801 machine learning Methods 0.000 title claims abstract description 71
- 238000000034 method Methods 0.000 claims abstract description 49
- 230000008859 change Effects 0.000 claims abstract description 20
- 230000000246 remedial effect Effects 0.000 claims abstract description 12
- 238000012545 processing Methods 0.000 claims description 65
- 238000003860 storage Methods 0.000 claims description 36
- 230000015654 memory Effects 0.000 claims description 31
- 238000012552 review Methods 0.000 claims description 7
- 238000012544 monitoring process Methods 0.000 claims description 6
- 238000004458 analytical method Methods 0.000 claims description 5
- 230000010354 integration Effects 0.000 claims description 4
- 238000007781 pre-processing Methods 0.000 claims description 2
- 230000004931 aggregating effect Effects 0.000 claims 3
- 230000008569 process Effects 0.000 description 23
- 238000012549 training Methods 0.000 description 20
- 238000007726 management method Methods 0.000 description 17
- 230000010365 information processing Effects 0.000 description 13
- 238000004590 computer program Methods 0.000 description 8
- 238000004519 manufacturing process Methods 0.000 description 7
- 238000004891 communication Methods 0.000 description 6
- 238000012360 testing method Methods 0.000 description 6
- 238000012356 Product development Methods 0.000 description 5
- 230000009471 action Effects 0.000 description 5
- 238000010586 diagram Methods 0.000 description 5
- 238000005516 engineering process Methods 0.000 description 5
- 238000011161 development Methods 0.000 description 4
- 238000013136 deep learning model Methods 0.000 description 3
- 230000003111 delayed effect Effects 0.000 description 3
- 238000000513 principal component analysis Methods 0.000 description 3
- 230000001413 cellular effect Effects 0.000 description 2
- 230000001934 delay Effects 0.000 description 2
- 230000007246 mechanism Effects 0.000 description 2
- 230000001902 propagating effect Effects 0.000 description 2
- 230000004913 activation Effects 0.000 description 1
- 238000007792 addition Methods 0.000 description 1
- 230000002730 additional effect Effects 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 239000002184 metal Substances 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 230000006403 short-term memory Effects 0.000 description 1
- 238000009987 spinning Methods 0.000 description 1
- 230000001960 triggered effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/70—Software maintenance or management
- G06F8/71—Version control; Configuration management
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/70—Software maintenance or management
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/30—Creation or generation of source code
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
- G06N3/0442—Recurrent networks, e.g. Hopfield networks characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/09—Supervised learning
Definitions
- the field relates generally to information processing systems and more particularly, to the processing of software code changes in such information processing systems.
- GitHub provides a software development platform that enables communication and collaboration among software developers.
- the software development platform provided by GitHub allows software developers to create new software versions of software without disrupting a current version.
- Software development tasks often require coordination among a number of engineering teams that work on different portions of a larger software development project.
- a method comprises obtaining one or more events related to one or more changes to software code; applying the one or more events to a machine learning prediction model that predicts a completion time of the one or more changes to the software code, wherein the machine learning prediction model is trained using (i) a plurality of events for a plurality of historical software code changes and (ii) a completion time for each historical software code change; and performing one or more automated remedial actions based at least in part on the predicted completion time.
- the one or more automated remedial actions comprise one or more of generating at least one notification responsive to the predicted completion time, and adjusting an allocation of resources assigned to the completion of at least one of the one or more changes to the software code.
- the one or more changes to the software code can be monitored and a new event related to the one or more changes to the software code can be applied to the machine learning prediction model to obtain an updated predicted completion time for the one or more changes to the software code.
- a predicted completion time can be obtained for each of a plurality of sets of changes to the software code within a project and the predicted completion time can be aggregated for each set of changes to obtain a predicted completion time for the project.
- a project may comprise a plurality of sets of changes to the software code and a predicted completion time for each set of changes within the project can be obtained and one or more sets of changes can be identified having a corresponding predicted completion time that occurs after a specified completion time for the project.
- illustrative embodiments include, without limitation, apparatus, systems, methods and computer program products comprising processor-readable storage media.
- FIG. 1 illustrates an information processing system configured for machine learning-based prediction of completion time of software code changes in accordance with an illustrative embodiment
- FIGS. 2 A and 2 B respectively, illustrate a training phase and a prediction phase of the machine learning prediction model of FIG. 1 in accordance with illustrative embodiments
- FIG. 3 illustrates a number of exemplary software development events that may be processed by the machine learning model of FIG. 1 in accordance with an illustrative embodiment
- FIG. 4 illustrates an exemplary product development tool dashboard in accordance with an illustrative embodiment
- FIG. 5 is a flow diagram illustrating an exemplary process for monitoring a request to review software code changes and for generating a predicted completion time for the monitored software code changes in accordance with an illustrative embodiment
- FIG. 6 is a flow diagram illustrating an exemplary implementation of a machine learning-based process for predicting completion times of software code changes in accordance with an illustrative embodiment
- FIG. 7 illustrates exemplary pseudo code for a training process for the machine learning prediction model of FIG. 1 in accordance with an illustrative embodiment
- FIG. 8 illustrates exemplary pseudo code for a data engineering process that preprocesses the data for the machine learning prediction model of FIG. 1 in accordance with an illustrative embodiment
- FIG. 9 shows an exemplary implementation of an exemplary long short-term memory (LSTM) network to predict software code change completion times in an illustrative embodiment
- FIG. 10 illustrates an exemplary processing platform that may be used to implement at least a portion of one or more embodiments of the disclosure comprising a cloud infrastructure
- FIG. 11 illustrates another exemplary processing platform that may be used to implement at least a portion of one or more embodiments of the disclosure.
- FIG. 1 shows a computer network (also referred to herein as an information processing system) 100 configured in accordance with an illustrative embodiment.
- the computer network 100 comprises a plurality of user devices 102 - 1 through 102 -L, collectively referred to herein as user devices 102 .
- the user devices 102 are coupled to a network 104 , where the network 104 in this embodiment is assumed to represent a sub-network or other related portion of the larger computer network 100 . Accordingly, elements 100 and 104 are both referred to herein as examples of “networks,” but the latter is assumed to be a component of the former in the context of the FIG. 1 embodiment.
- Also coupled to network 104 are one or more software development servers 110 and one or more project management servers 120 .
- the user devices 102 may comprise, for example, servers and/or portions of one or more server systems, as well as devices such as mobile telephones, laptop computers, tablet computers, desktop computers or other types of computing devices. Such devices are examples of what are more generally referred to herein as “processing devices.” Some of these processing devices are also generally referred to herein as “computers.”
- the user devices 102 in some embodiments comprise respective computers associated with a particular company, organization or other enterprise.
- at least portions of the computer network 100 may also be referred to herein as collectively comprising an “enterprise network.” Numerous other operating scenarios involving a wide variety of different types and arrangements of processing devices and networks are possible, as will be appreciated by those skilled in the art.
- Also associated with the user devices 102 are one or more input-output devices, which illustratively comprise keyboards, displays or other types of input-output devices in any combination.
- Such input-output devices can be used, for example, to support one or more user interfaces to the user devices 102 , as well as to support communication between the software development servers 110 , the project management servers 120 , and/or other related systems and devices not explicitly shown.
- the network 104 is assumed to comprise a portion of a global computer network such as the Internet, although other types of networks can be part of the computer network 100 , including a wide area network (WAN), a local area network (LAN), a satellite network, a telephone or cable network, a cellular network, a wireless network such as a Wi-Fi or WiMAX network, or various portions or combinations of these and other types of networks.
- the computer network 100 in some embodiments therefore comprises combinations of multiple different types of networks, each comprising processing devices configured to communicate using internet protocol (IP) or other related communication protocols.
- IP internet protocol
- the software development servers 110 can have at least one associated database (not explicitly shown in FIG. 1 ) configured to store data pertaining to, for example, software code under development, events related to software code changes, reviewer information and/or reviewer comments.
- Each of the project management servers 120 can also have at least one associated database (not explicitly shown in FIG. 1 ) configured to store predicted and specified completion time data pertaining to, for example, software code changes being monitored by the project management servers 120 .
- the databases associated with the software development servers 110 and/or the project management servers 120 can be implemented using one or more corresponding storage systems.
- Such storage systems can comprise any of a variety of different types of storage including network-attached storage (NAS), storage area networks (SANs), direct-attached storage (DAS) and distributed DAS, as well as combinations of these and other storage types, including software-defined storage.
- NAS network-attached storage
- SANs storage area networks
- DAS direct-attached storage
- distributed DAS distributed DAS
- each such processing device generally comprises at least one processor and an associated memory, and implements one or more functional modules for controlling certain features of the software development servers 110 and/or the project management servers 120 .
- the software development servers 110 and the project management servers 120 in this embodiment can each comprise a processor coupled to a memory and a network interface.
- the processor illustratively comprises a microprocessor, a microcontroller, an application-specific integrated circuit (ASIC), a field-programmable gate array (FPGA) or other type of processing circuitry, as well as portions or combinations of such circuitry elements.
- ASIC application-specific integrated circuit
- FPGA field-programmable gate array
- the memory illustratively comprises random access memory (RAM), read-only memory (ROM) or other types of memory, in any combination.
- RAM random access memory
- ROM read-only memory
- the memory and other memories disclosed herein may be viewed as examples of what are more generally referred to as “processor-readable storage media” storing executable computer program code or other types of software programs.
- One or more embodiments include articles of manufacture, such as computer-readable storage media.
- articles of manufacture include, without limitation, a storage device such as a storage disk, a storage array or an integrated circuit containing memory, as well as a wide variety of other types of computer program products.
- the term “article of manufacture” as used herein should be understood to exclude transitory, propagating signals.
- the network interfaces allow for communication between the software development servers 110 , the project management servers 120 , and/or the user devices 102 over the network 104 , and each illustratively comprises one or more conventional transceivers.
- the software development servers 110 may be implemented, at least in part, using the GitHub software development platform.
- the software development servers 110 may comprise a software code repository 112 , a software code change processing module 114 , and an event messaging module 116 .
- the software code repository 112 comprises multiple versions of software, such as a current software version and one or more versions undergoing software development.
- the software code change processing module 114 may process changes to the software code, for example, using at least portions of the GitHub software development tool.
- the event messaging module 116 generates events related to the software code changes (as described in more detail, for example, in conjunction with FIG.
- a messaging layer of a sequential message queue such as Kafka messaging layer or a messaging layer of another enterprise service bus.
- the messages may also be stored in a database, such as a NoSQL database (e.g., a MongoDB).
- the project management servers 120 may be implemented, at least in part, using the JiraTM product development tool that allows a project manager to monitor the progress of software development tasks.
- Each of the project management servers 120 may include an event processing module 122 , a machine learning prediction model 124 , and a dashboard update module 126 .
- the event processing module 122 obtains and processes events corresponding to changes to software code being generated by the event messaging module 116 of the software development server 120 .
- the event processing module 122 may transform the events into formats that are digestible by the machine learning prediction model 124 , for example.
- the machine learning prediction model 124 generates a predicted completion time for one or more changes to software code.
- the generated predicted completion times for the one or more changes to software code may be presented to one or more users, for example, using the dashboard update module 126 (as described in more detail in conjunction with, for example, FIG. 4 ).
- the machine learning prediction model 124 is trained using (i) a plurality of events for a plurality of historical software code changes and (ii) a completion time (e.g., as a label) for each historical software code change.
- the particular arrangement of elements 112 - 116 illustrated in the software development server(s) 110 , and the particular arrangement of elements 122 - 126 in the project management server(s) 120 of the FIG. 1 embodiment is presented by way of example only, and alternative arrangements can be used in other embodiments.
- the functionality associated with the elements 112 - 116 and/or elements 122 - 126 in other embodiments can be combined into a single element, or separated across a larger number of elements.
- multiple distinct processors can be used to implement different ones of the elements 112 - 116 and/or elements 122 - 126 or portions thereof.
- At least portions of elements 112 - 116 and/or elements 122 - 126 may be implemented at least in part in the form of software that is stored in memory and executed by a processor.
- FIG. 1 For machine learning-based prediction of completion time of software code changes is presented by way of illustrative example only, and in other embodiments additional or alternative elements may be used.
- another embodiment includes additional or alternative systems, devices and other network entities, as well as different arrangements of modules and other components.
- one or more of the software development servers 110 and at least one associated database can be on and/or part of the same processing platform.
- FIGS. 4 through 6 An exemplary process utilizing elements 122 - 126 of an example project management server 120 in computer network 100 will be described in more detail with reference to, for example, FIGS. 4 through 6 .
- FIG. 2 A illustrates an exemplary model training phase 210 of the machine learning prediction model 124 of FIG. 1 according to one or more embodiments.
- historical changes to software code e.g., from the software code repository 112
- software development events 220 e.g., as discussed further below in conjunction with FIG. 3
- actual completion time labels 230 for the historical changes to the software code
- the trained machine learning prediction model 224 has learned the lifetime of such software code changes based on the historical data and can predict the expected completion time with respect to such software code changes in order to provide project management teams with timely technical updates.
- the trained machine learning prediction model 224 may be implemented, for example, as an LSTM network.
- the training may employ supervised learning techniques with outcome labels (e.g., observed completion time) with respect to the historical data.
- outcome labels e.g., observed completion time
- the machine learning prediction model 224 can process single data points (e.g., an event), as well as sequences of data (e.g., events).
- FIG. 2 B illustrates an exemplary completion time prediction phase 250 of the machine learning prediction model 224 according to at least one embodiment.
- software development events 260 e.g., as discussed further below in conjunction with FIG. 3
- FIG. 2 B illustrates an exemplary completion time prediction phase 250 of the machine learning prediction model 224 according to at least one embodiment.
- software development events 260 e.g., as discussed further below in conjunction with FIG. 3
- current changes to the software code are applied to the machine learning prediction model 224 , during the completion time prediction phase 250 , in order to obtain the predicted completion time 270 of the current software code changes.
- a feedback path 280 is provided for a retraining of the machine learning prediction model 224 . In this manner, as the scope of the sample data increases, the machine learning prediction model 224 learns and becomes more accurate.
- Code branching provides a mechanism for working on different versions of a software code repository at one time. Branches can be used to experiment and to make changes before committing the changes to the main branch.
- a pull request also referred to herein as a request to review software code changes
- developers or other members of a development team to announce potential changes to software code that have been pushed to a branch in a repository on GitHub. Once such a pull request is opened, the potential changes can be evaluated, discussed and reviewed among collaborators before the potential changes are merged into a main code branch.
- a given pull request can show differences between two code branches, such as the main code branch and the proposed changes.
- commits In the GitHub software development platform, saved changes are referred to as commits. Each commit typically has an associated commit message, describing why a particular change was made. A merge request makes the approved software code changes available to the main branch.
- the project management servers 120 may be implemented, at least in part, using the Jira product development tool that allows a project manager to monitor the progress of software development tasks.
- a user story may comprise an explanation of a software feature written from the perspective of the end user.
- the purpose of a Jira story is to articulate how a software feature will provide value to the customer.
- a “sprint” is a time period in a development cycle where an engineering team completes work, in a known manner.
- One or more aspects of the disclosure recognize that there may be one or more pull requests to be reviewed or merged that are related to a Jira story and that experience unexpected delays. Such delays may require the program managers to adjust the timeline of the delayed pull requests themselves and/or the timeline of other implicated pull requests or tasks.
- machine learning-based techniques are provided for predicting completion times of software code changes that allow a project management team to dynamically track (and predict) the progress of pull requests and/or merge requests associated with outstanding user stories in a sprint, for example, using comments and/or activities in such pull/merge requests.
- FIG. 3 illustrates a number of exemplary software development events 300 that may be processed by the machine learning model of FIG. 1 , according to one or more embodiments of the disclosure.
- the exemplary software development events 300 comprise a create branch event, a code change event, a commit change event, a pull request event, a reviewer comment event (or a sequence of reviewer comments), a merge request event, a continuous integration/continuous deployment event and/or a code analysis tool event.
- the events 300 may comprise webhook events and may be obtained from a messaging layer.
- the software development events 300 that are applied to the machine learning prediction model 124 in the completion time prediction phase 250 comprise events occurring between an opening of a pull request related to a review of one or more software code changes to a completion of pull request (e.g., when the one or more changes to the software code associated with the pull request are closed and merged into the main branch).
- the exemplary software development events 300 may be categorized into a first set of events from the GitHub software development platform and a second set of events from a code review platform, such as a continuous integration/continuous deployment (CICD) tool (e.g., the Jenkins open source automation server to modify and ensure the integrity of the code pipeline) and/or a code analysis tool (e.g., a code checker that performs code analysis to review source code lines prior to entering the production phase of a development project).
- CICD continuous integration/continuous deployment
- a code analysis tool e.g., a code checker that performs code analysis to review source code lines prior to entering the production phase of a development project.
- FIG. 4 illustrates an exemplary product development tool dashboard 400 for a Jira story related to one or more software code changes, according to some embodiments of the disclosure.
- the dashboard 400 comprises an identifier for each pull request being monitored, as well as a corresponding predicted completion time flag (e.g., “on time” or “delayed”) and any automated actions to be performed based on the current predicted completion time of any pull request.
- a predicted completion time flag e.g., “on time” or “delayed”
- pull request 3 has a predicted completion time flag of “delayed,” since the predicted completion time occurs after the defined sprint closure date.
- an automated action may comprise sending a notification to the appropriate sprint team software indicating that pull request 3 needs additional attention (or resources) and/or recommending an allocation of additional resources to a completion of pull request 3 .
- FIG. 5 is a flow diagram illustrating an exemplary process 500 for monitoring a pull request related to software code changes and for generating a predicted completion time for the monitored pull request, according to at least one embodiment.
- the process 500 initially monitors a given pull request in step 510 .
- step 520 A test is performed in step 520 to determine if a new event is received for the monitored pull request. If it is determined in step 520 that a new event is not received, then program control returns to step 510 to continue monitoring the pull request.
- step 520 Once it is determined in step 520 that a new event is received, then the new event is applied in step 530 to the machine learning prediction model 124 and a new prediction of the completion time of the pull request is obtained in step 540 .
- the disclosed machine learning-based prediction techniques predict the lifetime of outstanding software code changes (e.g., associated with pull/merge requests) based on the GitHub events. In this manner, as events happen on the monitored pull request, the estimated completion time can be dynamically updated.
- FIG. 6 is a flow diagram illustrating an exemplary implementation of a machine learning-based process 600 for predicting completion times of software code changes, according to at least some embodiments.
- the machine learning-based process 600 initially obtains one or more events in step 610 related to one or more changes to software code.
- the one or more events are applied to a machine learning prediction model that predicts a completion time of the one or more changes to the software code.
- the machine learning prediction model can be trained, for example, using (i) a plurality of events for a plurality of historical software code changes and (ii) a completion time for each historical software code change.
- step 630 one or more automated remedial actions are performed based at least in part on the predicted completion time.
- the one or more events may comprise events occurring between an opening of a request to review the one or more changes to the software code and a completion of the one or more changes to the software code.
- the one or more events may be obtained, for example, from a messaging layer, such as a Kafka messaging layer.
- the one or more events may comprise, for example, a create branch event, a code change event, a commit change event, a pull request event, a reviewer comment event, a merge request event, a continuous integration/continuous deployment event and/or a code analysis tool event.
- the automated remedial actions performed in step 630 may comprise generating at least one notification responsive to the predicted completion time, and/or adjusting an allocation of resources assigned to the completion of the one or more changes to the software code.
- the one or more changes to the software code are monitored and a new event related to the one or more changes to the software code can be applied to the machine learning prediction model to obtain an updated predicted completion time for the one or more changes to the software code.
- a predicted completion time can be obtained for each of a plurality of sets of changes to the software code (e.g., defined tasks with respect to the software code, such as separate pull requests) within a project (e.g., a Jira story) and the predicted completion time for each set of changes can be aggregated to obtain a predicted completion time for the project.
- a project comprises a plurality of sets of changes to the software code
- a predicted completion time can be obtained for each set of changes within the project and one or more sets of changes can be identified having a corresponding predicted completion time that occurs after a specified completion time for the project.
- FIG. 7 illustrates exemplary pseudo code for a training process 700 for the machine learning prediction model 124 of FIG. 1 in accordance with an illustrative embodiment.
- the example of FIG. 7 assumes that the machine learning prediction model 124 being trained is implemented as a two-layer LSTM model having an input layer and an output layer.
- additional hidden layers may also be employed, as would be apparent to a person of ordinary skill in the art.
- data from the software development server(s) 110 is separated into a training dataset and a testing dataset, for example, with a random distribution of 80% for the training dataset and 20% for the testing dataset.
- the data from the software development server(s) 110 for the training process 700 may comprise, for example, the software development events 220 of FIG. 2 for software code changes, together with the corresponding completion time labels 230 of the software code changes.
- An input layer of the machine learning prediction model 124 can be trained with the training dataset and the training output of the input layer can be provided to the output layer of the machine learning prediction model 124 .
- the machine learning prediction model 124 is trained by the training process 700 using the training data until the machine learning prediction model 124 achieves, for example, an accuracy value of 90% on the test data.
- a maximum of 24 epochs e.g., training iterations
- the time difference between each subsequent pair of events in the training data is used to train the input layer of the machine learning prediction model 124 .
- the accuracy value on the test data satisfies the specified accuracy criteria (e.g., an accuracy value of 90% on the test data)
- a “pass” status is applied to the training process 700 and the training process 700 terminates for the current training dataset.
- FIG. 8 illustrates exemplary pseudo code for a data engineering process 800 that preprocesses the data for the machine learning prediction model of FIG. 1 in accordance with an illustrative embodiment.
- the exemplary data engineering process 800 initially collects data from one or more software development servers 110 of FIG. 1 (e.g., version control systems). The collected data is then explored, for example, using one or more Python libraries (e.g., associated with Python version 3.10.0). For example, the seaborn and/or matplotlib Python libraries may be employed in some embodiments for data exploration.
- Python libraries e.g., associated with Python version 3.10.0
- Python libraries e.g., associated with Python version 3.10.0
- the seaborn and/or matplotlib Python libraries may be employed in some embodiments for data exploration.
- the exemplary data engineering process 800 then preprocesses the collected data (e.g., to satisfy one or more data processing requirements of the machine learning prediction model 124 ).
- the preprocessing may comprise employing Principal Component Analysis (PCA) to explore relationships between the collected data.
- PCA Principal Component Analysis
- the results from the PCA analysis are then used to reduce a dimensionality of the available data, for example, by reducing features from the collected data.
- Null values (and/or other unimportant data records) that are not needed to train the machine learning prediction model 124 may be removed from the collected data in some embodiments.
- Ordinal encoding may then be applied to the remaining data (following removal of the unimportant data records), to transform the remaining data so that it can be processed by the machine learning prediction model 124 .
- binary or text values may be converted to categorical values.
- timestamp representations of each pull request (such as the difference between a pull request open time and a pull request closed time) may be encoded. For an exemplary feature “timestamp.format(pull request closed time ⁇ pull request open time),” where the open time is “24th November 12:30 PM,” and the closed time is “25th November 12:30 PM,” the timestamp representation may be expressed as 1440 minutes (24 hours multiplied by 60 minutes per hour).
- the time duration between each pair of subsequent events in each pull request is also identified (e.g., a delta time comprising the time duration between two events).
- FIG. 9 shows an exemplary implementation of a deep learning model 900 to predict completion times 950 of software code changes in accordance with an illustrative embodiment.
- the deep learning model 900 includes an LSTM architecture which is comprised of one or more LSTM memory blocks 925 - 1 through 925 -M, collectively referred to herein as LSTM memory blocks 925 , that can be connected through layers.
- Each LSTM memory block 925 comprises three gates that manage a state and output of the respective LSTM memory block 925 .
- a given LSTM memory block 925 operates upon an input sequence and each gate within a LSTM memory block 925 uses sigmoid activation units to control whether the respective LSTM memory block 925 is triggered.
- Each of exemplary LSTM memory blocks 925 comprises a forget gate that conditionally decides the information to provide from the block; an input gate that conditionally decides the values from the input to update the memory state and an output gate that conditionally decides what value to output based on the input and the memory of the respective LSTM memory block 925 .
- each LSTM memory block 925 has weights (e.g., h i , c i-1 , c i ) that are learned during the model training phase 210 of FIG. 2 .
- a given LSTM memory block 925 remembers values over arbitrary time intervals, and the three gates of each LSTM memory block 925 regulate the flow of information into and out of the cell in connection with a given number of hidden units 910 , a given number of features 930 , discussed hereinafter, and a given number of time steps 940 .
- Such an example LSTM network is utilized in one or more embodiments to process and classify data, and generate one or more predictions 950 of completion times for software code changes based on the applied software development events 260 of FIG. 2 .
- input data associated with an initial state 920 can pertain to features extracted from the applied software development events 260 .
- the following features were used: pull_request_draft; pull_request_base_user_login; software_changes; pull_request_base_repository_size; sender_login; pull_request_base_repository_name; repository_full_name; pull_request_base_repository_owner_type; pull_request_deletions; pull_request_title; pull_request_base_repository_open_issues; pull_request_head_repository_name; repository_owner_login; repository_forks; pull_request_head_repository_watchers; pull_request_url; pull_request_head_repository_has_issues; pull_request_head_reference; repository_open_issues; headers_x_github_delivery; organization_url; pull_request_
- the exemplary deep learning model 900 outputs a predicted completion time of one or more software code changes. It is noted that the particular arrangement of elements shown in FIG. 9 are presented by way of illustrative example only, and a wide variety of alternative machine learning models can be used in other embodiments, as would be apparent to a person of ordinary skill in the art.
- One or more embodiments of the disclosure provide improved methods, apparatus and computer program products for machine learning-based prediction of completion time of software code changes.
- the foregoing applications and associated embodiments should be considered as illustrative only, and numerous other embodiments can be configured using the techniques disclosed herein, in a wide variety of different applications.
- processing modules or other components may therefore each run on a computer, storage device or other processing platform element.
- a given such element may be viewed as an example of what is more generally referred to herein as a “processing device.”
- illustrative embodiments disclosed herein can provide a number of significant advantages relative to conventional arrangements. It is to be appreciated that the particular advantages described above and elsewhere herein are associated with particular illustrative embodiments and need not be present in other embodiments. Also, the particular types of information processing system features and functionality as illustrated and described herein are exemplary only, and numerous other arrangements may be used in other embodiments.
- compute services and/or storage services can be offered to cloud infrastructure tenants or other system users as a Platform as a service (PaaS) model, an Infrastructure as a service (IaaS) model, a Storage-as-a-Service (STaaS) model and/or a Function-as-a-Service (FaaS) model, although numerous alternative arrangements are possible.
- PaaS Platform as a service
- IaaS Infrastructure as a service
- STaaS Storage-as-a-Service
- FaaS Function-as-a-Service
- illustrative embodiments can be implemented outside of the cloud infrastructure context, as in the case of a stand-alone computing and storage system implemented within a given enterprise.
- the cloud infrastructure further comprises sets of applications running on respective ones of the virtual machines under the control of the hypervisor. It is also possible to use multiple hypervisors each providing a set of virtual machines using at least one underlying physical machine. Different sets of virtual machines provided by one or more hypervisors may be utilized in configuring multiple instances of various components of the system.
- cloud infrastructure can be used to provide what is also referred to herein as a multi-tenant environment.
- One or more system components such as a cloud-based software code change completion time prediction engine, or portions thereof, are illustratively implemented for use by tenants of such a multi-tenant environment.
- Cloud infrastructure as disclosed herein can include cloud-based systems such as AWS, GCP and Microsoft Azure.
- Virtual machines provided in such systems can be used to implement at least portions of a cloud-based software code change completion time prediction platform in illustrative embodiments.
- the cloud-based systems can include object stores such as Amazon S3, GCP Cloud Storage, and Microsoft Azure Blob Storage.
- the cloud infrastructure additionally or alternatively comprises a plurality of containers implemented using container host devices.
- a given container of cloud infrastructure illustratively comprises a Docker container or other type of Linux Container (LXC).
- LXC Linux Container
- the containers may run on virtual machines in a multi-tenant environment, although other arrangements are possible.
- the containers may be utilized to implement a variety of different types of functionality within the storage devices.
- containers can be used to implement respective processing devices providing compute services of a cloud-based system.
- containers may be used in combination with other virtualization infrastructure such as virtual machines implemented using a hypervisor.
- processing platforms will now be described in greater detail with reference to FIGS. 10 and 11 . These platforms may also be used to implement at least portions of other information processing systems in other embodiments.
- FIG. 10 shows an example processing platform comprising cloud infrastructure 1000 .
- the cloud infrastructure 1000 comprises a combination of physical and virtual processing resources that may be utilized to implement at least a portion of the information processing system 110 .
- the cloud infrastructure 1000 comprises multiple virtual machines (VMs) and/or container sets 1002 - 1 , 1002 - 2 , . . . 1002 -R implemented using virtualization infrastructure 1004 .
- the virtualization infrastructure 1004 runs on physical infrastructure 1005 , and illustratively comprises one or more hypervisors and/or operating system level virtualization infrastructure.
- the operating system level virtualization infrastructure illustratively comprises kernel control groups of a Linux operating system or other type of operating system.
- the cloud infrastructure 1000 further comprises sets of applications 1010 - 1 , 1010 - 2 , . . . 1010 -R running on respective ones of the VMs/container sets 1002 - 1 , 1002 - 2 , . . . 1002 -R under the control of the virtualization infrastructure 1004 .
- the VMs/container sets 1002 may comprise respective VMs, respective sets of one or more containers, or respective sets of one or more containers running in VMs.
- the VMs/container sets 1002 comprise respective VMs implemented using virtualization infrastructure 1004 that comprises at least one hypervisor.
- virtualization infrastructure 1004 that comprises at least one hypervisor.
- Such implementations can provide software code change completion time prediction functionality of the type described above for one or more processes running on a given one of the VMs.
- each of the VMs can implement machine learning-based prediction control logic and associated functionality for evaluating predicted completion times for one or more processes running on that particular VM.
- hypervisor platform that may be used to implement a hypervisor within the virtualization infrastructure 1004 is the VMware® vSphere® which may have an associated virtual infrastructure management system such as the VMware® vCenterTM.
- the underlying physical machines may comprise one or more distributed processing platforms that include one or more storage systems.
- the VMs/container sets 1002 comprise respective containers implemented using virtualization infrastructure 1004 that provides operating system level virtualization functionality, such as support for Docker containers running on bare metal hosts, or Docker containers running on VMs.
- the containers are illustratively implemented using respective kernel control groups of the operating system.
- Such implementations can provide machine learning-based prediction functionality of the type described above for one or more processes running on different ones of the containers.
- a container host device supporting multiple containers of one or more container sets can implement one or more instances of machine learning-based prediction control logic and associated functionality for evaluating predicted completion times.
- one or more of the processing modules or other components of system 110 may each run on a computer, server, storage device or other processing platform element.
- a given such element may be viewed as an example of what is more generally referred to herein as a “processing device.”
- the cloud infrastructure 1000 shown in FIG. 10 may represent at least a portion of one processing platform.
- processing platform 1100 shown in FIG. 11 is another example of such a processing platform.
- the processing platform 1100 in this embodiment comprises at least a portion of the given system and includes a plurality of processing devices, denoted 1102 - 1 , 1102 - 2 , 1102 - 3 , . . . 1102 -K, which communicate with one another over a network 1104 .
- the network 1104 may comprise any type of network, such as a WAN, a LAN, a satellite network, a telephone or cable network, a cellular network, a wireless network such as WiFi or WiMAX, or various portions or combinations of these and other types of networks.
- the processing device 1102 - 1 in the processing platform 1100 comprises a processor 1110 coupled to a memory 1112 .
- the processor 1110 may comprise a microprocessor, a microcontroller, an ASIC, an FPGA or other type of processing circuitry, as well as portions or combinations of such circuitry elements, and the memory 1112 , which may be viewed as an example of a “processor-readable storage media” storing executable program code of one or more software programs.
- Articles of manufacture comprising such processor-readable storage media are considered illustrative embodiments.
- a given such article of manufacture may comprise, for example, a storage array, a storage disk or an integrated circuit containing RAM, ROM or other electronic memory, or any of a wide variety of other types of computer program products.
- the term “article of manufacture” as used herein should be understood to exclude transitory, propagating signals. Numerous other types of computer program products comprising processor-readable storage media can be used.
- network interface circuitry 1114 is included in the processing device 1102 - 1 , which is used to interface the processing device with the network 1104 and other system components, and may comprise conventional transceivers.
- the other processing devices 1102 of the processing platform 1100 are assumed to be configured in a manner similar to that shown for processing device 1102 - 1 in the figure.
- processing platform 1100 shown in the figure is presented by way of example only, and the given system may include additional or alternative processing platforms, as well as numerous distinct processing platforms in any combination, with each such platform comprising one or more computers, storage devices or other processing devices.
- Multiple elements of an information processing system may be collectively implemented on a common processing platform of the type shown in FIG. 10 or 11 , or each such element may be implemented on a separate processing platform.
- processing platforms used to implement illustrative embodiments can comprise different types of virtualization infrastructure, in place of or in addition to virtualization infrastructure comprising virtual machines.
- virtualization infrastructure illustratively includes container-based virtualization infrastructure configured to provide Docker containers or other types of LXCs.
- portions of a given processing platform in some embodiments can comprise converged infrastructure such as VxRailTM, VxRackTM, VxBlockTM, or Vblock® converged infrastructure commercially available from Dell Technologies.
- converged infrastructure such as VxRailTM, VxRackTM, VxBlockTM, or Vblock® converged infrastructure commercially available from Dell Technologies.
- components of an information processing system as disclosed herein can be implemented at least in part in the form of one or more software programs stored in memory and executed by a processor of a processing device.
- a processor of a processing device For example, at least portions of the functionality shown in one or more of the figures are illustratively implemented in the form of software running on one or more processing devices.
Abstract
Description
- The field relates generally to information processing systems and more particularly, to the processing of software code changes in such information processing systems.
- A number of techniques exist for developing and making changes to software code. GitHub, for example, provides a software development platform that enables communication and collaboration among software developers. The software development platform provided by GitHub allows software developers to create new software versions of software without disrupting a current version. Software development tasks often require coordination among a number of engineering teams that work on different portions of a larger software development project.
- In one embodiment, a method comprises obtaining one or more events related to one or more changes to software code; applying the one or more events to a machine learning prediction model that predicts a completion time of the one or more changes to the software code, wherein the machine learning prediction model is trained using (i) a plurality of events for a plurality of historical software code changes and (ii) a completion time for each historical software code change; and performing one or more automated remedial actions based at least in part on the predicted completion time.
- In one or more embodiments, the one or more automated remedial actions comprise one or more of generating at least one notification responsive to the predicted completion time, and adjusting an allocation of resources assigned to the completion of at least one of the one or more changes to the software code. The one or more changes to the software code can be monitored and a new event related to the one or more changes to the software code can be applied to the machine learning prediction model to obtain an updated predicted completion time for the one or more changes to the software code.
- In some embodiments, a predicted completion time can be obtained for each of a plurality of sets of changes to the software code within a project and the predicted completion time can be aggregated for each set of changes to obtain a predicted completion time for the project. In addition, a project may comprise a plurality of sets of changes to the software code and a predicted completion time for each set of changes within the project can be obtained and one or more sets of changes can be identified having a corresponding predicted completion time that occurs after a specified completion time for the project.
- Other illustrative embodiments include, without limitation, apparatus, systems, methods and computer program products comprising processor-readable storage media.
-
FIG. 1 illustrates an information processing system configured for machine learning-based prediction of completion time of software code changes in accordance with an illustrative embodiment; -
FIGS. 2A and 2B , respectively, illustrate a training phase and a prediction phase of the machine learning prediction model ofFIG. 1 in accordance with illustrative embodiments; -
FIG. 3 illustrates a number of exemplary software development events that may be processed by the machine learning model ofFIG. 1 in accordance with an illustrative embodiment; -
FIG. 4 illustrates an exemplary product development tool dashboard in accordance with an illustrative embodiment; -
FIG. 5 is a flow diagram illustrating an exemplary process for monitoring a request to review software code changes and for generating a predicted completion time for the monitored software code changes in accordance with an illustrative embodiment; -
FIG. 6 is a flow diagram illustrating an exemplary implementation of a machine learning-based process for predicting completion times of software code changes in accordance with an illustrative embodiment; -
FIG. 7 illustrates exemplary pseudo code for a training process for the machine learning prediction model ofFIG. 1 in accordance with an illustrative embodiment; -
FIG. 8 illustrates exemplary pseudo code for a data engineering process that preprocesses the data for the machine learning prediction model ofFIG. 1 in accordance with an illustrative embodiment; -
FIG. 9 shows an exemplary implementation of an exemplary long short-term memory (LSTM) network to predict software code change completion times in an illustrative embodiment; -
FIG. 10 illustrates an exemplary processing platform that may be used to implement at least a portion of one or more embodiments of the disclosure comprising a cloud infrastructure; and -
FIG. 11 illustrates another exemplary processing platform that may be used to implement at least a portion of one or more embodiments of the disclosure. - Illustrative embodiments of the present disclosure will be described herein with reference to exemplary communication, storage and processing devices. It is to be appreciated, however, that the disclosure is not restricted to use with the particular illustrative configurations shown. One or more embodiments of the disclosure provide methods, apparatus and computer program products for machine learning-based prediction of completion time of software code changes.
-
FIG. 1 shows a computer network (also referred to herein as an information processing system) 100 configured in accordance with an illustrative embodiment. Thecomputer network 100 comprises a plurality of user devices 102-1 through 102-L, collectively referred to herein asuser devices 102. Theuser devices 102 are coupled to anetwork 104, where thenetwork 104 in this embodiment is assumed to represent a sub-network or other related portion of thelarger computer network 100. Accordingly,elements FIG. 1 embodiment. Also coupled tonetwork 104 are one or moresoftware development servers 110 and one or moreproject management servers 120. - The
user devices 102 may comprise, for example, servers and/or portions of one or more server systems, as well as devices such as mobile telephones, laptop computers, tablet computers, desktop computers or other types of computing devices. Such devices are examples of what are more generally referred to herein as “processing devices.” Some of these processing devices are also generally referred to herein as “computers.” - The
user devices 102 in some embodiments comprise respective computers associated with a particular company, organization or other enterprise. In addition, at least portions of thecomputer network 100 may also be referred to herein as collectively comprising an “enterprise network.” Numerous other operating scenarios involving a wide variety of different types and arrangements of processing devices and networks are possible, as will be appreciated by those skilled in the art. - Also, it is to be appreciated that the term “user” in this context and elsewhere herein is intended to be broadly construed so as to encompass, for example, human, hardware, software or firmware entities, as well as various combinations of such entities.
- Also associated with the
user devices 102 are one or more input-output devices, which illustratively comprise keyboards, displays or other types of input-output devices in any combination. Such input-output devices can be used, for example, to support one or more user interfaces to theuser devices 102, as well as to support communication between thesoftware development servers 110, theproject management servers 120, and/or other related systems and devices not explicitly shown. - The
network 104 is assumed to comprise a portion of a global computer network such as the Internet, although other types of networks can be part of thecomputer network 100, including a wide area network (WAN), a local area network (LAN), a satellite network, a telephone or cable network, a cellular network, a wireless network such as a Wi-Fi or WiMAX network, or various portions or combinations of these and other types of networks. Thecomputer network 100 in some embodiments therefore comprises combinations of multiple different types of networks, each comprising processing devices configured to communicate using internet protocol (IP) or other related communication protocols. - Additionally, the
software development servers 110 can have at least one associated database (not explicitly shown inFIG. 1 ) configured to store data pertaining to, for example, software code under development, events related to software code changes, reviewer information and/or reviewer comments. Each of theproject management servers 120 can also have at least one associated database (not explicitly shown inFIG. 1 ) configured to store predicted and specified completion time data pertaining to, for example, software code changes being monitored by theproject management servers 120. - The databases associated with the
software development servers 110 and/or theproject management servers 120 can be implemented using one or more corresponding storage systems. Such storage systems can comprise any of a variety of different types of storage including network-attached storage (NAS), storage area networks (SANs), direct-attached storage (DAS) and distributed DAS, as well as combinations of these and other storage types, including software-defined storage. - Additionally, the
software development servers 110 and theproject management servers 120 in theFIG. 1 embodiment are assumed to be implemented using at least one processing device. Each such processing device generally comprises at least one processor and an associated memory, and implements one or more functional modules for controlling certain features of thesoftware development servers 110 and/or theproject management servers 120. - More particularly, the
software development servers 110 and theproject management servers 120 in this embodiment can each comprise a processor coupled to a memory and a network interface. - The processor illustratively comprises a microprocessor, a microcontroller, an application-specific integrated circuit (ASIC), a field-programmable gate array (FPGA) or other type of processing circuitry, as well as portions or combinations of such circuitry elements.
- The memory illustratively comprises random access memory (RAM), read-only memory (ROM) or other types of memory, in any combination. The memory and other memories disclosed herein may be viewed as examples of what are more generally referred to as “processor-readable storage media” storing executable computer program code or other types of software programs.
- One or more embodiments include articles of manufacture, such as computer-readable storage media. Examples of an article of manufacture include, without limitation, a storage device such as a storage disk, a storage array or an integrated circuit containing memory, as well as a wide variety of other types of computer program products. The term “article of manufacture” as used herein should be understood to exclude transitory, propagating signals. These and other references to “disks” herein are intended to refer generally to storage devices, including solid-state drives (SSDs), and should therefore not be viewed as limited in any way to spinning magnetic media.
- The network interfaces allow for communication between the
software development servers 110, theproject management servers 120, and/or theuser devices 102 over thenetwork 104, and each illustratively comprises one or more conventional transceivers. - In the example of
FIG. 1 , thesoftware development servers 110 may be implemented, at least in part, using the GitHub software development platform. Thesoftware development servers 110 may comprise asoftware code repository 112, a software codechange processing module 114, and anevent messaging module 116. Generally, thesoftware code repository 112 comprises multiple versions of software, such as a current software version and one or more versions undergoing software development. The software codechange processing module 114 may process changes to the software code, for example, using at least portions of the GitHub software development tool. In some embodiments, theevent messaging module 116 generates events related to the software code changes (as described in more detail, for example, in conjunction withFIG. 3 ) and publishes the messages in a messaging layer of a sequential message queue, such as Kafka messaging layer or a messaging layer of another enterprise service bus. The messages may also be stored in a database, such as a NoSQL database (e.g., a MongoDB). - Also, the
project management servers 120 may be implemented, at least in part, using the Jira™ product development tool that allows a project manager to monitor the progress of software development tasks. Each of theproject management servers 120 may include anevent processing module 122, a machinelearning prediction model 124, and adashboard update module 126. Generally, theevent processing module 122 obtains and processes events corresponding to changes to software code being generated by theevent messaging module 116 of thesoftware development server 120. Theevent processing module 122 may transform the events into formats that are digestible by the machinelearning prediction model 124, for example. In some embodiments, the machinelearning prediction model 124 generates a predicted completion time for one or more changes to software code. The generated predicted completion times for the one or more changes to software code may be presented to one or more users, for example, using the dashboard update module 126 (as described in more detail in conjunction with, for example,FIG. 4 ). - In some embodiments, the machine
learning prediction model 124 is trained using (i) a plurality of events for a plurality of historical software code changes and (ii) a completion time (e.g., as a label) for each historical software code change. - It is to be appreciated that the particular arrangement of elements 112-116 illustrated in the software development server(s) 110, and the particular arrangement of elements 122-126 in the project management server(s) 120 of the
FIG. 1 embodiment is presented by way of example only, and alternative arrangements can be used in other embodiments. For example, the functionality associated with the elements 112-116 and/or elements 122-126 in other embodiments can be combined into a single element, or separated across a larger number of elements. As another example, multiple distinct processors can be used to implement different ones of the elements 112-116 and/or elements 122-126 or portions thereof. - At least portions of elements 112-116 and/or elements 122-126 may be implemented at least in part in the form of software that is stored in memory and executed by a processor.
- It is to be understood that the particular set of elements shown in
FIG. 1 for machine learning-based prediction of completion time of software code changes is presented by way of illustrative example only, and in other embodiments additional or alternative elements may be used. Thus, another embodiment includes additional or alternative systems, devices and other network entities, as well as different arrangements of modules and other components. For example, in at least one embodiment, one or more of thesoftware development servers 110 and at least one associated database can be on and/or part of the same processing platform. - An exemplary process utilizing elements 112-116 of an example
software development server 110 incomputer network 100 will be described in more detail with reference to, for example,FIGS. 3 and 6 . - An exemplary process utilizing elements 122-126 of an example
project management server 120 incomputer network 100 will be described in more detail with reference to, for example,FIGS. 4 through 6 . -
FIG. 2A illustrates an exemplarymodel training phase 210 of the machinelearning prediction model 124 ofFIG. 1 according to one or more embodiments. In the example ofFIG. 2A , historical changes to software code (e.g., from the software code repository 112) are processed to train the machinelearning prediction model 124 to generate a trained machinelearning prediction model 224. In particular, software development events 220 (e.g., as discussed further below in conjunction withFIG. 3 ) for the historical changes to the software code, together with actual completion time labels 230 for the historical changes to the software code, are applied to a machine learningmodel training module 240 that trains one or more machine learning models to generate a trained machinelearning prediction model 224. - In this manner, the trained machine
learning prediction model 224 has learned the lifetime of such software code changes based on the historical data and can predict the expected completion time with respect to such software code changes in order to provide project management teams with timely technical updates. - The trained machine
learning prediction model 224 may be implemented, for example, as an LSTM network. The training may employ supervised learning techniques with outcome labels (e.g., observed completion time) with respect to the historical data. In this manner, the machinelearning prediction model 224 can process single data points (e.g., an event), as well as sequences of data (e.g., events). -
FIG. 2B illustrates an exemplary completiontime prediction phase 250 of the machinelearning prediction model 224 according to at least one embodiment. In the example ofFIG. 2B , software development events 260 (e.g., as discussed further below in conjunction withFIG. 3 ) for current changes to the software code are applied to the machinelearning prediction model 224, during the completiontime prediction phase 250, in order to obtain the predictedcompletion time 270 of the current software code changes. - In at least some embodiments, a
feedback path 280 is provided for a retraining of the machinelearning prediction model 224. In this manner, as the scope of the sample data increases, the machinelearning prediction model 224 learns and becomes more accurate. - Code branching provides a mechanism for working on different versions of a software code repository at one time. Branches can be used to experiment and to make changes before committing the changes to the main branch. In the GitHub software development platform, for example, a pull request (also referred to herein as a request to review software code changes) allows developers or other members of a development team to announce potential changes to software code that have been pushed to a branch in a repository on GitHub. Once such a pull request is opened, the potential changes can be evaluated, discussed and reviewed among collaborators before the potential changes are merged into a main code branch. In some embodiments, a given pull request can show differences between two code branches, such as the main code branch and the proposed changes.
- In the GitHub software development platform, saved changes are referred to as commits. Each commit typically has an associated commit message, describing why a particular change was made. A merge request makes the approved software code changes available to the main branch.
- As noted above, the
project management servers 120 may be implemented, at least in part, using the Jira product development tool that allows a project manager to monitor the progress of software development tasks. In the context of the Jira product development tool, a user story may comprise an explanation of a software feature written from the perspective of the end user. The purpose of a Jira story is to articulate how a software feature will provide value to the customer. A “sprint” is a time period in a development cycle where an engineering team completes work, in a known manner. - One or more aspects of the disclosure recognize that there may be one or more pull requests to be reviewed or merged that are related to a Jira story and that experience unexpected delays. Such delays may require the program managers to adjust the timeline of the delayed pull requests themselves and/or the timeline of other implicated pull requests or tasks.
- In one or more embodiments, machine learning-based techniques are provided for predicting completion times of software code changes that allow a project management team to dynamically track (and predict) the progress of pull requests and/or merge requests associated with outstanding user stories in a sprint, for example, using comments and/or activities in such pull/merge requests.
-
FIG. 3 illustrates a number of exemplarysoftware development events 300 that may be processed by the machine learning model ofFIG. 1 , according to one or more embodiments of the disclosure. In the example ofFIG. 3 , the exemplarysoftware development events 300 comprise a create branch event, a code change event, a commit change event, a pull request event, a reviewer comment event (or a sequence of reviewer comments), a merge request event, a continuous integration/continuous deployment event and/or a code analysis tool event. Theevents 300 may comprise webhook events and may be obtained from a messaging layer. - In at least some embodiments, the
software development events 300 that are applied to the machinelearning prediction model 124 in the completiontime prediction phase 250 comprise events occurring between an opening of a pull request related to a review of one or more software code changes to a completion of pull request (e.g., when the one or more changes to the software code associated with the pull request are closed and merged into the main branch). - In one or more embodiments, the exemplary
software development events 300 may be categorized into a first set of events from the GitHub software development platform and a second set of events from a code review platform, such as a continuous integration/continuous deployment (CICD) tool (e.g., the Jenkins open source automation server to modify and ensure the integrity of the code pipeline) and/or a code analysis tool (e.g., a code checker that performs code analysis to review source code lines prior to entering the production phase of a development project). -
FIG. 4 illustrates an exemplary productdevelopment tool dashboard 400 for a Jira story related to one or more software code changes, according to some embodiments of the disclosure. In the example ofFIG. 4 , thedashboard 400 comprises an identifier for each pull request being monitored, as well as a corresponding predicted completion time flag (e.g., “on time” or “delayed”) and any automated actions to be performed based on the current predicted completion time of any pull request. - For example, as shown in
FIG. 4 , pullrequest 3 has a predicted completion time flag of “delayed,” since the predicted completion time occurs after the defined sprint closure date. Thus, an automated action may comprise sending a notification to the appropriate sprint team software indicating thatpull request 3 needs additional attention (or resources) and/or recommending an allocation of additional resources to a completion ofpull request 3. -
FIG. 5 is a flow diagram illustrating anexemplary process 500 for monitoring a pull request related to software code changes and for generating a predicted completion time for the monitored pull request, according to at least one embodiment. In the example ofFIG. 5 , theprocess 500 initially monitors a given pull request instep 510. - A test is performed in
step 520 to determine if a new event is received for the monitored pull request. If it is determined instep 520 that a new event is not received, then program control returns to step 510 to continue monitoring the pull request. - Once it is determined in
step 520 that a new event is received, then the new event is applied instep 530 to the machinelearning prediction model 124 and a new prediction of the completion time of the pull request is obtained instep 540. - Thus, at least in some embodiments, the disclosed machine learning-based prediction techniques predict the lifetime of outstanding software code changes (e.g., associated with pull/merge requests) based on the GitHub events. In this manner, as events happen on the monitored pull request, the estimated completion time can be dynamically updated.
-
FIG. 6 is a flow diagram illustrating an exemplary implementation of a machine learning-basedprocess 600 for predicting completion times of software code changes, according to at least some embodiments. In the example ofFIG. 6 , the machine learning-basedprocess 600 initially obtains one or more events instep 610 related to one or more changes to software code. Instep 620, the one or more events are applied to a machine learning prediction model that predicts a completion time of the one or more changes to the software code. The machine learning prediction model can be trained, for example, using (i) a plurality of events for a plurality of historical software code changes and (ii) a completion time for each historical software code change. - In
step 630, one or more automated remedial actions are performed based at least in part on the predicted completion time. - In some embodiments, the one or more events may comprise events occurring between an opening of a request to review the one or more changes to the software code and a completion of the one or more changes to the software code. The one or more events may be obtained, for example, from a messaging layer, such as a Kafka messaging layer. The one or more events may comprise, for example, a create branch event, a code change event, a commit change event, a pull request event, a reviewer comment event, a merge request event, a continuous integration/continuous deployment event and/or a code analysis tool event.
- In one or more embodiments, the automated remedial actions performed in
step 630 may comprise generating at least one notification responsive to the predicted completion time, and/or adjusting an allocation of resources assigned to the completion of the one or more changes to the software code. - In at least one embodiment, the one or more changes to the software code are monitored and a new event related to the one or more changes to the software code can be applied to the machine learning prediction model to obtain an updated predicted completion time for the one or more changes to the software code.
- In addition, a predicted completion time can be obtained for each of a plurality of sets of changes to the software code (e.g., defined tasks with respect to the software code, such as separate pull requests) within a project (e.g., a Jira story) and the predicted completion time for each set of changes can be aggregated to obtain a predicted completion time for the project. When a project comprises a plurality of sets of changes to the software code, a predicted completion time can be obtained for each set of changes within the project and one or more sets of changes can be identified having a corresponding predicted completion time that occurs after a specified completion time for the project.
- For example, assume there are four pull/merge requests associated with a Jira Story, each having a corresponding predicted completion time. If any of the predicted completion times of the four pull/merge requests falls after a defined sprint closure date, such pending pull/merge requests can be consolidated and a summary notification can be provided to the sprint team. In this manner, an automated mechanism is provided as a triggering point to review the pull/merge requests that need attention.
- The particular processing operations and other network functionality described in conjunction with the flow diagrams of
FIGS. 5 and 6 are presented by way of illustrative example only, and should not be construed as limiting the scope of the disclosure in any way. Alternative embodiments can use other types of processing operations for machine learning-based prediction of completion time of software code changes. For example, the ordering of the process steps may be varied in other embodiments, or certain steps may be performed concurrently with one another rather than serially. In one aspect, the process can skip one or more of the actions. In other aspects, one or more of the actions are performed simultaneously. In some aspects, additional actions can be performed. -
FIG. 7 illustrates exemplary pseudo code for atraining process 700 for the machinelearning prediction model 124 ofFIG. 1 in accordance with an illustrative embodiment. The example ofFIG. 7 assumes that the machinelearning prediction model 124 being trained is implemented as a two-layer LSTM model having an input layer and an output layer. In further embodiments, additional hidden layers may also be employed, as would be apparent to a person of ordinary skill in the art. - In the embodiment of
FIG. 7 , data from the software development server(s) 110 is separated into a training dataset and a testing dataset, for example, with a random distribution of 80% for the training dataset and 20% for the testing dataset. The data from the software development server(s) 110 for thetraining process 700 may comprise, for example, the software development events 220 ofFIG. 2 for software code changes, together with the corresponding completion time labels 230 of the software code changes. - An input layer of the machine
learning prediction model 124 can be trained with the training dataset and the training output of the input layer can be provided to the output layer of the machinelearning prediction model 124. The machinelearning prediction model 124 is trained by thetraining process 700 using the training data until the machinelearning prediction model 124 achieves, for example, an accuracy value of 90% on the test data. In the example ofFIG. 7 , a maximum of 24 epochs (e.g., training iterations) is employed. For each epoch of thetraining process 700, the time difference between each subsequent pair of events in the training data is used to train the input layer of the machinelearning prediction model 124. Once the accuracy value on the test data satisfies the specified accuracy criteria (e.g., an accuracy value of 90% on the test data), a “pass” status is applied to thetraining process 700 and thetraining process 700 terminates for the current training dataset. -
FIG. 8 illustrates exemplary pseudo code for adata engineering process 800 that preprocesses the data for the machine learning prediction model ofFIG. 1 in accordance with an illustrative embodiment. In the example ofFIG. 8 , the exemplarydata engineering process 800 initially collects data from one or moresoftware development servers 110 ofFIG. 1 (e.g., version control systems). The collected data is then explored, for example, using one or more Python libraries (e.g., associated with Python version 3.10.0). For example, the seaborn and/or matplotlib Python libraries may be employed in some embodiments for data exploration. - The exemplary
data engineering process 800 then preprocesses the collected data (e.g., to satisfy one or more data processing requirements of the machine learning prediction model 124). In the example ofFIG. 8 , the preprocessing may comprise employing Principal Component Analysis (PCA) to explore relationships between the collected data. The results from the PCA analysis are then used to reduce a dimensionality of the available data, for example, by reducing features from the collected data. Null values (and/or other unimportant data records) that are not needed to train the machinelearning prediction model 124 may be removed from the collected data in some embodiments. - Ordinal encoding may then be applied to the remaining data (following removal of the unimportant data records), to transform the remaining data so that it can be processed by the machine
learning prediction model 124. For example, binary or text values may be converted to categorical values. In addition, timestamp representations of each pull request (such as the difference between a pull request open time and a pull request closed time) may be encoded. For an exemplary feature “timestamp.format(pull request closed time−pull request open time),” where the open time is “24th November 12:30 PM,” and the closed time is “25th November 12:30 PM,” the timestamp representation may be expressed as 1440 minutes (24 hours multiplied by 60 minutes per hour). The time duration between each pair of subsequent events in each pull request is also identified (e.g., a delta time comprising the time duration between two events). - One or more embodiments include utilizing one or more artificial intelligence (AI) techniques (such as a deep learning algorithm or model) to predict completion times of software code changes. By way of example,
FIG. 9 shows an exemplary implementation of adeep learning model 900 to predictcompletion times 950 of software code changes in accordance with an illustrative embodiment. In the example embodiment depicted inFIG. 9 , thedeep learning model 900 includes an LSTM architecture which is comprised of one or more LSTM memory blocks 925-1 through 925-M, collectively referred to herein as LSTM memory blocks 925, that can be connected through layers. EachLSTM memory block 925 comprises three gates that manage a state and output of the respectiveLSTM memory block 925. In at least some embodiments, a givenLSTM memory block 925 operates upon an input sequence and each gate within aLSTM memory block 925 uses sigmoid activation units to control whether the respectiveLSTM memory block 925 is triggered. Each of exemplary LSTM memory blocks 925 comprises a forget gate that conditionally decides the information to provide from the block; an input gate that conditionally decides the values from the input to update the memory state and an output gate that conditionally decides what value to output based on the input and the memory of the respectiveLSTM memory block 925. - In one or more embodiments, the gates of each
LSTM memory block 925 have weights (e.g., hi, ci-1, ci) that are learned during themodel training phase 210 ofFIG. 2 . A givenLSTM memory block 925 remembers values over arbitrary time intervals, and the three gates of eachLSTM memory block 925 regulate the flow of information into and out of the cell in connection with a given number of hiddenunits 910, a given number offeatures 930, discussed hereinafter, and a given number of time steps 940. Such an example LSTM network is utilized in one or more embodiments to process and classify data, and generate one ormore predictions 950 of completion times for software code changes based on the appliedsoftware development events 260 ofFIG. 2 . - In such an example embodiment as depicted in
FIG. 9 , input data associated with aninitial state 920 can pertain to features extracted from the appliedsoftware development events 260. In one exemplary implementation, the following features were used: pull_request_draft; pull_request_base_user_login; software_changes; pull_request_base_repository_size; sender_login; pull_request_base_repository_name; repository_full_name; pull_request_base_repository_owner_type; pull_request_deletions; pull_request_title; pull_request_base_repository_open_issues; pull_request_head_repository_name; repository_owner_login; repository_forks; pull_request_head_repository_watchers; pull_request_url; pull_request_head_repository_has_issues; pull_request_head_reference; repository_open_issues; headers_x_github_delivery; organization_url; pull_request_assignee_url; comments; pull_request_mergeable_state; repository_private; pull_request_requested_teams; pull_request_assignee_login; pull_request_head_repository_owner_login; repository_watchers_count; pull_request_number; pull_request_author_association; repository_url; pull_request_changed_files; pull_request_merged_by_url; pull_request_head_repository_owner_url; pull_request_additions; pull_request_head_repository_forks_count; organization_login; action; pull_request_head_repository_url; pull_request_labels; pull_request_body; review_user_url; pull_request_comments; pull_request_user_url; headers_http_user_agent; pull_request_merged; pull_request_head_repository_open_issues; pull_request_user_login; pull_request_head_repository_watchers_count; review_state; pull_request_assignees; pull_request_head_user_type; pull_request_state; pull_request_base_repository_url; pull_request_head_repository_size; repository_owner_type; pull_request_base_repository_watchers; pull_request_base_repository_private; @timestamp; pull_request_base_label; review_author_association; pull_request_base_repository_owner_url; pull_request_base_repository_language; repository_open_issues_count; pull_request_base_repository_forks; pull_request_base_user_type; pull_request_head_repository_owner_type; pull_request_rebaseable; assignee; repository_size; pull_request_head_label; headers_x_github_event; repository_forks_count; pull_request_base_ref; pull_request_base_repository_default_branch; repository_language; pull_request_base_repository_watchers_count; pull_request_head_repository_private; pull_request_mergeable; pull_request_head_repository_full_name; pull_request_base_repository_full_name; pull_request_head_user_url; repository_description; pull_request_head_repository_description; pull_request_head_repository_default_branch; sender_url; repository_name; review_body; pull_request_base_repository_owner_login; review_user_login; pull_request_head_repository_fork; pull_request_head_user_login; pull_request_base_repository_forks_count; pull_request_head_repository_language; pull_request_head_repository_forks; pull_request_review_comments; pull_request_base_repository_open_issues_count; pull_request_base_user_url; headers_content_length; pull_request_merged_by_login; requested_team; requested_reviewer; repository_watchers; pull_request_commits; pull_request_requested_reviewers; repository_default_branch; pull_request_head_repository_open_issues_count; repository_owner_url; and pull_request_base_repository_description. - In the example embodiment of
FIG. 9 , the exemplarydeep learning model 900 outputs a predicted completion time of one or more software code changes. It is noted that the particular arrangement of elements shown inFIG. 9 are presented by way of illustrative example only, and a wide variety of alternative machine learning models can be used in other embodiments, as would be apparent to a person of ordinary skill in the art. - One or more embodiments of the disclosure provide improved methods, apparatus and computer program products for machine learning-based prediction of completion time of software code changes. The foregoing applications and associated embodiments should be considered as illustrative only, and numerous other embodiments can be configured using the techniques disclosed herein, in a wide variety of different applications.
- It should also be understood that the disclosed techniques for predicting completion times for changes to software code, as described herein, can be implemented at least in part in the form of one or more software programs stored in memory and executed by a processor of a processing device such as a computer. As mentioned previously, a memory or other storage device having such program code embodied therein is an example of what is more generally referred to herein as a “computer program product.”
- The disclosed techniques for machine learning-based prediction of completion time of software code changes may be implemented using one or more processing platforms. One or more of the processing modules or other components may therefore each run on a computer, storage device or other processing platform element. A given such element may be viewed as an example of what is more generally referred to herein as a “processing device.”
- As noted above, illustrative embodiments disclosed herein can provide a number of significant advantages relative to conventional arrangements. It is to be appreciated that the particular advantages described above and elsewhere herein are associated with particular illustrative embodiments and need not be present in other embodiments. Also, the particular types of information processing system features and functionality as illustrated and described herein are exemplary only, and numerous other arrangements may be used in other embodiments.
- In these and other embodiments, compute services and/or storage services can be offered to cloud infrastructure tenants or other system users as a Platform as a service (PaaS) model, an Infrastructure as a service (IaaS) model, a Storage-as-a-Service (STaaS) model and/or a Function-as-a-Service (FaaS) model, although numerous alternative arrangements are possible. Also, illustrative embodiments can be implemented outside of the cloud infrastructure context, as in the case of a stand-alone computing and storage system implemented within a given enterprise.
- Some illustrative embodiments of a processing platform that may be used to implement at least a portion of an information processing system comprise cloud infrastructure including virtual machines implemented using a hypervisor that runs on physical infrastructure. The cloud infrastructure further comprises sets of applications running on respective ones of the virtual machines under the control of the hypervisor. It is also possible to use multiple hypervisors each providing a set of virtual machines using at least one underlying physical machine. Different sets of virtual machines provided by one or more hypervisors may be utilized in configuring multiple instances of various components of the system.
- These and other types of cloud infrastructure can be used to provide what is also referred to herein as a multi-tenant environment. One or more system components such as a cloud-based software code change completion time prediction engine, or portions thereof, are illustratively implemented for use by tenants of such a multi-tenant environment.
- Cloud infrastructure as disclosed herein can include cloud-based systems such as AWS, GCP and Microsoft Azure. Virtual machines provided in such systems can be used to implement at least portions of a cloud-based software code change completion time prediction platform in illustrative embodiments. The cloud-based systems can include object stores such as Amazon S3, GCP Cloud Storage, and Microsoft Azure Blob Storage.
- In some embodiments, the cloud infrastructure additionally or alternatively comprises a plurality of containers implemented using container host devices. For example, a given container of cloud infrastructure illustratively comprises a Docker container or other type of Linux Container (LXC). The containers may run on virtual machines in a multi-tenant environment, although other arrangements are possible. The containers may be utilized to implement a variety of different types of functionality within the storage devices. For example, containers can be used to implement respective processing devices providing compute services of a cloud-based system. Again, containers may be used in combination with other virtualization infrastructure such as virtual machines implemented using a hypervisor.
- Illustrative embodiments of processing platforms will now be described in greater detail with reference to
FIGS. 10 and 11 . These platforms may also be used to implement at least portions of other information processing systems in other embodiments. -
FIG. 10 shows an example processing platform comprisingcloud infrastructure 1000. Thecloud infrastructure 1000 comprises a combination of physical and virtual processing resources that may be utilized to implement at least a portion of theinformation processing system 110. Thecloud infrastructure 1000 comprises multiple virtual machines (VMs) and/or container sets 1002-1, 1002-2, . . . 1002-R implemented usingvirtualization infrastructure 1004. Thevirtualization infrastructure 1004 runs onphysical infrastructure 1005, and illustratively comprises one or more hypervisors and/or operating system level virtualization infrastructure. The operating system level virtualization infrastructure illustratively comprises kernel control groups of a Linux operating system or other type of operating system. - The
cloud infrastructure 1000 further comprises sets of applications 1010-1, 1010-2, . . . 1010-R running on respective ones of the VMs/container sets 1002-1, 1002-2, . . . 1002-R under the control of thevirtualization infrastructure 1004. The VMs/container sets 1002 may comprise respective VMs, respective sets of one or more containers, or respective sets of one or more containers running in VMs. - In some implementations of the
FIG. 10 embodiment, the VMs/container sets 1002 comprise respective VMs implemented usingvirtualization infrastructure 1004 that comprises at least one hypervisor. Such implementations can provide software code change completion time prediction functionality of the type described above for one or more processes running on a given one of the VMs. For example, each of the VMs can implement machine learning-based prediction control logic and associated functionality for evaluating predicted completion times for one or more processes running on that particular VM. - An example of a hypervisor platform that may be used to implement a hypervisor within the
virtualization infrastructure 1004 is the VMware® vSphere® which may have an associated virtual infrastructure management system such as the VMware® vCenter™. The underlying physical machines may comprise one or more distributed processing platforms that include one or more storage systems. - In other implementations of the
FIG. 10 embodiment, the VMs/container sets 1002 comprise respective containers implemented usingvirtualization infrastructure 1004 that provides operating system level virtualization functionality, such as support for Docker containers running on bare metal hosts, or Docker containers running on VMs. The containers are illustratively implemented using respective kernel control groups of the operating system. Such implementations can provide machine learning-based prediction functionality of the type described above for one or more processes running on different ones of the containers. For example, a container host device supporting multiple containers of one or more container sets can implement one or more instances of machine learning-based prediction control logic and associated functionality for evaluating predicted completion times. - As is apparent from the above, one or more of the processing modules or other components of
system 110 may each run on a computer, server, storage device or other processing platform element. A given such element may be viewed as an example of what is more generally referred to herein as a “processing device.” Thecloud infrastructure 1000 shown inFIG. 10 may represent at least a portion of one processing platform. Another example of such a processing platform is processingplatform 1100 shown inFIG. 11 . - The
processing platform 1100 in this embodiment comprises at least a portion of the given system and includes a plurality of processing devices, denoted 1102-1, 1102-2, 1102-3, . . . 1102-K, which communicate with one another over anetwork 1104. Thenetwork 1104 may comprise any type of network, such as a WAN, a LAN, a satellite network, a telephone or cable network, a cellular network, a wireless network such as WiFi or WiMAX, or various portions or combinations of these and other types of networks. - The processing device 1102-1 in the
processing platform 1100 comprises aprocessor 1110 coupled to amemory 1112. Theprocessor 1110 may comprise a microprocessor, a microcontroller, an ASIC, an FPGA or other type of processing circuitry, as well as portions or combinations of such circuitry elements, and thememory 1112, which may be viewed as an example of a “processor-readable storage media” storing executable program code of one or more software programs. - Articles of manufacture comprising such processor-readable storage media are considered illustrative embodiments. A given such article of manufacture may comprise, for example, a storage array, a storage disk or an integrated circuit containing RAM, ROM or other electronic memory, or any of a wide variety of other types of computer program products. The term “article of manufacture” as used herein should be understood to exclude transitory, propagating signals. Numerous other types of computer program products comprising processor-readable storage media can be used.
- Also included in the processing device 1102-1 is
network interface circuitry 1114, which is used to interface the processing device with thenetwork 1104 and other system components, and may comprise conventional transceivers. - The
other processing devices 1102 of theprocessing platform 1100 are assumed to be configured in a manner similar to that shown for processing device 1102-1 in the figure. - Again, the
particular processing platform 1100 shown in the figure is presented by way of example only, and the given system may include additional or alternative processing platforms, as well as numerous distinct processing platforms in any combination, with each such platform comprising one or more computers, storage devices or other processing devices. - Multiple elements of an information processing system may be collectively implemented on a common processing platform of the type shown in
FIG. 10 or 11 , or each such element may be implemented on a separate processing platform. - For example, other processing platforms used to implement illustrative embodiments can comprise different types of virtualization infrastructure, in place of or in addition to virtualization infrastructure comprising virtual machines. Such virtualization infrastructure illustratively includes container-based virtualization infrastructure configured to provide Docker containers or other types of LXCs.
- As another example, portions of a given processing platform in some embodiments can comprise converged infrastructure such as VxRail™, VxRack™, VxBlock™, or Vblock® converged infrastructure commercially available from Dell Technologies.
- It should therefore be understood that in other embodiments different arrangements of additional or alternative elements may be used. At least a subset of these elements may be collectively implemented on a common processing platform, or each such element may be implemented on a separate processing platform.
- Also, numerous other arrangements of computers, servers, storage devices or other components are possible in the information processing system. Such components can communicate with other elements of the information processing system over any type of network or other communication media.
- As indicated previously, components of an information processing system as disclosed herein can be implemented at least in part in the form of one or more software programs stored in memory and executed by a processor of a processing device. For example, at least portions of the functionality shown in one or more of the figures are illustratively implemented in the form of software running on one or more processing devices.
- It should again be emphasized that the above-described embodiments are presented for purposes of illustration only. Many variations and other alternative embodiments may be used. For example, the disclosed techniques are applicable to a wide variety of other types of information processing systems. Also, the particular configurations of system and device elements and associated processing operations illustratively shown in the drawings can be varied in other embodiments. Moreover, the various assumptions made above in the course of describing the illustrative embodiments should also be viewed as exemplary rather than as requirements or limitations of the disclosure. Numerous other alternative embodiments within the scope of the appended claims will be readily apparent to those skilled in the art.
Claims (20)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US17/536,654 US20230168883A1 (en) | 2021-11-29 | 2021-11-29 | Machine learning-based prediction of completion time of software code changes |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US17/536,654 US20230168883A1 (en) | 2021-11-29 | 2021-11-29 | Machine learning-based prediction of completion time of software code changes |
Publications (1)
Publication Number | Publication Date |
---|---|
US20230168883A1 true US20230168883A1 (en) | 2023-06-01 |
Family
ID=86500092
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US17/536,654 Pending US20230168883A1 (en) | 2021-11-29 | 2021-11-29 | Machine learning-based prediction of completion time of software code changes |
Country Status (1)
Country | Link |
---|---|
US (1) | US20230168883A1 (en) |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170364824A1 (en) * | 2016-06-21 | 2017-12-21 | International Business Machines Corporation | Contextual evaluation of process model for generation and extraction of project management artifacts |
US20210055995A1 (en) * | 2019-08-19 | 2021-02-25 | EMC IP Holding Company LLC | Method and Apparatus for Predicting Errors in To-Be-Developed Software Updates |
US20220350588A1 (en) * | 2021-04-30 | 2022-11-03 | Microsoft Technology Licensing, Llc | Intelligent generation and management of estimates for application of updates to a computing device |
-
2021
- 2021-11-29 US US17/536,654 patent/US20230168883A1/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170364824A1 (en) * | 2016-06-21 | 2017-12-21 | International Business Machines Corporation | Contextual evaluation of process model for generation and extraction of project management artifacts |
US20210055995A1 (en) * | 2019-08-19 | 2021-02-25 | EMC IP Holding Company LLC | Method and Apparatus for Predicting Errors in To-Be-Developed Software Updates |
US20220350588A1 (en) * | 2021-04-30 | 2022-11-03 | Microsoft Technology Licensing, Llc | Intelligent generation and management of estimates for application of updates to a computing device |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20230162063A1 (en) | Interpretability-based machine learning adjustment during production | |
US20180240062A1 (en) | Collaborative algorithm development, deployment, and tuning platform | |
US10217054B2 (en) | Escalation prediction based on timed state machines | |
US11461679B2 (en) | Message management using machine learning techniques | |
US11301226B2 (en) | Enterprise deployment framework with artificial intelligence/machine learning | |
US11373131B1 (en) | Automatically identifying and correcting erroneous process actions using artificial intelligence techniques | |
US10771562B2 (en) | Analyzing device-related data to generate and/or suppress device-related alerts | |
JP7461696B2 (en) | Method, system, and program for evaluating resources in a distributed processing system | |
US20220327012A1 (en) | Software validation framework | |
US20220114401A1 (en) | Predicting performance of machine learning models | |
US11513925B2 (en) | Artificial intelligence-based redundancy management framework | |
US11392821B2 (en) | Detecting behavior patterns utilizing machine learning model trained with multi-modal time series analysis of diagnostic data | |
US20210044499A1 (en) | Automated operational data management dictated by quality of service criteria | |
AU2022224845B2 (en) | Intelligent vulnerability lifecycle management system | |
US11501155B2 (en) | Learning machine behavior related to install base information and determining event sequences based thereon | |
US20210135966A1 (en) | Streaming and event management infrastructure performance prediction | |
US10999393B2 (en) | Cloud broker for connecting with enterprise applications | |
US20230168883A1 (en) | Machine learning-based prediction of completion time of software code changes | |
US20230236923A1 (en) | Machine learning assisted remediation of networked computing failure patterns | |
US11900325B2 (en) | Utilizing a combination of machine learning models to determine a success probability for a software product | |
Sabharwal et al. | Hands-on AIOps | |
US11410121B2 (en) | Proactively predicting large orders and providing fulfillment support related thereto | |
Provatas et al. | Selis bda: Big data analytics for the logistics domain | |
Mohanty et al. | It operations and ai | |
US11971907B2 (en) | Component monitoring framework with predictive analytics |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: DELL PRODUCTS L.P., TEXAS Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:SREEDHARAN, VINEETH;GR, KISHORE;JANAKIRAMAN, TAMILARASAN;REEL/FRAME:058228/0368 Effective date: 20211118 |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NON FINAL ACTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: FINAL REJECTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: ADVISORY ACTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION |