WO2013184952A1 - Method for automatic extraction of designs from standard source code - Google Patents

Method for automatic extraction of designs from standard source code Download PDF

Info

Publication number
WO2013184952A1
WO2013184952A1 PCT/US2013/044573 US2013044573W WO2013184952A1 WO 2013184952 A1 WO2013184952 A1 WO 2013184952A1 US 2013044573 W US2013044573 W US 2013044573W WO 2013184952 A1 WO2013184952 A1 WO 2013184952A1
Authority
WO
WIPO (PCT)
Prior art keywords
code
kernels
metadata
database
bufferinfo
Prior art date
Application number
PCT/US2013/044573
Other languages
French (fr)
Inventor
Kevin D. Howard
Original Assignee
Massively Parallel Technologies, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US13/490,345 external-priority patent/US8762946B2/en
Application filed by Massively Parallel Technologies, Inc. filed Critical Massively Parallel Technologies, Inc.
Publication of WO2013184952A1 publication Critical patent/WO2013184952A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/70Software maintenance or management
    • G06F8/74Reverse engineering; Extracting design information from source code

Definitions

  • Figure 2 is an exemplary diagram showing branching structures binding code segments in a function
  • Figure 3 is a flowchart of a high-level exemplary algorithm for automatically extracting designs from standard source code
  • Figure 6 shows an example of a translation of pass-by-value to the present decomposition format
  • Figure 12 shows an exemplary decomposition carried to a McCabe code block level
  • Figure 19 is a flowchart 1900 showing an exemplary set of steps performed in automatically attaching files and databases to design elements
  • Control bubble - A control bubble is a graphical indicator of a control transformation.
  • a control bubble symbol indicates a structure that performs only transitions and does not perform processing.
  • Process bubble - A process bubble is a graphical indicator of a data transformation.
  • Node - A node is a processing element comprised of a processing core, or processor, memory and communication capability.
  • MPT State Machine An MPT state machine is a two- dimensional matrix which links together all relevant control kernels into a single non-language construct that calls process kernels. Each row in a MPT state machine consists of an index, the subroutine to be called (or the symbol "NOP"), a conditional statement, an index to the next accessible row (when the condition is true, or an end-of-job symbol is encountered), and an index to the next accessible row (when the condition is false, or when an end-of-job symbol is encountered). Process kernels form the "states" of the state-machine while the activation of those states form the state transition. This eliminates the need for software linker-loaders.
  • FIG. 1 is an exemplary diagram of the computing environment in which the present system and method operates.
  • system 100 includes a processor 101 which executes tasks and programs including a kernel management module 1 10, an algorithm management module 105, state machine 124, a kernel execution module 130, and an algorithm execution module 125.
  • System 100 further includes storage 107, in which is stored data including libraries 115 / 120 which respectively store algorithms 117 and kernels 122.
  • Storage 107 may be RAM, or a combination of RAM and other storage such as a disk drive.
  • Module 102 performs a translation of a graphical input functional decomposition diagram to corresponding functions (ultimately, states in a state machine), and stores the translated functions in appropriate libraries in storage area 108.
  • Module 103 generates appropriate finite state machines from the translated functions.
  • management system 145 can request that a kernel/algorithm be executed. It should be noted that the present system is not limited to the specific file names, formats and instructions presented herein. The methods described herein may be executed via system 100, or other systems compatible therewith.
  • Table 1 shows the branching and looping commands used by the C language, for example.
  • each kernel file includes the source code file name concatenated with either the letter P (for process) or the letter C (for control), along with consecutive numbering. Examples of kernel file names are shown below:
  • sourceCodeFile_P1 (), sourceCodeFile_P2(), sourceCodeFile_PN() or
  • Groups of proto-process kernels that are linked together with control flows are considered algorithms.
  • Groups of algorithms that are linked together with control flows are also considered algorithms.
  • mptStartingAddressDetectorQ obtains the addresses, types and sizes of all variables for the data dictionary, described in the following section.
  • a condition contains logical mathematical expressions with variables and constants associated with a control flow.
  • a process transformation accepts, produces and transforms data.
  • a decomposition object function such as "Add keyword list” is selected in a drop-down box 1506, in response to which, a list 1507 of keywords (or other appropriate data) to be associated with the code block is entered in block 1508.
  • keywords or other appropriate data
  • the association between the entered information and the selected object is stored in keyword list 1507 in digital memory (e.g., in data and program storage area 190).
  • Loop values for a process can be set and viewed by selecting a loop symbol 1503, and I/O metadata in data flow can be set and viewed by selecting a corresponding arrow 1504.
  • Figure 16 is an exemplary diagram showing how this candidate list 1610 is generated.
  • a keyword search is performed for keyword matches (indicated by arrow 1605) between a transformation process 1601 (via keyword list 1508) and candidate code blocks 1610, to determine all possible matching code blocks [1601(1 ), 1601(2) ... 1601(n)], which are stored in a first list 1610.
  • FIG. 17 is an exemplary diagram illustrating the present method of determining which code blocks (in list 1610) have looping structures corresponding to a selected process, in order to shrink list 1610 (cull it) and also to determine if test procedures can be run against the various code blocks.
  • I/O and loop information 1704 for the selected transformation process is compared with information 1702 relating to the I/O and loops of the various code blocks in list 1610, as shown in Figure 17, and those code blocks that do not match are removed, leaving a group of remaining code blocks in list 1710.
  • the present system can serialize any input dataset properly and save the data in a cloud or other environment. This data can then be used by any design by selecting the correct file name with the correct keyword list 1901 and field names/types. Standard file calls are treated as if they were database queries.
  • step 1915 the developer enters the database schema for each selected database type, as shown in Table 17 below.
  • a set of queries is attached to any database so that the database can be tested for correctness.
  • An exemplary set of test queries is shown below in Table 19.
  • List 21 10 is further culled, as shown in Figure 22, at step 1935 ( Figure 19), by executing queries 2204 defined by the F-type data store against the remaining files/databases 21 10, as indicated by arrow 2206. If the query return values are incorrect, then those files/databases are culled, to generate list 2210. If there are more than one file/database, then the one that best meets the developer's overall goals is selected.

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Stored Programmes (AREA)

Abstract

A system and method for automatic code-design and file/database-design association. Existing source code is analyzed for process and control elements. The control elements are encapsulated as augmented state machines and the process elements are encapsulated as kernels. The new elements can then have metadata attached (including, a name, I/O method, and test procedures), allowing software code sharing and automatic code/file/database upgrading, as well as allowing sub-subroutine level code blocks to be accessed directly.

Description

METHOD FOR AUTOMATIC EXTRACTION OF DESIGNS FROM STANDARD
SOURCE CODE
BACKGROUND
[0001] Software code sharing is important, as the current state-of-the- art allows for the sharing of subroutines (sometimes called methods) and libraries of subroutines. The term "subroutine" in computer-science typically refers to a named block of code which may have a parameter list and which may have a return value. This block of code can be accessed from within another code block via the use of its name and parameter list. There can be significant amounts of code within the subroutine. Sharing portions of a subroutine is not possible unless the to-be-shared code portion is itself a subroutine. Rather than requiring the entire subroutine be shared, it is more efficient to share only that portion of the subroutine that is required to be shared.
[0002] Furthermore, in prior art software development environments, code and software design quickly become disassociated, thus making difficult the task of maintaining code/design and file/database/design association.
SUMMARY
[0003] The introduction of any new technology requires a bridging mechanism between past solutions and new capability. The present method forms a bridge between conventional programming and an advanced
programming method by analyzing existing source code for process and control elements, then encapsulating the control elements as augmented state machines and process elements as kernels. The new elements can then have metadata attached, allowing software code sharing at the sub-subroutine level and automatic code/file/database upgrading, thus transforming the older technology into advanced technology.
[0004] Automatic code-design and file/database-design association allows a developer to simply perform the design, while locating and associating code or files/databases becomes automatic. Contrast this with source-code sharing models that require the developer to first find, then analyze, and finally associate blocks of code or locate and verify files and databases. Once code/files/databases and design can be reliably associated, then new, better code/files/databases can also be automatically located and used to replace existing code blocks, effectively allowing automatic code/file/database upgrading.
BRIEF DESCRIPTION OF THE DRAWINGS
[0005] Figure 1 is a system diagram showing an exemplary
environment in which the present method operates;
[0006] Figure 2 is an exemplary diagram showing branching structures binding code segments in a function;
[0007] Figure 3 is a flowchart of a high-level exemplary algorithm for automatically extracting designs from standard source code;
[0008] Figure 4 is a flowchart of a detailed exemplary algorithm for automatically extracting designs from standard source code;
[0009] Figure 5 shows an example of a simplified level 0.0
decomposition;
[0010] Figure 6 shows an example of a translation of pass-by-value to the present decomposition format;
[0011] Figure 7 and Figure 8 illustrate examples of functional decomposition in accordance with the present method;
[0012] Figure 9 is an exemplary decomposition diagram showing three decomposition levels;
[0013] Figure 10 and Figure 11 show exemplary relationships between control transforms, process transforms, data stores and terminators;
[0014] Figure 12 shows an exemplary decomposition carried to a McCabe code block level;
[0015] Figure 12A is a flowchart showing an exemplary set of high- level steps performed in sharing sub-subroutine level software;
[0016] Figure 13 is a flowchart showing an exemplary set of steps performed in associating code/files/databases and design;
[0017] Figure 14 is a computer screen display 1400 showing an example of how metadata can be associated with code blocks or kernels; [0018] Figure 15 is an exemplary diagram showing an initial step in one method of associating metadata with a transformation process using a computer-implemented procedure;
[0019] Figure 16 is an exemplary diagram showing how a candidate list is generated;
[0020] Figure 17 is an exemplary diagram illustrating the present method of determining which code blocks have looping structures corresponding to a selected process;
[0021] Figure 18 is an exemplary diagram illustrating the present method of determining which code blocks (in list 1710) provide correct results executing specified test procedures;
[0022] Figure 19 is a flowchart 1900 showing an exemplary set of steps performed in automatically attaching files and databases to design elements;
[0023] Figures 20, 21, and 22 are exemplary diagrams showing a process of automatically associating databases and design elements.
DETAILED DESCRIPTION
Definitions
[0024] The following terms and concepts used herein are defined below.
[0025] Data transformation - A data transformation is a task that accepts data as input and transforms the data to generate output data.
[0026] Control transformation - A control transformation evaluates conditions and sends and receives control to/from other control transformations and/or data transformations.
[0027] Control bubble - A control bubble is a graphical indicator of a control transformation. A control bubble symbol indicates a structure that performs only transitions and does not perform processing.
[0028] Process bubble - A process bubble is a graphical indicator of a data transformation.
[0029] Control Kernel - A control kernel is a software routine or function that contains only the following types of computer language constructs: declaration statements, subroutine calls, looping statements (for, while, do, etc), decision statements (if- -else, etc.), arithmetic statements (including increment and decrement operators), relational operators, logical operators, type declarations and branching statements (goto, jump, continue, exit, etc.).
[0030] Process Kernel - A process kernel is a software routine or function that contains the following types of computer language constructs:
assignment statements, looping statements, arithmetic operators (including increment and decrement operators), and type declaration statements
Information is passed to and from a process kernel via global memory using RAM.
[0031] Function - a software routine, or more simply an algorithm that performs one or more transformations.
[0032] Node - A node is a processing element comprised of a processing core, or processor, memory and communication capability.
[0033] Metadata - Metadata is information about an entity, rather than the entity itself.
[0034] MPT Algorithm - An MPT algorithm comprises control kernels, process kernels, and MPT algorithms.
[0035] MPT Data Transfer Model - The MPT data transfer model comprises a standard model for transferring information to/from a process kernel. The model includes a key, a starting address, a size, and a structurejndex. The key is the current job number, the starting address is the information starting address, the size is the number of bytes the data construct uses, and the structurejndex points to the struct definition that is used by the process kernel to interpret the memory locations accessed.
[0036] MPT State Machine - An MPT state machine is a two- dimensional matrix which links together all relevant control kernels into a single non-language construct that calls process kernels. Each row in a MPT state machine consists of an index, the subroutine to be called (or the symbol "NOP"), a conditional statement, an index to the next accessible row (when the condition is true, or an end-of-job symbol is encountered), and an index to the next accessible row (when the condition is false, or when an end-of-job symbol is encountered). Process kernels form the "states" of the state-machine while the activation of those states form the state transition. This eliminates the need for software linker-loaders.
[0037] State Machine Interpreter - for the purpose of the present document, a State Machine Interpreter is a method whereby the states and state transitions of a state machine are used as active software, rather than as documentation.
Computing Environment
[0038] Figure 1 is an exemplary diagram of the computing environment in which the present system and method operates. As shown in Figure 1 , system 100 includes a processor 101 which executes tasks and programs including a kernel management module 1 10, an algorithm management module 105, state machine 124, a kernel execution module 130, and an algorithm execution module 125. System 100 further includes storage 107, in which is stored data including libraries 115 / 120 which respectively store algorithms 117 and kernels 122. Storage 107 may be RAM, or a combination of RAM and other storage such as a disk drive. Module 102 performs a translation of a graphical input functional decomposition diagram to corresponding functions (ultimately, states in a state machine), and stores the translated functions in appropriate libraries in storage area 108. Module 103 generates appropriate finite state machines from the translated functions.
[0039] System 100 is coupled to a host management system 145, which provides management of system functions, and issues system requests. Algorithm execution module 125 initiates execution of kernels invoked by algorithms that are executed. Algorithm execution system 135 may comprise any computing system with multiple computing nodes 140 which can execute kernels stored in system 100. Management system 145 can be any external client computer system which requests services from the present system 100. These services include requesting that kernels or algorithms be added/changed/deleted from a respective library within the current system.
[0040] The software for system services that are indicated below as being initiated by various corresponding 'buttons' is stored in data and program storage area 190. [0041] In addition, management system 145 can request that a kernel/algorithm be executed. It should be noted that the present system is not limited to the specific file names, formats and instructions presented herein. The methods described herein may be executed via system 100, or other systems compatible therewith.
Software functional structure
[0042] Standard software is constructed using functions (sometimes also called methods, routines, or algorithms) and code segments to instantiate application concepts. A code segment is comprised of one or more code statements. Functions typically contain code segments bound together with branching or looping structures, as illustrated in the exemplary diagram of Figure 2. As shown in Figure 2, code segment 0 (ref. no. 2010) has two branches, 201 and 202, which respectively branch to code segments 201 (2010) and 202 (2012). Code segment 3 (2013) includes a loop 203. In the Figure 2 example, code segment 1 and code segment 2 both transfer execution to linear code segment 4 (2014).
[0043] Table 1 , below, shows the branching and looping commands used by the C language, for example.
TABLE 1
Figure imgf000007_0001
[0044] There are two additional types of statements in the C language: storage declaration and operator, as respectively shown in Table 2 and Table 3, below. Note that although C language code is shown in all examples, any programming language can be analyzed similarly.
TABLE 2
Figure imgf000008_0001
Figure imgf000009_0001
Register Unsigned Double 8
TABLE 3
Operator types C-Language Assignment Operators
Unary Operator *
Unary Operator &
Unary Operator -
Unary Operator !
Unary Operator ++ lvalue
Unary Operator ~ lvalue
Unary Operator Lvalue ++
Unary Operator Lvalue - Unary Operator Return
Unary Operator (type-name) expression
Unary Operator Sizeof expression
Unary Operator Sizeof (type-name)
Multiplicative Expression * expression
Operator
Multiplicative Expression 1 expression
Operator
Multiplicative Expression % expression
Operator
Additive Operator Expression + expression
Additive Operator Expression - expression
Shift Operator Expression « expression
Shift Operator Expression » expression
Rational Operator Expression < expression
Rational Operator Expression > expression
Rational Operator Expression <= expression
Rational Operator Expression >= expression
Equality Operator Expression == expression
Equality Operator Expression != expression
Bitwise Operator Expression & expression
Bitwise Operator Expression Λ expression
Bitwise Operator Expression | expression
Bitwise Operator Expression && expression
Bitwise Operator Expression \\ expression
Assignment Operator Lvalue = expression
Assignment Operator Lvalue += expression
Assignment Operator Lvalue -= expression
Assignment Operator Lvalue *= expression
Assignment Operator Lvalue /= expression
Assignment Operator Lvalue %= expression
Assignment Operator Lvalue »= expression
Assignment Operator Lvalue «= expression
Assignment Operator Lvalue &= expression
Assignment Operator Lvalue A= expression
Assignment Operator Lvalue] = expression
[0045] Figure 3 is a high-level exemplary algorithm 300 showing the present method for automatically extracting designs from standard source code. As shown in Figure 3, at step 305, the branching and looping commands are identified in a code segment 200 of interest. With the branching and looping commands identified, the code segments are extracted as process kernels 322 without metadata, at step 310. Control kernels 331 are then extracted at step 315. At step 320, the control kernels 331 are then encapsulated as MPT state machines 321 and the process kernels are encapsulated as process kernels 322. The extracted information is treated as an 'MPT algorithm' 301 .
[0046] At step 325, metadata 360 is then associated with these newly- created control and process design elements. The metadata can be used to associate the newly extracted design elements with code other than the original code used in the extraction process, as described further below.
Example source code for MPT algorithm
[0047] Figure 4 is a detailed exemplary algorithm 400 for automatically extracting designs from standard source code. As shown in Figure 4, initially, a system user locates the desired source code segment 401 in the file containing the computer program whose design is to be extracted. An example of a C language code segment 401 is shown below in Table 4. This example is used throughout the remainder of this document.
TABLE 4
#include <stdlib.h>
#include <stdio.h>
#define BUFFERSIZE 1024*1024
typedef struct {
unsigned int bufferl [BUFFERSIZE] ;
unsigned int buffer2 [BUFFERSIZE] ;
char test [12] ;
} sample_buffer ;
typedef struct {
int testl
int test2
int test3
} sample_bufferl ;
typedef struct {
sample_buffer *sample_buffer2 ;
char test[12] ;
} buffer_info;
int main(int argc, char *argv[]) {
unsigned int index;
char test_string [ 10] ;
buffer_info *bufferinfo;
sample_bufferl *sampleinfo;
if (( bufferinfo = (buffer_info *) malloc (
sizeof (buffer info) ) ) == NULL) { printf ( "ERROR ALLOCATING bufferinfo\n" ) ;
goto cleanup2;
}
if (( bufferinfo->sample_buffer2= (sample_buffer *) malloc ( sizeof ( sample_buffer) ) ) == NULL) {
printf ("ERROR ALLOCATING bufferinfo->mybuffer\n" ) ; exit;
}
if (( sampleinfo = (sample_bufferl *) malloc (
sizeof ( sample_bufferl ) ) ) == NULL) {
printf {"ERROR ALLOCATIONS sampleinfo\n" ) ;
goto cleanupl;
}
for (index = 0; index >= sizeof (buffer_info ) ; index++) { Bufferinfo->sample_buffer2->bufferl [index] = index; Bufferinfo->sample_buffer2->buffer2 [index] = index +
1 ;
}
bufferinfo->sample_buffer2->test = "testtesttest";
bufferinfo->test = "testtesttest";
sampleinfo->testl = 1;
sampleinfo->test2 = 2;
sampleinfo->test3 = 3;
cleanupl :
free (bufferinfo->mybuffer)
cleanup2 :
free (bufferinfo)
return ( 0 ) ;
Extracting Subroutines
[0048] All procedural computer languages have the concept of subroutine. A subroutine is a sequence of instructions for performing a particular task. This sequence can be called from multiple places within a computer program. Subroutines can call other subroutines, including themselves (called recursion). Subroutines that are called primarily for their return value are known as functions. In object-oriented programming, subroutines or functions with limited execution scope are called methods. Because programs can call subroutines which can call other subroutines, the hierarchical decomposition structure of a computer program is obtained by tracking the subroutine calls in that program. In present system, a single linear transformation having no process flow is called a control kernel. Multiple process kernels connected via flow control are called algorithms. Algorithms can contain other algorithms as well as kernels. This means that an algorithm is equivalent to a subroutine. [0049] As shown in Figure 4, at step 405 (in a C Language program, for example), the "Main" routine (or other source code segment of interest) 401 is first searched for any user-defined subroutines (e.g., user-defined functions and methods). Next, each subroutine is placed in its own file (along with any required header files). Each subroutine file is then edited to have an ".AUG" extension to create a corresponding .AUG file 403. A tracking file (".TRK") 404 is then created to track the hierarchy of the subroutines. In one embodiment, the .TRK file has the following format:
Main
Level 1 Subroutine Name
Level 2 Subroutine Name
... Level N Subroutine Name
... Level N Subroutine Name
... Level N Subroutine Name
Level 2 Subroutine Name
Level 2 Subroutine Name
Level 1 Subroutine Name
Level 1 Subroutine Name
Extracting variables
[0050] Almost all control structures require accessing variables, pointers, and/or arrays. The control (looping) statement below is an example:
For (index = 0; count >= sizeof(bufferjnfo); index++)
[0051] The statement above requires that the variable index be accessed. Accessing variables, pointers, and arrays requires determining their starting address and type. Therefore, at step 410, the starting address and type is determined for each of these entities.
[0052] In the case of "bufferjnfo", it also requires running "malloc()" and "sizeofO" functions prior to running the entire code segment to determine the number of bytes used by the "bufferjnfo" data structure. [0053] In the C and C++ languages, the use of the following commands creates the required dynamic memory allocation: "malloc ()", "calloc ()", "realloc ()", and "new type ()". In addition, there are arrays that are dynamically allocated at runtime. All of these structures dynamically allocate heap space. Thus, for every command that dynamically allocates memory, the required dynamic memory allocation is created for each routine for each program thread. The C language also has the ability to take the address of any variable and write any value starting at that address.
[0054] Table 5, below, shows the extracted variables, constants, structures, and #defines (all of which are highlighted) for the example code segment shown in Table 4. This table is known as the Variables and Constants Table or VCT 412.
TABLE 5
Figure imgf000014_0001
[0055] The variables, pointers, and arrays shown in Table 5 are constructed variables. Constructed variables are all possible variables that can be constructed using the structure definitions given. Not all constructed variables are used in the present sample code, but all are possible.
[0056] Before variables can be extracted, the "#defines" and "structs" are extracted by parsing these elements from the source code, at step 415, wherein the source code file is opened and scanned for any "#defines" or "structs". Any found items are placed into a file 402 with the same name as the source code file but with an ".ETR" file name extension. In Table 6, below, the found "#defines" and "structs" are indicated by italics.
TABLE 6
#include <stdlib.h>
#include <stdio.h>
#define BUFFERSIZE 1024*1024
typedef struct {
unsigned int buffe 1 [BUFFERSIZE] ;
unsigned int buffer2 [BUFFERSIZE] ;
char test [10] ;
} saxaple_buff r;
typedef struct {
int testl
int test2
int test3
} sample_bufferl;
typedef struct {
sample_buff r *sample_buffer2;
char test [10] ;
} buffer_info;
int main(int argc, char *argv[]) {
unsigned int index;
char test__string [10] ;
buffer_info *bufferinfo;
sample_bufferl *sampleinfo;
if (( bufferinfo = (buffer_info *) malloc (
sizeof (buffer_info) ) ) == NULL) {
printf ("ERROR ALLOCATING bufferinfo\n" ) ;
goto cleanup2;
}
if (( bufferinfo->sample_buffer2= (sample_buffer *} malloc ( sizeof (sample_buffer) ) ) == NULL) { printf ("ERROR ALLOCATING bufferinfo- >sample_buffer\n" ) ;
exit;
}
If (( sampleinfo = (sample_bufferl *) malloc(
sizeof (sample_bufferl ) ) ) == NULL) {
printf ("ERROR ALLOCATIONS sampleinfo\n" ) ;
goto cleanupl;
}
for (index = 0; index >= sizeof (buffer_info) ; index++) {
Bufferinfo->sample_buffer2->bufferl [ index] = index; Bufferinfo->sample_buffer2->buffer2 [ index] = index +
1;
}
bufferinfo->sample_buffer2->test = "testtesttest";
bufferinfo->test = "testtesttest";
sampleinfo->testl = 1;
sampleinfo->test2 = 2;
cleanupl :
free (bufferinfo->mybuffer ) ;
cleanup2 :
free (bufferinfo) ;
return ( 0 ) ;
}
[0057] Table 7, below, shows the placement of a function that is used within the source code file of the example code to update the "ETR" file 402. In the present example, the function "mptStartingAddressDetector()" (or equivalent), highlighted in bold text below, is used to determine the starting address of the "mallocO'ed" variables. The starting addresses are then stored by the system. The newly augmented source code file 403 uses the same name as the source code segment file 401 with the file extension changed to ".AUG".
[0058] At step 425, control and memory allocation statements are separated by modifying the "if control statements that contained "malloc()" commands by separating the "mallocQ" function from each "if" statement. TABLE 7 AUGMENTED SOURCE CODE FILE
ttinclude <stdlib.h>
#include <stdio.h>
#define BUFFERSIZE 1024*1024
typedef struct {
unsigned int bufferl [BUFFERSIZE] ;
unsigned int buffer2 [BUFFERSIZE] ;
char test[10]
} sample_buffer ;
typedef struct {
int testl
int test2
int test3
} sample_bufferl ;
typedef struct {
sample_buffer *sample_buffer2 ;
char test [10] ;
} buffer_info;
int main (int argc, char *argv[]) {
char *fileName;
FILE *fileNamePointer ;
stropy (mptFile,argv[0] ) ;
strcat (mptFile , " .ETR") ;
mptStartingAddressStart (filename , fileNamePointer) ;
unsigned int index;
mptStartingAddressDetector (fileNamePointer , "index" , (uint) &index) ;
char test_string [10] ;
mptStartingAddressDetector (fileNamePointer ,
"test_str;Lng", (uint) &test_string) ;
buffer_info *bufferinfo;
sample_bufferl *sampleinfo;
bufferinfo = (buffer_info *) malloc (sizeof (buffer_info) ) ; if (bufferinfo = NULL) {
printf ( "ERROR ALLOCATING bufferinfo\n" ) ; goto cleanup2;
}
mptStartingAddressDetector ( f leNamePointe ,
"bufferinfo",
(uint) bufferinfo) ;
mptStartingAddressDetector ( fileNamePointer ,
"bufferinfo->test" ,
(uint) bufferinfo->test) ;
bufferinfo->sample_buffer2= (sample_buffer *) malloc ( sizeof (sample_buffer) ) ;
if (bufferinfo->sample_buffer2 == NULL) { printf ( "ERROR ALLOCATING bufferinfo- >sample_buffer2\n" ) ;
mptStartingAddressEnd (fileNamePointer) ;
exit ( ) ;
}
mptStartingAddressDetector ( fileNamePointer ,
"bufferinfo->sample_buffer2" ,
(uint) bufferinfo->sample_buffer2) ;
mptStartingAddressDetector ( fileNamePointer ,
"bufferinfo->sample_buffer2->bufferl [] ", (uint) bufferinfo->sample_buffer2-
>bufferl) ;
mptStartingAddressDetector ( fileNamePointer ,
"bufferinfo->sample_buffer2->buffer2 [] " , (uint) bufferinfo->sample_buffer2-
>buffer2) ;
mptStartingAddressDetector ( fileNamePointer ,
"bufferinfo->sample_buffer2->test" , (uint) buf rinfo->sample_buffer2->test) sampleinfo =
(sample_bufferl*)malloc(sizeof (sample_bufferl) ) ;
mptStartingAddressDetector ( fileNamePointer ,
"sampleinfo" ,
(uint) sampleinfo) ;
if (sampleinfo == NULL) {
printf ("ERROR ALLOCATIONS sampleinfo\n" ) ; goto cleanup!.;
}
index = 0 ;
MPTForLoopl :
If (index < sizeof (buffer_info) {
bufferinfo->sample_buffer2->bufferl [index] = indexbufferinfo->sample_buffer2->buffer2 [ index] = index + 1 ;
index++ ;
goto MPTForLoopl ;
}
bufferinfo->sample_buffer2->test = "testtesttest" ; bufferinfo->test = "testtesttest";
sampleinfo->testl = 1;
sampleinfo->test2 = 2;
cleanupl;
free (bufferinfo->sample_buffer2 ) ;
cleanup2 :
free (bufferinfo) ;
mptStartingAddressEnd (fileNamePointer) ;
return ( 0 ) ;
}
mptStartingAddressStart (char *fileName, File *mptFilePointer) { FILE *fopen() ;
if (fileName = NULL) {
printf ("illegal file name") ;
exit (10000) ; }
else {
if (mptFilePointer = fopen (mptFile , "a")== NULL) {
printf ("Cannot open file") ;
exit (10001) ;
}
}
return (0) ;
}
mptStartingAddressDetector (File *fileNamePointer, char
* ariableName, uint address)
{
fprintf (fileNamePointer, "variable Name: "%s" Address:
"%u, ariableName, address) ;
return (0) ;
}
mptStartingAddressEnd (File *fileNamePointer) {
fclose (fileNamePointer) ;
}
[0059] Next, "for loops" are converted into an "if... goto" form, at step 430. The "if... goto" form exposes the process kernel and a control vector.
[0060] At step 435, at the beginning of the code segment 401 , the function "mptStartingAddressStart()" is inserted into the code segment 401. When the "mptStartingAddressStartO" is then called, it opens the ETR file with the same name as the source code file, but with the file extension set to "ETR". Prior to any program exit or return call, the "mptStarting AddressEndO" function is called, which closes the ETR file. See table 5. All language-defined
functions/methods are treated as part of the language, rather than as user defined functions or methods. In the case of the C language, this means that code segments are not extracted from the function types listed in Table 8, below, which shows the C language functions:
TABLE 8
Figure imgf000019_0001
Figure imgf000020_0001
Figure imgf000021_0001
Extracting Process and Control Kernels
[0061] At step 440, the present system accesses the ".AUG" file 403 and creates a set of kernel files. Each kernel file includes the source code file name concatenated with either the letter P (for process) or the letter C (for control), along with consecutive numbering. Examples of kernel file names are shown below:
sourceCodeFile_P1 (), sourceCodeFile_P2(), sourceCodeFile_PN() or
SCF_P1 (), SCF_P2(),... ,SCF_PN()
sourceCodeFile_C1(), sourceCodeFile_C2(), sourceCodeFile_CN() or
SCF_P1 (), SCF_C2(),... ,SCF_CN ()
[0062] Each added kernel indicates that it has completed, using the MptReturn kernel tracking variable. In an exemplary embodiment, this tracking variable is a sixty-four bit integer variable that saves the same process number as is placed on the kernel file name. The kernel number is placed prior to exiting the kernel. The "MptReturn" kernel variable is used by the MPT state machine to perform linear kernel transitions. The structural difference between a kernel and a function (in the C language) occurs at the parameter level.
[0063] A function has a parameter list, that is, an ordered group of input/output variables used by other functions and the main program to communicate with the target function. The information is communicated using either pass-by-reference or pass-by-value techniques. The only difference between the two techniques is that a copy of the data is created and made accessible when the pass-by-value technique is used, while a pointer to the actual data location is used during pass-by-reference.
[0064] The ordered-list nature of the parameter list adds a barrier to using a particular function. A kernel uses a parameter set, not a parameter list, so the order of the parameters makes no difference. Before a kernel can be made, the functions that will become the kernels must be generated. These functions are called proto-process kernels, and the example in Table 9, below, shows how they are extracted.
TABLE 9
#include <stdlib.h>
#include <stdio.h>
#define BUFFERSIZE 1024*1024
typedef struct {
unsigned int bufferl [BUFFERSIZE] ;
unsigned int buffer2 [BUFFERSIZE] ;
char test [10] ;
} sample_buffe ;
typedef struct {
int testl
int test2
int test3
} sample_bufferl ;
typedef struct {
sample_buffer *samplebuffer2;
char test [10] ;
} buffer_info;
int_64 MptLastReturnedKernal = 0;
int main (int argc, char *argv[]) {
unsigned int index;
mptStartingAddressDetector (arg [0] ," .ETR" , "index", &index) ;
char test_string [ 10 ] ;
mptStartingAddressDetector (arg [0] , " . ETR" ,
"test_string" , &test_string) ;
buffer_info *bufferinfo;
sample__bufferl *sampleinfo;
if (MptReturn == 0) SCF_P1 (bufferinfo) ;
if (bufferinfo = NULL) {
printf ("ERROR ALLOCATING bufferinfo\n" ) ;
goto cleanup2;
}
mptStartingAddressDetector ( argv[0] ,
" .ETR" , "bufferinfo" ,
bufferinfo) ;
mptStartingAddressDetector ( argv[0] ,
".ETR",
"bufferinfo->test" ,
bufferinfo->test) ;
if (MptReturn == 1) SCF_P2 (bufferinfo->sample_buffer2) ; if (bufferinfo->sample_buffer2 == NULL) {
printf ( "ERROR ALLOCATING bufferinfo- >sample_buffer2\n" ) ;
exit;
}
mptStartingAddressDetector ( argv [0] ,
".ETR",
"bufferinfo->sample_buffer2" ,
bufferinfo->sample_buffer2) ;
mptStartingAddressDetector ( argv[0] ,
" . ETR" ,
"bufferinfo->sample_buffer2->bufferl [ ] " , bufferinfo->sample_buffer2->bufferl) ; mptStartingAddressDetector ( argv [0] ,
" . ETR" ,
"bufferinfo->sample_buffer2->buffer2 [] ", bufferinfo->sample_buffer2->buffer2) ; mptStartingAddressDetector ( argv [0] ,
" . ETR" ,
"bufferinfo->sample_buffer2->test" , bufferinfo->sample_buffer2->test) ;
if (MptReturn == 2) SCF_P3 (sampleinfo) ;
mptStartingAddressDetector ( argv[0] ,
" . ETR" ,
"sampleinfo" ,
sampleinfo) ;
if (sampleinfo == NULL) {
printf ("ERROR ALLOCATIONS sampleinfo\n" ) ; goto cleanupl;
}
If MptReturn == 4) SFC_P4 (index) ;
MPTForLoopl :
If (index < sizeof (buffer_info) {
If (MptReturn == 4) SFC_P5 (buf'ferinfo, index);
goto MPTForLoopl;
}
If (MptReturn == 5) SFC_P6 (bufferinfo, sampleinfo) ;
cleanupl :
free (bufferinfo->sample_buffer2 ) ;
cleanup2 :
free (bufferinfo) ;
return ( 0 ) ;
}
int SCF_P1 (buffer_info *bufferinfo) (
bufferinfo = (buffer_info *) malloc (sizeof (buffer_info) ) ; MptReturn = 1; }
int SCF_P2 (sample_buffer *) bufferinfo->sample_buffer2)
bufferinfo->sample_buffer2 = (sample_buffer *) malloc ( sizeof (sample_buffer) ) ;
MptReturn = 2;
i
int SCF_P3 (sample_bufferl *sampleinfo) {
sampleinfo =
(sample_bufferl *)malloc (sizeof (sample_bufferl) ) ;
MptReturn = 3;
}
int SCF_P4 (int index) {
index = 0;
MptReturn = 4;
}
int SCF__P5 (buffer_info *bufferinfo, int index) {
bufferinfo->sample_buffer2->bufferl [index] = index;
bufferinfo->sample_buffer2->buffer2 [index] = index + 1; index++;
MptReturn = 5;
}
int SCF_P6 (buffer_info *bufferinfo, sample_bufferl *sampleinfo) {
bufferinfo->sample_buffer2->test = " fcest testtes t ";
bufferinfo->test = "testtesttest";
sampleinfo->testl = 1;
sampleinfo->test2 = 2;
MptReturn = 6;
[0065] Once the proto-process kernels are identified, their parameter lists are transformed into a parameter set, completing the kernel extraction process.
[0066] The proto-process kernel parameters lists are converted into parameter sets as follows:
[0067] 0) The proto-kernel is named as follows. If the proto-kernel is a subroutine or method then the proto-kernel name is the subroutine or method name. If the proto-kernel is equivalent to a McCabe code block then the name given is a concatenation of the source code file name an underscore, a P (for process) and a number representing the order that the kernel was created.
[0068] 1 ) All pass-by-value and pass-by-reference parameters are converted to input parameters and assigned to an input dataflow associated with the proto-kernel name. [0069] 2) All pass-by-reference parameters are converted to output parameters and assigned to an output dataflow associated with the proto-kernel name.
[0070] 3) All non-parametric pass-by-reference variables are converted to input parameters and assigned to an input dataflow associated with the proto- kernel name.
[0071] 4) All non-parametric pass-by-reference variables are also converted to output parameters and assigned to an output dataflow associated with the proto-kernel name.
[0072] 5) Any branch statement is associated with an input control flow whose name is composed of the letter "C" concatenated with a number representing the order that the control flow was named.
[0073] 6) The conditional portion of the control statement becomes the transfer condition of the control flow.
[0074] 7) A "goto" statement consists of a branch and a target code block starting position. The system encountering a "goto" statements causes a "after process xxx" condition to be placed on the control flow of the code block represented by the target code block starting position.
[0075] Groups of proto-process kernels that are linked together with control flows are considered algorithms. Groups of algorithms that are linked together with control flows are also considered algorithms.
[0076] All parameters are now associated with input and output dataflows. All input and output data-flows are associated with kernels and algorithms.
[0077] At step 445, kernels are transformed into kernel processes (they do not decompose) and, at step 450, algorithms are transformed into algorithm type processes (they do decompose). These processes are used to generate a high level design, such as that shown in the graph in Figure 7 (described below). All kernels and algorithms are now associated with processes.
[0078] At step 455, kernel and algorithm code is extracted and saved as components each comprising separately executable code 460 and associated metadata 360 (e.g., keyword list 1407 (Fig. 14), etc.), if any. This separately executable code 460 can be accessed by matching its input/output parameter types, and keyword lists to design processes with the input/output parameter types and keyword list 1507 (Fig. 15). The extracted kernel and algorithm code are called code components or more simply components.
[0079] If a parameter resolves to an address, then that parameter represents a pass-by-reference. In the "C" programming language this is indicated by an asterisk in the parameter definition. Since a pass-by-reference requires that the data be copied to separate data store variables, the
mptStartingAddressDetectorQ function obtains the addresses, types and sizes of all variables for the data dictionary, described in the following section.
[0080] Figure 5 is an example of a simplified level 0.0 decomposition, generated as the context level of Table 9. In the "C" programming language, the "Main" program always represents the program as a whole and its starting point, that is the context level of a decomposition diagram. As shown in Figure 5, a command line instruction (terminator 505) invokes process 'Main 0' 504, receives argc & argv 502 data, and returns any allocation errors 503.
[0081] Figure 6 shows the example code of Table 4 translated into decomposition diagrams. In the present high level design model, pass-by- reference is equivalent to a parameter simultaneously appearing on an input and an output dataflow. Figure 6 represents the decomposition of Main, that is, decomposition of all of the code blocks and user subroutines which occur within the scope of Main. All of the Figure 6 data and control flows come from the parameters and conditions found in Main. The data stores originate as data structures within Table 9. As shown in Figure 6, the tuple numbers found on the processes always start with a zero on decomposition level 0. When the level 0 bubble is opened, the bubble shows the contents at level 0.0. Level 0.0 contains the following process and control elements: 0.0 control bubble, 1.0 process bubble, 2.0 process bubble, etc. When one of those level 0.0 process bubbles is opened, the decomposition continues with 1.1.0, 2.1.0, etc., until all levels are accessed.
[0082] All of the interface, data movement, data storage, and control found in the original software are represented in the example decomposition diagrams. As can be seen, the example 0.0 decomposition shown in Figure 6 is visually complex. Part of that visual complexity is the fact that all variables are shown on each data/control flow. Next, the data/control flows are assigned a simple name, with the variable names associated with that flow name.
[0083] Figure 7 and Figure 8 illustrate examples of functional decomposition in accordance with the present method. Substituting aliases for flow names gives the simplified graphic view shown in the example of Figure 7. The purpose of the simplified graphic view is to decrease the visual complexity of the graph, making it more understandable while retaining all relevant information.
[0084] If an input/output parameter uses pass-by-value technology, the receiving routine has an additional kernel attached called, for example,
"MPTCopyValue" which performs the pass-by-value copy, as shown in the decomposition example 800 of Figure 8. Note that the double bubble shown in Figure 8 for "MptCopyValue" means that this is shared code. Similarly, the double lines on the "mptReturn" store mean that the store is global in nature. Although the transformation may appear more complex, it is not; what is shown more accurately describes what actually occurs when pass-by-value is performed.
Sharing Sub-Subroutine Level Software
[0085] If a system design is functionally decomposed until it reaches the point where the lowest decomposition level consists of only the "Basic Blocks" (herein called McCabe code blocks) of a program as defined in McCabe's cyclomatic complexity analysis, and as described above with respect to Figure 4, then it becomes possible to add metadata (including, e.g., a name, I/O method, and test procedures) to those code blocks allowing them to be accessed directly. Since these code blocks do not have parameters, the associated variables must be accessed directly.
Decomposition to McCabe Code Blocks
[0086] Figure 9 is an exemplary decomposition diagram showing three decomposition levels 901 , 902, 903, and including terminators (T1 , T2), control transformations (dashed circles), and process transformations (solid circles), and data stores.
[0087] The following are decomposition rules of the present method, which are used to generate the Figure 9 diagram: [0088] - A control transformation evaluates conditions, sends invocations and receives returns from those invocations.
[0089] - A condition contains logical mathematical expressions with variables and constants associated with a control flow.
[0090] - Control transformations contain non-event control items which are conditions that change the sequence of the execution of a program (if-then- else, go to, function calls, function returns) and event control items which are interrupts.
[0091] - Variables used by a control transformation can only be used in a condition.
[0092] - A control transformation can have only one selection condition per transformation.
[0093] - There can be, at most, one control transformation per decomposition level.
[0094] - A process transformation accepts, produces and transforms data.
[0095] - Process transformations decompose (analogous to functional decomposition diagrams) into less complex transformations.
[0096] - A process transformation cannot directly call another process transformation on the same or higher decomposition level.
[0097] - Data can only be passed to a process transformation using a data store, not directly.
[0098] - The direct return from a transformation can be used as a condition.
[0099] - Terminators represent extra-system activity; typically a terminator symbol represents a display screen or another separate system.
[0100] Figure 10 and Figure 11 show exemplary relationships between control transforms, process transforms, data stores and terminators, in accordance with the above decomposition rules. In Figures 10 and 1 1 , control transforms 1001 are indicated by a dashed circle, process transforms are indicated by a non-dashed circle, and terminators are indicated by a rectangle.
[0101] Figure 12 shows an exemplary decomposition carried to a McCabe code block level. When a transformation can no longer decompose, then that lowest-level process transformation can be associated with a code block (linear code with no control structure, equivalent to a McCabe code block), e.g., bubble 1.2 in Figure 12. Decomposition terminates when an attempt at decomposition results in a single process (transformation) at the next lower decomposition level, as indicated by arrow 1209. In Figure 12, completed decompositions are indicated by arrows 1202.
[0102] Figure 12A is a flowchart 1200 showing an exemplary set of high-level steps performed in sharing sub-subroutine level software. As shown in Figure 12A, at step 1205, decomposition to McCabe code blocks insures that all possible code blocks are accessible. At step 1207, data from the data flows (solid arrows) entering and exiting a McCabe code block is then saved as associated metadata which describes the input/output parameters that are used to match design processes with kernel and/or algorithm code. Thus the data in these data flows behaves as metadata. At step 1210, a unique name is added to the process transformation at the McCabe code block level (called an MPT process), and at step 1215, the input/output data flow information for that code block is associated with the code block, allowing all code blocks, i.e., sub- subroutines, to be shared (step 1220), eliminating the overhead of only sharing entire subroutines.
Automatic Code/File/Database Search/Test/Design Association Metadata
[0103] For automatic association of code with database
search/test/design in accordance with the present method, code-associated metadata comprises a keyword list 1407 for each McCabe code block and a list of all inputs and outputs to/from the code block. Similarly, in an exemplary embodiment, each decomposition design element (process bubble) also has an associated keyword list, input/output list (from the design), and associated test procedures.
[0104] Figure 13 is a flowchart 1300 showing an exemplary set of steps performed by the present system in associating code/files/databases and corresponding design. Operation of the present system is best understood by viewing Figures 14 - 18 (described below) in conjunction with Figure 13. Figure 14 is a computer screen display 1400 (generated, e.g., by processor 101 ) showing an example of how metadata can be associated with code blocks or kernels. As shown in Figure 14, exemplary screen display 1400 includes user- selectable buttons that invoke functions (executed on processor 101 , for example) including browsing of code blocks 1410 (via 'browse code blocks' button 1408), allowing entry and viewing of keywords (via 'keywords' button 1406), setting and viewing loop values (via 'loop values' button 1404), and viewing kernel I/O parameters (via button 1402). As shown in Figure 13, at step 1305, keyword metadata is associated with a code block. In one example, a 'keywords' button 1406 is selected, which causes a keyword drop-down box 1405 to be displayed, in response to which, a list 1407 of keywords (or other appropriate data) and optional test procedures, to be associated with the selected code block, is entered in box 1405. Keyword list 1407 thus provides the correspondence between code blocks and keywords, and may be stored in storage area 190.
[0105] Figure 15 is an exemplary diagram showing an initial step in one method of associating metadata with a transformation process using a computer-implemented procedure. Block 1501 shows a legend indicating exemplary types of graphical indicators used by the present system to indicate decomposition objects. After a transformation process of interest is located and selected, keyword metadata is associated with the transformation process through a graphically-displayed list 1506 (on screen 1500) of keywords and test procedures (such as the process indicated by bubble 1502 in Figure 15), at step 1310 (Figure 13),
[0106] Once a code block has been displayed on screen 1500 in block 1509, a decomposition object function, such as "Add keyword list", is selected in a drop-down box 1506, in response to which, a list 1507 of keywords (or other appropriate data) to be associated with the code block is entered in block 1508. When the user has completed entering the desired information (such as a group of keywords), the association between the entered information and the selected object is stored in keyword list 1507 in digital memory (e.g., in data and program storage area 190). Loop values for a process can be set and viewed by selecting a loop symbol 1503, and I/O metadata in data flow can be set and viewed by selecting a corresponding arrow 1504.
[0107] With both the code block and the transformation process having associated keyword lists 1407 and 1507, respectively, a list of candidate code blocks may be created for any particular transformation process. Figure 16 is an exemplary diagram showing how this candidate list 1610 is generated. As shown in Figure 16, at step 1315 (Figure 13), a keyword search is performed for keyword matches (indicated by arrow 1605) between a transformation process 1601 (via keyword list 1508) and candidate code blocks 1610, to determine all possible matching code blocks [1601(1 ), 1601(2) ... 1601(n)], which are stored in a first list 1610.
[0108] List 1610 is normally too long, as only one code block name is normally required. Figure 17 is an exemplary diagram illustrating the present method of determining which code blocks (in list 1610) have looping structures corresponding to a selected process, in order to shrink list 1610 (cull it) and also to determine if test procedures can be run against the various code blocks. Thus, at step 1320, I/O and loop information 1704 for the selected transformation process is compared with information 1702 relating to the I/O and loops of the various code blocks in list 1610, as shown in Figure 17, and those code blocks that do not match are removed, leaving a group of remaining code blocks in list 1710.
[0109] Unlike traditional systems, the present method does not associate test procedures with code, but with transformation processes instead. Associating test procedures with design allows one test procedure to be run against all remaining code blocks. Since a test procedure consists of input and associated expected outputs, one can determine which code blocks generate the correct answers and which do not. Figure 18 is an exemplary diagram
illustrating the present method of determining which code blocks (in list 1710) provide correct results executing specified test procedures. As shown in Figure 18, at step 1325 (Figure 13), the remaining code blocks in list 1710 are executed (arrow 1807) using test procedure data 1803, and those code blocks that generate an incorrect answer are culled, at step 1330, leaving a group of remaining code blocks 1810. For example, using an interactive display program, a user may specify input and output variables and their expected (correct) results when applied to the selected code block. Comparing expected values 1804 to the received values (execution results) 1802 allows the system to cull those code blocks that do not produce the expected values.
[0110] After step 1330, there are typically only a few code blocks left. To further decrease the number of code blocks to a single one, an additional step may be performed, in which developer goals are evaluated. Here, the developer defines the overall goal to be achieved with the design. This goal is defined by a list of possible goals, examples of which are shown in Table 11 below.
TABLE 11
Figure imgf000032_0001
[0111] A developer can mix and match goals to produce a desired result. At step 1335, the code block that best meets the selected goals is selected, via a comparison of developer goals, such as those shown in Table 12 below, with metadata for the remaining code blocks 1710.
TABLE 12
Figure imgf000032_0002
[0112] The final selection criteria indicated by the developer are compared against candidate code blocks 1710 to yield the code block closest to the developer's goals. Automatically associating a code block with a design element means that code and design can no longer drift apart. Not being able to associate a design element with a code block means either the code must be rewritten or the design must be further decomposed.
Data Store Extension
[0113] A data store is equivalent to a "C" or "C++" language data structure. What is still desired is a method for attaching FILES and DATABASES to processes. Attaching files and databases to processes is accomplished via a new data store type, the "F" (file) type. An example of an F-type object symbol is shown below:
F
F-Type Data Store Definition
[0114] A file definition list, such as that shown in Table 13, below, may be displayed in response to a user request..
TABLE 13
Flat File
Database
Flat File Selection
[0115] Figure 19 is a flowchart 1900 showing an exemplary set of steps performed in automatically attaching files and databases to design elements. As shown in Figure 19, at step 1905, a developer associates a 'flat' file with one or more keywords. Selection of a 'flat file' or equivalent button allows the developer to define the file format association, as shown in Table 14 below
TABLE 14
Figure imgf000033_0001
[0116] Once the flat file has been defined, the present system can serialize any input dataset properly and save the data in a cloud or other environment. This data can then be used by any design by selecting the correct file name with the correct keyword list 1901 and field names/types. Standard file calls are treated as if they were database queries.
Database Selection
[0117] At step 1910, a developer associates a database file with one or more keywords. Selection of a 'database' or equivalent button causes the database information description to be displayed as shown in Table 15 below.
TABLE 15
Database Type
Database Name
Database Description
Keyword list
Schema
Queries
Select Database Type
[0118] Selecting the Database Type option causes a list of supported database types to be shown. An example of this list is shown in Table 16 below.
TABLE 16
Figure imgf000034_0001
Schema
[0119] At step 1915, the developer enters the database schema for each selected database type, as shown in Table 17 below.
TABLE 17
Figure imgf000035_0001
[0120] The first time a table is defined it is placed into the selected database using, for example, the SQL CREATE TABLE command (for SQL databases) or similar command for noSQL databases. Adding data to an existing database table is performed using the SQL UPDATE (for SQL databases) or similar command for noSQL databases to +be generated. Changing the SQL schema is accomplished using an ALTER, DROP, DELETE, or TRUNCATE command for SQL databases.
Queries
[0121] At step 1920, selection of 'queries' allows the developer to enter a numbered list of queries to access the current database. A query can be accessed from the program by selecting the query number corresponding to the required query as a dataflow into the database, with the return value returning on the return data flow, as shown in Table 18 below.
TABLE 18
Figure imgf000035_0002
[0122] The first time data is placed into the selected database will cause a SQL CREATE TABLE (for SQL databases) or similar command for noSQL databases. Adding data to an existing database will cause a SQL UPDATE (for SQL databases) or similar command for noSQL databases to be generated. Changing the Schema will cause an ALTER command to be generated for SQL databases.
[0123] A set of queries is attached to any database so that the database can be tested for correctness. An exemplary set of test queries is shown below in Table 19.
TABLE 19
Figure imgf000036_0001
[0124] An exemplary set of file 'queries' is shown in Table 20 below.
TABLE 20
Figure imgf000036_0002
Automatic Attachment of Databases to Design Element
[0125] Since a file or a database can exist outside of a program it is very useful to be able to locate the proper file or database. Consider that the file format (for flat files) and schemas (for SQL databases) and keys (for key-value type noSQL databases) all define how to access the data. These data access methods can be used to find the correct file or database as well.
[0126] Figures 20 through 22 are exemplary diagrams showing the present method for automatically associating databases with design elements (control and process kernels or McCabe code blocks). It is initially determined whether the keyword search is against files or databases, and if against databases, whether the database is SQL or noSQI. As shown in Figure 20, at step 1925 (Figure 19), a search (indicated by arrow 2006) is then performed by comparing by the selected F-type data store keyword list 2004 against trie database keyword list 191 1 for all databases and files to create a list 2010 of potential file/databases.
[0127] As shown in Figure 21 , at step 1930 (Figure 19), list 2010 is then culled by comparing the data access method 2104 defined by the F-type data store against the data access methods 2102 for the listed file/databases to create a list 21 10 of matches, as indicated by arrow 2106.
[0128] List 21 10 is further culled, as shown in Figure 22, at step 1935 (Figure 19), by executing queries 2204 defined by the F-type data store against the remaining files/databases 21 10, as indicated by arrow 2206. If the query return values are incorrect, then those files/databases are culled, to generate list 2210. If there are more than one file/database, then the one that best meets the developer's overall goals is selected.
[0129] Having described the invention in detail and by reference to specific embodiments thereof, it will be apparent that modifications and variations are possible without departing from the scope of the invention defined in the appended claims. More specifically, it is contemplated that the present system is not limited to the specifically-disclosed aspects thereof.

Claims

CLAIMS What is claimed is:
1. A computer-implemented method for automatically extracting system designs from source code by functionally decomposing the source code comprising:
identifying branching and looping commands in a segment of the source code;
extracting, from the segment of the source code, process kernel code segments connected by one of the branching and looping commands;
extracting control kernels from the segment of the source code;
encapsulating the control kernels as state machines; and
encapsulating the process kernel code segments as process kernels; wherein each state machine and associated process kernel constitutes an algorithm.
2. The method of claim 1 , further including:
storing, in a keyword list, identifying metadata associated with the
extracted process kernels and the control kernels, and accessing the extracted process kernels and the control kernels by
matching associated said metadata in the keyword list to a design process with corresponding keywords;
wherein the extracted process and control kernels comprise separately executable code segments at a sub-subroutine level.
3. The method of claim 2, wherein the metadata for each lowest-level code block in the segment includes a name and an I/O method, to allow the sub- subroutine level code block to be accessed directly.
4. The method of claim 2, wherein each said code block is accessed by also matching its input/output parameter types to a design process with corresponding said input/output parameter types, to enable the sharing of sub- subroutine level software.
5. The method of claim 1 , further including associating metadata with the control kernels and process kernels, wherein the metadata is used to associate the extracted process kernels and the extracted control kernels with code other than the source code used in the extracting steps.
6. The method of claim 1 , further including generating a high level design using a plurality of said algorithms.
7. The method of claim 1 , further including:
functionally decomposing a segment of the source code until the lowest decomposition level consists of only McCabe code blocks;
adding metadata including a name, I/O method, and associated test procedures) to each of the code blocks, allowing the blocks to be accessed directly; and
associating the input/output data flow parameters for each of the McCabe code blocks with a said design process code block having corresponding data flow parameters, to allow the code blocks to be used as sub-subroutines.
8. A system for sharing sub-subroutine level software comprising a computer executing software instructions to perform the steps of:
analyzing a section of source code for process and control elements; encapsulating the control elements as state machines and the process elements as the process kernels;
associating identifying metadata to the process kernels and to the state machines; and
using the associated metadata to identify sub-subroutines to provide
software code sharing at a sub-subroutine level.
9. The method of claim 1 , further comprising:
performing decomposition, of a selected segment of the source code, to McCabe code blocks, each consisting of a transformation process; generating, for each of the code blocks, and also for a selected said
transformation process, associated metadata comprising a keyword list including a corresponding database type, database schema, and at least one database test query;
performing a keyword search for keyword matches between keywords associated with a selected one of the code blocks and the keyword list for the selected transformation process to determine matching ones of said code blocks matching said database type and said database schema of the selected one of the code blocks; and executing the test query on the matching code blocks to determine at least one database that can be used with the selected one of the code blocks.
10. The method of claim 9, wherein said keyword search is performedmatically upgrade a database.
PCT/US2013/044573 2012-06-06 2013-06-06 Method for automatic extraction of designs from standard source code WO2013184952A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US13/490,345 US8762946B2 (en) 2012-03-20 2012-06-06 Method for automatic extraction of designs from standard source code
US13/490,345 2012-06-06

Publications (1)

Publication Number Publication Date
WO2013184952A1 true WO2013184952A1 (en) 2013-12-12

Family

ID=49712648

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2013/044573 WO2013184952A1 (en) 2012-06-06 2013-06-06 Method for automatic extraction of designs from standard source code

Country Status (1)

Country Link
WO (1) WO2013184952A1 (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9851949B2 (en) 2014-10-07 2017-12-26 Kevin D. Howard System and method for automatic software application creation
US10496514B2 (en) 2014-11-20 2019-12-03 Kevin D. Howard System and method for parallel processing prediction
CN113986889A (en) * 2021-12-28 2022-01-28 天津南大通用数据技术股份有限公司 Method and system for realizing intelligent expansion of database function
US11520560B2 (en) 2018-12-31 2022-12-06 Kevin D. Howard Computer processing and outcome prediction systems and methods
US11687328B2 (en) 2021-08-12 2023-06-27 C Squared Ip Holdings Llc Method and system for software enhancement and management
US11861336B2 (en) 2021-08-12 2024-01-02 C Squared Ip Holdings Llc Software systems and methods for multiple TALP family enhancement and management

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030079188A1 (en) * 2001-10-17 2003-04-24 Analog Design Automation Inc. Method of multi-topology optimization
US20030149968A1 (en) * 2002-02-04 2003-08-07 Kabushiki Kaisha Toshiba Source program processing method
US20040015775A1 (en) * 2002-07-19 2004-01-22 Simske Steven J. Systems and methods for improved accuracy of extracted digital content
US20060136850A1 (en) * 2004-12-17 2006-06-22 Lsi Logic Corporation Method of parasitic extraction from a previously calculated capacitance solution

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030079188A1 (en) * 2001-10-17 2003-04-24 Analog Design Automation Inc. Method of multi-topology optimization
US20030149968A1 (en) * 2002-02-04 2003-08-07 Kabushiki Kaisha Toshiba Source program processing method
US20040015775A1 (en) * 2002-07-19 2004-01-22 Simske Steven J. Systems and methods for improved accuracy of extracted digital content
US20060136850A1 (en) * 2004-12-17 2006-06-22 Lsi Logic Corporation Method of parasitic extraction from a previously calculated capacitance solution

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9851949B2 (en) 2014-10-07 2017-12-26 Kevin D. Howard System and method for automatic software application creation
US10496514B2 (en) 2014-11-20 2019-12-03 Kevin D. Howard System and method for parallel processing prediction
US11520560B2 (en) 2018-12-31 2022-12-06 Kevin D. Howard Computer processing and outcome prediction systems and methods
US11687328B2 (en) 2021-08-12 2023-06-27 C Squared Ip Holdings Llc Method and system for software enhancement and management
US11861336B2 (en) 2021-08-12 2024-01-02 C Squared Ip Holdings Llc Software systems and methods for multiple TALP family enhancement and management
CN113986889A (en) * 2021-12-28 2022-01-28 天津南大通用数据技术股份有限公司 Method and system for realizing intelligent expansion of database function
CN113986889B (en) * 2021-12-28 2022-04-05 天津南大通用数据技术股份有限公司 Method and system for realizing intelligent expansion of database function

Similar Documents

Publication Publication Date Title
US8949796B2 (en) Method for automatic extraction of design from standard source code
Kirby Reflection and hyper-programming in persistent programming systems
CN107704382B (en) Python-oriented function call path generation method and system
US5325533A (en) Engineering system for modeling computer programs
US10942734B2 (en) Software dependency shading
US8726231B2 (en) Support for heterogeneous database artifacts in a single project
US7934205B2 (en) Restructuring computer programs
US20080281580A1 (en) Dynamic parser
US20080295080A1 (en) Program Maintenance Support Device, Program Maintenance Supporting Method, and Program for the Same
WO2013184952A1 (en) Method for automatic extraction of designs from standard source code
US8255883B2 (en) Translating late bound LINQ expressions into database queries
US20070074185A1 (en) Identifier expressions
US20110113285A1 (en) System and method for debugging memory consistency models
Yang et al. Powerstation: Automatically detecting and fixing inefficiencies of database-backed web applications in ide
JP2018510445A (en) Domain-specific system and method for improving program performance
Allamanis et al. Smartpaste: Learning to adapt source code
US8214402B2 (en) Interactive physical design tuning
KR20080038306A (en) Nullable and late binding
US20200097260A1 (en) Software application developer tools platform
de Carvalho Junior et al. Contextual abstraction in a type system for component-based high performance computing platforms
CA3134422A1 (en) System and method of computer-assisted computer programming
Gabriel et al. Foundation for a C++ programming environment
US20120330878A1 (en) Conventions for inferring data models
Jarraya et al. Quantitative and qualitative analysis of SysML activity diagrams
US20230418574A1 (en) Using a semantic tree of a compiler to execute a semantic code query against source code

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 13799893

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205N DATED 12/02/2015)

122 Ep: pct application non-entry in european phase

Ref document number: 13799893

Country of ref document: EP

Kind code of ref document: A1