WO2017001900A1

WO2017001900A1 - A data processing method

Info

Publication number: WO2017001900A1
Application number: PCT/IB2015/058497
Authority: WO
Inventors: Grigory Victorovich DEMCHENKO
Original assignee: Yandex Europe Ag; Yandex Llc; Yandex Inc.
Priority date: 2015-06-30
Filing date: 2015-11-03
Publication date: 2017-01-05
Also published as: RU2666334C2; RU2015125822A; RU2015125822A3

Abstract

A data processing method for processing intermediate data is disclosed. The method comprises interrupting a process effecting a processing of a data set such that the data set comprises a first part of amended or processed data and a second part of non- amended or non-processed data. The non-processed data is then transmitted within a block of memory to a second instance of the process for processing. On completion of that processing, a block of memory corresponding to the second part is then returned and combined with the first part.

Description

A DATA PROCESSING METHOD Cross-Referenced Application

The present application claims convention priority to Russian Patent Application No. 2015125822, filed June 30, 2015, entitled "A DATA PROCESSING METHOD" which is incorporated by reference herein in its entirety.

Field

The present technology teaches a method of providing intermediate data generated during computer program execution from a first process to a second process.

Background There are many instances where it can be desirable to transfer intermediate data encapsulated within a program object from one instance of a computer program to another. For example, it may be desirable to move data processing from one computer to another to balance load among a group or cluster of computers. Alternatively, having processed a request from a client across a network, a server, for example, a database server, might wish to transfer a program object back to the client to allow the client to continue program object processing.

Program objects can comprise a combination of executable code as well as data and so can either perform processing themselves and/or provide data for other program objects to process. For example, a program object such as a function or sub-routine might be instantiated by a parent object or even a main program to perform some processing on behalf of the calling parent object or main program. Object data can comprise a variety of types comprising, for example: primitive data types such as integers, real numbers, Boolean, characters; and structured or abstract data types such as arrays or lists and user-defined data types, each including multiple instances of data or combinations of data types. Program data can either be stored directly within a program object memory space or be indexed by pointers which contain addresses of other program objects or data. Handling the transfer of program objects including pointers can be problematic, in particular, when attempting to move a program object from a program or process running within one memory address space on a computer to another program or process potentially using another memory address space.

D. M. Dhamdhere, Operating Systems: A Concept-based Approach", ISBN: 007061 1947, 2006 discloses a general approach to allocating memory at an operating system (OS) level of a computer.

US 8,458,433 "Management of persistent memory in a multi-node computer system" discusses the creation and use of persistent memory in a multi-node computing system. A persistent memory manager uses persistent memory to load applications to preserve data from one application to the next. US 5,987,495 discusses how to restore the context of a user program, including program status word (PSW) and CPU register contents, following an asynchronous interrupt.

There are known solutions for switching data from absolute addressing to relative addressing to avoid problems when providing data from a first computer (or process) to another computer (or process). However, such solutions can require memory management tools capable of analyzing memory content and associating it with a process. This analysis can be onerous especially if there is no a priori knowledge of the nature of the object to be handled.

US 8,566,536, discloses direct access sharing of physical memory between processes. Memory address space is mapped to each of the processes by populating a first entry in a top level virtual address table for each of the processes. As each address space is being created and mapped to a given process, a master kernel commences generation of a master list of the entry in the top level virtual address table of each address space for each process. Thus, the address space of each of the processes is cross-mapped into each of the processes by populating one or more subsequent entries of the top level virtual address table with the first entry in the top level virtual address table from other processes. This technique is however reliant on functionality being provided by a processor core. There therefore continues to exist a need to provide for efficient processing of data. Summary

In accordance with a first broad aspect of the present technology, there is provided a data processing method for processing intermediate data generated during computer program execution. More particularly the present teaching relates to a data processing method for transferring intermediate data generated during execution of a computer program execution from a first process or application to a second process or application. Within the context of the present teaching a process may be considered as including an execution context of an application such that a first process is for example an execution context of a first application and a second process is an execution context of a second application that may be running on the same or a different computer.

The intermediate data may result from an incomplete processing of a data set arising for example from a pausing or termination of a computer program prior to complete processing of a data set. Prior to initiation of a computer program by a first application, sufficient memory space is allocated to facilitate complete processing of the data set. If the execution is interrupted prior to complete processing, a first part of the allocated memory space has amended memory addresses reflective of the processed data within that memory space whereas a second part of the allocated memory has un-amended memory addresses. In accordance with the present teaching, the first part of the memory is retained by first application and only the second part of the memory is allocated to a second application for subsequent processing. On completion of that subsequent processing the amended second part of the memory is returned to the first application to complete the processing routine. By only passing the second part of the memory between the first and second applications, the amount of memory that has to be transferred between applications is reduced and/or the load on processing power is also reduced.

In order to efficiently transfer an object from a first process to a second process the present teaching provides the initial allocated memory space as a contiguous memory block within a first memory address space. On interruption of the processing, the contiguous memory block is copied to persistent memory. That block is analyzed and split into amended and un-amended portions. The first part of the memory, as retained by first application, occupies a first contiguous portion of the first memory address space and a copy of the un-amended second part of the memory which occupies a second contiguous portion of the first memory address space is transferred to a second process and a second memory address for subsequent processing. On completion of that processing, the second contiguous portion now comprises amended memory space and is returned to the first process to replace the un-amended second contiguous portion of the first memory space. As the first process has a first memory address space and the second process has a second memory address space, to ensure compatibility between the memory portion that is transferred between the first process and the second process, for each memory location of the first and second parts, respective values are stored and are used to ensure that the memory space that is used by each of the first and second processes is compatible. This may include storing heap variables in a program heap and using the heap variables, or indexed versions of same, for each of the first process and the second process.

Some embodiments are implemented with a serializing function performing the above method steps in response to a call by a parent object to transfer said second part. In some cases, said first process is instantiated on a first computing apparatus and the second process is instantiated on a second different computing apparatus.

Alternatively, said first process is instantiated on a first computing apparatus and the second process is instantiated on the first computing apparatus at a later time.

Each of said first and second memory address spaces can be virtual memory address spaces.

In accordance with a second broad aspect of the present technology, there is provided a data processing method for transferring an object from a first process to a second process, the first process having a first memory address space and the second process having a second memory address space. The method is executable by a processor of a computing apparatus executing the second process and comprises obtaining a copy of a contiguous region of un-amended memory including data not yet processed by a first process, effecting a processing of that data so as to provide a contiguous region of amended memory and returning a copy of the contiguous region of amended memory to the first process.

Some embodiments comprise obtaining a size of said contiguous region of un-amended memory and allocating a contiguous portion of memory of at least said size with said second memory address space. Some embodiments are implemented with a de-serializing function performing the above method steps in response to a call by a parent object to receive said object.

In another aspect there is provided a computer program product comprising executable instructions stored on a computer readable medium which when executed on a computing apparatus are arranged to perform the above methods. In a still further aspect, there is provided a data processing system comprising a first computing apparatus connected to a second computing apparatus, each arranged to perform the above methods.

Accordingly there is provided a method as defined in each of the independent claims. A system and a computer program product are also provided. Advantageous features are provided in the dependent claims.

Brief Description of the Drawings

Various embodiments will now be described, by way of example, with reference to the accompanying drawings, in which:

Figure 1 illustrates schematically a system for providing intermediate data generated during computer program execution from a first process to a second process, the system being implemented in accordance with non-limiting embodiments of the present technology; Figure 2 illustrates a method of processing data operable within the system of Figure 1 ; and

Figure 3 illustrates a method of processing data operable within the system of Figure 1 .

Description of the Embodiments Referring to Figure 1 , there is shown a diagram of a system 100 including a first computing apparatus 20 and a second computing apparatus 201 . It is to be expressly understood that the system 100 is merely one possible implementation of the present technology. Thus, the description thereof that follows is intended to be only a description of illustrative examples of the present technology. This description is not intended to define the scope or set forth the bounds of the present technology. In some cases, what are believed to be helpful examples of modifications to system 100 may also be set forth below.

This is done merely as an aid to understanding, and, again, not to define the scope or set forth the bounds of the present technology. These modifications are not an exhaustive list, and, as a person skilled in the art would understand, other modifications are likely possible. Further, where this has not been done (i.e. where no examples of modifications have been set forth), it should not be interpreted that no modifications are possible and/or that what is described is the sole manner of implementing that element of the present technology. As a person skilled in the art would understand, this is likely not the case. In addition it is to be understood that the system 100 may provide in certain instances a simple implementation of the present technology, and that where such is the case they have been presented in this manner as an aid to understanding. As persons skilled in the art would understand, various implementations of the present technology may be of a greater complexity. In a first embodiment, the first computing apparatus 20 is communicatively coupled with a data storage device 30 which stores program code 10 for a program. The data storage device 30 can be a memory device such as a hard disk integrated with the first computing apparatus 20 or the data storage device 30 can be connected to the first computing apparatus 20 via a network (not depicted) or indeed any suitable wired or wireless connection. In the context of the present specification, unless expressly provided otherwise, "computing apparatus" is any computer hardware that is capable of running software appropriate to the relevant task at hand. Thus, some (non-limiting) examples of electronic devices include general purpose personal computers (desktops, laptops, netbooks, etc.), mobile computing devices, smartphones, and tablets, as well as network equipment such as routers, switches, and gateways. It should be noted that a device acting as a computing apparatus in the present context is not precluded from acting as a server to other electronic devices. The use of the expression "a computing apparatus" does not preclude multiple electronic devices being used in receiving/sending, carrying out or causing to be carried out any task or request, or the consequences of any task or request, or steps of any method described herein.

An instance 101 of the program 10 is loaded within an address space in memory 16 of the first computing apparatus 20. The memory 16 can comprise a virtual memory system with physical storage separate from the address space, but for the purposes of simplicity, the memory is shown in Figure 1 as a simple block 16. This block of memory 16 can be considered a first memory address space.

The instance 101 of the program is configured to allocate a first contiguous portion 12 of memory within the memory 16 for storing program heap variables - the portions of memory heap. The instance 101 is then configured to process data from a data set so as to store heap variables within the allocated first contiguous portion of the memory associated with the specific program. This instance 101 of the program may be considered a first process or application and in accordance with present teaching there is provided a method whereby intermediate data resultant from an incomplete processing of the data set by the first process is transferred to a second instance 1001 of the computer program. This second instance 1001 may be considered a second process or application.

The intermediate data may result from an incomplete processing of a data set arising for example from a pausing or termination of a computer program prior to complete processing of a data set. As detailed above, prior to initiation of a computer program by a first application, sufficient memory space is allocated to facilitate complete processing of the data set. If the execution is interrupted prior to complete processing, for example in response to the first instance 101 of the computer program ceasing data processing, the present teaching provides a transfer of non-processed data to the second instance 1001 of the computer program. In order to efficiently transfer the non-processed data from the first instance 101 to the second instance 1001 (from a first process to a second process), the present teaching provides the initial allocated memory space 12 as a contiguous memory block within a first memory address space 16. On interruption of the processing, the contiguous memory block is copied to persistent memory. This persistent memory may be a separate memory block or the same as the previously allocated memory space. That memory block is analyzed and split into a first part comprising an amended portion 12-1 of memory and a second part comprising an un-amended portion 12-2 of memory. The first part 12-1 of the allocated memory space 12 has amended memory addresses reflective of the processed data within that memory space whereas the second part 12- 2 of the allocated memory has un-amended memory addresses reflective of the fact that there is non-processed data.

In this way the first part of the memory, which will be retained by first application, comprises the amended portion 12-1 which occupies a first contiguous portion of the first memory address space. A copy of the second part comprising the un-amended portion 12-2 of the memory which occupies a second contiguous portion of the first memory address space is transferred to a second process as provided by the second instance 1001 and a second memory address 1201 within a second memory block 1601 for subsequent processing.

On completion of that processing, the second memory address 1201 which also defines a contiguous block of memory now comprises a block of memory defining an amended memory address space and a copy of that portion 1201 is returned to the first instance 101 or first process to replace the un-amended second contiguous portion 12-2 of the first memory space 12. In this way, as part of the processing of intermediate data, the first part 12-1 of the memory is retained by first instance 101 and only the second part 12-2 of the memory is allocated to the second instance 1001 or second application for subsequent processing. On completion of that subsequent processing the amended second part 1201 of the memory is returned to the first application to complete the processing routine. By only passing the second part of the memory between the first and second applications, the amount of memory that has to be transferred between applications is reduced and/or the load on processing power is also reduced.

As part of this transfer of the intermediate data between the first and second instances of the program 10, the data can be written to a buffer 60 which can then be either stored in non-volatile storage such as the data storage device 30 or transmitted across a network connection 40 for subsequent use by the second program instance 1001 . The processing can be effected by passing the data object either directly i.e. by passing the entire object, or by reference i.e. using a pointer to the data object. It will be appreciated that the other program instance can be a second instance 1001 of the program 10 running on another computing apparatus 201 ; the instance can be a later instance of the program 10 running on the same computing apparatus 20 sometime after the first instance 101 has ceased processing; or indeed the second instance can be an instance of a different program than the program 10. In Figure 1 , the second program instance 1001 is for simplicity shown as a separate program running on a second computing apparatus 201 connected to the first computing apparatus 20 but it is not intended to limit the present teaching to such a configuration.

Referring now to Figure 2 which illustrates steps involved in processing a data set according to the present teaching. For the purpose of this example, the data process described is one of image processing and the intermediate data may result from a rendering data process of a data set defining an image while applying an image filter like "sharpening" for the image in a graphical editor application. A user may run a first editor application- the first instance 101 of the computer program described above with reference to Figure 1 - and the first application may start applying the filter to the image. The application allocates enough memory space for the execution. When executing the image, the user may pause or stop the procedure during the execution, for example at a moment in time when the first application has applied the image filter only to a half of the picture. In this way the data set is only partially processed.

In effecting the execution process of the image filter as applied to the image, the first application makes changes to only half of memory addresses of the image files (the first part of the respective memory). The execution context may then be transferred to another instance 1001- for example a second editor application (and\or continue applying the filter on another computer), which may pick up the execution data from the first editor application and continue the execution. The first application based on the user intent may divide the image memory into two parts: a first part with amended memory addresses - the memory portions with the image filter applied (unchangeable for another device), and a second part (changeable for another device) with memory addresses having no amendments. It will be appreciated that this allocation of first and second memory portions may be effected after the interruption of the program execution as if there is no interruption there may be no need for this processing. The first application divides the image memory based on the memory portions being amended\not amended while applying the image filter.

While it is known to interrupt program execution and transfer the entirety of a memory block defining the entire data set to another instance of a computer program for execution, in accordance with the present teaching only the second part of memory is transferred to a second instance which may be on to a second computer while leaving on the first computer the first part of memory. On receipt of this subset of the entire memory originally allocated to the processing of the data set, the second computer executes program code for the second application which in this example is a second image editor). The second application uses the execution context of the first application and applies an image filter to the second part of memory addresses. The second application may finalize applying the filter and send the execution context (including the second memory portions addresses being amendments) to the first application. It will be further appreciated that this cascading of processing between instances of computer programs may be effected in a number of iterations such that two or more instances are used in the processing of a single data set, with each instance processing only a subset of data from the original data set. In such arrangements it will be appreciated that there are certain instances where multiple sequential or parallel processing steps may be beneficial but there are other instances where the effort associated with the reconstitution of the original allocated data set requires more effort than the benefit achieved from segmented processing. For example, there are benefits to be achieved when the memory is cut into, two parts, each part representing 50% of the initial data set that required processing. Having processed both of the two parts, the recombination of the processed memory blocks requires a certain level of processing but that recombination application requires less time that the processing of either one of the two parts. In another scenario where the memory was cut into 100 parts, each representing 1 % of the initial data set, the recombination of the processed parts is a complex and cumbersome activity that negates any advantage in separating the processing.

It will be appreciated however that the capacity to separate processing between individual processing machines or elements may be effect to advantageously deploy characteristics of the individual machines. For example, if a particular processing machine or server is particularly suited for processing data by application of black-to- white filters to the image and another machine or server is only capable of edge profiling an image then a process per the present teaching can be used to allocate processing of particular tasks to particular machines as appropriate. Once each of the machines has completed its individual task, then per the present teaching a reconstituted memory block of entire processed data set can be effected.

It will be further appreciated that while described as a series of sequential processing steps, a method per the present teaching can be effected using parallel execution of individual processing tasks. Parallel execution is particularly suited for data processing examples where the sequence of processing is not critical in the context of the overall processing being effected. For example, if the desired processing of an image data set requires application of a first filter and, based on the applied filter, a subsequent change of the colours defined by the image and a further subsequent cropping of the image, then segmentation of the three processing steps into three parallel activities will not achieve the desired effect and should not be deployed. The amended second memory portion addresses are then returned to the first instance 101 and combined with the retained first memory portion addresses so as to allow the first instance 101 continue working on the now complete image file portion within the memory 12 of the computer program.

In a first step 200 performed by a first instance 101 of an application a first contiguous portion of memory for storing program heap variables - the portions of memory heap- is allocated to the data set to be processed. It will be appreciated that the allocation of memory is typically a routine program step and as such will typically be effected preprocessing and prior to the interruption of the processing steps. This involves bespoke program code within the first instance 101 processing data to the extent required before it is to be transferred to another program for further processing. So for example, where the program instance 101 is an image filtering program, this preparation may identify available space within the memory 16 of the computer on which the program is executing and allocating a subset 12 of that memory 16 for this data set.

As was discussed in the introduction to Figure 2, the present teaching may be utilized with any one of a number of different types of data sets. In this context, the application of the techniques heretofore described should not be limited to the processing of image data. For example, the present teaching may be usefully employed in applications that require the completion of data base structures. In such an application, a first instance of an application may initiate filling-in or populating elements of a data structure as defined by a data base. At some period in time subsequent to the initial population, the process is terminated or interrupted. At that time, and similar to that described with reference to the processing of image data, the memory block may be interrogated and separated into processed and non-processed parts. The non-processed parts may be transferred as a contiguous block to another instance of the program for processing. On completion of that process the now processed part is returned, assimilated with the initially processed data and the process completed. In this way, the second instance of the application or process may populate the data base with data, and subsequent to that population return the data to the first instance. The manner of data population may vary. For example, the data within the data base may be varied or modified by different processing operations such as amending rows and lines, by providing calculations, summing numerics from a row "A" with numeric from a row "B", importing data from other data structures etc.

In this way it will be appreciated that the data set, which may for example be image data awaiting image processing or a data structure awaiting population, may be considered a program object. As will be appreciated, program objects can either be stored in a program heap or within a program stack. The heap is a region of computer memory whose allocation is not managed automatically by the operating system hosting a program. It is a free-floating region of memory, typically larger than the stack. To allocate memory for variables on the heap in a C language program, built-in C functions mallocO or calloc() are used. In C++, equivalent functions are new() and delete(), with other programming languages using similar functions.

For variables allocated in heap memory, the C function free() or C++ function delete() can be used to de-allocate that memory once that memory is no longer needed. Failing to do this results in memory leakage where memory on the heap will still be set aside and won't be available to other processes.

Because of the possibly many allocations and de-allocations of heap memory at program run time, in virtual memory systems, heap memory variables may be stored in non-contiguous portions of virtual memory as well as physical memory.

Thus, unless provision has been made otherwise, it cannot be assumed that any program object, on allocating to a memory space has been passed (either directly or by reference) for storage in a contiguous portion of memory. In order to ensure that the data set is allocated contiguous memory it is possible, for example per the teaching of Russian Patent Application No. 2014139545 filed 30 September 2014 (Reference: 2014-0107-SE-PD1 -RU / 34055-401 / B37-1479-01 ) to replace a global default memory allocator with a custom memory allocator. The custom memory allocator ensures that data to be processed is allocated within respective contiguous portions of the memory which has been allocated for the function at step 200.

For programs written in C or C++, controlling the allocation of the program heap can be achieved by overloading the malloc(), calloc() and new() functions so that as new variables are declared and allocated at program execution time, they are written to a contiguous portion of memory rather than being distributed across non-contiguous memory locations - both in virtual and physical memory. Equivalent techniques can be employed for programs written in other languages; or indeed other techniques for achieving the same result can be employed according to the operating system environment of the program.

In certain instances a memory allocator allocating objects to contiguous portions of memory is used by default within the program instance 101 such that the use of copies of the data set, one within a contiguous portion and one not does not have to be utilised. Having allocated memory, the program instance 101 initiates processing of the data set, step 202. At some period after this initiation, the program is interrupted, step 204. Per the present teaching this interruption is effected prior to complete processing of the data set. After interruption, the memory is interrogated (step 206) to ascertain portions which pertain to processed data, i.e. amended memory, or portions which pertain to un- amended memory, i.e. non- processed data.

The un-amended memory, either a copy of it or pointers to same, are transferred as a contiguous block, to the second instance 1001 of the program, step 208.

At some time subsequent, the first instance 101 receives a copy of amended memory, step 210, that corresponds to the originally provided un-amended part 12-2. As the returned memory block is also defined by a contiguous portion of memory that corresponds with the originally provided un-amended contiguous portion, the first instance 101 is operable to replace the original un-amended portion with the now amended version so as to provide a complete memory block that comprises amended or processed data. This data set can then be subsequently processed- step 212- to complete the required action required of the application.

Figure 3 shows an example process flow, from the perspective of the second instance 1001 reflecting processing resultant from receipt of an un-amended block of memory. In step 300, the second instance 1001 receives a block of memory for processing. The second instance allocates a contiguous portion of memory for that processing, step 302. The data within that memory is then processed, step 304. On completion a block of contiguous memory is returned to the first instance for subsequent processing, step 306.

In the context of the present specification, unless expressly provided otherwise, the expression "data" includes information of any nature or kind whatsoever capable of being stored, for example, in a database, or transmitted electronically, for example, in a stream. Thus data includes, but is not limited to audio-visual works (images, movies, sound recordings, presentations etc.), location data, numerical data, etc., text (opinions, comments, questions, messages, etc.), documents, spreadsheets, etc.

In the context of the present specification, unless expressly provided otherwise, a "database" is any structured collection of data, irrespective of its particular structure, the database management software, or the computer hardware on which the data is stored, implemented or otherwise rendered available for use. A database may reside on the same hardware as the process that stores or makes use of the information stored in the database or it may reside on separate hardware, such as a dedicated server or plurality of servers.

It will be appreciated that while the above method has been described for exemplary purposes with a specific sequence of steps, the various steps can be rearranged where possible to achieve the same effect.

Embodiments of the present invention find particular utility in for example, the distribution of program objects between devices; facilitating processing in virtual machines where for example processing and/or decision making can be moved between a remote server and local processors; backupVirtualization systems; and in compilers and code executing applications.

In the context of the present specification, unless expressly provided otherwise, a "server" is a computer program that is running on appropriate hardware and is capable of receiving requests (e.g. from computing apparatus) over a network, and carrying out those requests, or causing those requests to be carried out. The hardware may be one physical computer or one physical computer system, but neither is required to be the case with respect to the present technology. In the present context, the use of the expression a "server" is not intended to mean that every task (e.g. received instructions or requests) or any particular task will have been received, carried out, or caused to be carried out, by the same server (i.e. the same software and/or hardware); it is intended to mean that any number of software elements or hardware devices may be involved in receiving/sending, carrying out or causing to be carried out any task or request, or the consequences of any task or request; and all of this software and hardware may be one server or multiple servers, both of which are included within the expression "at least one server".

In the context of the present specification, unless expressly provided otherwise, the expression "computer usable information storage medium" is intended to include media of any nature and kind whatsoever, including RAM, ROM, disks (CD-ROMs, DVDs, floppy disks, hard drivers, etc.), USB keys, solid state-drives, tape drives, etc.

In the context of the present specification, unless expressly provided otherwise, the words "first", "second", "third", etc. have been used as adjectives only for the purpose of allowing for distinction between the nouns that they modify from one another, and not for the purpose of describing any particular relationship between those nouns. Thus, for example, it should be understood that, the use of the terms "first apparatus" and "third apparatus" is not intended to imply any particular order, type, chronology, hierarchy or ranking (for example) of/between the apparatus, nor is their use (by itself) intended imply that any "second apparatus" must necessarily exist in any given situation. Further, as is discussed herein in other contexts, reference to a "first" element and a "second" element does not preclude the two elements from being the same actual real- world element. Thus, for example, in some instances, a "first" apparatus and a "second" apparatus may be the same software and/or hardware, in other cases they may be different software and/or hardware.

Implementations of the present technology each have at least one of the above- mentioned object and/or aspects, but do not necessarily have all of them. It should be understood that some aspects of the present technology that have resulted from attempting to attain the above-mentioned object may not satisfy this object and/or may satisfy other objects not specifically recited herein.

Additional and/or alternative features, aspects and advantages of implementations of the present technology will become apparent from the following description, the accompanying drawings and the appended claims.

One skilled in the art will appreciate when the instant description refers to "receiving data" from a user that the computing apparatus executing receiving of the data from the user may receive an electronic (or other) signal from the user. One skilled in the art will further appreciate that displaying data to the user via a user-graphical interface (such as the screen of the computing apparatus and the like) may involve transmitting a signal to the user-graphical interface, the signal containing data, which data can be manipulated and at least a portion of the data can be displayed to the user using the user-graphical interface. Some of these steps and signal sending-receiving are well known in the art and, as such, have been omitted in certain portions of this description for the sake of simplicity. The signals can be sent-received using optical means (such as an optical connection), electronic means (such as using wired or wireless connection), and mechanical means (such as pressure-based, temperature based or any other suitable physical parameter based).

Modifications and improvements to the above-described implementations of the present technology may become apparent to those skilled in the art. The foregoing description is intended to be exemplary rather than limiting. The scope of the present technology is therefore intended to be limited solely by the scope of the appended claims The present teaching may also be extended to the features of one or more of the following numbered clauses:

1 . A data processing method for processing intermediate data generated during computer program processing of a data set, the method comprising: Allocating memory space (12) within a computing apparatus (20) for complete processing of the data set;

Processing of the data set using a first instance (20) of a computer program (10);

Interrupting processing prior to complete processing of the data set;

Identifying a first portion (12-1 ) of allocated memory comprising amended memory address space reflecting a processing of a first part of the data set and a second portion (12-2) of allocated memory comprising non-amended memory address space reflecting a non-processing of a second part of the data set;

Transferring the second portion (12-2) to a second instance (1001 ) of the computer program (10) for processing of the second portion; Receiving an amended memory address space (1201 ) corresponding to a processing of the second part of the data set;

Combining the received amended memory address space (1201 ) with the first portion (12-1 ) for subsequent processing by the first instance of the computer program.

2. The method of clause 1 wherein the first portion (12-1 ) and the second portion (12-2) are each allocated contiguous memory portions within the allocated memory.

3. The method of clause 1 or 2 wherein allocating memory space comprises allocating a first contiguous portion of memory heap for storing heap variables.

4. The method of clause 3 wherein processing of the data set using a first instance (101 ) of a computer program (10) comprises storing heap variables in the memory heap. 5. The method of any preceding clause comprising, responsive to interrupting processing prior to complete processing of the data set, copying the first contiguous portion (12-1 ) to persistent memory.

6. The method of any preceding clause wherein the identifying comprises analyzing a block of memory within persistent memory to split the block into amended (12-1 ) and un-amended (12-2) portions.

7. The method of clause 6 wherein the amended portion reflects a processing of a first part of the data set and occupies a first contiguous portion of the allocated memory and wherein the un-amended portion reflects a non-processing of a second part of the data set and occupies a second contiguous portion of the allocated memory, optionally wherein the transferring the second portion to a second instance of the computer program for processing of the second portion comprises transferring a copy of the second contiguous portion and further optionally wherein combining the received amended memory address space with the first portion for subsequent processing by the first instance of the computer program comprises replacing the second contiguous portion (12-2) with the received amended memory address space (1201 ).

8. The method of any preceding clause comprising storing heap variables in a program heap and using the heap variables, or indexed versions of same, for each of the processing by the first instance and the second instance. 9. The method of any preceding clause wherein said first instance (101 ) is instantiated on a first computing apparatus (20) and the second instance (1001 ) is instantiated on a second different computing apparatus (201 ) or wherein said first instance (10) is instantiated on a first computing apparatus (20) and the second instance (1001 ) is instantiated on the first computing apparatus (101 ) at a later time. 10. The method of any preceding clause wherein the allocated memory space (12) comprises virtual memory address spaces.

1 1 . A method according to clause 7 wherein transferring a copy of the second contiguous portion (12-2) comprises providing said copy of said second contiguous portion (12-2) and an index identifying locations within said second contiguous portion relative to said first contiguous portion and optionally further comprising writing said copy and said index of locations to persistent memory, and optionally wherein the persistent memory comprises one of computer memory or non-volatile memory accessible to each of said first and second instances.

12. A method according to clause 1 1 wherein said providing said copy of said second contiguous portion (12-2) and an index identifying locations within said second contiguous portion relative to said first contiguous portion (12-1 ) comprises transmitting said copy and said index of locations to a computing apparatus across a network link (40).

13. A method executable by a processor of a computing apparatus (20, 201 ) comprising: obtaining a copy of a contiguous region (12-2) of un-amended memory including data not yet processed by a first process (10), the data being a subset of a data set, effecting a processing of that data so as to provide a contiguous region of amended memory (1201 ), and returning a copy of the contiguous region of amended memory to the first process (10).

14. The method of clause 13 comprising allocating memory within the second computing apparatus (201 ) to the copy of a contiguous region of un-amended memory.

15. A computer program which when executed on a computing apparatus (20, 201 ) is configured to carry out the method of any one of clauses 1 to 14.

16. A data processing system comprising a first computing apparatus (20) connected to a second computing apparatus (201 ), said first computing apparatus (20) being arranged to perform the steps of any one of clauses 1 to 12 and the second computing apparatus (201 ) being arranged to perform the steps of clause 14, optionally wherein said first and second computing apparatus comprise different apparatus.

Claims

1 . A data processing method for processing intermediate data generated during computer program processing of a data set, the method comprising:

Allocating memory space within a computing apparatus for complete processing of the data set;

Processing of the data set using a first instance of a computer program;

Interrupting processing prior to complete processing of the data set;

Identifying a first portion of allocated memory comprising amended memory address space reflecting a processing of a first part of the data set and a second portion of allocated memory comprising non-amended memory address space reflecting a non- processing of a second part of the data set;

Transferring the second portion to a second instance of the computer program for processing of the second portion;

Receiving an amended memory address space corresponding to a processing of the second part of the data set;

Combining the received amended memory address space with the first portion for subsequent processing by the first instance of the computer program.

2. The method of claim 1 wherein the first portion and the second portion are each allocated contiguous memory portions within the allocated memory.

3. The method of claim 1 wherein allocating memory space comprises allocating a first contiguous portion of memory heap for storing heap variables.

4. The method of claim 3 wherein processing of the data set using a first instance of a computer program comprises storing heap variables in the memory heap.

5. The method of claim 4 comprising, responsive to interrupting processing prior to complete processing of the data set, copying the first contiguous portion to persistent memory.

6. The method of claim 1 wherein the identifying comprises analyzing a block of memory within persistent memory to split the block into amended and un-amended portions.

7. The method of claim 6 wherein the amended portion reflects a processing of a first part of the data set and occupies a first contiguous portion of the allocated memory.

8. The method of claim 7 wherein the un-amended portion reflects a non- processing of a second part of the data set and occupies a second contiguous portion of the allocated memory.

9. The method of claim 8 wherein the transferring the second portion to a second instance of the computer program for processing of the second portion comprises transferring a copy of the second contiguous portion.

10. The method of claim 9 wherein combining the received amended memory address space with the first portion for subsequent processing by the first instance of the computer program comprises replacing the second contiguous portion with the received amended memory space.

1 1 . The method of claim 1 comprising storing heap variables in a program heap and using the heap variables, or indexed versions of same, for each of the processing by the first instance and the second instance.

12. The method of claim 1 wherein said first instance is instantiated on a first computing apparatus and the second instance is instantiated on a second different computing apparatus.

13. The method of claim 1 wherein said first instance is instantiated on a first computing apparatus and the second instance is instantiated on the first computing apparatus at a later time.

14. The method of claim 1 wherein the allocated memory space comprises virtual memory address spaces.

15. A method according to claim 9 wherein transferring a copy of the second contiguous portion comprises providing said copy of said second contiguous portion and an index identifying locations within said second contiguous portion relative to said first contiguous portion.

16. The method of claim 15 comprising writing said copy and said index of locations to persistent memory.

17. A method according to claim 16 wherein the persistent memory comprises one of computer memory or non-volatile memory accessible to each of said first and second instances.

18. A method according to claim 15 wherein said providing said copy of said second contiguous portion and an index identifying locations within said second contiguous portion relative to said first contiguous portion comprises transmitting said copy and said index of locations to a computing apparatus across a network link.

19. A method executable by a processor of a computing apparatus comprising: obtaining a copy of a contiguous region of un-amended memory including data not yet processed by a first process, the data being a subset of a data set, effecting a processing of that data so as to provide a contiguous region of amended memory and returning a copy of the contiguous region of amended memory to the first process.

20. The method of claim 19 comprising allocating memory within the second computing apparatus to the copy of a contiguous region of un-amended memory.

21 . A computer program product comprising executable instructions stored on a computer readable medium which when executed on a computing apparatus are arranged to perform the method of claim 1 .

22. A data processing system comprising a first computing apparatus connected to a second computing apparatus, said first computing apparatus being arranged to perform the steps of claim 1 and the second computing apparatus being arranged to perform the steps of claim 19.

23. A system according to claim 22 wherein said first and second computing apparatus comprise different apparatus.