Native Code Process-Originated Migration in a Heterogeneous Environment by Charles M. Shub Computer Science Department University of Colorado at Colorado Springs 1. ABSTRACT Dynamic migration has been investigated in several research efforts as a vehicle for load sharing, resource sharing, communi- cation overhead reduction, failure robustness, and several other contexts. This paper presents a summary of related work in dynamic migration, and suggests motivations for considering heterogeneous migration. Next a design for heterogeneous migra- tion, along with its restrictions is presented. A prototype implementation is described. Conclusions are drawn and future work is suggested. Heterogeneous Migration Page 2 2. MOTIVATION AND BACKGROUND Several researchers have looked at dynamic migration. Theimer [THEI85] reports on preemptable remote execution facilities within the V system. His work, designed to allow sharing other- wise unused network resources, also allows for a form of dynamic load balancing. If requested at start up time, processes are dis- tributed to idle or lightly loaded processors through remote exe- cution. Later they may be dynamically moved to other workstations if the processors where they are executing are preempted by a user logging in. A detailed discussion of motivations, technical issues, design, implementation and limitations of this scheme can be found in [THEI86]. Another cited reason is that the efficiency gained by executing a computation on a particular architecture may warrant the overhead of migration to a machine of that architecture. For example, a computation involving detailed numerical algorithms will run fas- ter on a machine with a floating-point co-processor. For a pro- cess about to do substantial floating point arithmetic, it makes sense to request migration to a processor with such capabilities. Another reason for migration is extensive access to a large file of data. For a process with such a need it may be more efficient to migrate to the machine where the data is stored than to incur the overhead of repeatedly transferring data over the network to the original site of execution. A third motivation for process- originated migration is the avoidance of imminent physical danger Heterogeneous Migration Page 3 (such as from an incoming ballistic missile) to the system where a process is executing. A process interacting with a sensor may well discover such a hazard, desire the option of assessing the risk it presents, and perhaps request migration to a safer environment. Finally in a widely dispersed system such as a net- work of satellites, migration may be advisable to reduce the latency of response. Smith [SMIT88] provides a survey paper that echoes and amplifies the motivations for the work described above. His survey also describes several other systems that support homogeneous migra- tion, including the Cambridge Distributed System, [NEED82] DEMOS/MP, [MILL87] [POWE83] and Locus and Eden. [BLAC85] Related to this work is a recent study [EAGE88] showing the benefits of migration for dynamic load balancing might not be as great as earlier anticipated, and suggesting that migration should not be used for dynamic load balancing if used at all, should be a rare event. Also, Zayas [ZAYA87] has shown a vehicle for reducing the overhead associated with infrequent migration. A possible shortcoming of dynamic migration is that it generally requires homogeneous processing elements for successful implemen- tation. The purpose of the work reported here is to delineate issues in migration between heterogeneous nodes. Such an advance would have several beneficial effects including: 1. Elimination of the requirement for identical nodes. Heterogeneous Migration Page 4 This would allow upgrades and expansions to be more flexible and to take advantage of newer technology. 2. Generalization of the technique of migration to make it more utile. Dynamic migration is usually found in the context of a distri- buted system. Distributed operating systems often use a repli- cated kernel and replicated local system servers to provide client processes access to basic services for a bare machine. [STAN85] They provide other services through global system server processes to reduce the system code that must be replicated on each processor. System services are provided by sending requests to, and receiving replies from server processes via the IPC mechanism. Examples of distributed systems abound. They include Andrew, [MORR86] Eden, [ALME85] Locus [WALK83] Medusa, [OUST81] Oryx/Pecos, [SAGE85] Sprite, [OUST88] and The V system, [CHER84] [CHER86] [CHER88] among others. Tanenbaum and Van Renesse [TANE85] provide an excellent survey of distributed systems. Dynamic Migration involves briefly freezing the progress of the unit(s) of migration (given several different names in earlier research efforts), moving the migration unit(s) to a destination system, implementing one of several mechanisms for rebinding references, and finally resuming execution. Migration units are normally an address space and control information about the threads of execution progressing through that address space. Heterogeneous Migration Page 5 In the homogeneous case, the copy is simplified because the representation of a unit of migration is the same at both source and destination. This is normally not true in the heterogeneous case. Two approaches to this diversity of representation have been investigated. The approach reported herein is detailed below. The other one [ATTA88] involves an interpretive mode of execution where there is a simulator/interpreter/emulator of hypothetical hardware (an abstract machine) and the software runs on tis abstract machine. This approach eliminates the need to cope with hardware level differences between the source and the destination by providing a common logical representation on diverse physical platforms. A major drawback of this approach is its slower run time because of the execution time mapping by the simulator/interpreter/emulator from the logical representation to the physical representation. Also at issue is the choice of a logical representation to allow for more efficient interpretation on host architectures. 3. DESIGN First an overview of the design is presented. Next, the restrict- ing assumptions are described and discussed. Finally, the design components and how migration works with these components is presented. Heterogeneous Migration Page 6 3.1 Overview Our approach is to map from one physical representation to the other before execution to allow for faster execution, to make some restrictions on the application, and to embed information in the address space to allow the mapping activity. The mapping can take place either at migration time (as in our prototype) or at page replacement time. With a few exceptions, the embedding of additional information is done before execution. In essence, we are opting to trade complexity of representation and overhead at either migration time or page fault time for pure run time speed. We opted for this tradeoff because we view (in common with others) that migration should be a reasonably rare event. This view is based in part on the motivations of Theimer [THEI85] who suggests migration be used for sharing the capacity of otherwise unused workstation, migrating away because the owner of that workstation has arrived. We were also influenced by the opinion of Eager et. al. [EAGE88] who suggest that frequent dynamic migration for load balancing is a performance loser. 3.2 Present Limitations Our preliminary investigation has (for the sake of delaying con- sideration of some potentially thorny issues) made some simplify- ing assumptions. They are detailed below. 1. We have restricted our consideration to the equivalent Heterogeneous Migration Page 7 of a strongly typed language with no variant records. 2. We have restricted ourselves to a distributed system rather than a network of systems using diverse operat- ing systems. 3. We have restricted our consideration to migration units consisting of an address space with a single thread of control. 4. We have restricted our consideration to insisting the process itself request migration. The first restriction allows us to delay considering aliasing and casting issues until later in the investigation. We believe that there is generally some form of information about which alias is being used is available somewhere and capturing that information should be straightforward. This restriction eliminates many issues related to operating sys- tem differences and is in accord with the other research detailed above. The final two restrictions force the migration unit to be in a well known state at the time of migration. This considerably sim- plifies issues related to conversion of the state from the source to the destination. The last also implies a course level of granularity for migration. Investigation into reducing this granularity is scheduled for later in our timetable. Several Heterogeneous Migration Page 8 items are worth noting here. Much existing application software fits our single thread restriction. Several implementations of multiple threads within an address space at the application level limit that address space to one active thread at a time with the inactive threads being at well known points in their progress. These considerations influenced our choice to include multiple threads later in our timetable. Also, the key issue underlying both the second and third restrictions is the degree of knowledge of the migration unit's state at migration time. Resolution of the issue of capturing the state of a single threaded migration unit at an arbitrary point in its execution will generalize to multiple threads. Two approaches are under preliminary investiga- tion at this time. They are to delay migration until all threads progress to a "migration point" and to attempt migration at an arbitrary point in execution. 3.3 Design Components Our view of a migration unit is that it contains a collection of state and state transformation information. This information is described below on a component basis. The components are interre- lated, so some elements of the design are necessarily referenced before they are described. In the descriptions presented, we use the terms migratable and non-migratable as shorthand for hetero- geneously migratable and not heterogeneously migratable, noting that a migration unit that is not heterogeneously migratable might be migratable between homogeneous nodes. Heterogeneous Migration Page 9 Also, the information described below can, in general, be obtained before initial load, either from a modified compiler, a compilation post processor, or some combination thereof. Excep- tions are noted. 1. There is a complete sequence of machine code for all target architectures. It is far simpler to generate this information before execution than to try to dynam- ically generate it during execution. This clearly makes an executable image substantially larger than a corresponding non migratable unit. However, there are several ameliorating factors. First, we have a single executable module that will run on all target architec- tures rather than several executable units, one for each architecture. The size of the migratable unit will be comparable with the sums of the sizes of the non- migratable units. Thus, external storage costs of a migratable unit will be comparable with the costs of storing all the non migratable versions. Since we have a single unit of execution rather than several units for different architectures, we will be assured that the same software runs on all architectures and will not have to worry about one architecture's version being older than anothers. Second, in a virtual memory system, the physical memory requirements will be roughly equivalent because the code for the "other" Heterogeneous Migration Page 10 architectures will not be active and thus won't reside in main memory. Also, Zayas [ZAYA87] suggests that transmission time will not be impacted by the larger size. 2. There is static data as in non migratable applications. The layout of this data is of necessity different in the migratable case. We use the term "greatest common denominator" to describe the data layout. In essence, enough space is reserved for each primitive data type to concur with size and alignment requirements of all architectures. One can, if one desires, lay things out in a fashion to allow efficient processing with a vec- tor processor. This requirement simplifies migration considerably for several reasons. a. Each primitive data item occupies the same address in all architectures. The consequence of this is that pointers to data point to an address indepen- dent of the architecture. b. Each primitive item within a structure or record is found at the same offset within the structure or record, so primitive items within structures or records can be considered individually during migration. c. Similarly, records or structures within records or structures are found at the same offset. Heterogeneous Migration Page 11 d. Data items are guaranteed to be representable in all architectures. e. Pointers to code point to code for a specific architecture (see below). f. Static data can thus be converted from one archi- tecture to another on a component by component basis (see below for details). 3. Dynamically requested data (typically obtained via a "new" statement or a call to an allocation procedure) is similar to static data. This dynamic data is laid out using the same notion of GCD for size and layout. Implicit in this is that the call to the allocation procedure must embody some descriptor information about what is being requested so the migration routines can massage this data as well. 4. Activation history information is similarly constrained by GCD considerations. This information consists of space allocated for local dynamic variables and saving state information. 5. Loader information must include the target architecture of the initial form of the migration unit since all initialized data is represented in that form. 6. State outside the address space of the migration unit must also be handled. Thus it behooves one to select Heterogeneous Migration Page 12 an environment that minimizes this state. Distributed systems do this well. Moreover, having a distributed kernel across diverse nodes assures us that kernel data structures can be uniformly designed to handle issues in this area. 7. State mapping information is new information not nor- mally found in non migratable units. This is informa- tion that allows the information described above to be converted from its representation on the source archi- tecture to its representation at the destination. The information is designed so it can be generated before execution. It includes several items: a. The function entry correspondence table contains the corresponding function entry points of the code in different architectures. This allows map- ping pointers to functions. b. The return address correspondence table contains corresponding return addresses (code pointers) in different architectures. This allows mapping return pointers in the activation history. c. Static data mapping information describes how to map the static data on a component by component basis. This information is a list of transforms. d. Dynamic data mapping information is similar. e. Activation history mapping information consists of Heterogeneous Migration Page 13 two parts. The first is a set of transformation information for the dynamically allocated local storage. This part is similar to the static and dynamic data mapping information. The second describes the semantics of the information in registers and how to handle any register mismatches between the architectures. As an example, consider migration from a vax to a sun. Suppose a record/structure stored starting at address 2268 (hex) consisting of two 8 bit integers whose values are 3 and 4 respectively, 16 bits of padding for alignment, a 32 bit integer containing the value 107 (hex), a 32 bit pointer to an integer that currently points to the 32 bit integer, and a 32 bit pointer to a function that points to the function "prune". Suppose further that the vax code for the function "prune" starts at address 1020304 (hex) and the sun code starts at address 1040503 (hex). The data mapping table would look as follows: 2268 -> 4 * 1, 2 * 4, 1 * T nil denoting map 4 one byte quantities, then 2 four byte quantities, and one four byte quantity through the function entry table. The function entry table would contain the pair, Heterogeneous Migration Page 14 1020304, 1040503 and the corresponding values in memory would be memory vax sun location before after -------- ------ ----- 2268 03 03 2269 04 04 226a ?? ?? 226b ?? ?? 226c 07 00 226d 01 00 226e 00 01 226f 00 07 2270 6C 00 2271 22 00 2272 00 22 2273 00 6C 2274 04 01 2275 03 04 2276 02 05 2277 01 03 8. On a request for migration, the migration unit is frozen; the address space is copied to the destination; a data massage routine traverses the address space to Heterogeneous Migration Page 15 map the static data, then traverses the dynamic storage to map the dynamic data, and finally traverses the activation history; and then the migration unit is reactivated. The data massage is simplified because at the time of the migration request, all the needed state information is in memory, and none is temporarily in the machine registers on the source machine. 4. PROTOTYPE IMPLEMENTATION Our prototype is implemented under the V operating system [CHER88] for several reasons. First, we maximized leverage because V already runs on two diverse architectures and already has support for homogeneous migration. Thus we were able to focus our attention on heterogeneous migration without being bogged down in the details of designing and implementing a distributed system or in the details of implementing the bulk of the migra- tion work. Secondly, V (we found out as part of this work) keeps a minimum of state information outside the address space of the migration unit. The down side was we needed to use the "C" pro- gramming language and thus needed to be careful to use the language in accord with our restrictions. An implication of this choice was that migration be between sun workstation and vax workstations. This proved to be a mixed blessing in that we had to cope with several differences (most notably byte ordering and register mismatches) but had several commonalities as well. Heterogeneous Migration Page 16 Prototype executable generation was done in an ad-hoc fashion although many phases were automated. The difficult tasks in pro- totype generation included data layout, done by manual padding of the data declarations in the source; generation of the mapping information, done by a locally designed tool; and generation of the loadable module, done by a tool that constructed an appropri- ate header, and extracted relevant portions of non-migratable executables and joined them together. The code to do the migration was augmented to include the data massage. This was straightforward, though tedious. Minimal changes were made to system tables, although we did run into some cases where data in the kernel had to be mapped as well. As the prototype was to show feasibility, we have not yet imple- mented data massage on initial load (the unit must always start execution on a sun) nor have we completed testing of migration in the opposite direction. 5. CONCLUSIONS The work described above has provided a design architecture for heterogeneous migration, and the prototype has shown feasibility of this approach. Much more effort is necessary to determine the eventual utility of this approach. Heterogeneous Migration Page 17 6. FUTURE RESEARCH This effort opens the door for much related work. In terms of pure development, there is the enhancing of the prototype to allow starting the executable in either architecture and complet- ing the code to migrate in the reverse direction. Research issues to be pursued include reduction of the granular- ity with which migration can occur and the extension to multiple threads; extension to additional architectures; investigation of parallel compilation as a vehicle to handle data layout and map- ping table generation; and investigation of the application level below which migration should not be allowed. 7. REFERENCES [ALME85] Almes, G. T., Black, A. T., Lazowska, E. D., and Noe, J. D., "The Eden System: A Technical Review," IEEE Transactions on Software Engineering, Volume 11, Number 1, January, 1985, Pages 43 - 59. [ATTA88] Attardi, G., Baldi, A., Boni, U., Carignani, F., Cozzi, G. Pelligrini, A., Durocher, E., Filotti, I., Qing, W., Hunter, M., Marks, J., Richardson, C., and Watson, A., "Techniques for Dynamic Software Migration," Proceed- ings of the 5th Annual Esprit Conference, North- Holland, November, 1988, Pages 475 - 491. Heterogeneous Migration Page 18 [BLAC85] Black, A., "Supporting Distributed Applications: Experience with Eden," Proceedings of the Tenth ACM Symposium on Operating Systems Principles, Orcas Island, Washington, December, 1985, Pages 181 - 193. [CHER84] Cheriton, D. R., "The V Kernel: A Software Base for Distributed Systems," IEEE Software, Volume 1, Number 2, April, 1984, Pages 19 - 42. [CHER86] Cheriton, D. R., Lantz, K. A. "V-System 6.0 Reference Manual," Computer Systems Laboratory, Departments of Computer Science and Electrical Engineering, Stanford University, June, 1986. [CHER88] Cheriton, D. R., "The V Distributed System," Communica- tions of the ACM, Volume 31, Number 3, March, 1988, Pages 314 - 333. [EAGE88] Eager, D. L., Lazowska, E. D., and Zahorjan, J., "The Limited Performance Benefits of Migrating Active Processes For Load Sharing," Proceedings of the 1988 Sigmetrics Conference on Measurement and Modeling of Computer Systems, Sante Fe, New Mexico, May, 1988, Pages 63 - 72. [MORR86] Morris, J. H., et. al., "Andrew: A Distributed Personal Computing Environment," Communications of the ACM, Volume 19, Number 3, March, 1986, Pages 184 - 201. Heterogeneous Migration Page 19 [MILL87] Miller, B. et. al., "DEMOS/MP: The Development of a Distributed Operating System," Software Practice and Experience, Volume 17, Number 4, April, 1987, Pages 277 - 290. [NEED82] Needham, R. and Herbert, A., "The Cambridge Distributed Computing System," Addison Wesley, 1982. [OUST81] Ousterhout, J. K., "Medusa A Distributed Operating Sys- tem," UMI Research Press, 1981. [OUST88] Ousterhout, J. K., Cherenson, A. R., Douglis, F., Nel- son, M. N., and Welch, B. B., "The Sprite Network Operating System," IEEE Computer, February, 1988, Pages 23 - 45. [POWE83] Powell, M. L. and Miller, B. P., "Process Migration in DEMOS/MP", Proceedings of the Ninth ACM Symposium on Operating Systems Principles, Bretton Woods, New Hampshire, October, 1983, Pages 110 - 119. [SAGE85] Sager, G. R. et. al. "The Oryx/Pecos Operating System," A. T. and T. Technical Journal, Volume 64, Number 1, January, 1985, Pages 251 - 268. [SMIT88] Smith, J. M., "A Survey of Process Migration Mechan- isms," Operating Systems Review, Volume 22, Number 3, July, 1988, Pages 28 - 40. Heterogeneous Migration Page 20 [STAN85] Stankovic, J. A., "Reliable Distributed System Software," IEEE Computer Society Press, 1985. [TANE85] Tanenbaum, A. S. and van Renesse, R., "Distributed Operating Systems," Computing Surveys, Volume 17, Number 4, December, 1985, Pages 419 - 470. [THEI85] Theimer, M., et. al., "Preemptable Remote Execution Facilities for the V-System," Proceedings of the Tenth ACM Symposium on Operating Systems Principles, Orcas Island, Washington, December, 1985, Pages 2 - 12. [THEI86] Theimer, M., "Preemptable Remote Execution Facilities for Loosely-Coupled Distributed Systems," Doctoral Dissertation, Stanford University, 1986. [WALK83] Walker, B., Popek, G., English, R., Kline, C., and Thiel, G., "The LOCUS Distributed Operating System," Operating Systems Review, Volume 17, Number 5, December, 1983, Pages 49 - 70. [ZAYA87] Zayas, E., "Attacking the Process Migration Bottleneck," Proceedings of the Eleventh ACM Symposium on Operating Systems Principles, Austin, Texas, December, 1987, Pages 13 - 24.