Concept: Legacy Evolution

Activities across the lifecycle:

Introduction
Establishing a Baseline
Going Forward with the RUP
The Evolution Cycle
Summary
Reference

Additional topics:

Concepts
- Concept: Enterprise Application Integration
- Concept: Introduction to Service-Oriented Architecture
Guidelines
White Papers
- Using Service-Oriented Architecture and Component-Based Development to Build Web Service Applications
- A Guide to Legacy Integration with IBM Rational Rapid Developer

Introduction

A legacy system has been defined as a system that "...significantly resists modification and evolution to meet new and constantly changing business requirements."[1] It usually implies that the system is large and old. In this context, it also means "a system originally developed using a process other than RUP."

By "evolution" we mean a significant project for updating, incorporating, or redeveloping a legacy system. Thus, this roadmap doesn't describe how RUP can be used for ongoing maintenance of mature systems.

Usually, the first thing to do is to define a vision for the proposed evolution, answering questions such as:

Where is the value of your legacy system?
How do you want it to evolve?
Is there a business case supporting the planned evolution?

A more detailed discussion of these topics can be found in the Guideline: Defining a Vision for Legacy Evolution.

Common challenges of evolving legacy systems include:

The system is poorly understood.
- Documentation is out of date.
- Original developers are not available, and the remaining staff has limited knowledge of how the system actually works.
The system was developed using older software development methods and technologies which may not be suitable for future development effort.

As with all projects, the fundamental principles of RUP are applicable to legacy evolution projects.

These principles are:

Early risk mitigation
Iterative development
Progress assessment based on concrete, measurable evidence
Organization around small, empowered teams
Verifying quality continuously
Scope management
Producing only the work products that are needed

This already makes the basic RUP lifecycle template, with its four phases of Inception, Elaboration, Construction, and Transition, fully applicable to a legacy system project. This, in turn, makes most of the Project Management activities of the RUP fully applicable as well.

The adaptation of the Rational Unified Process (RUP) to deal with evolution of legacy systems is discussed in further detail below.

Establishing a Baseline

To go beyond simply applying the RUP lifecycle and use other disciplines of the RUP going forward, you need to establish a starting point. You must identify a minimal set of essential work products describing the legacy system.

Depending on the scope of the evolution, you may need more or less of:

Requirements
Architecture and design
Tests
User documentation

Once you have established this baseline of RUP work products, you can proceed with the legacy project as if it were a RUP Evolution cycle.

Establishing a minimal set of work products that will allow your project to proceed as per the RUP requires some reverse engineering on your legacy system. By reverse engineering, we mean trying to identify, extract, or recreate enough information to enable you to proceed almost as if the project had been originally developed using the RUP. This is the point at which many project managers are ready to scrap the RUP for their legacy project, as they perceive this reverse engineering effort to be a huge waste of time. It does not need to be such an immense effort, though, as the intent is not to recreate every single work product, but to understand the key attributes of the current system and determine what should be conserved and what should be replaced or upgraded.

The RUP templates for these work products can be used, as well as some of the associated guidelines and checklists, but you probably want to tailor the templates first to avoid falling into the trap of documenting elements you do not need. In many cases, you can "fill in" the templates (in your first pass) by cross-referencing; that is, indicating in which existing document the corresponding information can be found. If the existing documentation is online in HTML, then hyperlinks can be used.

Note that this step of establishing a baseline is not RUP specific. Whatever process or method you will use to go forward, you will need to do some reverse engineering of the existing system.

Requirements

Perhaps the greatest value of a legacy system is as a requirements specification for the new system.

For example, when we started Rational Apex, the first draft of our Vision Document stated "...it has first to do everything that the Rational Environment (version Delta) does, and do it no slower." Then we specified deviations from the Rational Environment: features added, features dropped.

A smart team never retrospectively documents the requirements of a legacy system, so you do not have to restart the requirements effort from scratch; you only need to identify your key use cases. You probably have them already, described in the current User's Manual. Just having an inventory of the use cases (a Report: Use-Case Model Survey) may be enough. You will only need to detail the use cases that need to change. Many of the nonfunctional requirements can be derived from your marketing or installation documentation: capabilities, size and performance characteristics, operating systems, memory, peripherals, other software, general constraints, and most of the "ilities." If you are not using a requirements management tool, then maybe now is the right time to start. Finally, a good additional artifact to create while doing this reverse engineering is a Glossary of terms used in the legacy system, collecting terms as you encounter them. It can prove invaluable when going forward.

Architecture and Design

Your legacy system does not need to be completely redesigned using object-oriented (OO) techniques. You will, however, need a minimal amount of architectural information. You can create a minimal Software Architecture Document, starting from the Implementation View: What are the various subsystems or main bodies of code? What are the critical interfaces? From this information, you can identify your Deployment View and your Process View if the legacy system is distributed. You will need a precise inventory of the existing software, clearly identifying each element and the relationships among them. If the software is not yet under configuration management, now is the right time to start controlling it.

Describing the interfaces and the scenarios of how these interfaces are exercised is crucial. Later on, you will identify the subsystems that are not affected by the evolution: the stable, core, reusable chunks of the legacy system. Do you need a detailed software design documentation as well as these interface descriptions? If you have it and can trust it, that is nice, but do not embark on a huge effort to produce it before you know what pieces need to be changed. Even then, proceed on a case-by-case basis. Tools can help you do this reverse engineering within a few days of effort.

You will need also to identify the different data sources of your legacy system that need to be migrated and record their data profile in the Data Migration Specification. This will be crucial information when you start defining the data mapping between existing data sources and those needed by the new version of the system.

Tests

Whatever tests, test scripts, test cases, and test harnesses were developed for the legacy system will still be largely applicable to the new system.

User Documentation

Unless there is an incentive to completely revamp it, the user documentation for the legacy system can constitute a good baseline for the new system.

Going Forward with the RUP

Once you have established your minimal RUP work product baseline, much of it by reference to existing information, you can now proceed. Most of the tasks of the RUP apply, just as they do in Construction and Transition iterations for a brand new development project. Yet, as always, try to keep things as light as possible as you choose what to adopt from RUP; do not execute tasks or create work products that are unnecessary.

Requirements Management

Express new requirements using use cases. You may have to recreate a use case for existing functionality to better articulate what is being changed. If several use cases need to be added or changed, you may find it useful to derive a small Domain Model from your Glossary.

Architecture and Design

You might want to use object-oriented techniques and the UML (Unified Modeling Language) for your new development. A handy technique is to consider some of the least affected subsystems as big composite classes, especially when you are doing sequence diagrams. The resulting Design Model should only go into details for the classes that are architecturally important or that need to evolve. Proxies can be created for these classes, mapping their functionality to the existing code.

If your long-term goal is ambitious and aims at a complete, gradual replacement of the legacy system, you will have to do an architectural design for the new system, and then map it to the existing subsystems. You can create wrappers around some of the existing body of code to make it look like it was designed using OO techniques. Reassembling the complete system with the various wrappers can be an internal milestone in your elaboration phase. As you go into use-case design, your use-case realizations will show you the impact on various existing subsystems. Then you can decide which of these "wrapped subsystems" need be converted, ported, rewritten, or integrated in an EAI (Enterprise Application Integration) framework.

The Data Migration Specification needs to be completed with the source-target data mapping. It will be used to implement the migration components necessary to perform the data migration.

In some limited cases, you might be able to use tools, such as IBM Rational XDE or Rose, to reverse engineer elements of your existing code into the UML. But do not rely on using the results blindly; they will always require some human interpretation.

Deployment

Depending on the scope of the evolution, deployment of the new system may be more challenging than a green field development. If you migrated the system to a new architecture or redeveloped significant portions of it, you will have to choose a strategy: either to cut over "cold turkey" from the old system to the new one or use a phased strategy and do the transition in small incremental steps. You can even have both systems working in parallel until the new one can be fully trusted. In practice, deployment is often much more delicate with a legacy system than a new application as you need to tackle issues of data conversion and migration, continuity of operations, retraining of personnel, and so on. This deployment strategy could be described in the Deployment Plan.

Other Disciplines

Other software development disciplines with all their tasks, guidelines, techniques, and tools also apply: test and implementation, for example. Configuration management may be more relevant and required earlier in the project than for a new development as you start from day one with many existing work products, sometimes with complex dependencies between them. In a legacy system upgrade, change management becomes a dominant activity.

Often, the decision to redevelop a legacy system also represents an opportunity to reengineer business processes, using business modeling, which could lead to a different set of requirements for the new system.

The Evolution Cycle

A legacy evolution project goes through the same cycle of phases as all RUP projects. The objectives of these phases are essentially the same; however, the following sections describe some specifics for legacy evolution projects.

Inception phase

The RUP Inception phase specifies that you produce a Vision Document and Business Case, as well as an Initial Development Case specifying which work products you need to recreate. In this phase, you will also start the process of reverse engineering for some of the work products: requirements and architecture, mainly, in order to be able to choose the appropriate evolution strategy and estimate its cost.

Elaboration phase

In this phase, you will complete your RUP baseline, the minimal set of work products that you need to go forward, including the conversion of some older work products to the new tool set. For simple extensions, this can be done in one short iteration. But if there are a large number of architectural changes to go through, as in a migration strategy or redevelopment, then you will have several iterations in this elaboration phase to implement a new architectural baseline. It may even be that this Elaboration phase is the dominant phase and that there will be little to do in Construction and Transition. Testing is put in place in the new environment, and regression testing can start early. Unlike Elaboration for a green-field development, there is from the beginning a large number of work products -- code in particular -- to manage, and tasks from the Change and Configuration Management discipline may have to be stressed earlier.

Construction phase

This phase is not significantly different from any other RUP project, except that much of the work involves interfacing to or reworking existing code rather than developing new code. Additional elements are reverse-engineered, redesigned, and documented as necessary.

Transition phase

The Transition phase may be more delicate, depending on the deployment strategy to go from the old system to the new one; see the section on Deployment above.

RUP's iterative approach is particularly helpful in staging legacy evolutions, with its concrete and measurable objectives for each iteration. Joe Marasco, the manager for the Rational Apex project wrote:

"We decided which bits of functionality needed to be moved first, which parts will be moved without touching them at all, which will be moved in later iterations. The version on Sun OS was postponed to a later iteration, once the version on AIX was stable. Instead of seeing the butterfly emerge in one day from the cocoon, you plan its metamorphosis and track its evolution iteration by iteration. I cannot imagine managing the evolution of a complex legacy system by any other means."

Summary

How do you apply the RUP to a legacy system?

First, by understanding what you are trying to do.
Second, by intelligently exploiting what you already have.
Third, by focusing on the principles and not necessarily the details of the RUP.

Large portions of the RUP can be used for the evolution of a legacy system, with more or less tailoring and formality, depending on the type of evolution you envisage and how much information on the legacy system is at hand.

Just because it is a legacy system, there is no reason not to have a Vision Document, describing what it is you want to achieve; a Project Plan, showing major milestones and what you want to accomplish; maybe iterations and their specific objectives; and a Risk List. You also need a Business Case to be able to discuss the benefits of doing the project and the approach you will take.

Additional RUP work products can also be developed by extracting from or reverse engineering the existing system. However, this should be done judiciously, as it is often more cost-effective to continue to use and reference existing documentation rather than change it to RUP format.

One caution is that we have seen projects fail when too many changes were attempted at the same time: a major evolution of a legacy system (e.g., a migration to a new platform) at the same time as a change of process (e.g., going to the RUP) and a change of tool set (e.g., going to Rational Suites). It is preferable to introduce a new process and new tools during an earlier project before you undertake a major legacy evolution so that developers can become familiar with the RUP, its philosophy, and its contents, as well as the tools that support it. Avoid multiplying risk for the project by introducing too many unknowns and changes simultaneously.

Reference

Michael Brodie and Michael Stonebraker, Migrating Legacy Systems, San Francisco: Morgan Kaufmann Publishing, 1995.