Guideline: Designing State for J2EE Applications
This guideline discusses state management design mechanisms for a J2EE application.
Relationships
Main Description

Introduction

Effective management of application state is an important aspect of designing distributed applications. This guideline provides an overview of some of the common design considerations and mechanisms for state management in a J2EE application.

Design considerations related to state management should be addressed during the Elaboration Phase of the project.  The software architect should examine general approaches to state management as part of the activities associated with the Analysis & Design Discipline Activity: Define a Candidate Architecture.  During the Task: Architectural Analysis, the software architect should examine the scalability and performance requirements for the application to determine what state management techniques will need to be used to enable the application to meet the performance objectives.  As the design of the application is refined during the course of the Elaboration Phase, the architect will need to define J2EE specific design and implementation mechanisms for managing state information with the application in theTask: Identify Design Mechanisms.

As described in Concept: J2EE Deployment Configurations, J2EE applications can be composed of several logical layers distributed across one to many physical tiers (machines).  After a brief technical overview of state management, the remaining sections of this guideline will discuss the different state management design and implementation mechanisms that can be used across the many application tiers.

Note that the software architect should document which mechanisms have been selected as part of the Artifact: Software Architecture Document, and should provide guidelines for using these mechanisms as part of project-specific design guidelines.

Technical Overview

There is growing trend to build distributed applications that interact with the Internet in one form or another. Even though the underpinnings of the Internet are by nature stateless, more often than not, there is a need to manage state for building any kind of business application. Consider an Internet application where a  user clicks on a link from page-a to page-b. The application processing the request for page-b no longer has access to the information used to process page-a.  This behavior may be acceptable for static web pages, but most business applications require some information about the previous processing. This is where state management mechanisms provided by J2EE come in.

Transient vs. Persistent State

Before delving into the state management guidelines it is important to differentiate between types of state information. State information can be broadly divided into two categories: transient (only exists as long as the application is active) and persistent (exists after the application has terminated).

Transient state information exists as long as the entity holding this information is alive. For example, state information stored as a field in an ordinary Java class. If the container hosting this class is terminated for any reason, the state information will be lost, unless the data has been replicated elsewhere, such as on a backup server.

Persistent state exists as long as the data store used to maintain state information exists.  Persistent state information is generally stored in a file or database, and is loaded when needed by the application.  Any changes to persistent state information must be written back to the database.  The integrity and recoverability aspects of the persistent data store should be consistent with those of the data being accessed by the application. An example of persistent state is information stored in a data store such as a relational database.

Session State

Web clients often require the ability to make multiple browser requests, navigating from page to page, while retaining client-specific information, such as items in a shopping cart. Web applications handle this by creating a session ID, and associating state data with this session ID. The session ID and associated state is referred to as session state.

Session state is data associated with a particular client's interaction with a web application over a short period of time (minutes or hours, rather than days). Thus, session state is short-lived data that is commonly deleted after some time-out period, in order to avoid consuming resources.

Session state can be stored at the client or at the server, as described in later sections. The J2EE platform provides mechanisms specifically tailored to managing session state, because of its importance in web-based applications.

Basic Persistence Mechanisms

The following are common mechanisms used by web applications to store state.

Cookies

Cookies are small text files stored on web-based clients. A server can store cookies on the client. Subsequent client requests send the cookie to the server, giving the server access to the state data stored in the cookie.

Some issues with cookies are:

  • Many users believe that cookies compromise security and/or privacy and, therefore, they disable cookies.
  • There are limitations on the size of cookie headers, and so this limits how much data can be stored.
  • Some protocols, such as Wireless Access Protocol (WAP) do not support cookies.
  • If a client logs in from another location (such as another machine) cookies stored in the other location are not available.
  • State data must be representable by string values.

URL Rewriting

URL rewriting is a mechanism for embedding session state into the URLs referenced in each page. When a web server generates pages to be delivered to a client, it encodes the session state into the URLs of the page. Then when the user clicks on a URL, the state data stored in the URL is sent back to the server, allowing it to re-establish the session context. A similar mechanism uses HTML hidden fields. Issues with these mechanisms are:

  • All pages in a given session must be handled by the server, otherwise the server may lose track of the session.
  • State does not survive when the client shuts down her browser, or links to a specific URL by typing or using a bookmark.
  • As with cookies, the state data is not available when the client logs in from another location.
  • As with cookies, the state data must be representable by string values.

Flat File

A flat file is one of the simplest methods of maintaining persistent state information.  Upon initialization, the flat file is read to establish the initial state values.  Each time the state is changed, the file must be rewritten to save the state.  Some disadvantages of maintaining application-state in a flat file are:

  • The scalability of the application is adversely impacted, since the application must lock the application object to prevent access to the global data while the application-state variables are being updated and rewritten to the flat file.
  • In most cases, updating data will require rewriting the entire file.
  • Flat files do not always provide recoverability in the event of an error.

XML

Maintaining persistent state information in an XML file is a step up from a flat file.  Some advantages of maintaining application state in an XML file as opposed to a flat file are:

  • An XML file provides structure that is not present in a flat file.
  • An XML file can be parsed using standard APIs.
  • An XML file is generally more portable.

Database

Maintaining persistent state information in a database provides the maximum recoverability.  Some advantages of maintaining application state in a database are:

  • The design of the tables provides structure.
  • The entire application state does not need to be rewritten when an application variable is updated.  Only the updated information needs to be rewritten.
  • Consistency can be maintained by coordinating application state recovery with recovery of the production database.
  • For high reliability situations, the database server can be clustered.

Databases can be accessed using the Java Database Connectivity (JDBC) API. JDBC can also be used for accessing other tabular data sources including spreadsheets, and flat files.

J2EE-Specific Mechanisms

The J2EE platform provides specific mechanisms for managing state. These are higher-level mechanisms that can be configured to use one or more of the basic mechanisms described thus far.

Servlet Context

Servlets can use the servlet context to save data applicable to multiple clients and client sessions.

Data stored in the servlet context are essentially global variables for the J2EE application.  As a result, the use of application state can have a significant impact on application design.  The software architect needs to factor in the following items during the Task: Identify Design Mechanisms in determining if the servlet context is appropriate:

  • Servlet context can be maintained in a single process, thus not shared across multiple servers (clusters).  If this does not match the application's scalability needs, the architect needs to consider storing the state as session-state.  
  • Servlet context is part of the process memory, therefore it is typically not maintained when the process is terminated.
  • Multiple threads can access the global data.  Locking and synchronization of the global data may impact the scalability of the application.

HTTP Session Object

Servlets and JSPs can store data associated with a particular client session in the HTTP Session Object. If storing data in session objects, then there may be issues around how session data is made available across multiple servers. Some vendors provide the ability to route client requests to the same server, a practice known as "server affinity".

The HTTP Session Object is available at the server during the processing of client requests, but may or may not be stored at the server between requests. The server could be configurable to use any of the basic persistency mechanisms described previously, including storing the session state in cookies on the client, or in files or a database on the server. It could also provide the ability to replicate session data in memory across servers.

The mechanism is selected by configuring the server - JSPs and servlets are coded independently of the selected mechanism, accessing the session object using an API specified by the Servlet specification.

Enterprise JavaBeans

Enterprise JavaBeans include high level mechanisms for storing state, which are based on the lower level mechanisms previously described, such as databases and files. Stateful session beans are used to store data associated with a particular client session, while entity beans are used to store longer-term data. See Guideline: Enterprise JavaBean (EJB) for discussion of state stored by EJBs.

Designing Session State

Web clients often require the ability to make multiple browser requests, navigating from page to page, while retaining client-specific information, such as items in a shopping cart. Web applications handle this by creating a session ID, and associating state data with this session ID.

The session ID itself is stored on the client by one or two mechanisms:

  • cookie - The client browser sends this cookie to the server on each request allowing the server to re-establish session-state.
  • URL rewriting - the URLs in pages delivered to the client by the server have the session ID encoded. When the user clicks on such a URL, the session ID is sent to the server, allowing the server to re-establish session state.

The server is configured to use the selected approach. Servlets and JSP should be coded to work regardless of the method configured. Specifically, use the HttpServletResponse.encodeURL() method to encode all URLs. This method checks if URL rewriting is enabled, and if so, performs the encoding.

The data associated with a session ID can be stored in the HTTP session object, where it can be accessed by JSPs and servlets, or in session beans.

Both the session ID and associated data should be set to time-out, so that session data that has not been used in a long time does not consume resources indefinitely. The architect should select an appropriate time-out period.

Selecting the Right Mechanism

Architects should consider storing session state in the client for reasons of simplicity and performance. When the state is managed and stored at the client, servers do not have to expend resources to store state information or to ensure its consistency. The downside of storing the state information with the client is that the information needs to be sent up to the server whenever needed, thus causing network latency related issues. There may be also be security considerations, if there is session state data that you do not wish to be exposed to the client. In this case, encryption may be an option.

If your application has large amounts of session state, it is generally preferable to store this state on the server, where there are generally fewer size and type limitations.

Generally session state related to presentation concerns should be stored in the HTTP session object, while stateful session beans should contain state required for correctly implementing business logic. Duplication of state data should be avoided - instead, move any duplicated state data into the HTTP session, and pass this data into the session bean as parameters on session bean method invocations, as required.

If session data stored on the server must survive the failure of a server node, then consider using a mechanism to persist or replicate session data.

Designing Longer-Lived State

Session data is for short-lived client data that times out. There may also be a need for data that survives for much longer periods of time.

The right mechanism for such data depends on the nature of the data being stored. Cookies, flat files, XML files, and databases are all options. For database access, an entity bean is generally the best choice. See Guideline: Entity Beans for details.