Introduction
Effective management of application state is an important aspect of designing distributed applications. This guideline
provides an overview of some of the common design considerations and mechanisms for state management in a J2EE
application.
Design considerations related to state management should be addressed during the Elaboration Phase of the
project. The software architect should examine general approaches to state management as part of the activities
associated with the Analysis & Design Discipline Activity: Define a Candidate Architecture. During the Task: Architectural Analysis, the
software architect should examine the scalability and performance requirements for the application to determine what
state management techniques will need to be used to enable the application to meet the performance objectives. As
the design of the application is refined during the course of the Elaboration Phase, the architect will need to
define J2EE specific design and implementation mechanisms for managing state information with the application in
theTask: Identify Design Mechanisms.
As described in Concept: J2EE Deployment Configurations, J2EE
applications can be composed of several logical layers distributed across one to many physical tiers (machines).
After a brief technical overview of state management, the remaining sections of this guideline will discuss the
different state management design and implementation mechanisms that can be used across the many application tiers.
Note that the software architect should document which mechanisms have been selected as part of the Artifact: Software Architecture Document,
and should provide guidelines for using these mechanisms as part of project-specific design guidelines.
Technical Overview
There is growing trend to build distributed applications that interact with the Internet in one form or another. Even
though the underpinnings of the Internet are by nature stateless, more often than not, there is a need to manage
state for building any kind of business application. Consider an Internet application where a user clicks on a
link from page-a to page-b. The application processing the request for page-b no longer has access to the information
used to process page-a. This behavior may be acceptable for static web pages, but most business applications
require some information about the previous processing. This is where state management mechanisms provided by J2EE come
in.
Transient vs. Persistent State
Before delving into the state management guidelines it is important to differentiate between types of state
information. State information can be broadly divided into two categories: transient (only exists as long as the
application is active) and persistent (exists after the application has terminated).
Transient state information exists as long as the entity holding this information is alive. For example, state
information stored as a field in an ordinary Java class. If the container hosting this class is terminated for any
reason, the state information will be lost, unless the data has been replicated elsewhere, such as on a backup server.
Persistent state exists as long as the data store used to maintain state information exists. Persistent state
information is generally stored in a file or database, and is loaded when needed by the application. Any changes
to persistent state information must be written back to the database. The integrity and recoverability aspects of
the persistent data store should be consistent with those of the data being accessed by the application. An example of
persistent state is information stored in a data store such as a relational database.
Session State
Web clients often require the ability to make multiple browser requests, navigating from page to page, while retaining
client-specific information, such as items in a shopping cart. Web applications handle this by creating a session ID,
and associating state data with this session ID. The session ID and associated state is referred to as session state.
Session state is data associated with a particular client's interaction with a web application over a short period of
time (minutes or hours, rather than days). Thus, session state is short-lived data that is commonly deleted after some
time-out period, in order to avoid consuming resources.
Session state can be stored at the client or at the server, as described in later sections. The J2EE platform provides
mechanisms specifically tailored to managing session state, because of its importance in web-based applications.
Basic Persistence Mechanisms
The following are common mechanisms used by web applications to store state.
Cookies
Cookies are small text files stored on web-based clients. A server can store cookies on the client. Subsequent client
requests send the cookie to the server, giving the server access to the state data stored in the cookie.
Some issues with cookies are:
-
Many users believe that cookies compromise security and/or privacy and, therefore, they disable cookies.
-
There are limitations on the size of cookie headers, and so this limits how much data can be stored.
-
Some protocols, such as Wireless Access Protocol (WAP) do not support cookies.
-
If a client logs in from another location (such as another machine) cookies stored in the other location are not
available.
-
State data must be representable by string values.
URL Rewriting
URL rewriting is a mechanism for embedding session state into the URLs referenced in each page. When a web server
generates pages to be delivered to a client, it encodes the session state into the URLs of the page. Then when the user
clicks on a URL, the state data stored in the URL is sent back to the server, allowing it to re-establish the session
context. A similar mechanism uses HTML hidden fields. Issues with these mechanisms are:
-
All pages in a given session must be handled by the server, otherwise the server may lose track of the session.
-
State does not survive when the client shuts down her browser, or links to a specific URL by typing or using a
bookmark.
-
As with cookies, the state data is not available when the client logs in from another location.
-
As with cookies, the state data must be representable by string values.
Flat File
A flat file is one of the simplest methods of maintaining persistent state information. Upon initialization, the
flat file is read to establish the initial state values. Each time the state is changed, the file must be
rewritten to save the state. Some disadvantages of maintaining application-state in a flat file are:
-
The scalability of the application is adversely impacted, since the application must lock the application object to
prevent access to the global data while the application-state variables are being updated and rewritten to the flat
file.
-
In most cases, updating data will require rewriting the entire file.
-
Flat files do not always provide recoverability in the event of an error.
XML
Maintaining persistent state information in an XML file is a step up from a flat file. Some advantages of
maintaining application state in an XML file as opposed to a flat file are:
-
An XML file provides structure that is not present in a flat file.
-
An XML file can be parsed using standard APIs.
-
An XML file is generally more portable.
Database
Maintaining persistent state information in a database provides the maximum recoverability. Some advantages of
maintaining application state in a database are:
-
The design of the tables provides structure.
-
The entire application state does not need to be rewritten when an application variable is updated. Only the
updated information needs to be rewritten.
-
Consistency can be maintained by coordinating application state recovery with recovery of the production database.
-
For high reliability situations, the database server can be clustered.
Databases can be accessed using the Java Database Connectivity (JDBC) API. JDBC can also be used for accessing other
tabular data sources including spreadsheets, and flat files.
J2EE-Specific Mechanisms
The J2EE platform provides specific mechanisms for managing state. These are higher-level mechanisms that can be
configured to use one or more of the basic mechanisms described thus far.
Servlet Context
Servlets can use the servlet context to save data applicable to multiple clients and client sessions.
Data stored in the servlet context are essentially global variables for the J2EE application. As a result, the
use of application state can have a significant impact on application design. The software architect needs to
factor in the following items during the Task: Identify Design Mechanisms in
determining if the servlet context is appropriate:
-
Servlet context can be maintained in a single process, thus not shared across multiple servers (clusters). If
this does not match the application's scalability needs, the architect needs to consider storing the state as
session-state.
-
Servlet context is part of the process memory, therefore it is typically not maintained when the process is
terminated.
-
Multiple threads can access the global data. Locking and synchronization of the global data may impact the
scalability of the application.
HTTP Session Object
Servlets and JSPs can store data associated with a particular client session in the HTTP Session Object. If storing
data in session objects, then there may be issues around how session data is made available across multiple servers.
Some vendors provide the ability to route client requests to the same server, a practice known as "server affinity".
The HTTP Session Object is available at the server during the processing of client requests, but may or may not be
stored at the server between requests. The server could be configurable to use any of the basic persistency mechanisms
described previously, including storing the session state in cookies on the client, or in files or a database on the
server. It could also provide the ability to replicate session data in memory across servers.
The mechanism is selected by configuring the server - JSPs and servlets are coded independently of the selected
mechanism, accessing the session object using an API specified by the Servlet specification.
Enterprise JavaBeans
Enterprise JavaBeans include high level mechanisms for storing state, which are based on the lower level mechanisms
previously described, such as databases and files. Stateful session beans are used to store data associated with a
particular client session, while entity beans are used to store longer-term data. See Guideline: Enterprise JavaBean (EJB) for discussion of
state stored by EJBs.
Designing Session State
Web clients often require the ability to make multiple browser requests, navigating from page to page, while retaining
client-specific information, such as items in a shopping cart. Web applications handle this by creating a session ID,
and associating state data with this session ID.
The session ID itself is stored on the client by one or two mechanisms:
-
cookie - The client browser sends this cookie to the server on each request allowing the server to
re-establish session-state.
-
URL rewriting - the URLs in pages delivered to the client by the server have the session ID encoded. When the user
clicks on such a URL, the session ID is sent to the server, allowing the server to re-establish session state.
The server is configured to use the selected approach. Servlets and JSP should be coded to work regardless of the
method configured. Specifically, use the HttpServletResponse.encodeURL() method to encode all URLs. This method checks
if URL rewriting is enabled, and if so, performs the encoding.
The data associated with a session ID can be stored in the HTTP session object, where it can be accessed by JSPs and
servlets, or in session beans.
Both the session ID and associated data should be set to time-out, so that session data that has not been used in a
long time does not consume resources indefinitely. The architect should select an appropriate time-out period.
Selecting the Right Mechanism
Architects should consider storing session state in the client for reasons of simplicity and performance. When the
state is managed and stored at the client, servers do not have to expend resources to store state information or to
ensure its consistency. The downside of storing the state information with the client is that the information needs to
be sent up to the server whenever needed, thus causing network latency related issues. There may be also be security
considerations, if there is session state data that you do not wish to be exposed to the client. In this case,
encryption may be an option.
If your application has large amounts of session state, it is generally preferable to store this state on the server,
where there are generally fewer size and type limitations.
Generally session state related to presentation concerns should be stored in the HTTP session object, while stateful
session beans should contain state required for correctly implementing business logic. Duplication of state data should
be avoided - instead, move any duplicated state data into the HTTP session, and pass this data into the session bean as
parameters on session bean method invocations, as required.
If session data stored on the server must survive the failure of a server node, then consider using a mechanism to
persist or replicate session data.
Designing Longer-Lived State
Session data is for short-lived client data that times out. There may also be a need for data that survives for much
longer periods of time.
The right mechanism for such data depends on the nature of the data being stored. Cookies, flat files, XML files, and
databases are all options. For database access, an entity bean is generally the best choice. See Guideline: Entity Beans for details.
|