New York
University
Computer
Science Department
Courant
Institute of Mathematical Sciences
Session 13:
Persistence in EJB Frameworks
Course Title: Extreme Java Course
Number: g22.3033-007
Instructor: Jean-Claude Franchitti Session: 13
Persistence in Enterprise JavaBeans is encapsulated
in the notion of EntityBeans. This handout describes bean- and
container-managed EntityBean persistence and the relative merit of these
techniques with respect to portability, productivity and performance.
Sun Microsystems produced the Enterprise JavaBeans
specification (version 1.0) in March 1998. According to its cover page, the
goal of the specification was to provide a "component architecture for the
development and deployment of object-oriented distributed, enterprise-level
applications." Two important conceptual components were described by this
architecture: "EnterpriseBeans" and their "container."
EnterpriseBeans are of two varieties: SessionBeans and EntityBeans.
SessionBeans don't have persistent state; EntityBeans do. In other words, you
can't expect the state of SessionBeans to outlive their process. EntityBeans on
the other hand have some or all of their state persist between incarnations.
EJB Architecture Overview
To discuss EntityBeans and their persistence, we first have to outline the components of the EJB architecture that will be important to our discussion. This section won't attempt to describe all the components, only the relevant ones. From a bean implementor's point of view, the first decision is whether the bean is going to be a SessionBean or an Entity-Bean. Both can have business logic but, generally speaking, SessionBeans are unshared servants to a client and have no persistent state. EntityBeans are shared among clients and do have persistent state.
For every bean there is a set of other interfaces
that must be in place for the bean to be accessible to clients. Each bean must
have a mandatory remote object interface as well as a Home interface (think
factory). The EJBObject-derived interface is a remote interface that typically
delegates calls to the bean; the EJBHome-derived interface contains bean
lifecycle and management functionality. The home interface is also remote.
You're probably asking yourself, Where does this so-called "container" fit in? Well, from a bean-provider perspective, everything that facilitates the life cycle of the bean and the dispatch of a remote client method invocation to your bean is "the container." For the purposes of this handout, this includes the EJBObject- and EJBHome-derived remote interfaces (which are application- dependent) as well as services like the implementation of JTS and JNDI (which are application-independent) that your bean can depend on. Some containers may also support container-managed persistence. This is an API that an EntityBean developer can use to delegate the persis-tence of the bean to instead of writing all of the persistence himself (by using JDBC or JSQL).
Figure 1 shows the relationship of an EntityBean (with container-managed persistence), its container and the client. In this figure the EmployeeBean is the implementation of all the logic for the bean. Employee is just a remote object interface declaration and the same is true for EmployeeHome. These three elements along with a DeploymentDescriptor (described later) are supplied by the bean provider and are application-dependent. Tools supplied by the container provider will typically generate the implementations of the remote interfaces.
These application-dependent components wrap the bean
for the container. We say they "wrap the bean for the container"
because they're mandatory and you can access the bean only through them.
Application-independent components are also part of
the container, of course. Figure 1 shows that developers can take advantage of
the application-independent APIs, that is, JTS, JNDI and the container-managed
persistence (CMP) API. Notice that in this figure, which depicts container-
managed persistence, the container can add connection pooling and transactional
caching as well as dynamic schema management. These capabilities provide a lot
of transparent functionality to the bean with container-managed persistence.
Figure 2 is very similar except that the bean provides its own persistence. Most of the rest of the container is intact, but the ability to provide caching, dynamic schema management and connection pooling transparently is lost. So in this figure the bean is fatter because to write these capabilities on top of JDBC would be a tremendous burden on productivity. Plus your bean's code is just plain fatter.
Since this handout is mainly about Entity-Beans and
how their persistence is managed, we're going to skip the discussion of
SessionBeans except to reiterate that they're unshared and typically don't have
persistent state. They may, however, have transactional, conversational state.
SessionBeans will typically contain most of your session-based (unshared
between clients) business logic. The good news is that they will typically make
use of other EnterpriseBeans through remote-object interfaces. This leads to a
highly scalable federated archictecture.
You're probably asking yourself, Where does this
so-called "container" fit in? Well, from a bean-provider perspective,
everything that facilitates the life cycle of the bean and the dispatch of a
remote client method invocation to your bean is "the container." For
the purposes of this handout, this includes the EJBObject- and EJBHome-derived
remote interfaces (which are application- dependent) as well as services like
the implementation of JTS and JNDI (which are application-independent) that
your bean can depend on. Some containers may also support container-managed
persistence. This is an API that an EntityBean developer can use to delegate
the persistence of the bean to instead of writing all of the persistence
himself (by using JDBC or JSQL).
The Development/Deployment Process
As you can see, there is a substantial amount of code needed to make an EJB application server of any complexity. You may wonder whether you need to write all that code. Well, the development process that you will follow will be highly dependent on the tools supplied by your EJB application server and development tool vendor (which may be the same). A large part of the code required is the glue that plugs your bean into the container of your EJB app server vendor. The fine folks who wrote the EJB specification provided a nifty item called the DeploymentDescriptor. This serialized class is provided for each bean and contains enough meta data for tools to generate the framework to plug the bean into an EJB container. This would include the EJBHome, derived class, the remote object interface and other hooks to manage the bean.
Contained in a bean's DeploymentDescriptor is a
ControlDescriptor for each method. The descriptors contain meta information
about the bean and its methods with respect to their behavior in several
dimensions (e.g., transactional, security).
With the state of EJB implementations today, there
are basically three approaches to EJB development: (1) hand-code everything,
(2) hand-code the bean and the deployment descriptor and (3) generate most of
the bean and the deployment descriptor from a graphical environment. Let's look
at these in a little more detail.
This means that the EJB app server doesn't support any autogeneration tools but
does provide the API for its container. In this case you'll be expected to
write implementations for the beans, their remote object interfaces, their home
interfaces and any other hooks required by the vendor's API.
Hand-Code the Bean and Descriptor
EJB was written with this in mind. By producing a EJB-jar file containing compiled code for beans, their remote object and home interface declarations, and their associated serialized DeploymentDescriptors, a tool could be used to generate the rest of the infrastructure. In this manner the container's API doesn't necessarily need to be exposed to the programmer. Rather, the code generator takes care of this mapping. Figure 3 illustrates this approach.
Graphical Bean and Descriptor Generator
For ease of use, an EJB-jar file generation from a
graphical environment can be employed. In this development environment you
simply create an object model of the beans and their relationships. The
graphical environment takes care of generating the .jar file containing the
beans and their descriptors. Tools like PowerTier for EJB from Persistence
Software, Inc., take this graphical approach and not only generate the
container framework but also an RDBMS-independent object-relational mapping for
EntityBeans. Figure 4 illustrates this approach.
The bottom line with respect to generation is that
once you have the EJB-jar file container-specific tools can then generate the
framework for adapting beans to their container.
EntityBeans: The Persistence in EJB
Given an understanding of how Entity-Beans fit into the EJB architecture, the choice that a bean developer has to make is between bean-managed and container-managed persistence. How do you choose between them? Well, let's look at the relative merits of the approaches with respect to productivity, performance and portability. In a sense, the choice for developers is a choice between fat beans, where the bean is smart about data management, and fat servers, where the EJB container is smart about data management.
Container-Managed Persistence
For container-managed persistence the EJB container automatically implements the object-relational mapping services for the bean. The EJB container uses additional meta information in the deployment descriptor to determine how to implement the object-relational mapping for the bean. This makes the bean itself "thin" because the bean only needs to contain the custom business logic added by the developer. Figure 1 shows this notion of "thin" beans and a "fat" container.
The main advantages of container-managed persistence
are performance, runtime/ development-time flexibility and productivity.
EJB containers with container-managed persistence
can provide data management services that are unavailable with bean-managed
persistence, including:
·
Object-relational
mapping: automates the task of mapping complex beans - including inheritance,
aggregation and association - to relational tables.
·
Shared
transactional caching: many clients can share access to beans in the EJB
container with the same transactional integrity provided by a database.
·
Object
state management: container-managed persistence can keep track of which objects
have been changed within a transaction. In bean-managed persistence it's up to
the developer to manage this, usually by setting a dirty flag in each object.
·
Object
management optimizations: the container can support joint database queries in
the database that return networks of heterogeneous beans and enable navigation
between bean classes in the container cache.
·
Data
management optimizations: the container can defer write operations in a
transaction until commit, then batch multiple database operations into one
call.
·
Connection
management: the container can transparently manage database connections and
implement native database operations for optimal performance.
·
Container-based
agents: the container can monitor events within the server and notify clients
when specific events occur.
To develop a class with container-managed persistence, the developer performs the following steps (this example describes the Persistence Software, Inc., PowerTier for EJB container):
1.
Define
the object model and object-relational mapping using a CASE tool.
2.
Use
the EJB container tools to load the modeling information from the CASE tool and
generate beans whose deployment descriptors and implementations allow them to
use container-managed persistence.
3.
Add
custom code to the bean class (code insertion points protect the custom code
when the bean is regenerated).
The generated beans implement all required EJB methods but add convenient query methods and methods for complex relationship management. The container provides transparent connection management, exception handling, integrity and transactions.
In addition to enabling rapid application
development, container-managed persistence also makes enterprise bean classes
independent from the database schema. Thus beans that use container-managed
persistence are more flexible and are portable across data schemas. Changes to
the object model or database schema are handled by the container's
object-relational mapping, providing a high degree of flexibility and support
for iterative development. Finally, containers whose implementation of
Container Managed Persistence includes shared and transactional caching can
provide extreme performance and scalability advantages because they can manage
concurrent transactional access to the bean from multiple clients.
The main disadvantage is that they aren't as
portable as bean-managed persistence beans with respect to the current EJB
specification. This is mainly because EJB is underspecified in the area of a portable
container API for container-managed persistence.
Bean-Managed Persistence
For bean-managed persistence the developer writes database access calls using JDBC and SQL directly in the methods of the bean. This makes the bean itself "fat," requiring hundreds of lines of developer code to implement each bean. On the other hand, this makes the EJB container relatively "thin," giving it a small footprint and high portability. Figure 2 showed this notion of the "fat" bean with a "thin" container.
The main advantage of bean-managed persistence is
its portability. A bean managing its own persistence using JDBC or JSQL is very
portable, especially with respect to the current state of the EJB
specification.
The main disadvantages of the bean-managed persistence
approach is that writing the object-relational mapping by hand can be time
consuming and requires extensive knowledge of SQL. For example, the developer
must hand-code database access calls that implement the object-relational
mapping in the enterprise bean callback methods (ejbCreate(), ejbLoad(),
ejb-Store(), etc.).
Another severe limitation of this approach is that
it hardwires the mapping to a particular database schema into the guts of the
bean. This means that any change to the object model or the underlying database
schema can require extensive changes to the code for a particular bean,
limiting the flexibility and reusability of beans built using bean-managed
persistence.
To develop a class with bean-managed persistence,
the developer must perform the following steps:
1.
Define
the object-relational mapping for the bean.
2.
Implement
required EJB class methods. Using the JDBC API and SQL, write methods to
perform ejbCreate(), ejbRemove(), ejbLoad(),ejbFind<>() and ejbStore().
This requires a detailed knowledge of SQL and the underlying database schema.
3.
Implement
accessor and mutator methods. For each attribute or relationship write methods
to manipulate the bean value. Relationship access code to manage many-to-one
relations and delete constraints can be very complex.
4.
Manage
database connections. Each method must manage its own database connections by
interacting with the database connection pool.
5.
Handle
exceptions. Each method must handle both Java and database exceptions
appropriately.
6.
Manage
bean integrity. Any method that changes the state of the class must set a flag
to mark the bean as modified so the changes will be sent to the database.
7.
Ensure
transaction integrity. Collectively, all the objects in any transaction must be
able to "roll back" their state if a transaction fails.
8.
Add
custom code to the bean class.
Even if the SQL mapping were generated for you, the
gains in performance and flexibility would not be as great as container-managed
persistence unless you wrote all of the outlined features yourself…and then
you've thrown out productivity.
Finally, because you typically won't find shared and
transactional cache management in Bean Managed Persistence, it will have
extremely poor performance and scalability since multiple clients will block
each other for the entire scope of their transactions. This can be costly even
in the case of implicit transactions.
Conclusions
While bean-managed persistence provides the most
portability, it will generally fall short with respect to productivity.
It should be clear at this point that container- managed persistence in EntityBeans provides the most productivity. This method will also typically provide the highest degree of performance since, together, the code generation tools and the bean programmer can take advantage of container- specific mapping and caching technology. However, container-managed persistence currently falls short in the area of portability.
Below are some guidelines for making decisions with
respect to which kind of persistence management to use.
When
to Use Container-Managed Persistence
Developers who will benefit most from container-managed persistence are those
requiring rapid application development, high reliability and scalable
performance in their deployed application.
·
Rapid
application development: productivity is the number one reason for choosing
container-managed persistence. For simple beans, bean-managed persistence can
require 25 lines of hand-written code for every line of hand-written code required
by container-managed persistence. For more complex beans the ratio is even
worse.
·
Reliability:
the markedly smaller amount of code the developer has to write using
container-managed persistence guarantees fewer defects and higher reliability.
·
Performance:
container-managed persistence allows use of shared transactional caching,
deferred database operations and native database implementation to optimize
performance.
The kinds of applications that benefit most from
container-managed persistence include:
·
Data
intensive applications: these applications tend to have larger and more complex
object-relational mappings.
·
OLTP
applications: these often require good database performance, robust scalability
and rock-solid data integrity.
·
Application
integration: which must map information from several different data sources.
When to Use Bean-Managed
Persistence
The primary reason for using bean-managed
persistence today would be to improve portability for beans across servers from
multiple vendors. One known omission in the Enterprise JavaBean 1.0
specification is that it did not define a standard API for EJB containers to
ensure portability among containers from different vendors.
This omission was corrected in the Enterprise JavaBean 2.0 specification, as JavaSoft made clear that full support for entity beans is mandatory in EJB 2.0. Even with bean-managed persistence, however, the number of differences between EJB 1.x servers in areas such as security and transaction management are likely to make full portability difficult in the next 12 to 18 months.
Future Directions in Portability
and Scalability
Container-managed persistence will be made portable through standard deployment descriptor formats that can help define the schema as well as standard meta-data- driven internal container APIs for transparent, portable, container-managed persistence. This can be done in a fashion similar to the OMG's Persistent State Service specification.
Speaking of the OMG, CORBA integration is essential for any EJB app server to be scalable to the enterprise. IIOP is the perfect unifying protocol to support EJB applications and is necessary to overcome some of the limitations of JRMP, the most serious of which are bugs in distributed garbage collection (IIOP would do away with DGC) and service context propagation (without this, it is difficult to implement JTS and other services).