IBM Websphere Metadata
This article explains how WebSphere uses metadata to map
(CMP) container-managed persistence Enterprise JavaBeans to database tables.
Introduction
The J2EE 1.2 and EJB 1.1 specifications were a big step forward for enterprise Java TM developers.
They introduced a concept that enterprise applications had been missing for some time.
The metadata of a J2EE application could be read and written in a simple, easy-to-understand format, that is essentially plain text.
IBM has gotten behind this idea in a big way in WebSphere Application Server, Version 4.0 (WebSphere 4.0).
This has some ramifications for developers working with WebSphere 4.0 and WebSphere Studio Application Developer
(Application Developer), as we will see in this article.
What is metadata?
Metadata literally means "data about data".
The parts of an application that aren't code, but describe the code and how it fits together with other code.
Metadata is information about a resource, such as an EJB or servlet, and about
how it can be used by other J2EE resources.
An example of metadata is the EJB 1.1 Deployment Descriptor, which is described
in [EJB 1.1].
Let's say you're building a simple EJB .jar file for deployment to
WebSphere 4.0.
The .jar file contains a single container-managed persistence
(CMP) entity bean that represents a person.
The following deployment descriptor (named ejb-jar.xml) is
contained in the META-INF directory of our EJB .jar file, and
describes our Person EJB:
<!DOCTYPE ejb-jar PUBLIC "-//Sun Microsystems, Inc.//DTD Enterprise JavaBeans 1.1//EN"
"http://java.sun.com/j2ee/dtds/ejb-jar_1_1.dtd">
<ejb-jar>
<enterprise-beans>
<entity>
<ejb-name>PersonEJB</ejb-name>
<home>com.ibm.demo.ejbs.PersonHome</home>
<remote>com.ibm.demo.ejbs.Person</remote>
<ejb-class>com.ibm.demo.ejbs.PersonBean</ejb-class>
<persistence-type>Container</persistence-type>
<prim-key-class>java.lang.Integer</prim-key-class>
<reentrant>False</reentrant>
<cmp-field><field-name>id</field-name></cmp-field>
<cmp-field><field-name>name</field-name></cmp-field>
<cmp-field><field-name>age</field-name></cmp-field>
<cmp-field><field-name>educationLevel</field-name></cmp-field>
<primkey-field>id</primkey-field>
</entity>
</enterprise-beans>
<assembly-descriptor>
<security-role>
<description>
Everyone can gain access to this EJB.
</description>
<role-name>everyone</role-name>
</security-role>
<method-permission>
<role-name>everyone</role-name>
<method>
<ejb-name>PersonEJB</ejb-name>
<method-name>*</method-name>
</method>
</method-permission>
<container-transaction>
<method>
<ejb-name>PersonEJB</ejb-name>
<method-name>*</method-name>
</method>
<trans-attribute>Required</trans-attribute>
</container-transaction>
</assembly-descriptor>
</ejb-jar>
This simple deployment descriptor defines the parts of
this EJB, such as the home interface, remote interface, bean class, and
CMP fields -- that are the fields in the bean class that will be container-managed.
In other words, they will be stored and retrieved from a relational database
by code generated during deployment. Finally, the deployment descriptor
contains other information such as the container transaction settings
and the EJB security roles defined for this bean.
This information is used by Websphere in the following ways:
To determine how to handle transactions (whether to start a new transaction for
each method, or to "flow" existing transactions through each EJB method). It's
also used by the WebSphere security system to determine if a user (who is
mapped by WebSphere to one or more J2EE roles) can access a particular EJB
method.
WebSphere uses the metadata to determine how to generate the code for CMP persistence that will actually do
the work of storing and retrieving information from a relational database.
This is exactly like a million other examples of deployment descriptors
that you can find in other books and articles, and I doubt that most of you
have learned anything new. (If you have, you may want to review the EJB 1.1
specification before moving on.)
I won't rehash what all the various tags in a deployment descriptor mean.
Instead, let's find out what other metadata WebSphere uses in conjunction with
EJBs, and how you can use that metadata in your own projects.
Metadata in WebSphere 5.0
Let's begin by examining what happens when you generate the deployment code for
this EJB using the WebSphere Application Assembly Tool (AAT). Remember that
there are two forms of an EJB JAR:
"undeployed" form
Contains only the remote and home interfaces, the bean implementation class,
and the deployment descriptor.
"deployed" form
Contains the classes that are necessary to support persistence, transactions,
and distribution, and that are generated by the application server during
deployment.
We want to do here is to examine some of the information that
WebSphere uses in this deployment process. WebSphere AE supports three methods
for mapping CMP EJBs to a database:
Top-down
The information in the EJB is used to create a database table that corresponds
to the managed fields of the CMP EJB.
Meet-in-the-middle
There is a pre-specified correspondence between the managed fields in the CMP
EJB and the columns in one or more database tables.
Bottom-up
EJB fields are created for the columns in a database table.
The key point here is that WebSphere requires additional metadata beyond the EJB
deployment descriptor to perform these mappings. The metadata is used to drive
the code generation process for the classes that actually execute specific SQL
statements and then copy information out of the database tables into the EJB
and vice versa. If you can understand the metadata generated for a top-down
mapping, then you are well on your way to understanding how to use WebSphere to
map CMP EJBs to database tables via the meet-in-the-middle or bottom-up method.
If you use the WebSphere AAT to generate
deployment code for an EJB JAR file, or deploy an undeployed EJB JAR file using
the WebSphere Administration Console without specifying any additional
information about database mapping, it will perform a top-down mapping. So, if
you open the JAR file that contains this descriptor (attached) in AAT, generate
the deployment code, and then expand the JAR into a directory, you will see
that the META-INF directory now contains the following files:
One of these files is expected -- the MANIFEST file that is part of any JAR
file, so we won't pay special attention to it. The other files are the
interesting ones:
-
EJB-JAR.xml
The same as the one above, but modified by AAT to contain additional
identification tags.
-
/Schema/schema.dbxm
Contains an XML representation of the database schema and table that the CMP
EJB maps to.
-
Map.mapxmi
Contains XML that shows how the CMP fields in the EJB-JAR.xml
file map into the database schema in the schema file.
-
Table.ddl
Contains the necessary SQL to create the table described in the Schema file.
Let's begin by looking at what changed in the
EJB-JAR.xml file.
The part of the file below shows what changed:
<ejb-jar id="ejb-jar_ID">
<enterprise-beans>
<entity id="ContainerManagedEntity_1">
<ejb-name>PersonEJB</ejb-name>
<home>com.ibm.demo.ejbs.PersonHome</home>
<remote>com.ibm.demo.ejbs.Person</remote>
<ejb-class>com.ibm.demo.ejbs.PersonBean</ejb-class>
<persistence-type>Container</persistence-type>
<prim-key-class>java.lang.Integer</prim-key-class>
<reentrant>False</reentrant>
<cmp-field id="CMPAttribute_1">
<field-name>id</field-name>
</cmp-field>
<cmp-field id="CMPAttribute_2">
<field-name>name</field-name>
</cmp-field>
<cmp-field id="CMPAttribute_3">
<field-name>age</field-name>
</cmp-field>
<cmp-field id="CMPAttribute_4">
<field-name>educationLevel</field-name>
</cmp-field>
<primkey-field>id</primkey-field>
</entity>
</enterprise-beans>
...
</ejb-jar>
As you can see, a few things have been added. AAT has added an id attribute to
the following tags:
-
Ejb-jar
-
Entity
-
Cmp-field
These id tags uniquely identify each CMP field within each Entity EJB contained in the JAR.
As we will see in a moment, this unique identification is crucial for WebSphere to operate correctly on the other metadata files.
The next file to become familiar with is not really a metadata file, but a file that WebSphere generates for your convenience.
This is the Table.ddl file, which contains the SQL to create the table for the top-down mapping:
CREATE TABLE PERSONEJB
(ID INTEGER NOT NULL,
NAME VARCHAR(250),
AGE INTEGER,
EDUCATIONLEVEL INTEGER);
ALTER TABLE PERSONEJB
ADD CONSTRAINT PERSONEJBPK PRIMARY KEY (ID);
If you carefully compare this file to the EJB deployment descriptor above,
you will see that the table that corresponds to this EJB has the same name
specified in the <ejb-name> tag in the deployment
descriptor, and that the columns of the table match the names in the <cmp-field>
tags above.
The column corresponding to the value of the <primkey-field>
tag has been declared NOT NULL (since it will be the key for this table), and a
primary key constraint has been added for this column as well.
You may be wondering how WebSphere knows what datatypes to use to create this
table. The answer is simple -- there is a fixed mapping of datatypes in the
database to the Java language types of the container-managed attributes defined
in the code of your EJB Bean class. This mapping varies from database to
database, which is why you must select the database type in either the AAT or
the WebSphere Administration Console when you deploy the EJB to WebSphere.
Now that you've seen the Table.ddl file and understand how
WebSphere derived it from the code of your CMP EJB and the metadata in the EJB
deployment descriptor, the next file to investigate is the schema.dbxmi
file held in the Schema subdirectory of the META-INF directory:
<xmi:XMI xmi:version="2.0" xmlns:xmi="http://www.omg.org/XMI" xmlns:RDBSchema="RDBSchema.xmi">
<RDBSchema:RDBDatabase xmi:id="RDBDatabase_1" name="TopDownDB" tableGroup="RDBTable_1">
<dataTypeSet href="UDBV7_Primitives.xmi#SQLPrimitives_1"/>
</RDBSchema:RDBDatabase>
<RDBSchema:RDBTable xmi:id="RDBTable_1" name="PERSONEJB"
primaryKey="SQLReference_1" database="RDBDatabase_1">
<columns xmi:id="RDBColumn_1" name="ID" allowNull="false"
group="SQLReference_1">
<type xmi:type="RDBSchema:SQLExactNumeric"
xmi:id="SQLExactNumeric_1">
<originatingType xmi:type="RDBSchema:SQLExactNumeric"
href="UDBV7_Primitives.xmi#SQLExactNumeric_1"/>
</type>
</columns>
<columns xmi:id="RDBColumn_2" name="NAME">
<type xmi:type="RDBSchema:SQLCharacterStringType"
xmi:id="SQLCharacterStringType_1" length="250">
<originatingType xmi:type="RDBSchema:SQLCharacterStringType"
href="JavatoDB2UDBNT_V71TypeMaps.xmi#SQLCharacterStringType_250"/>
</type>
</columns>
<columns xmi:id="RDBColumn_3" name="AGE">
<type xmi:type="RDBSchema:SQLExactNumeric"
xmi:id="SQLExactNumeric_2">
<originatingType xmi:type="RDBSchema:SQLExactNumeric"
href="UDBV7_Primitives.xmi#SQLExactNumeric_1"/>
</type>
</columns>
<columns xmi:id="RDBColumn_4" name="EDUCATIONLEVEL">
<type xmi:type="RDBSchema:SQLExactNumeric"
xmi:id="SQLExactNumeric_3">
<originatingType xmi:type="RDBSchema:SQLExactNumeric"
href="UDBV7_Primitives.xmi#SQLExactNumeric_1"/>
</type>
</columns>
<namedGroup xmi:type="RDBSchema:SQLReference"
xmi:id="SQLReference_1"
name="PERSONEJBPK" members="RDBColumn_1" table="RDBTable_1"
constraint="Constraint_PERSONEJBPK"/>
<constraints xmi:id="Constraint_PERSONEJBPK" name="PERSONEJBPK"
type="PRIMARYKEY" primaryKey="SQLReference_1"/>
</RDBSchema:RDBTable>
</xmi:XMI>
It uses an XML standard called XMI, which represents information about an object design or
object model in XML.
In fact, what it's describing is WebSphere's internal means of representing the
database schema for this EJB. It is not intended to be as easily readable as
the EJB deployment descriptor. However, it's not that hard to understand once
you study it for a few minutes. Immediately after the opening XMI tag that
describes the version and namespaces used by this file, you see the following
tags:
<RDBSchema:RDBDatabase xmi:id="RDBDatabase_1" name="TopDownDB" tableGroup="RDBTable_1">
<dataTypeSet href="UDBV7_Primitives.xmi#SQLPrimitives_1"/>
</RDBSchema:RDBDatabase>
The only important thing about this group of tags is that it specifies that this
particular schema uses the DB2- UDB 7 mapping to map Java types to database
types.
The next segment gets more interesting. Notice that these tags have the
following structure as shown in Figure 1 below.
Figure 1. Database Tag Structure
As you can see, there is a <RDBSchema:RDBtable> tag that corresponds to the table defined in the CREATE TABLE SQL above.
There are <columns> tags for each of the columns defined in the table as well. Finally, each <column>
tag contains type information that describes both the originating type and the
type of the column. The originating type provides information on the primitive
database type (numeric, etc.), while the type tag shows how the originating
type is extended for this particular column (by providing length, scale, or
precision information).
Here we have an XML definition of the table. At first glance, this doesn't seem
useful, since it is very similar to the information in the Table.ddl
file. However, the next file, the map.mapxmi file, brings
everything together and helps all this make sense:
<ejbrdbmapping:EjbRdbDocumentRoot xmi:version="2.0" xmlns:xmi="http://www.omg.org/XMI"
xmlns:ejbrdbmapping="ejbrdbmapping.xmi" xmlns:ejb="ejb.xmi"
xmlns:RDBSchema="RDBSchema.xmi" xmlns:Mapping="Mapping.xmi"
xmi:id="EjbRdbDocumentRoot_1" outputReadOnly="false" topToBottom="true">
<helper xmi:type="ejbrdbmapping:RdbSchemaProperies"
xmi:id="RdbSchemaProperies_1" primitivesDocument="DB2UDBNT_V71">
<vendorConfiguration
href="RdbVendorConfigurations.xmi#DB2UDBNT_V71_Config"/>
</helper>
<inputs xmi:type="ejb:EJBJar" href="META-INF/ejb-jar.xml#ejb-jar_ID"/>
<outputs xmi:type="RDBSchema:RDBDatabase"
href="META-INF/Schema/Schema.dbxmi#RDBDatabase_1"/>
<nested xmi:type="ejbrdbmapping:RDBEjbMapper" xmi:id="RDBEjbMapper_1">
<helper xmi:type="ejbrdbmapping:PrimaryTableStrategy"
xmi:id="PrimaryTableStrategy_1">
<table href="META-INF/Schema/Schema.dbxmi#RDBTable_1"/>
</helper>
<inputs xmi:type="ejb:ContainerManagedEntity"
href="META-INF/ejb-jar.xml#ContainerManagedEntity_1"/>
<outputs xmi:type="RDBSchema:RDBTable"
href="META-INF/Schema/Schema.dbxmi#RDBTable_1"/>
<nested xmi:id="PersonEJB_id---PERSONEJB_ID">
<inputs xmi:type="ejb:CMPAttribute"
href="META-INF/ejb-jar.xml#CMPAttribute_1"/>
<outputs xmi:type="RDBSchema:RDBColumn"
href="META-INF/Schema/Schema.dbxmi#RDBColumn_1"/>
<typeMapping
href="JavatoDB2UDBNT_V71TypeMaps.xmi#Integer-INTEGER"/>
</nested>
<nested xmi:id="PersonEJB_name---PERSONEJB_NAME">
<inputs xmi:type="ejb:CMPAttribute"
href="META-INF/ejb-jar.xml#CMPAttribute_2"/>
<outputs xmi:type="RDBSchema:RDBColumn"
href="META-INF/Schema/Schema.dbxmi#RDBColumn_2"/>
<typeMapping
href="JavatoDB2UDBNT_V71TypeMaps.xmi#String-VARCHAR"/>
</nested>
<nested xmi:id="PersonEJB_age---PERSONEJB_AGE">
<inputs xmi:type="ejb:CMPAttribute"
href="META-INF/ejb-jar.xml#CMPAttribute_3"/>
<outputs xmi:type="RDBSchema:RDBColumn"
href="META-INF/Schema/Schema.dbxmi#RDBColumn_3"/>
<typeMapping
href="JavatoDB2UDBNT_V71TypeMaps.xmi#int-INTEGER"/>
</nested>
<nested xmi:id="PersonEJB_educationLevel---PERSONEJB_EDUCATIONLEVEL">
<inputs xmi:type="ejb:CMPAttribute"
href="META-INF/ejb-jar.xml#CMPAttribute_4"/>
<outputs xmi:type="RDBSchema:RDBColumn"
href="META-INF/Schema/Schema.dbxmi#RDBColumn_4"/>
<typeMapping
href="JavatoDB2UDBNT_V71TypeMaps.xmi#int-INTEGER"/>
</nested>
</nested>
<typeMapping xmi:type="Mapping:MappingRoot"
href="JavatoDB2UDBNT_V71TypeMaps.xmi#Java_to_DB2UDBNT_V71_TypeMaps"/>
</ejbrdbmapping:EjbRdbDocumentRoot>
A few things are key to understanding how WebSphere
EJB to RDB mapping works. It is not my intention to tell you how to generate
this file from scratch, but instead to explain what it does so that you'll be
able to make small changes to this file (and the others we've covered) in order
to handle simple challenges in CMP mappings with WebSphere.
Start off by examining the following lines of code.
<inputs xmi:type="ejb:ContainerManagedEntity" href="META-INF/ejb-jar.xml#ContainerManagedEntity_1"/>
<outputs xmi:type="RDBSchema:RDBTable" href="META-INF/Schema/Schema.dbxmi#RDBTable_1"/>
Here we have the first indication of what is going on. As you can see, these two
lines link together a specific EJB reference in the
ejb-jar.xml file
(ContainerManagedEntity_1, which was the id of the "PersonEJB" we saw earlier),
with a particular database table defined in the schema (RDBTable_1, which is
the PERSONEJB table previously seen in the schema file). In fact, if this were
a multiple-table mapping (one where some columns came from two or more tables),
you'd see multiple
<outputs> tags, each referring to a
different schema file and table within that file
1.
This same principle continues throughout the rest of the file, as the next section indicates:
In this segment you see the connection between a particular container-managed
field defined in the
ejb-jar.xml file (CMPAttribute_1, which is
the field id) and a particular database column defined in the schema
(RDBColumn_1, which is the ID column). After the input and output mappings are
defined, the final piece to this puzzle is the type mapping -- which (as you
can see) maps a Java type (Integer) to a relational database type (INTEGER).
This kind of mapping is repeated for all of the CMP fields in the EJB.
If you're familiar with Converters in VisualAge� for Java EJB Support, you'll be
relieved to know that the <typeMapping> tag is used to pick
the default converter. If you need a different conversion than what is
specified (say a specialized converter that knows how to convert the special
Strings "Yes" and "No" to a boolean), you can specify this through a <helper>
tag at this point.
Figure 2 below shows the interaction between these three primary XML files and their constituent parts.
Figure 2. Metadata file relationships
Simple Metadata Tricks
Now that you know about the existence, structure, and interrelationships of
these XML files, the question is, what do you do with them?
First of all, let's clarify what you should not do with them.
You should not try to create these files in order to perform your own bottom-up
or complex meet-in-the-middle mapping.
The reason is that the underlying schemas aren't fully documented in the
WebSphere documentation, because these files are intended to be generated and
edited by the WebSphere toolset - VisualAge for Java 4.0, and especially
WebSphere Studio Application Developer
2.
The Application Developer documentation contains the best description
of the internal representation of the XMI object model that these files use.
If you are a tool builder who wants to generate your own entity EJBs using this
information, consider using the documented Application Developer tool APIs to
construct these files, rather than trying to reverse-engineer an object model
from the XML.
On the other hand, there are a couple of instances where directly changing the
XML can be the easiest way of updating your EJBs.
For example, many corporate environments have different database tables set up
to support development, test, and production.
In some cases, these databases may be hosted on the same instance of DB2 or
Oracle, and only differ by schema name (you might have DEV.PERSONEJB,
TEST.PERSONEJB and PROD.PERSONEJB).
How would you write your code so that it doesn't have any dependencies on what
environment? In the case of CMP Entity EJBs, WebSphere makes it simple.
All you need to do is change the name of the schema in the schema tag, and then
deploy the EJB JAR file to the different WebSphere instances used for the three
environments. For example, for DEV, your tag might look like this:
You can automate simple substitution with
tools like AWK, SED, or even ANT, which could also be used to invoke the
appropriate WebSphere command-line tool (SEAppInstall on Advanced Single Server
Edition, or WSCP on Advanced Edition) to generate the deployment code and
install the resulting application.
In this case, you'd start with an undeployed EJB JAR file, deploy it once, and
then copy the metadata files described above back into the build tree of your
project so that they become part of the undeployed JAR file.
When you deploy the JAR, WebSphere picks up the metadata files and generates
the deployment code appropriately.
Another simple change you can make is to update the XML to perform a minimal
meet-in-the-middle mapping when either the EJB definition or the database
schema changes. For instance, suppose you decide later in the project to change
the name of the educationLevel CMP field to edLevel. You'd only need to update
the ejb-jar.xml file to change the field like this:
<field-name>edLevel>/field-name>
</cmp-field>
Keep the id the same, because (as we saw earlier) the id is actually used to map
the CMP field to the corresponding column in the schema. As you can imagine, a
corresponding change in the database would involve keeping the ejb-jar.xml
the same, while updating the schema.dbxmi file appropriately.
Again, in either case, redeploy the EJB jar file after editing the XML.
Summary
This article has examined some of the hidden parts of CMP EJB mapping to
relational databases in WebSphere 4.0 and Application Developer. It described a
little bit about how the ejb-jar, schema, and map files interoperate, and how
the tools that operate on these files function. This information can help you
make better use of the WebSphere tools for CMPs, and plan the best way to
handle automated configuration and deployment issues involving CMPs.
Part 2of this article will examine some of the other features of these
files, such as associations, inheritance, converters, and composers, and also
examine the ejb-jar extension file, which is used in custom finder methods for
CMP EJBs.
Acknowledgements
Footnotes
1 This is because the current schema file
only contains a single table in each file.
2 AAT can generate these documents for a
top-down mapping, but you cannot edit them or perform the other mapping types
directly in AAT.