Wrox Home  
Search
Professional Oracle Programming
by Rick Greenwald, Robert Stackowiak, Gary Dodge, David Klein, Ben Shapiro, Christopher G. Chelliah
June 2005, Paperback


Excerpt from Professional Oracle Programming

Database Design Basics

There is a common belief that database design is done by the database administrator (DBA), while program design is done by programmers. While this may seem like a natural division of labor, it is actually quite unfortunate. Applications are made up of two equal and interrelated parts: data and processes. Neither has any purpose without the other and weaknesses in the design of either (or their intimate relationship) will be reflected in the entire system.

Design of the database and design of the total application require parallel efforts with frequent validations of each against the other. An elegant database design that requires awkward programming will be a maintenance nightmare. A tightly integrated and effective delineation of program modules can be rendered useless if the data needed by those modules is not structured for convenient access. Even worse is the discovery, as an application nears its final integration, that some data needed for a particular function isn't available or that some portion of the database has no means of being populated or maintained. It happens.

The solution is for the entire development team, analysts, programmers, and DBAs to work together with the system's eventual users to mutually discover both functional and data requirements and coordinate their implementation. DBAs need to gain an appreciation for modern programming constructs, while programmers need to understand the tenets of good database design.

Some parts of application development (systems engineering) are methodical and almost scientific. There are straightforward ways of performing certain common tasks that can be readily adopted to solve specific problems. Database design tends to have more guidelines and ideas than absolute rules. What rules do exist are always presented in the context of when they should be broken.

So consider database design to be more art than science. Don't ever attempt to argue that your chosen layout is the one correct design; further, be extremely suspicious of anyone else (DBA, consultant, manager, or Peruvian llama farmer) who insists that his or her approach is the only right solution.

The only absolute rule of database design is that there are always alternatives. That is not to say that some alternatives are not better than others; clearly there are wrong designs. The point is that the relative merits of design alternatives are only evaluated in the context of how the database will be used. These "how" issues are the province of the process designers and programmers. No DBA can develop an optimal database design without intimate knowledge of the way all of the programs will need to access the data.

Even with knowledge of the programs' data needs, it is still necessary to evaluate alternatives in light of multiple often-conflicting priorities. Performance is generally one of the priorities, but then it is necessary to determine which of the programs' performance needs are most critical. Modifying the database design to optimize a particular access will inevitably make some other accesses less efficient.

For an order entry system that takes in thousands of small, online orders each day, it may be most critical to optimize the database to accommodate that specific process, even though reporting, inventory management, shipping, and other related functions are either made slower or more complex. Another order entry system that receives only a few orders per hour for thousands of line items that have stringent availability requirements might be better designed around the needs of demand forecasting and analytics.

Let me offer one war story. I was the lead database designer for an order entry and customer service database for a large retailer in the early 1990s. The eventual system, when it went into production, included the largest known Oracle database on Unix in the world. I had an idea for a rather radical physical design change that would dramatically increase the availability of data in the event of a disk failure. (With the size of the total database using the 2 GB disks of the day, with no available disk mirroring, we calculated that we could expect an average of one disk failure per month.) This change would reduce recovery time by at least 50 percent and allow continued access, during most failures, to 90 percent of the customers' data while the 10 percent was being recovered.

The denormalization idea that I was considering would place some additional programming requirements on most of the common modules that had interfaces to the database. The design choice would also make certain types of access impractical, if not impossible. I spent two full months coordinating with each analyst from every subsystem team to make sure that there were no requirements to access customer order history data except within the context of a particular customer. Finally, everyone agreed that they had no such requirement. I made the database design change, and all database access modules in the system were modified to access the redesigned table structures.

All was well until about a month before we were ready to begin stress testing with an almost fully loaded database. A developer had been experimenting (without requirements or specifications) on a really interesting set of reports that would allow management to see the popularity of products across all customers. Nobody knew about this "skunkworks" operation until he brought me the SQL queries and asked to have access to the stress test database that was then about half loaded. I looked at his queries and told him they couldn't run. He was indignant and insisted that I was just a consultant and only his management could decide what was allowed to run. I tried to explain that I was not discussing permission, but practical reality. He wouldn't take no for an answer, so I let him start one of the reports before I left on Friday. When I returned on Monday morning, he understood my distinction between "can not run" and "may not run." I killed the still incomplete query.

So who was to blame? Not him — his idea would have been a great tool for management and just predated an analytical warehouse in that shop by about three years. Not me-I built a database designed to meet all of the application requirements and provide extremely high availability given the technology of the day. The blame (if that concept even applies) was only in the project management that allowed the mismatch between program/functional design and database design; neither took the other fully into account.

Only by understanding the many conflicting requirements of a system, evaluating their relative importance and then understanding the implications of the trade-offs can the design team achieve a final, optimized design.

Database Design Phases

Database design is not a single effort. Typically, the design of the database goes through multiple high-level phases. At each step it is necessary to review the current state of the database design and compare it to the current state of the processing design. At each phase the comparison becomes more detailed and exacting.

Conceptual Database Design

There is no significant reason to separate conceptual design from logical design. Some designers or methodologies move some preliminary activities into this phase, such as identifying key business drivers and high-level purposes of an application (and its database). The output of such a preliminary phase would typically be a few paragraphs (or, at most, pages) that describe the opportunity for improving a business process or provide a new service with an estimate of potential benefits and a statement of the scope of the effort. This can provide a useful perspective to set boundaries on the rest of the design processes to prevent expanding beyond what can be reasonably accomplished.

Logical Database Design

From a list of requirements (as they are discovered and documented), the team initially identifies a long list of data elements that they know must be included in the database. They additionally sort the data elements into groupings, informally at first and then with more rigor, around the entities of importance to the organization.

Logical design is performed without concern to the specific database management system that will eventually host the application's data. The logical database design for a system will look identical whether the eventual system will be hosted on DB2 on the mainframe or on SQL Server on a Windows workstation. That isn't to say that the eventual systems will be identical — it's just to say that those differences don't appear during logical design.

During the logical design phase the development team determines what data needs to be acquired, stored, and used by a particular application. Simultaneously, the team identifies what functional processes need to be performed by the application. As each model is refined, it is compared to the other to ensure that needed data for each process is included in the data model and that each element of data included in the data model has some identified processes that will create, use, and (usually) modify or purge the data.

Various methodologies, some very rigorous, others very casual, may be used by an organization for gathering requirements and identifying data elements that may be of importance to an application. Two complementary techniques [are] Entity-Relationship Diagramming and Unified Modeling Language.

For all of the data elements collected in this phase, it is necessary to begin documenting some basic characteristics of the data. You need to know where the data comes from (if it isn't being created by the new application), and you will want to gather some preliminary estimates of the type of data:

  • Is it strictly numeric? Will it always be integers or allow decimal values?
  • Is the data item a date? If so, what granularity is required . . . to the day? to the minute? to the millisecond?
  • Will the data be alphanumeric or will it need to include other characters?
  • What if a needed element isn't character data but an image or some other unstructured object? If so, how large do you expect each instance to be?
  • Is there a defined domain (range or list of allowable values) associated with the data? Does the domain relate to other data expected to be in your system? Does the domain have an external definition rule (such as spatial coordinates limited to valid latitude and longitude pairs)?
  • What is the expected lifetime of this particular data element? Is it transitory or does it continue to exist for a predictable period of time?

Gathering an initial shopping list of data elements is a necessary but relatively casual process during the early parts of logical system design. Some discipline, however, is soon needed. The primary means of organizing this rapidly growing list of data elements is normalization and is traditionally documented using Entity-Relationship Diagram (ERD) techniques.

When developers and DBAs are working effectively together, the ERDs may be supplemented with use case diagrams as well. Examples of both of these techniques are shown in the discussion of database design tools later in this excerpt.

Physical Design

The logical database design phase delivered a consistent structure of the data without regard to the eventual deployment within a specific database environment. It is possible to directly translate the logical design into a set of tables and build a database. It is usually appropriate, though, to take advantage of the specific capabilities (and limitations) of the selected relational database management system (RDBMS).

Some shops have declared that they want to maintain RDBMS neutrality and therefore will not utilize any of the extended capabilities offered by any single RDBMS. This argument is na´ve in at least three aspects.

  • First is the assumption that the "least common denominator" capabilities of all RDBMSs will be sufficient to satisfy the application's needs. The standard SQL language is the basis of all current relational database systems, but SQL does not include specifications for much of the necessary infrastructure of an actual database. Additionally, there have been several revisions of the SQL standard as well as different levels of compliance by commercial products for each revision. Two RDBMSs may both be SQL-compliant but incompatible because of the different subsets of functionality that they implement. Discovering what the true "lowest common denominator" between every SQL database would be more work than just choosing one and exploiting it.
  • Second is the assumption that by avoiding special features, the resulting design will actually be portable across RDBMSs without changes. There are so many operational and administrative differences between different databases that the SQL commonality isn't enough to facilitate simple transfer of data and application from one RDBMS to another.
  • The argument for this decision is commonly expressed as, "We decided to purchase Oracle because our study indicates that it is the best product on the market today. However, if another product is better three years from now, we want to be able to move our application to that new platform." Do you see the fundamental flaw in this logic? What led this organization to determine that Oracle was the best product available today? Inevitably it was the extended set of features differentiating it from competitors that influenced this decision and justified the expense of licensing Oracle. Ignoring all of those additional capabilities means the organization pretty much wasted the months of evaluation time as well as the licensing costs.

In any event, if your organization wants to build a system that will run unchanged on either Oracle or MySQL, you should just choose MySQL and save some money rather than lobotomize Oracle so that it will behave like MySQL.

So, logically, the next step must be to translate the generic logical design into something that is specific to the implementation using a particular RDBMS. For this article's purpose, that RDBMS will obviously be Oracle. How do you represent the logical table structure into the most reliable, best performing, most scalable, and highest concurrency database that can be built?

We won't try to preview Oracle's many features here. We will move on from this topic with the understanding that it is during this physical design phase that you should select the appropriate technology options available from Oracle that will best meet your application needs. You will need to determine access paths to the data and decide on appropriate indexing and partitioning schemes. You will have to look at your tables' needs for unique key values and build corresponding sequence generators. You'll consider the groups of users that will need similar privileges and will incorporate them into roles. You'll consider the need for alternative and potentially conflicting access patterns and possibly define views or materialized views to meet those needs. You'll consider your transactions' needs for storing intermediate result sets and define appropriate global temporary tables. If you find that you have user groups with the need for segregating their data, you will evaluate the alternatives of creating separate databases, and separate schemas or using virtual private database policies or adding label columns in the data tables to support Oracle Label Security.

It is during this phase that knowledge of available RDBMS features and experience with the requirements, usage, and limits of each comes into play. Physical database design is primarily an evaluation of trade-offs. For any given logical design, we guarantee that there are at least a hundred possible physical designs to implement it. We'll further guarantee that there will be significant differences in how those alternative physical designs behave, operate, and perform. There is usually not one absolutely correct alternative in that list, but some will fit the total set of requirements better than the others. Sorting through the alternatives by evaluating the compromises imposed by the conflicting requirements is the heart of the art of database design.

Practical Design

Okay, if you read the many good books devoted to the topic of database design, you probably won't see practical design listed as one of the standard phases of design. But in real systems development projects, there is a final phase in which the physical design is actually tested and various accommodations, minor or otherwise, are made in response to the discoveries made during testing. Even the most experienced database designers will not be able to fully anticipate the effects of the actual programs running with actual data volumes.

If your team did a good job with physical design, then the database that you create and load is likely to meet the design requirements well. However, many of the requirements that are obtained early in the process may get revised as programs are built and users are able to do their first hands-on testing of the application. Sometimes the estimates of how many users, how many rows, how many transactions, and so on turn out to be inaccurate. As this new knowledge is acquired, it is important to revisit the physical design and potentially reconsider some of the alternatives set aside earlier.

In many cases, this ongoing practical design process can be transparent to the application programs. The Oracle DBA can generally change indexing, partitioning, and virtual private database policies; add views and materialized views; and move a LOB (large object, either of character or binary data) to out-of-line storage without having to modify actual SQL. However, learning late in the development cycle that what you thought was a one-to-one relationship is actually one-to-many can have significant ripples throughout the design.

The key point of considering this as an ongoing part of the development cycle is that the earlier you recognize the need for adapting your database design, the easier it will be to fully investigate alternatives, make the necessary changes, and fully test the results. Discovering these needed changes the day after you put the database and application into production will be several orders of magnitude more painful.

Database Design Tools

Using a tool to support the documentation, diagramming, and iterative evolution of data models is strongly advised. Simple databases with only a few entities may not require a modeling tool, but "real" systems with dozens of entities, hundreds of data elements, and large development teams will need the capabilities of such a tool.

One good book on database design, not directed toward DBAs, Database Design for Mere Mortals, 2nd Edition, by Michael J. Hernandez (Addison-Wesley, 2003), provides a thorough description of one approach to the logical (but not physical) database design process. It provides a system for doing design supported by paper forms rather than an automated tool. Such an approach is possible and is actually a good general way to learn the subject. The paper forms to gather and organize information about tables, fields, physical and logical characteristics, relationships, business rules, and so on work well for a college course case study's level of complexity. The book is recommended to a developer who wants more depth on the process of logical database design — but with the two caveats that any significant project will require automation of the overwhelming amount of data gathered and organized and that the book does not cover physical design topics specific to Oracle.

Over the past 20 years or so, data modeling using entity-relationship diagrams, as originally proposed by Peter Chen, has become a standardized methodology with excellent support from several vendors' modeling tools. The most common of these tools (although not necessarily the most powerful) is ERwin Data Modeler, now provided by Computer Associates. Oracle Corporation's tool in this space, Oracle Designer, has unfortunately been somewhat lost in the shadow of its newer sister product, JDeveloper. Both are bundled into the Oracle Developer Suite along with Oracle's other development tools. If your shop acquires JDeveloper for building, debugging, and deploying Java and PL/SQL modules, be aware that you have also already acquired an excellent data modeling tool (along with Oracle Warehouse Builder for designing data warehouses and their requisite data extraction and transformation processes).

Most of these tools do a reasonable job of aiding the team during logical design, since, by definition, this is a generic process that isn't tied to a specific database management system's particular capabilities and extensions. It is during the later stages of physical (and practical) design that it is important to have a tool that is capable of fully exploiting all of the capabilities of the RDBMS. In the case of designing a database specifically for Oracle, you'll want a tool that can let you define and build partitioned tables, index-organized tables, sequence number generators, global temporary tables, bitmap indexes, materialized views, and so on.

With a tool that isn't aware of the full capabilities of the specific RDBMS, the development team must manually modify and maintain the database definitions generated by the tool. In the long run, this manual effort will present a lot of additional work and risk.

Object-Oriented Design Tools

At least a small entry needs to be added to discuss a class of program development tools that have the capability of generating database structures. Developer frameworks, such as J2EE Entity Beans, can automatically generate a database object (and the necessary access routines) that will meet the narrow needs of the particular bean to persist its state. It is dangerous to depend upon such default database objects, however. In many cases the generated object will not properly exploit the underlying database's features for performance and scalability. Further, the administration of a database with such unmodeled objects will be unnecessarily difficult and potentially lead to problems in production.

This is not to say that developers shouldn't use the capability when prototyping their program modules. Just consider the generated objects as an input to the actual database design process and not a substitute for it. Several narrowly defined objects may be better modeled as part of a larger picture, tied to more general business requirements, and thereby become more resilient to future application enhancements.

Database Design Techniques

The preceding section on design tools mentioned entity-relationship modeling as a key technique used in the development and discussion of database designs. Unified Modeling Language is another set of popular techniques that can also aid in this process.

Entity-Relationship Modeling

Entity-Relationship Diagrams (ERDs) attempt to accomplish two things early in the design of a database. First it documents the entities of interest to the enterprise. Entities may represent things (such as a customer, an employee, or a product) or an event (like a registration, a shipment, or an order). Entities have identifying characteristics and additional descriptive characteristics. For instance, an employee has an employee number, a name, a home address, a work location, a supervisor, a hire date, a pay rate, and so on. One of the challenges during design will be sorting through all of the available data "thingies" to figure out which are entities, which are identifiers of entity instances, and which are descriptive of particular instances of an entity.

In addition to documenting entities, an ERD documents the relationship between entities. From the list of example entities in the previous paragraph, you might conclude that there is a relationship between a customer and a product. It might be direct (a CUSTOMER owns a PRODUCT) or it might be more complex set of relationships:

  • each CUSTOMER may place one or more ORDERs
  • each ORDER must include one or more PRODUCTs
  • each SHIPMENT must include at least one PRODUCT to one and only one CUSTOMER
  • each ORDER may be fulfilled with one or more SHIPMENTs
  • a SHIPMENT may include parts of one or more ORDERs for one and only CUSTOMER
  • and so on

This list is an example of the relationship rules that might exist for a company. Another company with the same entities may have very different relationships and rules. For instance, an alternative relationship might be that "a SHIPMENT fulfills one and only one ORDER". Sorting through these relationships and the business rules that define them is the essence of designing a database that will meet the requirements of a specific business. Getting one rule wrong during logical design can lead to a database that will not be able to support necessary business processes.

Setting aside the order entry discussion, see Figure 1 for a quick illustration of the ERD modeling technique for documenting a logical database design that might be part of a (highly simplified) insurance billing system. Each box designates an entity . . . an object of importance to the enterprise. The ERD doesn't attempt to show all of the low-level data elements (attributes) associated with each entity. At most, identifying elements (keys) are shown. A total of seven entities have been shown in this example.

Relationships between entities are indicated by lines connecting them. Figure 1 illustrates an ERD with eight entities and nine relationships.

Figure 1
Figure 1: A simplified example of an ERD for the medical insurance claims.

While we might have shown INSURED and POLICY_OWNER as completely distinct entities, this model assumes that there are many characteristics of the two that overlap, so they have been shown as two subtypes of the PERSON entity.

The second aspect of an ERD is to document the relationships between entities, as shown by the connecting lines. One line is drawn for each relationship; every entity should be expected to have at least one relationship to some other entity, but not all entities are necessarily related to every other entity. It is possible for two entities to have more than one relationship.

Further, it should be recognized that every relationship exists in two directions. If entity A has a relationship to entity B, then entity B has a relationship to entity A. A single line defines this bidirectional relationship, but it is useful to define the relationship in both directions. Many methodologies and tools are not strict in this regard, but it helps when reading an ERD to be able to traverse the relationships without having to translate on the fly. In Figure 1, the relationship between POLICY and CLAIM entities can be stated in two ways: (1) A POLICY may have one or more CLAIMs submitted and (2) A CLAIM must be submitted against exactly one POLICY.

The nature of each relationship may be optional or mandatory in one direction or both. In this statement of the bi-directional relationship between the POLICY and CLAIM entities, the term "MUST" indicates that the relationship is mandatory in that direction (a CLAIM cannot exist without an associated POLICY.) A solid line in the ERD indicates that the relationship, in that direction is mandatory.

An optional relationship exists in the other direction — the term "MAY" informs is that a POLICY is allowed to exist without any associated CLAIMs. In the ERD, a broken line indicates that end of the line to indicate that the relationship from that direction is optional.

The degree of each relationship, additionally, may be a one-to-one correspondence between the entities or, more commonly, a one-to-many relationship. Where many occurrences of an entity may exist for a relationship, a "crow's foot" is used to designate the many end of the one-to-many relationship, such as the fact that multiple CLAIMs are allowed for a POLICY. The written description of the relationship uses the "one or more" phrase to designate this.

Where only one occurrence of an entity may participate in the particular relationship, no crow's foot is used and we'll describe the relationship using "one and only one" or "exactly one" terminology.

Many-to-many relationships do occur in the real world, and during this logical design phase, they may appear in our ERD. Figure 1 showed such a relationship between CLAIM and TREATMENT. Many-to-many relationships, however, present difficulties in creating a physical database. Remember that a relational database makes its connection between tables using data values themselves rather than pointers. We will eventually, in physical design, implement a one-to-many relationship by carrying the key value of the "one" side as a data value stored within each row on the "many" side table. Many-to-many relationships can't be stored that way, so data modelers will resolve the many-to-many relationship into two one-to-many relationships and by creating a new connecting entity to help define the relationship. This issue was ignored in the preliminary example, but Figure 2 shows that specific portion of our total ERD with this relationship resolved.

Figure 2
Figure 2: Resolving a many-to-many relationship using a connecting entity.
A note on diagramming conventions: The example ERD uses one convention for showing characteristics of a relationship. Different methodologies (such as the original methodology of Peter Chen) and tools will possibly use a different diagramming technique to represent that a relationship is mandatory or the degree of the relationship. Learn and use the conventions of your chosen tool and don't get hung up on the differences from the example.

Another common construct included within an ERD are "Type of" subentities exemplified by INSURED and POLICY_OWNER that have been shown as subtypes of the PERSON entity. At a later revision of this ERD, some differences in types of PROVIDERs might have been discovered that would lead to a similar subtyping for doctors, hospitals, physical therapists, and pharmacies.

One other ERD diagramming convention shows "one of" choices among multiple relationships. This option is shown by the arc crossing the relationship lines connecting the CLAIM entity to DENIAL and PAYMENT with an arc. According to our simple ERD, either (but not both) of these may exist for any given CLAIM.

The primary advantage of ERD diagramming is that ERDs are easily explained to nontechnical users and sponsors of a proposed system. Most business people can easily validate the important entities, their identifying attributes, and the rules that define the relationships between the entities in their real world. A second advantage is that ERDs are very easily translated from their logical representation to a preliminary physical design of a relational database. It is, after all, a relational technique. The third advantage of ERD modeling is that ERDs can scale to show the total information model of an entire enterprise or a restricted view can present just the subset of entities of interest to a particular system. (Pulling a small part of the entire ERD, as in Figure 2, is an example of this capability.) This allows enterprise data architects a wonderful way to "zoom" in and out to examine global relationships or detailed implications of a proposed change in one particular program.

The disadvantage of ERD modeling is that it is primarily a data-focused methodology and doesn't attempt to model the processing needs of the application. Processing will have to be designed to support each relationship, but the ERD doesn't suggest how that processing will be handled. ERDs, as valuable as they are, require a complementary design and documentation technique to handle processing needs and module decomposition.

Unified Modeling Language

Application designers have been modeling systems for the 50 years that computerized systems have been developed. Flowcharts (some of us actually used the plastic diagram templates to draw them) were an early means of documenting the processing requirements and how data and, more specifically, control passed from module to module. During the 1970s new structured programming methods improved our ability to model the behavior of COBOL programs.

Along with the adoption of object-oriented design and development techniques in the early 1990s, a new set of similar (but distinct) design methodologies arose. OO developers learned and adopted the Booch, Jacobson (OOSE) and/or Rumbaugh (OMT) methodologies. Each of these methodologies had its strengths and its proponents. While they shared many concepts and goals, there were enough differences to make it challenging to pull together a development team that could agree on a common approach to the design of a new system.

The current solution to this dilemma is the Unified Modeling Language (UML), formally adopted as a standard by the Object Management Group in 1997. Today UML is almost universally accepted as the methodology and documentation techniques for nearly all object-oriented application design.

We spent a great portion of this excerpt providing a somewhat detailed introduction to entity-relationship diagrams and normalization as database design practices. We will not attempt, however, to provide as much coverage of the use of UML for several reasons. First, many of the developers reading this to learn about how to utilize Oracle may already have extensive experience with UML. Second, UML is a very comprehensive modeling "language" beyond the scope of this work. The third, and most important, reason arises from the origins and emphasis of UML.

Before any reader interprets this section as a criticism of UML, let us immediately put that concern to rest. When we introduced the ERD concept for logical modeling of relational databases, we immediately admitted that ERD techniques are extremely data-centric — their weakness is that they don't provide any assistance to the documentation of processes. Neither data nor process can be adequately modeled alone. Each complements the other and the two aspects of a system must be consistently reviewed within the context of the other.

UML can do a far better job of balancing the focus of design on both process and data than an ERD. UML is, however, very process-oriented in its orientation. This is not a criticism; in fact, it is what makes UML the perfect complement to data-oriented design techniques. Many UML practitioners focus on use case diagrams to document interactions between users (Actors) and processes (Tasks) and the relationships (Composition, Association, Triggers) between tasks. Secondary emphasis is given to documenting Scenarios, Sequence Diagrams, and Narrative Descriptions. (Note that at this point the UML process begins to identify entities and attributes, the building blocks of the database design techniques discussed previously.) All of these aspects of UML are primarily focused on process, with only a secondary acknowledgement of data.

Working through the UML facilities, we finally get to UML class diagrams. We recognize that they closely mimic the diagramming techniques of ERD modeling. It is, however, tempting for many designers to consider their class and object modeling from a limited perspective of the particular use case on which they are focused. Nothing in UML prevents or restricts data definition from being performed from a more global perspective, and in fact, it has to be. Design of database objects requires all of the rigor and emphasis that is given to any particular user interface problem — in fact, more. The effects of an inappropriate database design will have ramifications throughout the entire system, and correction of any data design deficiencies are likely to force major changes within many program modules as well.

The danger of UML modeling is not a weakness of the UML techniques themselves. It is an unfortunate tendency within development teams to focus their attention on the process design techniques and ignore class diagrams.

What we hope to encourage is an increased emphasis on the part of UML practitioners for the data aspects of the process/data dichotomy. Data used by only a single process or limited sequence of tasks may not have global effects, but the very nature of centralizing data storage within a relational database provides both benefits and potential risks — failure to properly model the shared data objects is one of the greater risks. By gaining additional understanding of the data modeling process (no pun intended), application designers can produce more resilient systems that better satisfy the needs of all their user communities.