Microsoft Word is a Legal Document

Feb 25 2005

To say "Access is a relational database" is as incorrect as saying, "Word is a legal document." We need to be more clear than this as data professionals.

Terminology confusion abounds in the software industry and I believe the worst examples are in database management in particular. So I shouldn't have been surprised to read the nonsense I discuss here... but still I was. The following quote opens up edition 6.14 of Woody’s Access Watch, a normally fine e-newsletter focused on building Microsoft Access solutions:

Access is a normalized database (except for purists who don’t count a database as normalized unless it adheres to all 13 of the Codd & Date rules), and that means that you should store linked data in different tables.

If you don’t reel from an affront to your sensibilities from that one sentence, you should. This is no different, to me, than saying:

Microsoft Word is a legal document.

Sounds silly, right?

But did the Access Watch quote sound as odd to you? It should. It reveals many misunderstandings about database technology and practices and is confused and muddled, something we of all people should avoid.

Ignoring the poor grammar, here are the technical reasons this WOW statement should offend you as a technology professional:

 

1. Microsoft Access is not a database.

Read that again. Access is not a database. A database is an organized collection of information accessible for some purpose. Is Access an organized collection of information? No! It is a piece of software. Many people are far too cavalier with the word “database.” Database professionals—those in our society who are responsible for (and presumably trained for) organizing and classifying the world’s information—should be the least so.

Fundamentally, Access is a “database management system” or “DBMS” for short. You could call it “database software” or even, loosely, a “database system.” But it not a database. “Northwind” and “pubs” are databases. IMDB.com is (largely) a database. The card catalog at the library is a database. But Microsoft Access is a DBMS.

Calling Access a database is like calling Microsoft Word a “document.”

 

2. Nothing guarantees that any particular database implemented within Access will be normalized or not normalized . . .

. . . any more than using Word to write documents guarantees they will be legally enforceable documents.

In my experience databases implemented with Access frequently violate all sorts of normalization rules. One recent example I saw had a single column, “[City, State]” (yes, including the comma and space!). This clearly violates first normal form (“1NF”).

And don’t mistake this for an Access limitation or problem. No DBMS forces you to follow good database design practices. I have seen equally good and bad designs in Access, Oracle, and SQL Server. (Okay, the more advanced tools are somewhat more likely to be used by better trained professionals. But the edge is slight.)

To think of Access as a “normalized” database tool or even a “relational” DBMS would be akin to thinking of Word as a tool that automatically produced legally enforceable documents regardless of what you typed.

Being an expert at Access (or any database tool) no more qualifies you to implement correct, high quality database solutions than being a Word expert would qualify you to be a Pulitzer prize-winning author. It helps get the job done faster, but you still have to know how to do the job. (Most hiring managers do not understand this!)

 

3. Normalization does not mean adhering to Codd’s 12 rules.

Note: the guidelines are normally listed as "12" rules and were proposed by Ted Codd (not "and Date"). They are much quoted, including of course in Date's standard database text. See this site for a quick summary. Since there is a "Rule Zero," it would be fine to say the "13 Rules," but most texts do not call them such.

Codd’s 12 rules for relational database systems were geared toward those hoping to implement a relational DBMS and, as such, dictate certain behaviors of those software systems.

Normalization, in stark contrast, applies to the table-level design of one particular database.

See why it’s so important to be clear about “database” v. “DBMS”? I hope so.

(As an aside, being fully “normalized” also does not mean being “relational.” Adhering to 1NF is required. Beyond that they are simply design recommendations. See C.J. Date’s excellent article “What First Normal Form Really Means” for details.)

 

4. Normalization does not per se mean storing data in different tables and then linking back together.

Granted, such a design is usually a consequence of a normalized design. But normalization rules are strictly about removing certain types of redundancy from a certain table. Usually this is accomplished by decomposing the table in question into two or more related tables, for example splitting out a “Children” column into a separate “Child” table containing one row per child. But it’s important to note the difference.

It is especially important to note that just because you have separated data tables and relationships between them does not mean you have a normalized database or a relational database.

Legal jargon is probably necessary to produce a legal document, but just a glut of legalese scattered in my document does not mean it will be legally enforceable.

But possibly the biggest problem with the Access Watch quote is that it comes from a supposed industry “expert.” Unfortunately such ignorance of our field is rampant. You must be careful and pay attention to the fundamentals.

Keep in mind:

If you do not know the history of your own profession you can never hope to be but an amateur.

—anonymous

 

Note: For much deeper and longer treatment of many of the industry's database confusions, see Fabian Pascal's site, www.dbdebunk.com. If you regularly work with databases, you owe it to yourself to be familiar with his and Chris Date’s writings available at this site.

Data Management , ,