By bloid
via ytechie.com
Published: May 15 2008 / 07:47
A discussion came up at work recently about the extent of constraint usage in your databases. There were basically 2 camps:
1. Constrain everything humanly possible. If it’s an integer that wouldn’t normally be negative, add a ">= 0" constraint.
2. Constrain primarily where it’s necessary to maintain referential integrity.
Consider the following diagram. It’s a map of the flow of data from your user, which eventually makes its way into the database.
Comments
Kenneth Downs replied ago:
I not voting because I don't like it enough or dislike it enough to do either. The author states that constraints in the db are the most correct, but in the application are the easiest. A good programmer would ask how to get the correct ones easily. The answer is that the application needs to be able to understand the server error codes and present them to the user. I suspect the author does not understand this because he states the application must "synchronize" its constraints with the server, which is not true. The application should not be bothering with constraints that will be enforced on the server, it only needs to be able to report failures to the user.
sproketboy replied ago:
DBAs tend to favor constraints and usually fall into camp 1. This is usually to protect themselves from themselves. I'm a developer so I fall into camp 3 - which is to say I avoid constraints like the plague.
Why?
Because at the end of the day your database is going to be your bottleneck and you don't ever want to do anything that will negatively impact database performance (which constraints will do). Indexes good – constraints bad.
The most common constraint issue is referential integrity. Example: User looks up an invoice where the customer has been deleted. The invoice looks up the customer, can’t find one and you end up with an NPE. This is easily alleviated by adding a ‘deleted’ bit to you tables. When the user deletes, just set the deleted bit to 1 leaving the row in the db, and hide those rows from future search results and drop down choices.
So you have a nice solution without needing any constraints. The only person now who can mess up your database is your DBA (if you’re unfortunate enough to have one lying around somewhere). In that case just go back to step 1, rinse and repeat.
planetmcd replied ago:
The notion that referential integrity is an annoying byproduct of DBAs or that constraints are universally bad is silly at best. Martin Fowler described 2 types of databases. Application databases (a database that is only connected to 1 application) and integration databases (a database that has more than 1 application connecting to it). They serve different purposes and each has their challenges, use cases, and strengths.
In the case of an integration database, constraints have a place in the database because if you need to apply rules to data, it is the one common point to multiple applications. The DRY principle would dictate you at least put the data checks there to insure against a mistake in the client applications.
There is more leeway vis a vis constraints in the case of application databases, but to say that universally constraints are bad and indexes are good is inaccurate. Too many indexes can add unnecessary overhead in an intensely transactional database where searches are done on PK/FKs. Constraints on the other hand can be a good additional check. Maybe you don't need to check the format of an email in the database, but checking for nulls or uniqeness might be completel appropriate. Suppose, for instance, you require a unique user name. I can't think of a more efficient way to do this than let the database check it with a unique constraint. The alternative is to load every user name and implement a search, which is reinventing what the database already can do (and likely faster).
I'm not saying the database should duplicate every constraint of all the applications that connect to it, but thought should be applied to how you use database technology and understanding its capabilities and limitations as they relate to your applications will get you miles ahead of relying on misguided aphorisms.
sproketboy replied ago:
That's why I said 'application' singular. If you have multiple applications accessing your database (ones that you have no control over) then yes constraints are OK.
I have to disagree on business rule constraints. These are insidious. It might seem like a neato idea at first - let the db do a bunch of basic validations for you.
To that I can quote my database. I quote: "Violation of UNIQUE KEY constraint 'IX_JUNK'. Cannot insert duplicate key in object 'JUNK'."
Unless you want to punish your users with these kinds of messages you'll have to write code to catch the SQLException, examine the exception to see if this is a business rule exception (recoverable) or real exception (state exception, query exception, connection exception etc) and translate the message into something human readable - FOR EVERY CONSTRAINT. Nothing is for free.
You're much better off just writing the durn validation in the application itself where it belongs.
planetmcd replied ago:
Actually, the word application isn't in your comment at all, that's why I cited Fowler. But at least we're on the same page with regards to the type of database we're discussing. I'm not a database zealot, but given your position of only validating input in the application, how would you validate a unique user name string in the application with 10K, 50K, 200K, 1M existing users in the application without relying on the database unique constraint?
sproketboy replied ago:
Sorry for taking so long to respond. Yes you could use a constraint to check uniqueness or you could just have a normal index on the column and just do a select field from table where field = ? which might take a nano-second or so more time but then you wouldn't have to check the SQLException to see if this was really a constraint violation as opposed to a connection failure or other exception. (I'm talking Java here - your mileage may vary with other languages). ;)
My view is that SQLException is generally a Runtime exception - meaning there's nothing the user did to cause it.
planetmcd replied ago:
How weird that we both let it go for 9 days and check the conversation in the same hour.
Your suggested solution requires 2 calls to the database, and depending on the size of your application, and its database activity, an extra query to the database can add load unnecessary (though admittedly this is not common enough to be a concern unless your build a high transaction web app). To attempt to insert a unique username, you must search for the username, and find out if it exists, then you reply to either the database server or the user depending on whether the search was successful. The functionality for search and decision on whether it is unique and conditionally processing the insert is already built into the database. You are rewriting functionality that already exists merely to say I have all processing logic in a specific layer, regardless of how well another layer may do the job. The point is true for things such as zip code formatting or such. Databases aren't generally aware of what a zip code is. But uniqeness is something they are designed to test for and is a concern that crosses most uses for a database. I suppose that you could argue that moving this uniquenss logic away from the application code makes it more confusing to developers reading the code, but I think that this is a read herring. I would also say if the develpoer is having trouble gleaning this and doesn't know the db schema (or know how to get at it), you have bigger problems.
.
How you represent an error to the user is a different question, in either schenario we developers should give them meaningful feedback, this is just a question of how to handle internal processing.
Most database adapters can interpret specific SQL error codes and wrap them in a specific error, to not use them is a waste of effort. You would merely trap the violation of the uniqueness constraint exception. And then when that exception occurs, deal with it with a response to the user.
A SQLException is only a Runtime exception when you didn't expect it. Otherwise its using the DRY principles and not reinventing the wheel. There is no need to do two queries when 1 will do fine.
planetmcd replied ago:
P.S Thanks for the interesting discussion.
Voters For This Link (7)
Voters Against This Link (1)