I just got a chuckle while reading an article by Michael Schiff in which he lists his 5 Data Axioms. Mr. Schiff has a few decades of data management work and has developed these "data axioms" based on his experience. The reason I chuckled is because I can relate to what he is saying, and the discussion brought to mind different experiences I have had in the past.
Axiom #1: If the user can't access the data, the data doesn't exist.
I can't begin to tell you how many times I have been told by end users, and even business developers, "that data is not in the database." In my experience, UI's are a funny and finicky technology. Many times it has been necessary to create data manually through SQL inserts to the database in order to meet certain requirements. However, when the business user goes into the database for some reason the data doesn't show up. Hence, "it doesn't exist." After I show them with query extracts that the data does exist, it usually comes out that they were either looking in the wrong screen, or the data they provided me to load was incorrect.
Axiom #2: If the user knows the data exists but can't obtain a computer-readable copy, the user will reenter the data and ignore the metadata.
This one really irks me sometimes. If I'm not available when the user discovers "that data is not in the database" (see #1 above) and they are trying to complete a testing cycle, sometimes they will re-enter the data through the UI. This causes a number of issues that I have to resolve usually by backing out the data that I loaded, because they've already used their new data in other processes. Oh, and my favorite is when the UI told them the entity already exists, so they change the name by appending something like "-real" to the end of the entity name. At least that makes it easier to delete my loaded data.
Axiom #3: Data doesn't flow; it reproduces itself spontaneously. Mutations occur with each new generation.
I have a different understanding of this axiom than Mr. Schiff, or at least have a different understanding of "it reproduces itself spontaneously". There have been many instances when I've found "-real" appended to data entity names, and been told by the user, "you must have loaded that, because I didn't put it there." After showing them the source of the data I loaded, that they provided to me, and showing them that none of the data has "-real" appended to it, do they finally recall that they entered the data, because "it wasn't in there".
Axiom #4: For every two systems, there is another system that reconciles data differences between the two.
My experience with this one is when a client has brought us in to install a new software package, yet has no plans to discontinue using the other one. A famous quote is, "We know what the old system does, so we need to keep using it." This usually occurs when the project has been IT sponsored and hasn't received business buy-in.
Axiom #5: Data in a warehouse is like the clothes in a closet; even if we haven't accessed some data in two years, we still tend not to throw it away.
I have seen this growing problem more and more now that storage is so cheap. I know there are regulations in different industries that mandate keeping certain historical records. However, when an application creates "tracing" tables that hold details of every transaction process, even when the process has been backed out, the database tends to grow exponentially. This can lead to growing pains that DBA's don't like to hear about. It is important to create data archiving and retirement or deletion plans as part of any software implementation project.
I hope you have enjoyed my little trip down memory lane. Thank you Mr. Schiff for stirring up these memories in me. If nothing else, the phrase, "you'll look back on this in the future and laugh" is so true in the software implementation world we live in. If you have any of those memories, please leave a comment and do share your experiences.