Many, many years ago when I first got bit by the computer bug, I remember something I learned in my first computer programming class. We learned that computers - the Apple IIe at the time - are nothing more than complex calculating machines that translate series of 1's and 0's into instructions for the processor to execute.
My programming teacher taught us about data quality at that time. He taught us that a computer is not intelligent, that it can only act on the instructions that you give it. He taught us GIGO - Garbage In, Garbage Out. (Thank you Mr. Hills wherever you are today.)
It is amazing that one of the first things I learned about in my Information Technology career was Data Quality. Of course it wasn't called that at the time, but we were taught that if you have poor quality data, you will have poor results.
It was a lesson that I had to learn by experience. One of our coolest projects was to create a little game called The Farmer and the Pig. We created a simple game in BASIC that had a Pig, the letter "P" inside a pen, outlined with the letter "T", and a Farmer, represented by the letter "F". The object of the game was simple, using the arrow keys, you had to move the farmer one space at a time to catch the pig. The pig moved randomly around the pen, but could not move outside the pen boundaries. The pig was caught when the F moved to the space occupied by the P. When the pig was caught the game ended with a Congratulations message. My first turn-based combat game, if you will.
Well, I finished my coding, and looked over the code and everything looked correct. So, it was time for my first test. I started moving my farmer, the pig ran around, everything looked great. Until, I moved my farmer on top of the pig. The game should have ended, but that pig moved away! I inspected my code and found out that I had miscoded the equation that determined if the F was moving to where the P was located. That was my first experience with bad data.
These days businesses are generating ever growing volumes of data. This data is being used to make corporate decisions. More and more corporations are realizing that they have bad data in their systems, and that they need to do something about it. The cost of poor data quality is coming to the forefront of corporate discussions.
Informatica is one of the major software players in the Data Integration and Data Quality space. They have realized that Data Quality needs to be an integral part of Data Integration and are making it easier for companies to clean up their data.
Today I found a web site sponsored by Informatica called, "Do You Trust Your Data?" On that site, Informatica has compiled a collection of data quality horror stories. There are some stories that will make you laugh, some that will make you cringe and some that will make you cry. There are some pretty crazy stories on that site. If you want a good laugh, check out the video section as well.
The site, in a tongue-in-cheek manner, points out something that more and more corporations are realizing, the quality of your data not only effects your processes and decisions, it also effects your customers. It's Garbage In, Garbage Out gone wild!
If you have a great Data Quality story, you can submit it to the Informatica site and add to the hilarity. Or leave a comment below and share with us as well.