When something breaks in the production environment, it is critical to find a solution quickly. When you receive that 2 a.m. phone call that a job failed, it's not always easy to think clearly in order to determine the cause of the failure.
I have written previously about the fact that I am a support analyst for a large data warehouse. This means that if something goes wrong with the data warehouse, my group is responsible for fixing it. Our database houses all the data for a very large brokerage and financial services firm. There are thousands of inbound and outbound connections to this data. If something goes wrong, it needs to be fixed, usually immediately.
In my prior article I discussed the stressful job it is to be a production support engineer. In this article I will discuss some ways to keep these special quality people on your team and productive.
If you don't know already I am a Production Support Engineer for a data warehouse that houses around 15 Terrabytes of data. A lot of my reflections recently have centered around things I've noticed in my job. I've also been speaking with guys who have been doing Production Support from over a year, to several years. One thing they all bring up, and it would be wise for CIOs to understand is...
A while back I discussed the definition of Data Quality. This time I want to discuss and give some examples of some issues that are raised as data quality problems, that are not really data quality issues.
Many companies have grown large production support groups to handle any sort of production issues. These issues can range from program failures during batch or real-time processing, or data quality clean up issues. Unfortunately many production support groups are terribly understaffed and overworked. Maybe the old school production support model is the direction to go.