Data Quality Problems are Corporate IT’s Dirty Little Secret

From Innovations, a website published by Ziff-Davis Enterprise from mid-2006 to mid-2009. Reprinted by permission.

In the early days of home broadband, I was a customer of a very large cable company; whose name I’m sure you know. When making one of my frequent calls to the technical support organization, I was typically required to give all my account information to a representative who then transferred me to a support tech, who asked me to repeat all of the account information I just gave the first person.. If my problem was escalated, I got transferred to a third rep who would ask me for — you guessed it – my account information.

This company’s reputation for lousy customer service was so legendary that if you type its name and “customer service” into a search engine today, you get mostly hate sites. One of its biggest problems was poor data integration. For example, its sales database was so fragmented that I routinely received offers to sign up for the service, even though I was already a customer. I’m a customer no longer.

Does this sound familiar? Most of us and have had frustrating experiences of this kind. Thanks to acquisitions, internal ownership squabbles and poor project management, most large companies have accumulated islands of disintegrated, loosely synchronized customer data. PricewaterhouseCoopers’ 2001 Global Data Management survey found that 75 percent of large companies had significant problems as a result of defective data. It’s unlikely the situation has improved much since then. Data quality is corporate America’s dirty little secret.

The Path to Dis-Integration
There are several reasons for this, according to Paul Barth, an MIT computer science Ph.D. who’s also co-founder of NewVantage Partners. One is acquisitions. As industries have consolidated, the survivors have accumulated scores of dissimilar record-keeping systems. Even basic definitions don’t harmonize. Barth tells of one financial services company that had six different definitions for the term “active customer.”

A second problem is internal project management. The decentralization trend is pushing responsibility deeper into organizations. There are many benefits to this, but data quality isn’t one of them. When departmental managers launch new projects, they often don’t have the time or patience to wait for permission to access production records. Instead, they create copies of that data to populate their applications. These then take on a life of their own. Synchronization is an afterthought, and the occasional extract, transform and load procedure doesn’t begin to repair the inconsistencies that develop over time.

Data entry is another problem. With more customers entering their own data in online forms and fewer validity checks being performed in the background, the potential for error has grown. Data validation is a tedious task to begin with and really effective quality checks tend to produce a lot of frustrating error messages. E-commerce site owners sometimes decide it’s easier just to allow telephone numbers to be entered in the ZIP code field, for example, as long as it moves the customer through the transaction.

Finally, data ownership is a tricky internal issue. Many business owners would rather focus on great features than clean data. If no one has responsibility for data quality, the task becomes an orphan. The more bad data is captured, the harder it is to synchronize with the good stuff. The problem gets worse and no one wants responsibility to clean up that mess.

Barth has a prescription for addressing these problems. It isn’t fast or simple, but it also isn’t as difficult as you might think. Next week we’ll look at an example of the benefits of good data quality and offer some of his advice for getting your data quality act together.

Leave a Reply

Your email address will not be published.