One of the most challenging aspects of any BI project lies in the tension between the “behind the scenes” data work, and the “high visibility” BI application. Everyone will see the new marketing and sales dashboard you are tasked to produce, but no one will have a clue about the mammoth data integration effort it took to consolidate opportunities from 12 different systems across the globe to produce a meaningful “company pipeline”. To make matters worse, office productivity software, and popular websites like amazon, google, or financial web sites turned your users to a very sophisticated bunch with extreme expectations. “Why would it take you two months to create this calculation? I can do it in excel in 2 seconds..”, is a typical example that illustrates the lack of understanding and appreciation for the complexity of modern data integration work. Faced with this reality, as you start planning your project, you may like to stack your work in a linear fashion: first, focus on the data exploration and integration work, and then, when you actually have the data, report back on what it can support and produce the visuals, reports, dashboards etc, to illustrate these findings. If this sounds like a good idea, well, it’s because it generally is, but unfortunately, this approach is rarely possible in light of the harsh reality of your business. A normal complexity data integration effort will take 8-12 weeks to complete, between the exploration phase, the modeling phase, the ETL phase, adding robustness and error handling and testing. During that time, you will not be able to show your users any progress, with this approach. All this hard work by your data architects, data modelers, ETL developers is completely obscured from your business users. It’s way too technical and requires a very high level of specialization to appreciate and your users, for the most part, will not be able to follow the details. Especially in these days of Agile methodologies and iterative approaches, it’s very hard to explain three months of work and budgets burnt, with nothing to show. So, what can you do? Well, you will have to work in parallel, on both tracks.
Your reporting application and data work will have to progress in parallel, rather than serially. But wait a minute, you will ask, how can I design reports and dashboard without the data to put into them? The key to doing that is your reporting data model. As long as you can agree at a relatively early phase of the project on your application requirement, functionally, navigationally, overall data scope, drillability and graphical representation, you will be able to produce a data model that will be able to support those requirements in principal. Once you have done that, you can use assumptions and deductions to fill this data model with “dummy” or “test” data, loaded from spreadsheets, manually entered, or inserted from well understood sources. This will allow your visual track to start making progress without needing to wait for all the data work to be completed. Sure, there will inevitably be some rework on both fronts due to misunderstandings, changes in requirements as the project progresses and some wrong assumptions. But, for the most part, if your model is solid and well thought out, you will be looking at MINIMAL changes or rework and MAXIMAL utilization of time. Now, instead of producing the report in 6 months (3 to build the data, 3 to build the reporting application), you can deliver your reporting application in about half the time.