Big Data, BI, Data Analytics 1 — The Early Years
How it all started — for me, at least
During a few articles, I will try to cover the emergence of data analytics through some deep dives into real life. It is easy to get carried away by the marketing and hype around some of the current products in the market, and some of the vendors of these products are good at making it look like if they have invented the wheel and nothing can work without their product.
But most of what happened until now was actually quite mundane and was developed step by step. We used the tools at hand, and when something new was invented, it took a little while before the world managed to do something good with it.
The beginning
It was towards the end of the 1990s, I had started in a new job - me being a Microsoft software development expert, suddenly, after having worked with Visual Basic for a couple of years, along with Access and SQL Server, a prototype Exchange Server, and some other tools - including the by then somewhat dated Microsoft C/C++ package.
The new job was exactly to help a small consulting/software development company getting started with Microsoft-based development, which seemed to be in demand in the market. The company had previously worked with Informix, an early relational database system that had been fairly popular but now was in a decline. And it had developed solutions using the proprietary 4GL tool from Informix.
An interesting time, back then! Changes, an open attitude towards new products and new ways of making software, and a world full of perspectives for anyone with some skills in one or the other direction.
In my previous job, I had done some interesting things with Crystal Reports, which we used as a component in the Visual Basic applications. And, can you imagine, just two weeks into my new job, my boss came into the room where I was placed (actually, the corridor, as all the rooms were full) and asked if any of us knew anything about Crystal Reports. I was the only person who answered, which was a bit odd, as it turned out that my colleagues had made a report for a client, advising exactly Crystal Reports as their reporting tool, to be installed on all computers and used by every white-collar employee.
Well, they had recommended it, but none of them knew about the product, so when the client now asked if we could send a consultant to help them getting started with it - it was me who went there.
The task was initially simple - just make a handful of reports, document them, and give some quick instructions to key employees on how they could do something similar themselves. The reports were to be used for a number of practical purposes, but basically to show what happened in the company - how much money was earned, how busy the employees were, and so on.
I think that KPIs were mentioned, by there was not particular goal of measuring KPIs. It was much more directed towards the practical purpose of making sure that the new production environment could be adjusted to perform well.
Developing the solution
As most IT-people know, there is no such thing as simple - and no such thing as the client doing anything themselves after a quick round of instructions. Many tasks are initially described like that, but it never happens.
So, what actually happened was that I made a couple of reports with Crystal Reports, had to conclude that some of the other requests would require the creation of stored procedures or temporary tables in the database server, but as this was a no-go due to warranty terms in the contract with the system vendor, I had to be creative.
A request to buy a separate reporting server with whatever database system they would prefer, but possibly Microsoft SQL Server - was rejected. They had a budget for a consultant, but none for software or servers. But all computers were equipped with the full package of Microsoft Office, including Access, so that became the solution. The remaining reports were easily made using Access and a hierarchy of queries plus some local tables for storing temporary results, and that made the reports possible. When I showed the setup to the client, we ended up deciding to switch completely to Access, abandoning Crystal Reports, and everything was converted.
That was the end of the assignment, everything worked, and three days had passed. When I went out of the door it was with a comment from the client that they would get back to me soon.
Expanding the solution
It took two weeks, then they called, and I went back to them - now to see a long list of requested reports, which just continued to grow over the several years that followed with our continued cooperation.
The client had a transactional system consisting of two separate systems that had been somewhat merged by the vendor. The merging process had not been very well done, as data from the two ends of the system were treated in each their own database, not really being integrated at all, and actually not being compatible. Everything was in French language, which nobody there understood, and several data fields were coded in one way or another, as bitmaps or encrypted, making it impossible to use SQL directly for joining data. Hence, the need to use something more capable than Crystal Reports.
With the ever growing reporting demands came also a more and more sophisticated setup of queries and temporary tables, all done in Access, as this was the only tool I was allowed to work with, and to make it possible for all the reports to be available when needed, I created a scheduler and started preparing some data along the way, so that they could be fetched quickly when running the scheduled reports. Nevertheless, with all such optimizations, the system became more and more of a burden for that old desktop PC that had been allocated to run it.
Each day produced many reports, some of them quite advanced, combining historical data with real-time data, doing a lot of data extractions even for the purpose of calculating some temporary keys that could be used to combine the data from the two systems, and we began to see the performance of the transactional system be affected by the reporting system.
Finally, something that looked like a data warehouse
I don’t remember how long we had been developing more and more complex reporting concepts in Access before it happened, but maybe two years or so, until we finally were allowed to set up and create a dedicated reporting database server! Hurray!
And from there, the development of a dedicated structure and on-going updates in an Oracle database to support the actual report-making, still to take place in Access.
Along the way in this process, the world started talking about “big data” - and Hadoop was invented. Towards the end of this period, in the beginning of the 2000s, we had Business Intelligence (BI) defined as a new concept that some pioneering companies began implementing - to be used with special analytical database setups (OLAP instead of OLTP).
Fun fact: “data warehouse” had more or less just been invented as a concept, but the term was solely used for the OLAP type of database, so we never spoke about my data warehouse as such. However, the function was effectively the same, only the mechanisms to get to the reports were different.
And the use of real-time data in the reports was to be introduced in the market by some of the big software vendors as Business Activity Monitoring (BAM), but it didn’t exist at the time I was doing it – moreover, the combination with historical data for complete reports covering any wanted period would not happen in any of the marketed BI/data warehouse systems for many years to come.
Nobody ever mentioned “data analytics” or “data science” in these contexts, but that was soon to change.