About Data Warehouse

BTH Industrial Training Program

A Data warehouse is not a new concept and from its term, perceiving its very existence is not complex. In simple language, a warehouse is a place where something is stored. To understand it better, a few examples should do the trick.

The popularity of e-commerce websites has been on a rise due to products being reasonable, along with it getting delivered at your doorstep. A customer projects a demand for a product and it gets stored in the warehouse. Now as soon as the supplier places the order, the good is immediately dispatched.

Discard the idea of a data warehouse from this scenario. What do you see? Yes, you see absolutely nothing. With this, you have the customers directly going to the e-commerce website since they don’t end up storing any data on a warehouse. The e-commerce team goes to the suppliers and ask for the product. Can you imagine how tedious this could get. This is a tremendous strain on the manufacturers as well as on the customers since they’re wasting their time in this elaborate process thereby witnessing a huge delay in their order.

Similar to this is the data warehouse, where the data is stored and procured from the transaction system. In this case, there are two concepts-OLTP and OLAP. The former deals with recording transactions, while the latter analyses the data and this is where the data warehouse is utilized. Every transaction made through an ATM is recorded in an OLTP system, and so are various other activities.

Now if you have to perform querying on this system, you will have to join multiple sources which have different formatting types of their own. Numerous customers use the ATM in a given day and the same number of queries are being hit on the OLTP system. One can imagine the kind of load being managed by the system. Hence, the OLTP system is definitely not used for querying purposes. It is only used for recording a transaction.

What are the ways in which you could reserve a railway ticket? You could book it through your mobile, the station or even multiple agents. But when you book it through a website, it is entirely different from booking it on your mobile. So these multiple disparate source make it difficult for querying. This example explains the multiple sources which become a hindrance to analytical processing.

Now based on this, the end user would want data for his reporting purpose. So you create an alternate system called the OLAP system. The diagram shows multiple sources feeding into th data warehouse. On the other hand, there are users trying to access the warehouse to receive data and generate reports.