Tuesday, 6 January 2015

Datawarehouse Models

Data Warehouse Models From the perspective of data warehouse architecture we have the following data warehouse models:
1.Virtual Warehouse
2 Data mart
3 Enterprise Warehouse

VIRTUAL WAREHOUSE
  • The view over a operational data warehouse is known as virtual warehouse. It is easy to built the virtual warehouse.
  •  Building the virtual warehouse requires excess capacity on operational database servers. 

DATA MART
  • Data mart contains the subset of organisation-wide data.
  •  This subset of data is valuable to specific group of an organisation. 
Note: in other words we can say that data mart contains only that data which is specific to a particular group. For example the marketing data mart may contain only data related to item, customers and sales. The data mart are confined to subjects.


Points to remember about data marts
  • Window based or Unix/Linux based servers are used to implement data marts. 
  • They are implemented on low cost server.
  •  The implementation cycle of data mart is measured in short period of time i.e. in weeks rather than months or years. 
  • The life cycle of a data mart may be complex in long run if it's planning and design is not organisation-wide. Data marts are small in size.
  •  Data mart is customized by department. The source of data mart is departmentally structured data warehouse. Data mart is flexible.
ENTERPRISE WAREHOUSE

  • The enterprise warehouse collects all the information all the subjects spanning the entire organization 
  • This provides us the enterprise-wide data integration.
  • This provides us the enterprise-wide data integration. 
  • The data is integrated from operational systems and external information providers. 
  • This information can vary from a few gigabytes to hundreds of gigabytes, terabytes or beyond.

Monday, 5 January 2015

Difference between OLAP and OLTP(DWH and database)


SN
Data Warehouse (OLAP)
Operational Database(OLTP)
1
This involves historical processing of information.
This involves day to day processing.
2
OLAP systems are used by knowledge workers such as executive, manager and analyst.
OLTP system are used by clerk, DBA, or database professionals.
3
This is used to analysis the business.
This is used to run the business.
4
It focuses on Information out.
It focuses on Data in.
5
This is based on Star Schema, Snowflake Schema and Fact Constellation Schema.
This is based on Entity Relationship Model.
6
It focuses on Information out.
This is application oriented.
7
This contains historical data.
This contains current data.
8
This provides summarized and consolidated data.
This provide primitive and highly detailed data.
9
This provides summarized and multidimensional view of data.
This provides detailed and flat relational view of data.
10
The number or users are in Hundreds.
The number of users is in thousands.
11
The number of records accessed is in millions.
The number of records accessed is in tens.
12
The database size is from 100GB to TB
The database size is from 100 MB to GB.
13
This are highly flexible.
This provide high performance

Introduction to data warehouse and its need and features



The term "Data Warehouse" was first coined by Bill Inmon in 1990. He said that Data warehouse is subject Oriented, Integrated, Time-Variant and nonvolatile collection of data.This data helps in supporting decision making process by analyst in an organization The operational database undergoes the per day transactions which causes the frequent changes to the data on daily basis.But if in future the business executive wants to analyse the previous feedback on any data such as product,supplier,or the consumer data. In this case the analyst will be having no data available to analyse because the previous data is updated due to transactions. The Data Warehouses provide us generalized and consolidated data in multidimensional view. Along with generalize and consolidated view of data the Data Warehouses also provide us Online Analytical Processing (OLAP) tools. These tools help us in interactive and effective analysis of data in multidimensional space. This analysis results in data generalization and data mining. The data mining functions like association,clustering ,classification, prediction can be integrated with OLAP operations to enhance interactive mining of knowledge at multiple level of abstraction. That's why data warehouse has now become important platform for data analysis and online analytical processing.

Understanding Data Warehouse

The Data Warehouse is that database which is kept separate from the organization's operational database.
There is no frequent updation done in data warehouse.
Data warehouse possess consolidated historical data which help the organization to analyse it's business.
Data warehouse helps the executives to organize, understand and use their data to take strategic decision.

Data warehouse systems available which helps in integration of diversity of application systems.
The Data warehouse system allows analysis of consolidated historical data analysis.

Definition
Data warehouse is Subject Oriented, Integrated, Time-Variant and Nonvolatile collection of data that support management's decision making process.

Why Data Warehouse Separated from Operational Databases

The following are the reasons why Data Warehouse are kept separate from operational databases:
The operational database is constructed for well known tasks and workload such as searching particular records, indexing etc but the data warehouse queries are often complex and it presents the general form of data.
Operational databases support the concurrent processing of multiple transactions.Concurrency control and recovery mechanisms are required for operational databases to ensure robustness and consistency of database.
Operational database query allow reading, modifying operations while the OLAP query need read only access of stored data.
Operational database maintain the current data on the other hand data warehouse maintain the historical data.


Data Warehouse Features
The key features of Data Warehouse such as Subject Oriented, Integrated, Nonvolatile and Time-Variant are are discussed below:
Subject Oriented - The Data Warehouse is Subject Oriented because it provides us the information around a subject rather the organization's ongoing operations. These subjects can be product, customers, suppliers, sales, revenue etc. The data warehouse does not focus on the ongoing operations Rather it focuses on modelling and analysis of data for decision making.

Integrated - Data Warehouse is constructed by integration of data from heterogeneous sources such as relational databases, flat files etc. This integration enhances the effective analysis of data.

Time-Variant - The Data in Data Warehouse is identified with a particular time period. The data in data warehouse provide information from historical point of view

Non Volatile - Non volatile means that the previous data is not removed when new data is added to it. The data warehouse is kept separate from the operational database therefore frequent changes in operational database are not reflected in data warehouse.

Note: - Data Warehouse does not require transaction processing, recovery and concurrency control because it is physically stored separate from the operational database.

Data Warehouse Applications
As discussed before Data Warehouse helps the business executives in organize, analyse and use their data for decision making. Data Warehouse serves as a soul part of a plan-execute-assess "closed-loop" feedback system for enterprise management. Data Warehouse is widely used in the following fields:
financial services
Banking Services
Consumer goods
Retail sectors.
Controlled manufacturing