Introducing to SQL Server Analytics Services

Many times I’ve been asked by my coworkers and friend to show them how to create and work with OLAP, Cubes, SSAS, MDX and so on and so on.

Therefore I decided to start writing a simple manual on how to use SSAS from scratch.

In the examples, I will use AdventureWorksDW2008R2 as a data warehouse.

The central concept of the architecture

In the previous post, I talked about Kimbal architecture of Datawarehouse, but again I will try to illustrate how a straightforward Business intelligence solution should look like.
From the diagram, It is clear that there are many processes to be done before we start working with designing the OLAP Cube.
For these sessions, I will not get into details about what are those processes, but I will keep strictly to the OLAP design and usage of the Microsoft SQL Server Analytical Services (SSAS) tool.

Starting  SSAS

SSAS alongside with SSIS and SSRS is part of the Microsoft’s BI solution and can be accessed with opening Visual Studio, which is installed on your computer with installing Business Intelligence Development Studio add-ins to Microsoft Visual Studio.
If you are not able to access SSAS and If you had installed SQL Server 2008 from a CD or Media, then you can run the installation media, and make sure you had selected BIDS as an option, along with Management Tools.

1. Create a new solution

For creating a new project, go to New -> Project and then Choose Analysis services project.
 

2. Create a data source

First thing after creating a solution is creating a data source to the data warehouse that we will use.
Under Data sources -> Right Click -> New Datasource. From Datasource, vizard chooses New to create a new connection manager.

3. Choose Impersonation options

When to use snowflake and when star schema?

This question is also very common on interviews, but it is also essential part of Data Warehouse design.

Star schema

In star schema dimensions that reflect a hierarchy are flattened into a single table.

For example in a star schema Product Dimension would have columns like Product Line, Model, Inventory in one denormalized table, but in relational database this hierarchy most likely will be normalized with multiple tables with one-to-many relationships.

Snowflake schema

A snowflake schema does not flatten a hierarchy dimension into a single table. It would, instead, have two or more tables with a one-to-many relationship. Dimensions here are more in normalized structure.

Good example when to use snowflake schema is when we have Company with many types of products. Some of the products will have few attributes some other products will have a lot of attributes. Best practice here is to create main dimension for products that will contain common attributes like Name, Price, Product Number etc. and then to create additional sub dimensions for each distinct product like Groceries Dimension, Hardware Dimension, Books Dimension etc.
The sub dimensions must contain a foreign key of the main Product dimension table.

What is the Kimball Architecture?

On every job interview I go I am asked if I know what Kimbal Architecture is?
Here what Kimball group say their architecture is on their web site:

The Kimball Architecture separates the data and processes comprising the DW/BI system into the backroom extract, transformation and load (ETL) environment and the front room presentation area, as illustrated in the following diagram.
Figure 1: Kimball architecture diagram.

The Kimball Architecture focuses on the following components:

  • Backroom ETL system: The Kimball Group has identified 34 subsystems in the ETL process flow, grouped into four major operations:extracting the data from the sources, performing cleansing and conforming transformations, delivering it to the presentation server, and managing the ETL process and back room environment.
  • Front room presentation area: The Kimball Architecture presumes the data utilized by the BI applications is dimensionally-structured, organized by business process, atomically-grained (complemented by aggregated summaries for performance tuning), and tied together by the enterprise data warehouse bus architecture, as described earlier on this page.
  • Front room BI applications: The front room is the public face of the DW/BI system; it’s what business users see and work with day-to-day. There’s a broad range of BI applications supported by BI management services in the front room, including ad hoc queries, standardized reports, dashboards and scorecards, and more powerful analytic or mining/modeling applications.
  • Metadata: Metadata is all the information that defines and describes the structures, operations, and contents of the DW/BI system.Technical metadata defines the objects and processes which comprise the DW/BI system. Business metadata describes the data warehouse contents in user terms, including what data is available, where did it come from, what does it mean, and how does it relate to other data. Finally, process metadata describes the warehouse’s operational results.

Some organizations adopt an alternative data warehouse architecture that includes a third normal form (3NF) relational data warehouse. This hub-and-spoke architecture, often called the Corporate Information Factory (CIF), includes a data acquisition ETL process to gather, clean and integrate data similar to the backroom ETL system described above. With the CIF, atomic data is loaded into third normal form structures, typically called the enterprise data warehouse (EDW). Another ETL data delivery process then populates downstream reporting and analytic environments supporting the business users; these environments are typically structured dimensionally.
A modification to the Kimball Architecture, sometimes referred to as a hybrid architecture, leverages an existing 3NF data warehouse as the source of clean, integrated data to feed the front room presentation area described above.