Monday, August 30, 2010

Basic Storage Modes (MOLAP, ROLAP and HOLAP) in Analysis Services

There are three standard storage modes (MOLAP, ROLAP and HOLAP) in OLAP applications which affect the performance of OLAP queries and cube processing, storage requirements and also determine storage locations. To learn more about these standard storage modes, pros and cons of each one, click here.

Friday, August 27, 2010

Lesson 2: Defining and Deploying a Cube


After you define a data source view in your Microsoft Analysis Services project, you are ready to define an initial Analysis Services cube.

You can define a cube and its dimensions in a single pass using the Cube Wizard. Alternatively, you can define one or more dimensions and then use the Cube Wizard to define a cube that uses those dimensions. If you are designing a complex solution, you generally start by defining the dimensions



Defining a Dimension

In the following task, you will use the Dimension Wizard to build a Date dimension.

Note

This lesson requires that you have either completed all the procedures in Lesson 1 or have opened the completed project for Lesson 1 that is available by downloading and installing the updated samples. For more information, see Obtaining Updated Samples in Considerations for Installing SQL Server Samples and Sample Databases. When it is installed, the default location for this project file is C:\Program Files\Microsoft SQL Server\100\Samples\Analysis Services\Tutorials\Lesson 1 Complete.

To define a dimension

1. In Solution Explorer, right-click Dimensions, and then click New Dimension.

2. On the Welcome to the Dimension Wizard page, click Next.

3. On the Select Creation Method page, verify that the Use an existing table option is selected, and then click Next.

4. On the Specify Source Information page, verify that the Adventure Works DW data source view is selected.

5. In the Main table list, select Date.

6. Click Next.

7. On the Select Dimension Attributes page, select the check boxes next to the following attributes:

Date Key

• Full Date Alternate Key

• English Month Name

• Calendar Quarter

• Calendar Year

• Calendar Semester

8. Change the setting of the Full Date Alternate Key attribute's Attribute Type column from Regular to Date.To do this, click Regular in the Attribute Type column. Then click the arrow to expand the options. Next, click Date > Calendar > Date. Click OK. Repeat these steps to change the attribute type of the following attributes as follows:

English Month Name to Month

• Calendar Quarter to Quarter

• Calendar Year to Year

• Calendar Semester to Half Year

9. Click Next.

10. On the Completing the Wizard page, in the Preview pane, you can see the Date dimension and its attributes.

11. Click Finish to complete the wizard.

In Solution Explorer, in the Analysis Services Tutorial project, the Date dimension appears in the Dimensions folder. In the center of the development environment, Dimension Designer displays the Date dimension.

12. On the File menu, click Save All.

Defining a Cube

The Cube Wizard helps you define the measure groups and dimensions for a cube. In the following task, you will use the Cube Wizard to build a cube.

To define a cube and its properties

1. In Solution Explorer, right-click Cubes, and then click New Cube.

2. On the Welcome to the Cube Wizard page, click Next.

3. On the Select Creation Method page, verify that the Use existing tables option is selected, and then click Next.

4. On the Select Measure Group Tables page, verify that the Adventure Works DW data source view is selected.

5. Click Suggest to have the cube wizard suggest tables to use to create measure groups.

The wizard examines the tables and suggests InternetSales as a measure group table. Measure group tables, also named fact tables, contain the measures you are interested in such as the number of units sold.

6. Click Next.

7. On the Select Measures page, review the selected measures in the Internet Sales measure group, and then clear the check boxes for the following measures:

Promotion Key

• Currency Key

• Sales Territory Key

• Revision Number

By default, the wizard selects as measures all numeric columns in the fact table that are not linked to dimensions. However, these four columns are not actual measures. The first three are key values that link the fact table with dimension tables that are not used in the initial version of this cube.

8. Click Next.

9. On the Select Existing Dimensions page, select the Date dimension that you created earlier and then click Next.

10. On the Select New Dimensions page, select the new dimensions to be created. To do this, verify that the Customer, Geography and Product check boxes are selected and clear the InternetSales check box.

11. Click Next.

12. On the Completing the Wizard page, change the name of the cube to Analysis Services Tutorial. In the Preview pane, you can see the InternetSales measure group and its measures. You can also see the Date, Customer, and Product dimensions.

13. Click Finish to complete the wizard.

In Solution Explorer, in the Analysis Services Tutorial project, the Analysis Services Tutorial cube appears in the Cubes folder, and database dimensions appear in the Dimensions folder. Additionally, in the center of the development environment, Cube Designer displays the Analysis Services Tutorial cube.

14. On the toolbar of Cube Designer, change the Zoom level to 50 percent, so that you can more easily see the dimensions and fact tables in the cube. Notice that the fact table is yellow and the dimension tables are blue.

15. On the File menu, click Save All.

Adding Attributes to Dimensions

In the following tasks, you will use Dimension Designer to add attributes to the Customer and Product dimensions.

Adding Attributes to the Customer Dimension

To add attributes

1. Open Dimension Designer for the Customer dimension. To do this, double-click the Customer dimension in the Dimensions node of Solution Explorer.

2. In the Attributes pane, notice the Customer Key and Geography Key attributes that were created by the Cube Wizard.

3. On the toolbar of the Dimension Structure tab, use the Zoom icon to view the tables in the Data Source View pane at 100 percent.

4. Drag the following columns from the Customer table in the Data Source View pane to the Attributes pane:

BirthDate

• MaritalStatus

• Gender

• EmailAddress

• YearlyIncome

• TotalChildren

• NumberChildrenAtHome

• EnglishEducation

• EnglishOccupation

• HouseOwnerFlag

• NumberCarsOwned

• Phone

• DateFirstPurchase

• CommuteDistance

5. Drag the following columns from the Geography table in the Data Source View pane to the Attributes pane:

City

• StateProvinceName

• EnglishCountryRegionName

• PostalCode

6. On the File menu, click Save All.

Adding Attributes to the Product Dimension

To add attributes

1. Open Dimension Designer for the Product dimension.

2. In the Attributes pane, notice the Product Key attribute that was created by the Cube Wizard.

3. On the toolbar of the Dimension Structure tab, use the Zoom icon to view the tables in the Data Source View pane at 100 percent.

4. Drag the following columns from the Products table in the Data Source View pane to the Attributes pane:

StandardCost

• Color

• SafetyStockLevel

• ReorderPoint

• ListPrice

• Size

• SizeRange

• Weight

DaysToManufacture

• ProductLine

• DealerPrice

• Class

• Style

• ModelName

• StartDate

• EndDate

• Status

5. On the File menu, click Save All.

Reviewing Cube and Dimension Properties

After you have defined a cube, you can review the results by using Cube Designer. In the following task, you review the structure of the cube in the Analysis Services Tutorial project.

To review cube and dimension properties in Cube Designer

1. To open the Cube Designer, double-click the Analysis Services Tutorial cube in the Cubes node of Solution Explorer.

2. In the Measures pane of the Cube Structure tab in Cube Designer, expand the Internet Sales measure group to reveal the defined measures.

You can change the order by dragging the measures into the order that you want. The order you create will affect how certain client applications order these measures. The measure group and each measure that it contains have properties that you can edit in the Properties window.

3. In the Dimensions pane of the Cube Structure tab in Cube Designer, review the cube dimensions that are in the Analysis Services Tutorial cube.

Notice that although only three dimensions were created at the database level, as displayed in Solution Explorer, there are five cube dimensions in the Analysis Services Tutorial cube. The cube contains more dimensions than the database because the Date database dimension is used as the basis for three separate date-related cube dimensions, based on different date-related facts in the fact table. These date-related dimensions are also called role playing dimensions. The three date-related cube dimensions let users dimension the cube by three separate facts that are related to each product sale: the product order date, the due date for fulfillment of the order, and the ship date for the order. By reusing a single database dimension for multiple cube dimensions, Analysis Services simplifies dimension management, uses less disk space, and reduces overall processing time.

4. In the Dimensions pane of the Cube Structure tab, expand Customer, and then click Edit Customer to open the dimension in Dimension Designer.

Dimension Designer contains these tabs: Dimension Structure, Attribute Relationships, Translations, and Browser. Notice that the Dimension Structure tab includes three panes: Attributes, Hierarchies, and Data Source View. The attributes that the Dimension contains appear in the Attributes pane. For more information, see: Defining Dimension Attributes, Creating User-Defined Hierarchies, Defining Attribute Relationships

5. Switch to Cube Designer by right-clicking the Analysis Services Tutorial cube in the Cubes node in Solution Explorer and then clicking View Designer.

6. In Cube Designer, click the Dimension Usage tab.

In this view of the Analysis Services Tutorial cube, you can see the cube dimensions that are used by the Internet Sales measure group. Also, you can define the type of relationship between each dimension and each measure group in which it is used.

7. Click the Partitions tab.

The Cube Wizard defined a single partition for the cube, by using the multidimensional online analytical processing (MOLAP) storage mode without aggregations. With MOLAP, all leaf-level data and all aggregations are stored within the cube for maximum performance. Aggregations are precalculated summaries of data that improve query response time by having answers ready before questions are asked. You can define additional partitions, storage settings, and writeback settings on the Partitions tab. For more information, see Partitions (Analysis Services - Multidimensional Data), Aggregations and Aggregation Designs, Designing Partition Storage and Aggregations

8. Click the Browser tab.

Notice that the cube cannot be browsed because it has not yet been deployed to an instance of Analysis Services. At this point, the cube in the Analysis Services Tutorial project is just a definition of a cube, which you can deploy to any instance of Analysis Services. When you deploy and process a cube, you create the defined objects in an instance of Analysis Services, and populate the objects with data from the underlying data sources.

9. In Solution Explorer, right-click Analysis Services Tutorial in the Cubes node and then click View Code.

The XML code for the Analysis Services Tutorial cube is displayed on the Analysis Services Tutorial.cube [XML] tab. This is the actual code that is used to create the cube in an instance of Analysis Services during deployment. For more information, see: How to: View the XML for an Analysis Services Project

10. Close the XML code tab.

Deploying an Analysis Services Project

To view the cube and dimension data for the objects in the Analysis Services Tutorial cube in the Analysis Services Tutorial project, you must deploy the project to a specified instance of Analysis Services and then process the cube and its dimensions. Deploying an Analysis Services project creates the defined objects in an instance of Analysis Services. Processing the objects in an instance of Analysis Services copies the data from the underlying data sources into the cube objects. For more information, see Deploying Analysis Services Projects, Configuring Analysis Services Project Properties

At this point in the development process, you generally deploy the cube to an instance of Analysis Services on a development server. Once you have finished developing your business intelligence project, you will generally use the Analysis Services Deployment Wizard to deploy your project from the development server to a production server. For more information, see Planning an Analysis Services Deployment, Using the Analysis Services Deployment Wizard

In the following task, you review the deployment properties of the Analysis Services Tutorial project and then deploy the project to your local instance of Analysis Services.

To deploy the Analysis Services project

1. In Solution Explorer, right-click the Analysis Services Tutorial project, and then click Properties.

The Analysis Services Tutorial Property Pages dialog box appears and displays the properties of the Active(Development) configuration. You can define multiple configurations, each with different properties. For example, a developer might want to configure the same project to deploy to different development computers and with different deployment properties, such as database names or processing properties. Notice the value for the Output Path property. This property specifies the location in which the XMLA deployment scripts for the project are saved when a project is built. These are the scripts that are used to deploy the objects in the project to an instance of Analysis Services.

2. In the Configuration Properties node in the left pane, click Deployment.

Review the deployment properties for the project. By default, the Analysis Services Project template configures an Analysis Services project to incrementally deploy all projects to the default instance of Analysis Services on the local computer, to create an Analysis Services database with the same name as the project, and to process the objects after deployment by using the default processing option. Related topic: Configuring Analysis Services Project Properties

Note

If you want to deploy the project to a named instance of Analysis Services on the local computer, or to an instance on a remote server, change the Server property to the appropriate instance name, such as <ServerName>\<InstanceName>.

3. Click OK.

4. In Solution Explorer, right-click the Analysis Services Tutorial project, and then click Deploy.

Business Intelligence Development Studio builds and then deploys the Analysis Services Tutorial project to the specified instance of Analysis Services by using a deployment script. The progress of the deployment is displayed in two windows: the Output window and the Deployment Progress – Analysis Services Tutorial window. Open the Output window, if necessary, by clicking Output on the View menu. The Output window displays the overall progress of the deployment. The Deployment Progress – Analysis Services Tutorial window displays the detail about each step taken during deployment. Related topics: Building Analysis Services Projects, Deploying Analysis Services Projects

5. Review the contents of the Output window and the Deployment Progress – Analysis Services Tutorial window to verify that the cube was built, deployed, and processed without errors.

6. Hide the Deployment Progress - Analysis Services Tutorial window by clicking the Auto Hide icon on the toolbar of the window.

7. Hide the Output window by clicking the Auto Hide icon on the toolbar of the window.

You have successfully deployed the Analysis Services Tutorial cube to your local instance of Analysis Services, and then processed the deployed cube.

Browsing the Cube

After you deploy a cube, the cube data is viewable on the Browser tab in Cube Designer, and the dimension data is viewable on the Browser tab in Dimension Designer.

To browse the deployed cube

1. Switch to Dimension Designer for the Product dimension in Business Intelligence Development Studio. To do this, double-click the Product dimension in the Dimensions node of Solution Explorer.

2. Click the Browser tab to display the All member of the Product Key attribute hierarchy. In lesson three, you will define a user hierarchy for the Product dimension that will let you browse the dimension.

3. Switch to Cube Designer in Business Intelligence Development Studio. To do this, double-click the Analysis Services Tutorial cube in the Cubes node of Solution Explorer.

4. Select the Browser tab, and then click the reconnect icon on the toolbar of the designer.

The left pane of the designer shows the objects in the Analysis Services Tutorial cube. On the right side of the Browser tab there are two panes: the upper pane is the Filter pane, and the lower pane is the Data pane. In an upcoming lesson, you will use the cube browser to do analysis.

Lesson 1: Defining a Data Source View within an Analysis Services Project

Here I am using AdventureworksDW. You can download  here


Creating an Analysis Services Project

In the following task, you use Business Intelligence Development Studio to create a new Microsoft Analysis Services project named Analysis Services Tutorial, based on the Analysis Services Project template. A project is a collection of related objects. Projects exist within a solution, which includes one or more projects. For more information, see Defining an Analysis Services Project.

To create a new Analysis Services project

1. Click Start, point to All Programs, point to Microsoft SQL Server 2008, and then click SQL Server Business Intelligence Development Studio.

The Microsoft Visual Studio development environment opens.

2. On the File menu of Visual Studio, point to New, and then click Project.

3. In the New Project dialog box, select Business Intelligence Projects in the Project types pane, and then select Analysis Services Project in the Templates pane.

Notice the default project name, the default solution name, and the default project location in the bottom of the dialog box. By default, a new directory will be created for the solution.

4. Change the project name to Analysis Services Tutorial, which also changes the solution name, and then click OK.

You have successfully created the Analysis Services Tutorial project, based on the Analysis Services Project template, within a new solution that is also named Analysis Services Tutorial.

Defining a Data Source

After you create an Analysis Services project, you generally start working with the project by defining one or more data sources that the project will use. When you define a data source, you are defining the connection string information that will be used to connect to the data source. For more information, see Defining a Data Source Using the Data Source Wizard (Analysis Services).

In the following task, you define the AdventureWorksDW2008 sample database as the data source for the Analysis Services Tutorial project. While this database is located on your local computer for the purposes of this tutorial, source databases are frequently hosted on one or more remote computers.

To define a new data source

1. In Solution Explorer, right-click Data Sources, and then click New Data Source.

2. On the Welcome to the Data Source Wizard page, click Next to open the Select how to define the connection page.

3. On the Select how to define the connection page, you can define a data source based on a new connection, based on an existing connection, or based on a previously defined data source object. In this tutorial, you define a data source based on a new connection. Verify that Create a data source based on an existing or new connection is selected and then click New.

4. In the Connection Manager dialog box, you define connection properties for the data source. In the Provider list, verify that Native OLE DB\SQL Server Native Client 10.0 is selected.

Analysis Services also supports other providers, which are displayed in the Provider list.

5. In the Server name text box, type localhost.

To connect to a named instance on your local computer, type localhost\<instance name>. To connect to the specific computer instead of the local computer, type the computer name or IP address.

6. Verify that Use Windows Authentication is selected. In the Select or enter a database name list, select AdventureWorksDW2008.

7. Click Test Connection to test the connection to the database.

8. Click OK, and then click Next.

9. On the Impersonation Information page of the wizard, you define the security credentials for Analysis Services to use to connect to the data source. Impersonation affects the Windows account used to connect to the data source when Windows Authentication is selected. Analysis Services does not support impersonation for processing OLAP objects. Select Use the service account, and then click Next.

10. On the Completing the Wizard page, type the name Adventure Works DW and then click Finish to create the new data source.

Note:

To modify the properties of the data source after it has been created, double-click the data source in the Data Sources folder to display the data source properties in Data Source Designer.

Defining a Data Source View

After you define the data sources that you will use in an Analysis Services project, the next step is generally to define a data source view for the project. A data source view is a single, unified view of the metadata from the specified tables and views that the data source defines in the project. Storing the metadata in the data source view enables you to work with the metadata during development without an open connection to any underlying data source. For more information, see Designing Data Source Views (Analysis Services).

In the following task, you define a data source view that includes five tables from the Adventure Works DW data source.

To define a new data source view

1. In Solution Explorer, right-click Data Source Views, and then click New Data Source View.

2. On the Welcome to the Data Source View Wizard page, click Next.

3. The Select a Data Source page appears. Under Relational data sources, the Adventure Works DW data source is selected. Click Next.

Note:

To create a data source view that is based on multiple data sources, you first define a data source view that is based on a single data source. This data source is then called the primary data source. You can then add tables and views from a secondary data source. When designing dimensions that contain attributes based on related tables in multiple data sources, you might have to define a MicrosoftSQL Server data source as the primary data source to use its distributed query engine capabilities.

4. On the Select Tables and Views page, you select tables and views from the list of objects that are available from the selected data source. You can filter this list to help you in selecting tables and views.

Note:

Click the maximize button in the upper-right corner so that the window covers the full screen. This will make it easier to see the complete list of available objects.

5. In the Available objects list, select the following objects. You can select multiple tables by clicking each while holding down the CTRL key:

• DimCustomer (dbo)

• DimDate (dbo)

• DimGeography (dbo)

• DimProduct (dbo)

• FactInternetSales (dbo)

6. Click > to add the selected tables to the Included objects list.

7. Click Next.

8. In the Name field, type Adventure Works DW and then click Finish to define the Adventure Works DW data source view.

The Adventure Works DW data source view appears in the Data Source Views folder in Solution Explorer. The content of the data source view is also displayed in Data Source View Designer in Business Intelligence Development Studio. This designer contains the following elements:

• A Diagram pane in which the tables and their relationships are represented graphically.

• A Tables pane in which the tables and their schema elements are displayed in a tree view.

• A Diagram Organizer pane in which you can create subdiagrams so that you can view subsets of the data source view.

• A toolbar that is specific to Data Source View Designer.

9. To maximize the Microsoft Visual Studio development environment, click the Maximize button.

10. To view the tables in the Diagram pane at 50 percent, click the Zoom icon on the Data Source View Designer toolbar. This will hide the column details of each table.

11. To hide Solution Explorer, click the Auto Hide button, which is the pushpin icon on the title bar. To view Solution Explorer again, position your pointer over the Solution Explorer tab along the right side of the development environment. To unhide Solution Explorer, click the Auto Hide button again.

12. If the window is not hidden by default, click Auto Hide on the title bar of the Properties window.

You can now view all the tables and their relationships in the Diagram pane. Notice that there are three relationships between the FactInternetSales table and the DimDate table. Each sale has three dates associated with the sale: an order date, a due date, and a ship date. To view the details of any relationship, double-click the relationship arrow in the Diagram pane.

Modifying Default Table Names

You can change the value of the FriendlyName property for objects in the data source view to increase the user-friendliness of their names. You can also change the names of these objects after you define them.

In the following task, you will change the friendly name of each table in the Adventure Works DW data source view by removing the "Dim" and "Fact" prefixes from these tables. This will increase the user-friendliness of the cube and dimension objects that you will define in the next lesson.

Note:

You can also change the friendly names of columns, define calculated columns, and join tables or views in the data source view to increase their user-friendliness.

To modify the default name of a table:

1. In the Tables pane of Data Source View Designer, right-click the FactInternetSales table, and then click Properties to display the properties for the FactInternetSales object in the Adventure Works Tutorial data source view.

2. Click the Auto Hide button on the title bar of the Properties window so that this window will remain visible.

It is easier to change the properties for each table in the data source view when the Properties window remains open. If you do not pin the window open by using the Auto Hide button, the window will close when you click a different object in the Diagram pane.

3. Change the FriendlyName property for the FactInternetSales object to InternetSales.

When you click away from the cell for the FriendlyName property, the change is applied. In the next lesson, you will define a measure group that is based on this fact table. The name of the fact table will be InternetSales instead of FactInternetSales because of the change you made in this lesson.

4. Click DimProduct in the Tables pane. In the Properties window, change the FriendlyName property to Product.

5. Change the FriendlyName property of each remaining table in the data source view in the same way, to remove the "Dim" prefix.

6. When you have finished, click the Auto Hide button to hide the Properties window again.

7. On the File menu, or on the toolbar of BI Development Studio, click Save All to save the changes you have made to this point in the Analysis Services Tutorial project. You can stop the tutorial here if you want and resume it later.

The basic concepts of OLAP include

The basic concepts of OLAP include:


Cube
Dimension table
Dimension
Level
Fact table
Measure
Schema
Cube

The basic unit of storage and analysis in Analysis Services is the cube. A cube is a collection of data that’s been aggregated to allow queries to return data quickly. For example, a cube of order data might be aggregated by time period and by title, making the cube fast when you ask questions concerning orders by week or orders by title.



Cubes are ordered into dimensions and measures. Dimensions come from dimension tables, while measures come from fact tables.

Dimension table

A dimension table contains hierarchical data by which you’d like to summarize. Examples would be an Orders table, that you might group by year, month, week, and day of receipt, or a Books table that you might want to group by genre and title.

Dimension

Each cube has one or more dimensions, each based on one or more dimension tables. A dimension represents a category for analyzing business data: time or category in the examples above. Typically, a dimension has a natural hierarchy so that lower results can be “rolled up” into higher results. For example, in a geographical level you might have city totals aggregated into state totals, or state totals into country totals.

Level

Each type of summary that can be retrieved from a single dimension is called a level. For example, you can speak of a week level or a month level in a time dimension.

Fact table

A fact table contains the basic information that you wish to summarize. This might be order detail information, payroll records, drug effectiveness information, or anything else that’s amenable to summing and averaging. Any table that you’ve used with a Sum or Avg function in a totals query is a good bet to be a fact table.

Measure

Every cube will contain one or more measures, each based on a column in a fact table that you’d like to analyze. In the cube of book order information, for example, the measures would be things such as unit sales and profit.

Schema

Fact tables and dimension tables are related, which is hardly surprising, given that you use the dimension tables to group information from the fact table. The relations within a cube form a schema. There are two basic OLAP schemas: star and snowflake. In a star schema, every dimension table is related directly to the fact table. In a snowflake schema, some dimension tables are related indirectly to the fact table. For example, if your cube includes OrderDetails as a fact table, with Customers and Orders as dimension tables, and Customers is related to Orders, which in turn is related to OrderDetails, then you’re dealing with a snowflake schema.

SQL Server Analysis Services (SSAS)

Introduction


Microsoft SQL Server Analysis Services (SSAS) delivers online analytical processing (OLAP) and data mining functionality for business intelligence applications. Analysis Services supports OLAP by letting you design, create, and manage multidimensional structures that contain data aggregated from other data sources, such as relational databases. For data mining applications, Analysis Services lets you design, create, and visualize data mining models that are constructed from other data sources by using a wide variety of industry-standard data mining algorithms.



The basic idea of OLAP is fairly simple. Let’s think about that book ordering data for a moment. Suppose you want to know how many people ordered a particular book during each month of the year. You could write a fairly simple query to get the information you want. The catch is that it might take a long time for SQL Server to churn through that many rows of data.



And what if the data was not all in a single SQL Server table, but scattered around in various databases throughout your organization? The customer info, for example, might be in an Oracle database, and supplier information in a legacy xBase database. SQL Server can handle distributed heterogeneous queries, but they’re slower.



What if, after seeing the monthly numbers, you wanted to drill down to weekly or daily numbers? That would be even more time -consuming and require writing even more queries.



This is where OLAP comes in. The basic idea is to trade off increased storage space now for speed of querying later. OLAP does this by precalculating and storing aggregates. When you identify the data that you want to store in an OLAP database, Analysis Services analyzes it in advance and figures out those daily, weekly, and monthly numbers and stores them away (and stores many other aggregations at the same time). This takes up plenty of disk space, but it means that when you want to explore the data you can do so quickly.