Getting Started with Adobe After Effects - Part 6: Motion Blur


Upload Image Close it
Select File

Browse by Tags · View All
SQL Server Storage Internals 4
T-SQL 1

Archive · View All
August 2011 5

LivingForSqlServer

SQL Server Storage Internals Part 1 - Basics

Aug 8 2011 4:56AM by Ramkumar   

While learning query tuning topics, I got a chance to understand anatomy of data page and different types and levels of Index pages.

This learning helped me to understand concepts better and to give better performance tuning solutions later.

This is my attempt to share my learning with beginners by using some simple scripts in step by step approach.

Suggest all SQL Server DBAs to spend sometime to understand anatomy of different types of Pages. This would help DBAs to take better decisions when DBCC CHECKDB throws warning messages, to provide better performance tuning solutions etc.

While writing an article in this topic, Its unavoidable to refer some Undocumented commands. I strongly recommend beginners to try all given scripts only in local or development enviornments.

Those who are new to system tables and SQL Server 2005/2008 System views, take some time to refer "Mapping SQL Server 2000 System Tables to SQL Server 2005 System Views" topic in below url.

http://msdn.microsoft.com/en-us/library/ms187997(SQL.90).aspx

Understaning system tables (SQL 2000) or system views (SQL 2005/2008) always helps you to make your day to day work easier.

This article is to make beginners comfortable in SQL Server storage internals. I 've delibrately avoided some accuracies in some places for better understanding.

Without wasting time, lets quickly jump into the topic.

To test concepts related to storage internals, lets create a database named LearningInternals

CREATE DATABASE LearningInternals
GO
USE LearningInternals
GO
    

Some key points on data and log files:

A database have three types of files

  1. Primary data file (.MDF) - Every database has one primary data file and is the starting point of database. All user tables are created in PRIMARY filegroup if no other secondary files are present)
  2. Secondary data file (.NDF) - A database can have optionally zero or more Secondary data files, created for better performance and to manage tables efficiently.
  3. Transaction Log file (.LDF) - A database by default will have one log file, used to record changes made in tables. This can be backed up and used later in case of data loss.

Here is the query to list all filegroups (primary and secondary data files) in current database

select * from sys.filegroups
go

Below query lists all types of files and its related details

select database_id, file_id, type, type_desc, name, physical_name, state_desc, size, is_read_only, backup_lsn, differential_base_guid 
from sys.master_files 
where DB_NAME(database_id) = 'LearningInternals'
go

Recomendation:

Its recommended to keep data and log files in different drives for better performance. And also its recommended not to keep Tempdb in drives where user database files are placed.

Before moving further into Storage internals, its important to understand some key internal components.

Pages

Is the fundamental unit of data storage in SQL Server. Once database is created, data space in primary and secondary files are logically divided 8k blocks called Pages. Page numbers are refered along with data file number. For example, first page of Primary file group is named as 1:0 (1 represents filegroup and 0 represents page number).

Page size : 8 k (8 * 1024 = 8192 bytes. 92 bytes for header. 8060 bytes  for data)

Extents

Extents are collection of 8 pages. Pages in an extent can be owned by one or upto 8 tables.

There are two types of extents

  1. Uniform extent - all 8 pages are belongs to one table
  2. Mixed extent - 8 pages are shared by two or more tables

when a table is created and a row is inserted, table gets 1 page in mixed extent. When a table grows, table is moved to uniform extent. This logic is to manage space efficiently.

There are different types of pages.

For our convenience We can logically divide pages into

  1. System pages - pages used by Storage engine to manage user data
  2. Data & index pages - Pages holds user created table data or index values

System pages:

Slno Page Description
1 Header Page Stores detals about data files like size and type etc.
2 GAM Global Allocation Map. GAM is the 3rd page (1:2) in each data file. It's a bitmap with appx. 64000 bits (8k page * 8 bits per byte). each bit tells whether the associated extent is uniform or not.  With one GAM, Storage engine can  identify upto 64000 uniform extents. Meaning one GAM  will cover upto 4 GB data space. What if data file size is greater than 4GB? Simple. another GAM will cover next 4 GB portion
3 SGAM Shared Global Allocation Map. SGAM is the 4th page (1:3) in each data file. It's a bitmap with appx. 64000 bits (8k page * 8 bits per byte) each bit tells whether the associated extent is mixed extent or not.  With one SGAM, Storage engine can  identify upto 64000 mixed extents. Meaning one SGAM  will cover upto 4 GB data space. What if data file size is greater than 4GB? Simple. another SGAM will cover next 4 GB portion
4 IAM Index Allocation Map. One IAM page for each table to identify pages associated with a table (there can be many IAM for one table if table is partitioned and/or a table have row overflow pages/LOB data etc. im leaving them unexplained as of now).  It's a bitmap with appx. 64000 bits. each bit tells whether the associated extent is allocated to a table or not.  With one IAM, Storage engine can  identify upto 64000 extents. Meaning one IAM will cover upto 4 GB data space. What if a table size is greater than 4GB? Simple. another IAM will cover next 4 GB portion.
5 PFS Page Free Space. PFS is used for storing space availability of each pages.  One byte for each page. There will be one PFS for every 8000 pages.
6 DCM Differential Change Map. DCM is a bitmap page helps differential backups to identify the extents changed since Full backup.
7 BCM Bulk change map. Used to monitor extents affected by bulk operations. Used when database in bulk logged recovery mode.

Data and Index Pages:

Slno Page Description
1 Data page Holds table rows. When an Insert statement is issued, PFS helps to identify the right page to hold the record.
2 Row Overflow page From SQL 2005, you have have 4 varchar(5000) columns, meaning  maximum 8000 bytes per row limitation is avoided with Row overflow page. This page is used when a record size exceeds 8000 row size.
3 LOB LOB data like text, varchar(max) pages are stored seperately.

When a database is created, a data file will have pages like this,

Page Header PFS GAM SGAM Unused Unused DCM BCM .... Data Data Data Data Data Data Data
Page No. 1:0 1:1 1:2 1:3 1:4 1:5 1:6 1:7 .... 1:15001 1:15002 1:15003 1:15004 1:15005 1:15006 1:15007

Here is the query to list all tables and pages allocated to each table.

select so.name, so.object_id, sp.index_id, sp.partition_id, sp.hobt_id, sa.container_id, internals.total_pages, internals.used_pages, internals.data_pages, first_page, root_page, first_iam_page
from sys.objects so
      inner join sys.partitions sp on so.object_id = sp.object_id
      inner join sys.allocation_units sa on sa.container_id = sp.hobt_id
      inner join sys.system_internals_allocation_units internals on internals.container_id = sa.container_id
where so.name not like 'sys%'

Summary:

When a database is created, by default one primary data file and a log file is created. Data file is logically divided in to hundreds of 8k blocks called pages. Some system related pages like GAM, SGAM, IAM and PFS are used to manage data pages.

In my next article, I am going to cover

  1. The relationship among sys.objects, sys.partitions and sys.allocation_units system views
  2. Anatomy of a page.

Tags: SQL Server Storage Internals,


Ramkumar
366 · 0% · 112
2
 
0
Lifesaver
 
0
Refreshed
 
0
Learned
 
0
Incorrect



Submit

2  Comments  

  • Good information. I've found understanding pags and extents is important to maintain contigiuous data. Keeping fragmentation levels low is good for read-ahead performance

    commented on Aug 21 2011 1:01AM
    Jack Vamvas
    5 · 26% · 8528
  • Hi Ram,

    Great article..Just happen to stumble across it..

    But I think your explanation of IAM pages is kind of incomplete.IAM pages are not used to track the pages associated with a table but rather the extents associated with an NC index or a heap(there is a lot of debate on this actually) and each IAM page will track the extents across a single GAM for a single entity.

    Also NC indexes or heaps do occupy non contiguous pages and a single NC index can span across multiple files.To track them IAM chains are used to link them together and are sorted by their file order.So when the optimizer decides to do a NC scan or a table scan IAM is the only way for it to do so(unordered scan) while for a clustered index scan it has 2 options linked list or IAM chains.

    commented on Oct 30 2011 1:39AM
    Sachin Nandanwar
    499 · 0% · 76

Your Comment


Sign Up or Login to post a comment.

"SQL Server Storage Internals Part 1 - Basics" rated 5 out of 5 by 2 readers
SQL Server Storage Internals Part 1 - Basics , 5.0 out of 5 based on 2 ratings
    Copyright © Rivera Informatic Private Ltd Contact us      Privacy Policy      Terms of use      Report Abuse      Advertising      [ZULU1097]