What's new in High Availability with Microsoft SQL Server 2005
The simple definition of the availability of a system is the amount of time a system is up in a given year. A highly available system is one that can potentially mask the effects of the failure and maintain availability to such a degree that end users are not affected by the failure.
With SQL Server 2005, Microsoft has set the bar much higher in its quest to offer a variety of solid highavailability (HA) options. These options, coupled with extensive Windows server family enhancements, provide organizations their first real chance at achieving the mythical "five-nines" (that is, 99.999%) uptime.
Here are a few of the most significant enhancements and new features that have direct or
indirect effects on increasing high availability for a SQL Server 2005–based implementation:
Increased number of nodes in a SQL cluster
You can create a SQL cluster of up
to eight nodes on Windows 2003 Data Center and up to four nodes on Windows
2003 Enterprise Edition.
Ability to do unattended cluster setup
Instead of having to use wizards to set up
SQL clustering, you can use the Unattended Cluster Setup mode. This is very useful
for fast re-creation or remote creation of SQL clustering configurations.
Full SQL Server 2005 services as cluster managed resources
All SQL Server 2005
services like Analysis Services, Notification Services, Reporting Services etc. are cluster aware.
SQL Server 2005 database mirroring
Database mirroring essentially extends the
old log shipping feature of SQL Server 2000 and creates an automatic failover capability
to a "hot" standby server
SQL Server 2005 peer-to-peer replication
a new option of data replication that
uses a publisher-to-publisher model (hence peer-to-peer).
SQL Server 2005 fast recovery
Administrators can reconnect to a recovering database
after the transaction log has been rolled forward (and before the rollback
processing has finished)
Database administrators can perform a restore operation while the
database is still online
The online index option allows concurrent modifications
(updates, deletes, and inserts) to the underlying table or clustered index data and
any associated indexes during index creation time
SQL Server 2005 allows for the generation and use of a readonly,
stable view of a database. The database snapshot is created without the
overhead of creating a complete copy of the database or having completely redundant
Data partitioning improvements
Data partitioning has been enhanced with
native table and index partitioning, which essentially allow you to manage large
tables and indexes at a lower level of granularity
Addition of a snapshot isolation level
A new snapshot isolation (SI) level is being
provided at the database level. With SI, users can access the last committed row,
using a transactionally consistent view of the database
Dedicated administrator connection
SQL Server 2005 introduces a dedicated
administrator connection that administrators can use to access a running server
even if the server is locked or otherwise unavailable. This capability enables administrators
to troubleshoot problems on a server by executing diagnostic functions or
Transact-SQL statements without having to take down the server
What is (RAID) and what are the different levels of RAID (Redundant Array of Inexpensive Disks) in relation to SQL Server 2005?
Redundant array of inexpensive disks (RAID) is used to configure a disk subsystem to provide
better performance and fault tolerance for an application. The basic idea behind using RAID is
that you spread data across multiple disk drives so that I/Os are spread across multiple drives.
RAID has special significance for database-related applications, where you want to spread
random I/Os (data changes) and sequential I/Os (for the transaction log) across different disk
subsystems to minimize disk head movement and maximize I/O performance.
Since multiple disks increases the mean time between failures (MTBF), storing data redundantly also increases fault tolerance.
A RAID appears to the operating system to be a single logical hard disk. RAID employs the technique of disk striping, which involves partitioning each drive's storage space into units ranging from a sector (512 bytes) up to several megabytes.
RAID-0 uses disk striping; that is, it writes data across multiple hard disk partitions in what is called
A stripe set. This can greatly improve speed because multiple hard disks are working at the same
time. You can implement RAID-0 through the use of Windows Server software or third-party hardware.
Although RAID-0 gives you the best speed, it does not provide any fault-tolerance. If one of
the hard disks in the stripe set is damaged, you lose all of your data. Because of the lack of faulttolerance,
Microsoft doesn’t recommend storing any of your SQL Server data on RAID-0 volumes.
RAID-1 uses disk mirroring. Disk mirroring actually writes your information to disk twice —
once to the primary file and once to the mirror. This gives you excellent fault-tolerance, but
it is fairly slow, because you must write to disk twice. Windows Server allows you to implement
RAID-1 with a single controller, or you can use a controller for each drive in the mirror,
commonly referred to as disk duplexing. This is the recommended place for storing your transaction
logs because RAID-1 gives fast sequential write speed (writing data in sequence on the
disk rather than jumping from one empty spot to the next), a requirement for transaction logs.
RAID-5—striping with parity — writes data to the hard disk in stripe sets. Parity checksums
will be written across all disks in the stripe set. This gives you excellent fault-tolerance as well
as excellent speed with a reasonable amount of overhead. You can use the parity checksums
to re-create information lost if a single disk in the stripe set fails. If more than one disk in
the stripe set fails, however, you will lose all your data. Although Windows Server supports
RAID-5 in a software implementation, a hardware implementation is faster and more reliable,
and we suggest you use it if you can afford it. Microsoft recommends storing your data files
on this type of RAID because data files require fast read speed as opposed to transaction logs,
which need fast write speed.
You should use RAID-10 (sometimes referred to as RAID 0+1) in mission-critical systems that
require 24/7 uptime and the fastest possible access. RAID-10 implements striping with parity
as in RAID-5 and then mirrors the stripe sets. So, you get the incredible speed and faulttolerance,
but RAID-10 has a drawback. With this type of RAID you get the added expense
of using more than twice the disk space of RAID-1. Then again, we are talking about a situation
that can afford no SQL Server downtime
Unless you can afford a RAID-10 array, Microsoft suggests a combination of
RAID-5 and RAID-1. In this scenario, you place your data files on the RAID-5
array for speed and redundancy. You place your transaction log files on the
RAID-1 drives so they can be mirrored.
What are filegroups and how can you use them to administer VLDBs (SqlServer 2005)?
What are the advantages and disadvantages of using filegroups in sqlserver 2005 ?
A filegroup is a logical grouping of database files used for performance and to improve administration on very large databases (VLDBs)—usually in the hundreds of gigabyte or terabyte range.
The two basic filegroup types in SQL Server 2005 are the primary, or default, filegroup that is
created with every database and the user-defined filegroups created for a particular database.
The primary filegroup will always contain the primary data file and any other files that are not
specifically created in a user-defined filegroup. You can create additional filegroups using the
ALTER DATABASE command or Management Studio.
Filegroups have several rules you should follow when you are working with them:
- The first (or primary) data file must reside in the primary filegroup.
- All system files must be placed in the primary filegroup.
- A file cannot be a member of more than one filegroup at a time.
- Filegroups can be allocated indexes, tables, text, ntext, and image data.
- New data pages are not automatically allocated to user-defined filegroups if the primary
filegroup runs out of space.
- Using filegroups, you can explicitly place database objects into a particular set of database files. For example, you can separate tables and their nonclustered indexes into separate filegroups. This can improve performance, because modifications to the table can be written to both the table and the index at the same time. This can be especially useful if you are not using striping with parity (RAID-5).
- Another advantage of filegroups is the ability to back up only a single filegroup at a time.
This can be extremely useful for a VLDB, because the sheer size of the database could make
backing up an extremely time-consuming process.
- Yet another advantage is the ability to mark the filegroup and all data in the files that are part of it as either read-only or read-write.
There are really only two disadvantages to using filegroups.
- The first is the administration that is involved in keeping track of the files in the filegroup and the database objects that are placed in them.
- The other is that if you are working with a smaller database and have RAID-5 implemented, you may not be improving performance.
If you place tables in one filegroup and their corresponding indexes in a different
filegroup, you must back up the two filegroups as a single unit—they
cannot be backed up separately.
When to use @@IDENTITY, SCOPE_IDENTITY, and IDENT_CURRENT ?
@@IDENTITY, SCOPE_IDENTITY, and IDENT_CURRENT are similar functions because they all return the last value inserted into the IDENTITY column of a table.
To give you a quick overview....
» @@IDENTITY returns the last identity value generated for any table in the current session, across all scopes.
» SCOPE_IDENTITY returns the last identity value generated for any table in the current session and the current scope.
» IDENT_CURRENT is not limited by scope and session; it is limited to a specified table. IDENT_CURRENT returns the identity value generated for a specific table in any session and any scope.
After an INSERT, SELECT INTO, or bulk copy statement is completed, @@IDENTITY contains the last identity value that is generated by the statement. If the statement did not affect any tables with identity columns, @@IDENTITY returns NULL. If multiple rows are inserted, generating multiple identity values, @@IDENTITY returns the last identity value generated. If the statement fires one or more triggers that perform inserts that generate identity values, calling @@IDENTITY immediately after the statement returns the last identity value generated by the triggers. If a trigger is fired after an insert action on a table that has an identity column, and the trigger inserts into another table that does not have an identity column, @@IDENTITY returns the identity value of the first insert. The @@IDENTITY value does not revert to a previous setting if the INSERT or SELECT INTO statement or bulk copy fails, or if the transaction is rolled back.
It returns the last identity value inserted into an identity column in the same scope. A scope is a module: a stored procedure, trigger, function, or batch. Therefore, two statements are in the same scope if they are in the same stored procedure, function, or batch.
For example, there are two tables, T1 and T2, and an INSERT trigger is defined on T1. When a row is inserted to T1, the trigger fires and inserts a row in T2. This scenario illustrates two scopes: the insert on T1, and the insert on T2 by the trigger.
Assuming that both T1 and T2 have identity columns, @@IDENTITY and SCOPE_IDENTITY will return different values at the end of an INSERT statement on T1. @@IDENTITY will return the last identity column value inserted across any scope in the current session. This is the value inserted in T2. SCOPE_IDENTITY() will return the IDENTITY value inserted in T1. This was the last insert that occurred in the same scope. The SCOPE_IDENTITY() function will return the null value if the function is invoked before any INSERT statements into an identity column occur in the scope.
IDENT_CURRENT returns the last identity value generated for a specific table in any session and any scope. It returns NULL when the function is invoked on an empty table or on a table that has no identity column.
I hope this cleared some of your doubts on using identity functions. I suggest you look at books online for more information and examples.
What are extents and pages in the context of SQL Server 2005 ? What are the different types extents and pages in SQL Server 2005 ?
An extent is a block of eight pages totaling 64KB in size. Because the extent is the basic unit
of allocation for tables and indexes and all objects are saved in a table of some kind, all objects
are stored in extents.
SQL Server has two types of extents:
» Uniform: In uniform extents, all eight pages are used by the same object.
» Mixed:Mixed extents are used by objects that are too small to take up eight pages, so more
than one object is stored in the extent.
When a table or an index needs additional storage space, another extent is allocated to that
object. A new extent will generally not be allocated for a table or index until all pages on that
extent have been used. This process of allocating extents rather than individual pages to objects
serves two useful purposes.
First, the time-consuming process of allocation takes place in one batch rather than forcing
each allocation to occur whenever a new page is needed. Second, it forces the pages allocated
to an object to be at least somewhat contiguous. If pages were allocated directly, on an asneeded
basis, then pages belonging to a single object would not be next to each other in the
data file. Page 1 might belong to table 1, page 2 might belong to index 3, page 3 might belong
to table 5, and so on. This is called fragmentation. Fragmentation can have a significant negative impact on performance. When pages for a single object are contiguous, though, reads and writes can occur much more quickly.
At the most fundamental level, everything in SQL Server is stored on an 8KB
page. The page is the one common denominator for all objects in SQL Server. Many types of pages exist, but
every page has some factors in common. Pages are always 8KB in size and always have a
header, leaving about 8,060 bytes of usable space on every page.
SQL Server has eight primary types of pages:
Data pages hold the actual database records. The data page is 8,192 bytes, but
only 8,060 of those bytes are available for data storage because a header at the beginning of
each data page contains information about the page itself. Rows are not allowed to span more
than one page, but if you have variable-length columns that exceed this limit, you can move
them to a page in the ROW_OVERFLOW_DATA allocation unit.
Index pages store the index keys and levels making up the entire index tree. Unlike
data pages, you have no limit for the total number of entries you can make on an index page.
Text and image pages hold the actual data associated with text, ntext, and
image datatypes. When a text field is saved, the record will contain a 16-byte pointer to a
linked list of text pages that hold the actual text data. Only the 16-byte pointer inside the
record is counted against the 8,060-byte record-size limit.
Global Allocation Map pages
The Global Allocation Map (GAM) page type keeps track of
which extents in a data file are allocated and which are still available.
Index Allocation Map pages
Index Allocation Map (IAM) pages keep track of what an
extent is being used for—specifically, to which table or index the extent has been allocated.
Page Free Space pages
This is not an empty page; rather, it is a special type of page that keeps
track of free space on all the other pages in the database. Each Page Free Space page can keep track
of the amount of free space of up to 8,000 other pages.
Bulk Changed Map pages
This page contains information about other pages that have been
modified by bulk operations (such as BULK INSERT) since the last BACKUP LOG statement.
Differential Changed Map pages
This page contains information about other pages that
have changes since the last BACKUP DATABASE statement.
What Recovery Models are available in SQl Server 2005 ?
How does Recovery Model options affect database backups and recovery ?
A recovery model is a database configuration option that controls how transactions are
logged, whether the transaction log is backed up, and what restore options are available
for the database. The recovery model you choose for your database has both datarecovery
implications and performance implications, based on the logging the recovery
model performs or doesn’t perform.
SQL Server 2005 provides three recovery models for databases: Full, Simple, and
Bulk-Logged. These models determine how SQL Server works with the transaction
log and selects the operations that it logs and whether it truncates the log. Truncating
the transaction log is the process of removing committed transactions and leaving log
space to new transactions.
The following is a definition of each recovery model:
The transaction log is used for very little in this recovery model. In fact, almost nothing
is recorded in the log. This means any database set to use this model can be recovered only up to the last backup. Any changes made to your database after the last backup was performed will be lost because they are not recorded in the transaction log. This model is a good choice for development databases where most data is test data that does not need to be restored after a crash, but is not appropriate for databases in which the loss of recent changes is unacceptable. It is also a good choice for databases that are not changed often, such as an OLAP database.
This model records much more information in the transaction log than the
Simple model. Bulk operations such as SELECT INTO, BCP, BULK INSERT, CREATE INDEX, and text and ntext operations are the only information not recorded. This means you can recover most of the data in the event of a crash; only bulk operations may be lost. You can set this option just before performing a bulk-insert operation to speed up the bulk insert. You need to back up your database immediately after performing bulk operations if this option is selected because everything that is inserted during this time is not in the transaction log so it will all be lost if the database crashes before the next backup.
This is the default option, which records every operation against the database in the
transaction log. Using this model, you will be able to recover your database up to the minute of a crash. This is a good option for most production databases because it offers the highest level of protection.