Shiloh-Yukon-Katmai Discussion: Federated Servers

To achieve the high levels of performance required by the largest Web sites, a multitier system typically balances the processing load for each tier across multiple servers. SQL Server 2005 shares the database processing load across a group of servers by horizontally partitioning the data in a SQL Server database. These servers are managed independently, but cooperate to process the database requests from the applications; such a cooperative group of servers is called a federation.

A federated database tier can achieve very high levels of performance only if the application sends each SQL statement to the member server that has most of the data required by the statement. This is referred to as collocating the SQL statement with the data required by the statement. Collocating SQL statements with the required data is not a requirement unique to federated servers. It is also required in clustered systems.

Federated Server Tier

There is one instance of SQL Server on each member server.

Each member server has a member database. The data is spread through the member databases.

The tables from the original database are horizontally partitioned into member tables. There is one member table per member database, and distributed partitioned views are used to make it appear as if there was a full copy of the original table on each member server.

The application layer must be able to collocate SQL statements on the member server that contains most of the data referenced by the statement.

Backing Up and Restoring Federated Database Servers

In a federated-database-server tier that is built by using distributed partitioned views, the member servers form one logical unit. Therefore, you must coordinate the recovery of the member databases to make sure that they remain synchronized correctly.

SQL Server 2005 does not require that you coordinate backups across member servers. Backups can be independently taken from each database, without regard for the state of the other member databases. Because the backups do not have to be synchronized, there is no processing overhead for synchronization and no blockage of running tasks.

The most important aspect of recovering a set of member databases is the same as recovering any other database: Plan and test the recovery procedures before you put the databases into production. You must set up processes to restore all the databases to the same logical point in time. SQL Server includes features to support the recovery of all member databases to the same point in time.

Pros

1. Federated servers if implemented correctly are a great way to load balance a database server environment and is very similar to a database farm implementation.

2. This would greatly allow you to distribute the load on your servers based on any of the following criteria:

a. Geographic location

b. Traffic in terms of users

c. Traffic in terms of transactions

d. Database table size

3. Also, federated servers give you the option on partitioning data across servers with the help of distributed partitioned views

4. This gives you the option of horizontally partitioning the data across various servers which ultimately leads to greater throughput

5. It lets you control the traffic coming in and also helps in maintaining the load thresholds across the entire setup

6. Furthermore, if we have a middle tier in the entire setup, then design changes in the federated server environment will not affect the client side applications in any manner as they would be connecting to the middle tier and the middle tier will connect with the database server

Cons

1. One of the major drawbacks is disaster recovery. If one of the member server fails, there needs to a failback plan in place which could cause minimum hindrance to the normal operations while implementation.

2. Also, if distributed partitioning is being implemented, then rules and constraints need to be strong enough to prevent any sort on inconsistencies from arising due to data modification

3. Also, if the performance of one of the member servers takes a hit, most of the distributed partitioned views would also take a hit in terms of operations done on them

4. The middle tier should be designed in such a robust manner that there is no ambiguity in resolving which server in the environment needs to targeted based on the query coming into the server

5. The backup/restore scenarios need to be designed and planned in such a way that all the member servers are all synchronized at all times

Useful Articles

The following whitepaper and TechNet articles should help a great deal here:

Planning, Implementing, and Administering Scaleout Solutions with SQL Server 2005Solution

http://download.microsoft.com/download/4/7/a/47a548b9-249e-484c-abd7-29f31282b04d/ImplementingScaleOut.doc

Scaling Out SQL Server with Data Dependent Routing

http://www.microsoft.com/technet/prodtechnol/sql/2005/scddrtng.mspx

Designing Data Tier Components and Passing Data Through Tiers http://msdn2.microsoft.com/en-us/library/ms978496.aspx