inset
SQL Server 2000 Steers Toward Enterprise
Sep. 4, 2000

Scheduled for launch Sept. 26, SQL Server 2000 (SQL 2000) will emerge in two editions—Standard and Enterprise—that roll up many small improvements requested by users over the last two years. The high-end Enterprise Edition (EE) leverages the new breed of Windows hardware to boost performance and availability and introduces a new load-balancing technology that could make big gains in scalabiilty once it has usable supporting tools. Both EE and Standard Edition have useful new features to simplify management of large databases and large numbers of servers. The catch: higher prices for EE and Web customers.

Market Background

SQL 2000 builds on SQL Server 7.0 (SQL 7), a major rewrite of the product that shipped in Nov. 1998. SQL 7 has done well, turning in a 35% boost in revenue for fiscal year 2000 and winning highly visible Web customers such as Barnesandnoble.com and Buy.com. However, it hasn't dented the lead of IBM and Oracle, which kept a combined share of 60% from 1998 through 1999, based on Dataquest estimates of new database license revenue. By comparison, Microsoft's estimated share in 1999 was 13%, up from 10% the previous year.

Prices explain some of the revenue gap: SQL Server license fees are lower on average than those of its competitors. There's a more important factor, however. "We're focused on Windows, and obviously, that has an effect," observes Jeff Ressler, lead product manager for SQL Server at Microsoft. Because SQL Server runs only on the Windows platform, it has to fight the perception—fair or not—that the platform isn't reliable or scalable enough for large, mission-critical databases.

SQL 2000 EE's mission is to help shake that perception, by

  • Breaking into high-performance territory previously occupied by databases on Unix and mainframe platforms
  • Removing limitations in its failover, backup, and recovery systems that made it hard to provide high levels of availability, particularly for very large databases
  • Offering replication and administration tools that for companies managing tens or hundreds of servers.

Performance and Scalability

SQL 2000 EE has turned in impressive benchmark numbers compared to its competition and to previous versions of the software (see the chart "SQL Server Enters Race for Performance"). It gets those numbers by exploiting improvements in Windows 2000 on multiprocessor machines and by yoking multiple machines together with a new, experimental load-balancing technology. SQL 2000 also makes important gains in performance through indexing.

Exploiting Improvements in Machines

SQL 2000 EE makes better use of multiple processors than in previous versions, for tasks such as

  • Database scans (e.g., sums and searches through unordered database tables)
  • Creation of indexes, an important operation in large databases that are frequently updated or in data warehouses that are periodically loaded with large amounts of new data in bulk
  • Database consistency checks, which administrators run periodically to detect database corruption.

Moreover, SQL 2000 EE will be one of the first applications to benefit from Windows 2000 Datacenter Server, which will ship with high-end servers this year (see "Windows 2000 Datacenter Targeting High-End Unix Servers" on page 3 of the June 2000 Update). On Datacenter, SQL supports up to 32 processors and 64GB of main memory. When run on Datacenter, SQL 2000 also supports Virtual Interface System Area Networks (such as Compaq ServerNet and GigaNet cLAN), which offer high bandwidth and offload most network processing from the CPU to the network interface card. In benchmark tests, these network technologies yielded 15% to 25% performance improvements over conventional high-speed networks.

Load Balancing

SQL 2000 EE introduces a new technology called Distributed Partitioned Views for balancing loads among multiple database servers. This database-specific technology is completely independent of the Windows Cluster Service, which doesn't support load balancing and is mainly used to limit downtime due to failures. The new technology works by sharing queries and updates among the machines in the cluster (see the illustration "Balancing Load with Distributed Partitioned Views").

Benchmark results suggest that SQL Server load balancing can provide very high levels of scalability, increasing throughput almost linearly with the number of participating machines. For example, on the Transaction Processing Council's TPC-C order entry benchmark, adding four machines to an eight-machine cluster (increasing cluster size and system cost by 50%) increased throughput by 46%. According to Microsoft, two major customers have deployed Distributed Partitioned Views in production Web sites.

However, Distributed Partitioned Views won't be practical for many customers, because SQL 2000 lacks management and development tools for the technology. To set up an eight-machine cluster, for example, an administrator must manually install eight different instances of a database. Adding an additional machine to the cluster means repeating this task, on all machines. Also, to get performance near that of the benchmarks, companies must rewrite some application code to exploit load balancing. Hence, Microsoft Senior Vice President Paul Flessner has called SQL 2000 a "first installment" on scalability through load balancing; many customers will wait until SQL Server Yukon ships (some two years from now) to realize the benefits promised by Distributed Partition Views.

Indexing Improvements

SQL 2000 makes many other performance improvements. Two of the most important include the following:

Indexed views. SQL 2000 EE supports indexed views, tables that cache the results of a query and update that cache as the underlying data change. Similar to the "materialized views" supported in competing database servers, indexed views will particularly benefit reporting and data warehousing applications, which frequently issue complex queries that summarize large volumes of infrequently updated data.

Indexes on computed columns. SQL 2000 can index database columns whose values are computed from other columns, such as an "extended price" column that is the product of quantity and unit price columns. This could particularly benefit sorting and interactive searches (e.g., locating the largest single order for a particular item). SQL 2000 Standard Edition also supports this feature.

Limiting Downtime

To speed recovery from failures, SQL EE works more effectively with the Windows Cluster Service than previous versions and offers "log shipping" tools for automatically maintaining standby database servers. To limit planned downtime due to maintenance, both SQL EE and Standard Edition provide faster, less-intrusive backup and consistency checking.

Of course, the best way to limit downtime is to avoid failures altogether. It's hard to tell if SQL 2000 is better than its predecessors in this respect, but customers such as TV Guide Online have already used beta versions in production systems, and Microsoft itself is running 40 internal applications (including its Enterprise Resource Planning system) on beta 2.

Speeding Recovery from Failures

SQL 2000 EE makes improvements to clustering and introduces complementary tools for maintaining standby servers.

Automating failover with clustering. SQL 2000 EE eliminates many bugs and limitations of SQL 7 in supporting the Windows Cluster Service. Cluster Service is a quick restart (failover) solution for a cluster of machines; when one machine in the cluster fails, its applications restart on another. Machines in the cluster must share a disk system and use specially designed network and server hardware.

SQL Server has supported Cluster Service for some time, but as Microsoft Program Manager Richard Waymire concedes, "It’s just too hard in 6.5 and 7.0." SQL 2000 introduces several features to make it easier:

  • Administrators can apply SQL Server service packs and hotfixes to a cluster without "unclustering" the servers, a time-consuming and error-prone chore.
  • Most SQL Server services and tools work properly in clusters, including full-text indexing, replication, and administration tools such as Query Analyzer.
  • Setup automatically sets up clustering when it is run in a cluster; administrators had to use a separate Cluster Wizard in previous versions.
  • The setup program and the standard database administration console (Enterprise Manager) can perform cluster administration tasks, such as removing and rebuilding failed servers; these tasks required separate cluster administration tools in earlier versions.

SQL 2000 EE also supports clusters of up to four machines on Windows 2000 Datacenter Server; previous versions are limited to two.

Maintaining standby servers with log shipping. SQL 2000 EE incorporates tools for log shipping, which maintains a copy of a production database on a standby machine by periodically applying production logs to the standby. Compared to Cluster Service, failover is slower and less automatic, because administrators must manually put the standby into production after a failure. However, log shipping doesn't require the production and standby server to share a disk (or even be on the same network) and doesn't require special cluster-ready hardware.

SQL 7 has tools for log shipping that Microsoft has used internally and shipped (unsupported) in the Back Office Resource Kit, but SQL 2000 EE's tools are integrated into the administrative console and are substantially more automated.

Scaling Up Maintenance

In both Enterprise and Standard Editions, SQL 2000 uses new algorithms and better hardware support to speed backup and reduce its impact on operations. It introduces a new compressed logging mode that can reduce logging overhead while still allowing full recovery from many failures, and it offers optimized database checks and online index rebuilds.

Faster differential backup. SQL 2000 uses new backup algorithms to speed differential backups (which record just the pages of the database that have changed since the last full backup). In addition, it supports differential backups on specific segments of a database (specific data files or file groups); previous versions could only do differential backups on entire databases.

Support for enterprise storage systems. SQL 2000's backup utilities work better with enterprise storage systems, large-capacity, high-speed disk arrays that provide numerous features for reliability and disaster recovery. In particular, SQL 2000 makes better use of "business continuation volume" (BCV) features, which enable administrators to quickly mirror a disk volume while it's in use by a database server, detach the mirror (the BCV) from the production server, and make a backup from the BCV on a secondary server. Previous versions of SQL Server worked with BCVs on storage systems such as EMC's Symmetrix and Compaq StorageWorks, but didn't provide all the features that Oracle does. Now, according to Bill Loughlin of EMC, "All the functionality available on competitive databases is available on SQL Server, in terms of gross capabilities."

For example, SQL 2000 can recover a database from a BCV on EMC's Symmetrix; previous versions could only use BCVs for backup and had to recover from slower backup media such as tape. On Compaq's StorageWorks systems, SQL 2000 can use that hardware's "snapshot" virtual copy operation to create a backup of a live database in seconds; previous versions could take snapshots only while the production server was offline. On both types of hardware, SQL 2000's tools for scheduling backups and tracking backup history will recognize the creation of a valid BCV as a full backup, something earlier versions of the tools wouldn't do.

Log compression, simplified logging. SQL 2000 introduces a new "bulk logging" option that compresses log records for operations that move large amounts of data (e.g., creation of indexes, batch data imports, writes of documents or images). Bulk logging provides a useful compromise between fully logging bulk operations (which slows down the database and bloats the log file) or not logging at all (which can lead to data loss if a server crashes).

Bulk logging only tracks which database pages change during a bulk operation, not the contents of those pages. It records the contents later, when the log is backed up. With bulk logging, sites can recover from some kinds of failures (database or OS crashes) without losing work. However, failures that corrupt database pages on disk (e.g., disk failure) can destroy work because bulk operations don't log enough information to reconstruct lost pages. Full logging is still available for sites that can't tolerate this risk.

SQL 2000 also simplifies the database options used to control logging levels. There's a single database setting that selects among three logging options: full (everything fully logged), bulk, and simple (minimal logging, which doesn't support log backup). Sites can switch levels while online. Earlier versions controlled logging using complex and confusing combinations of several options.

Faster consistency checking and index maintenance. In SQL 2000 the database consistency checking (DBCC) utility can run without locking entire database tables, thus interfering less with an online database. DBCC has a new, fast "low-level" checking mode that doesn't exhaustively check database structure, but that can still catch most hardware-caused corruption. DBCC can also reorganize (defragment) indexes while the database is online. This can be much faster and less intrusive than dropping and re-creating indexes, if the indexes haven't changed much.

Management

SQL 2000 Standard and Enterprise have many new features to help large organizations and application service providers that need to centrally maintain large collections of servers. SQL 2000 delivers many extensions to the replication system that synchronizes data among servers (see "Replication Improvements"). It also enables companies to install multiple servers on the same machine and makes many small improvements to help administer large numbers of machines.

Multiple Servers on a Machine

SQL 2000 is the first version of SQL Server to allow multiple independent installations of the software (called instances) on a single machine. On previous versions of SQL EE, multiple instances might run on a machine after a Cluster Service failover, but administrators can't explicitly install two copies of SQL Server side-by-side on the same machine.

Multi-instance support has several applications:

  • Companies can install and test new versions of a database on the same machine as old versions, by giving each version its own instance.
  • Companies can test new SQL Server versions side-by-side with the old: one or more SQL 2000 instances can be installed alongside one 6.5 or 7.0 instance on a given machine.
  • Application service providers can give subscribers their own instances, allowing each subscriber to have its own security database and server settings while sharing a machine with other subscribers.

Microsoft itself has used the multi-instance feature to completely rewrite SQL EE's support for Cluster Service, enabling many of the failover improvements discussed above.

Many Machines

Among SQL 2000's many improvements are features for coordinating backup and recovery, locating services, and copying servers and databases among multiple machines.

Coordinating recovery with restore-to-mark. In SQL 2000, applications can write named transactions (called marks) into the transaction log; administrators can later restore the database to the state before or after a mark. This is useful for recovering quickly from large-scale or risky operations that went wrong (e.g., restore to the point before an unintended run of an end-of-month archiving script). It can also be used to coordinate backup and restoration among multiple servers: all servers involved in an operation can write the same mark into their logs, and administrators can restore all servers to this mark to get them all back to the same logical point. For example, a company might use this feature to restore all servers in a cluster back to the start of monthly posting, or to restore separate human resources and payroll databases to the point before an across-the-board salary increase.

Installation disk imaging. SQL 2000 works correctly with disk imaging; administrators can use an imaging utility to copy a server installation onto a CD and put it on another machine. Imaging provides a quick way to create many identical servers. Previous versions of SQL Server don't work with imaging; a server will refuse to start up on a machine that has a different name than the one on which the server was initially installed.

Locating resources with Active Directory. Users and applications can browse a network for SQL Server machines, databases, instances, and publications (replicated data sets) using the Windows 2000 Active Directory service. This could prove useful for administrators who deal daily with large numbers of servers. Previous versions allow searching only for machines, and use the less-scalable Windows name service (WINS). SQL 2000 can still use WINS, however, since it's the only way to locate servers in NT networks.

Copying and moving databases. SQL 2000 introduces a Copy Database Wizard. This wizard automates the complex sequence of steps required to copy or move a database from one server to another (e.g., to initialize a standby server).

Pricing and Licensing

Many of the features described above are exclusive to SQL 2000 EE, which illuminates another part of its mission: to better differentiate EE from the lower-priced Standard Edition and to justify a new, higher price tag. SQL 2000 is also more expensive for typical Web uses than earlier versions. For details, see "Some Windows 2000 Server Products Get Per-Processor Pricing" on page 10 of the Aug. 2000 Update. Some relevant points from that article include the following:

Price increase for EE. The estimated retail price (ERP) for SQL 2000 EE is US$11,099 with 25 Client Access Licenses (CALs). The corresponding price for SQL 7 EE is US$8,009.

Web price increase. Companies using SQL Server on Web sites must buy one or more per-processor licenses for their servers. A single per-processor license is US$4,999 for SQL 2000 Standard and US$19,999 for EE (estimated retail). This is more expensive than the now-discontinued Internet Connector license, which cost US$2,999 per processor. On the other hand, per-processor licenses cover all users, whereas an Internet Connector license didn't cover employees or partners of the company that owned it.

Prices for volume purchasers will be lower, but they will still see increased prices over SQL 7.

Companies should note some important requirements in SQL 2000 licenses:

Cluster Service license reduction. SQL 2000 EE reduces the number of server licenses required in many cluster sites compared to earlier versions. Specifically, a SQL 2000 cluster requires only enough licenses to cover its "active" servers (servers that are accepting connections from applications, as opposed to "passive" servers acting as standbys). With SQL 7 and earlier, all servers in a cluster, active or not, require a license.

Multiple instance licensing. A SQL 2000 EE server license covers all instances installed on a single machine. A SQL 2000 Standard license, in contrast, covers only one instance.

Resources

General Resources

An overview of SQL 2000 for developers is at http://msdn.microsoft.com/msdnmag/issues/0800/sql2000/sql2000.asp.

Five useful video presentations covering SQL 2000 appear at http://msdn.microsoft.com/training/seminars/DataAccess.asp.

A series of six hands-on labs for SQL 2000 are at www.microsoft.com/SQL/productinfo/handson.htm.

For a list of all of SQL 2000's new features, see Microsoft's evaluation guide at www.microsoft.com/sql/productinfo/sql2krev.htm.

Web and XML features of SQL 2000 are covered in "XML Hooks SQL Server into Next Generation Applications" on page 8 of the June 2000 Update.

Microsoft's information site for SQL Server is www.microsoft.com/sql.

Performance and Scalability

A technical article on Distributed Partitioned Views and their use in the TPC-C benchmark system is at http://msdn.microsoft.com/msdn-online/start/features/highperform.asp.

A summary of all SQL Server benchmarks is at www.microsoft.com/sql/productinfo/LOBbench.htm. For the benchmark results with SQL 2000 and the SAP Enterprise Resource Planning system, see www.sap.com/solutions/technology/pdf/50020428.pdf.

Limiting Downtime

The Microsoft IT group's practices for limiting SQL Server downtime, including log shipping, are described at www.microsoft.com/technet/showcase/itops/availsql.asp.

For a case study of Cluster Services failover on SQL 2000, see http://msdn.microsoft.com/voices/sampleapp05182000.asp.

An introduction to SQL 2000's new logging features is in http://msdn.microsoft.com/library/periodic/period00/dbRecovery.htm.

For further information on the EMC storage products, see www.emc.com/products.

For further information on Compaq's StorageWorks systems, see www.compaq.com/storage.

Management

A detailed explanation of multiple instance support is at http://msdn.microsoft.com/library/periodic/period00/MultipleInstances.htm.

Pricing and Licensing

For a comprehensive review of Microsoft's server pricing and licensing, see "Some Windows 2000 Server Products Get Per-Processor Pricing" on page 10 of the Aug. 2000 Update.