| Windows .NET Server Supports Enterprise Storage |
| Nov. 18, 2002 |
Support for networked storage in Windows .NET Server 2003 could make it cheaper and less risky for Windows organizations to manage storage, speed backups, and provide disaster recovery. Centralized storage in external disk arrays accessed over a high-speed network has been supported by Unix vendors since the 1990s and is poised for further growth. Even though Windows-based networked storage is the most rapidly growing segment, Microsoft is a latecomer to the market. The company's increasing support for the technology buttresses its claim that it is serious about enterprise-class server operating systems (OSs). However, organizations considering networked storage for Windows servers must still overcome its high upfront costs, substantial learning curve, and potential dependence on non-interoperable vendor hardware and software. The cost of networked storage is dropping and many storage vendors and industry analysts feel that when the economy recovers, networked storage will be widely adopted, even among mid-size organizations currently priced out of the market. Storage vendors that partner with Microsoft stand to gain from this growth. Benefits of Networked Storage Windows .NET Server adds support for both of the technologies that dominate networked storage today: storage area networks (SANs) and network-attached storage (NAS). (For an overview of these technologies, see the illustration "What Is Networked Storage?".) These technologies augment or replace direct-attached storage (DAS)—that is, disk drives connected directly to and used exclusively by a single computer. Organizations are turning to SANs and NAS to support growing storage requirements. A recent Storage Magazine survey of 481 IT managers found that 57% expect to increase spending on storage technology in 2003. Of the storage technology to be purchased, 47% said that their primary spending will be for SANs, while 20% said that their primary spending will be for NAS. The primary reasons for this shift are as follows: Support for snapshots. Snapshots are point-in-time copies of entire disk volumes. Until Windows .NET Server and Windows XP, native DAS-based snapshot support in an OS was rare. In contrast, nearly all SAN and NAS vendors support snapshots (although often at extra charge). Snapshots make it possible for multiple backups to be run during the workday without impacting live applications. When used in conjunction with networked storage, the snapshot can be transferred to tape storage or secondary disk storage without impacting the performance of the application server that created it. Some applications can even restore data from snapshots, in many cases without interrupting the application’s service to other users. Snapshots also make it possible to instantly get a replica of data for data warehousing or testing by another server. However, an important caveat applies that could nullify these benefits: if applications are unaware that a snapshot is occurring, the snapshot might miss parts of transactions, which could make some of the snapshot data unusable. For this reason, some application vendors might not support use of snapshot-created data. However, if the applications are "snapshot aware," they can avoid this problem by briefly pausing while snapshots are taken. Dynamic storage expansion and reallocation. Most SAN and NAS technologies virtualize storage so that disk volumes can be expanded dynamically without disrupting file system access. With DAS, unused capacity cannot easily be shifted to other servers that need additional storage. When a storage volume needs to grow larger, it usually requires backing up the volume to tape, adding or replacing disk drives, reformatting the disks, and then restoring the data. This is labor-intensive and takes the server out of service for an extended period. Furthermore, current DAS systems simply cannot scale up to terabytes of storage, as is sometimes required for large databases, such as one that contains point-of-sale history data. With networked storage, in contrast, disks can be added to expand storage volumes without bringing down the storage unit, and storage can be shifted from one volume to another as needs dictate. This reduces the likelihood that a company would purchase more storage than it really needs. Fault tolerance. Although hardware RAID controllers and redundant power supplies make DAS more fault-tolerant, they are per-server expenses. SAN- or NAS-based storage is generally more expensive per byte than DAS, but all connected servers get the benefit of the networked storage’s fault tolerance features. Failed disk drives can be replaced without interrupting storage availability, and many SAN and NAS systems will do this automatically from a pool of hot spares. Multiple storage access paths for backup and clustering. Network storage makes it practical for multiple servers to access the same stored data over multiple paths. With DAS, the only access path to stored data is via the server that controls it, and this path can become a bottleneck. For example, during backup the host server and the applications running on it can be heavily impacted as the server transfers all data from the disk to the network. With networked storage, in contrast, a backup server can back up data without impacting application servers. In clustering schemes that require one server to take over disk access from another (such as in Microsoft’s Cluster Service), it is impractical to directly connect two or more computers—each with its own fault-tolerant disk controller—to the same set of disks because of caching, state management, and configuration control issues. However, SAN-based clusters allow multiple servers to connect to the same volume so that one server can take over access to the volume if another fails or is taken offline. Basis for disaster recovery. Many networked storage vendors offer a way to replicate critical data—in some cases, current to the latest transaction—to backup disk storage at a geographically separate data center, enabling organizations to quickly recover from a catastrophe to the primary data center. Although organizations keep backup tapes of critical business data offsite, in the event of catastrophe it could take weeks or months to restore these tapes to another site. Even then, the data is only current to the date of the last tape set sent offsite. Many businesses could not survive prolonged outages of this nature. A much better solution is to replicate each disk write request to the disaster recovery site. This cannot be done with DAS, but it is available in networked storage solutions. Standby servers at the recovery site can access the replicated data in the event of a disaster to the primary site. With a SAN, one replication technology and communications link can service dozens of servers, which makes the whole prospect much less expensive and far more manageable. Overall, the added flexibility, manageability, and availability of networked storage can justify its higher upfront costs. Also, as the number of servers sharing the networked storage grows, the cost per server shrinks. However, for this approach to work, an organization must have a high degree of computing centralization, which may require higher-capacity WANs to serve geographically separated sites. Storage Improvements in Windows .NET Server Windows .NET Server brings many new storage related improvements, including the following:
Volume Shadow Copy Service Enables Snapshots Windows .NET Server includes a new service and API, called the Volume Shadow Copy Service (VSS), that lets applications, OS services, storage devices, and backup applications work together so that clean "shadow copies" (i.e., snapshots) can be made by most storage technologies with only a brief pause to applications. VSS provides an interface to communicate with hardware-based snapshot technologies from networked storage vendors such as Dell, EMC, and Hewlett-Packard. (See the illustration "Volume Shadow Copy Service".) If the only way to obtain snapshots was to freeze and split off an entire mirrored disk volume, each snapshot would require huge amounts of storage for comparatively tiny changes to the data. However, Microsoft and most of its storage vendor partners implement snapshots in a much more space-efficient way by keeping copies of only those storage blocks that change following the snapshot. In this way, snapshots only slightly increase total storage requirements. When a VSS "requestor," such as a local backup program agent, signals that it needs to take a snapshot, any VSS-aware application (such as SQL Server or the next release of Exchange) will complete its current transactions and pause. VSS then signals the storage device to take the snapshot (which completes within a few seconds), following which the application can resume. The backup program can then perform a backup of the snapshot data with no further interference with the live application. Microsoft claims that building this coordination mechanism into the OS is a first for any platform, including advanced Unix systems. Prior to VSS, to get an application, such as an Oracle or SQL Server database, to coordinate a data snapshot with a vendor’s SAN device required running additional intermediary agents for each particular application and storage device. VSS includes a graphical management tool (see the screen shot "Creating Shadow Copies") that includes a scheduler to create timed snapshots. In addition to significantly improving backups and restores, VSS provides an end-user benefit. Windows .NET Server includes a Windows Explorer extension that can be installed on Windows XP workstations, allowing end users to access shadow copies of files located on shared .NET Server directories (see the screen shot "End-User Access to Shadow Copies"). Although users cannot initiate snapshots, users with the necessary permissions can restore files back to previous point-in-time copies without involving IT personnel. Although networked storage enhances the value of snapshots, .NET Server also includes a provider that allows VSS to work with conventional DAS, allowing smaller organizations that cannot afford SANs or NAS to get some of the benefits of snapshots. Virtual Disk Service Windows .NET Server introduces a new storage management API called the Virtual Disk Service (VDS) that gives .NET Server’s Windows Disk Manager utility and other third-party management applications a standard hardware-independent way to communicate with any storage device that has an associated VDS provider. One of the problems with SANs and NAS is that each hardware vendor provides its own method for managing storage. Although some high-end storage arrays have a separate management processor and console, others use commands sent from a management application running on a server using the storage array. Each approach uses proprietary technologies, making it difficult to create storage management applications that work uniformly across multiple storage vendors. VDS helps rectify this problem by exposing a common storage management interface, which will enable storage ISVs to automate common storage configuration operations, such as disk formatting, binding or breaking a mirror, or dynamically growing storage volumes when needed as users or applications consume disk space. Windows .NET Server will also support a standard API to host bus adapters (HBAs)—the adapters servers use to connect to SANs. This Microsoft HBA API is based on one being developed by the Storage Networking Industry Association (SNIA) and will enable management of SAN communication devices, such as Fibre Channel switches. Microsoft will provide a dynamic-link library (DLL), and the HBA vendors will write drivers that interface with it via the Windows Management Infrastructure (WMI) interface. Other Network Storage Improvements In addition to VSS and VDS, Windows .NET Server contains several other SAN and NAS enhancements, such as the following: SAN boot. Windows .NET Server has an option to boot from a SAN-based storage volume. This means that all data pertinent to a server can be stored on—and backed up from—the SAN. In addition to giving OS files the same storage benefits as application files, this has important implications for disaster recovery: if the SAN data is continuously replicated to storage at another site, then if something happens to the primary server, a backup server can be pointed at the corresponding replica’s volume and boot up (as long as certain critical hardware in the backup server, such as its network card and HBA, are the same as the primary server's). With the exception of a few OEM-specific implementations of Windows 2000 SANs (unsupported by Microsoft), Windows servers could not boot from a SAN and needed a small local disk drive on each server for the Windows OS and the paging file. This could create a problem if the data on the system disk were lost or corrupted, because it stores the Registry and many other files critical to restoring the system and applications, and if this information wasn’t properly backed up, restoring the OS and applications was difficult or impossible. Using a hardware redundant array of independent disks (RAID) controller and disk mirroring on each SAN-connected server just to make its system volume more fault tolerant adds expense that could be avoided by dispensing with local storage entirely and simply booting the servers off SAN-based system volumes. Although Microsoft will support booting .NET Server from a SAN, certain caveats apply: it requires vendor support and has been tested only with specific hardware combinations. Multipath I/O. Microsoft is working with storage hardware vendors to define a hardware-independent API, called Multipath I/O, needed to enable use of dual HBAs and independent, redundant network paths between a server and its SAN storage units. When building high-availability solutions, a generally accepted principle is to avoid single points of failure. In a SAN environment, the connection between the server and the storage controller is a potential failure point. A solution is to provide redundant paths between the server and storage controller. Although certain proprietary solutions could do this under Windows 2000 Server, Microsoft has been working with Dell, EMC, Hewlett-Packard/Compaq, Hitachi, and others to deliver a standard multipath solution for hosts running Windows 2000 Server and Windows .NET Server connected to SANs. Each of the storage vendors will build its own "device-specific module" (driver) on top of Multipath I/O to fit its specific offerings. The idea is to provide a consistent code base for common functions and then work with each vendor to ensure that its driver works reliably. LUN masking and the ability to disable dynamic scanning. With Windows .NET Server, administrators can disable some Windows features that make it a poor SAN citizen. Windows NT and 2000 were designed with the assumption that any disk volume it could see "belonged" to it alone, and Windows would write a label called a "disk signature" to the volume so that it could track it as a resource. This behavior made servers running these earlier Windows versions grab all unclaimed SAN storage volumes (identified and addressed using Logical Unit Numbers, or LUNs), even those intended for eventual use by other servers. With Windows .NET Server, it’s now possible to completely disable the dynamic scanning behavior described above, and using a new capability called "LUN masking" administrators can set a server policy that blocks access to LUNs used by other servers so that the Windows .NET Server never attempts to read from or write to those LUNs. Although LUN access permissions are also controllable at both the storage network switch and the storage controllers hosting the LUNs, .NET Server’s LUN masking feature provides an additional level of protection independent of specific storage vendors’ hardware. What’s Slowing Networked Storage in Windows Environments? Before use of networked storage with Windows servers can become ubiquitous in large or even medium-size data centers, many obstacles must be addressed, including cost, historical lack of Microsoft support, complexity, and lack of standards. High Upfront Cost The biggest hurdle when implementing networked storage is the upfront cost. Although the prices of SAN and NAS storage are coming down, conventional DAS is coming down at a similar rate. To see returns on networked storage investments, this initial cost must be offset by long-term operational savings or by extenuating requirements, such as a need for disaster recovery or clustering. Furthermore, organizations will generally incur the higher cost of networked storage only for mission-critical systems, where the cost of not having it is also high. For this reason, SANs initially were more commonly found connected to large Unix servers. However, the trend toward server consolidation and a new connectivity technology is driving down the cost of networked storage, making it more of a consideration for Windows servers: Centralization and consolidation. As server scalability grows and WANs and Internet-based virtual private networks improve and drop in price, centralizing computing resources is becoming more feasible. When more or larger servers can share a SAN infrastructure, the storage cost per user shrinks. Cheaper connections through iSCSI. SCSI-over-IP, known as "iSCSI," is a new technology beginning to threaten Fibre Channel for host-to-storage connectivity. This technology essentially tunnels SCSI commands inside the IP protocol and makes it possible to use relatively inexpensive high-speed Ethernet for storage connectivity instead of the much more expensive Fibre Channel HBAs, hubs, and switches. Past Lack of Microsoft Support In the past, Microsoft has not shown much interest in supporting networked storage, except in the case of a few specific configurations that supported its clustering solutions. In the event of problems involving Microsoft’s OS or applications and networked storage, Microsoft would typically force customers to turn to the storage vendor for support. This made many Windows-centric organizations reluctant to take risks on the whole concept. The complexity of SANs and SAN technologies further slowed their adoption. With the addition of networked storage features in Windows .NET Server, Microsoft is embracing SANs much further—once a storage vendor has devices certified to work with the new storage APIs, Microsoft should be able to fully support solutions built on them. Microsoft also plans to support NAS-based storage in key applications, which could increase the penetration of NAS in Windows organizations. Although NAS configurations are not currently supported by many Microsoft applications, such as SQL Server and Exchange, Microsoft has indicated that it will move toward NAS-enabling these applications in the future. The company has also released the Windows 2000 Server Appliance Kit (SAK), which has become the basis for many Windows-powered NAS devices. (For information on the Windows 2000 SAK, see ".NET Puts Focus on Embedded Strategy" on page 6 of the July 2001 Update.) A SAK version of Windows .NET Server will ship about 90 days after Windows .NET Server ships. Lack of Standards Historically, supporting a SAN with components from multiple vendors on any platform, including any flavor of Unix, was a very complex procedure that was typically tested for only one particular configuration of components: a specific HBA and firmware level, a particular Fibre Channel switch and firmware level, and a particular storage controller and firmware level. Changing any of these often broke the system, which was a huge headache for customers. Thus, many organizations with SANs depend on proprietary hardware and software from a single storage vendor. Today, no network administrator would think that a particular vendor’s network adapter was needed to connect to a Cisco Ethernet switch. However, unlike the data networking world where interoperability standards, such as Ethernet, TCP/IP, Hypertext Transfer Protocol, or Simple Mail Transfer Protocol allow communications between heterogeneous systems, few such standards exist in the storage arena. With .NET Server, Microsoft is introducing new storage standards and interfaces, such as VSS and VDS, designed to reduce this deficit and make networked storage more like data networks. However, Microsoft can only do so much. Storage industry groups, such as the SNIA, are also trying to "herd the storage cats"—the vendors—into agreeing on common interoperability and management standards, but progress is slow and true heterogeneous connectivity is not yet here. SNIA is currently working on storage interoperability standards and on a storage management standard called Bluefin, but work is still in the early stages. .NET Server’s VDS and HBA API represent Microsoft-specific solutions that address some of Bluefin’s goals, but Microsoft has not yet announced whether it will support Bluefin when it matures. In the meantime, customers are trying to avoid being painted into a proprietary solution corner, which in many cases causes them to wait. Microsoft’s belated embrace of networked storage should help alleviate some of their concerns. Resources For more information on Windows storage, see www.microsoft.com/windows2000/technologies/storage. For additional white papers on .NET Server storage and clustering, see www.microsoft.com/windows.netserver/techinfo/overview. For more information on the Storage Networking Industry Association, see www.snia.org. |