inset
SQL Server Gets .NET, XML
Nov. 24, 2003

The next version of SQL Server, code-named Yukon, will integrate the .NET Framework's Common Language Runtime (CLR) and offer native support for XML and Web services. .NET Framework integration could improve stability of business systems that depend heavily on custom server-side database code (as many financial and billing systems do) and increase productivity of developers building such systems by letting them create database-resident logic with Visual Studio, using Visual Basic (VB) .NET or C#. However, database developers will still need to understand Transact-SQL (T-SQL), SQL Server’s native programming language, and development organizations must carefully consider where to use T-SQL versus .NET languages.

.NET Framework integration and better support for XML and Web services in Yukon could help Microsoft use the success of SQL Server to further drive use of the .NET Framework for enterprise application development, and the adoption of XML and Web services for business-to-business data exchange. With Visual Studio’s sophisticated development environment and tool set, Yukon could also offer a better alternative to database developers using Java to program competing databases, such as IBM's DB2 and Oracle.

.NET Framework Improves Server-Side Programming

Integration of the .NET Framework in SQL Server Yukon will deliver major improvements to server-side database programming.

Developers and system architects build server-side database code for two primary reasons. First, server-side logic can provide a layer between business logic and data—commonly used database queries can be made available as programming interfaces to application developers, letting those developers focus on writing business logic without having to write code that directly accesses (and potentially damages) the underlying database. Second, overall application performance can often be improved by executing certain data-intensive code in the database. Rather than incurring the performance costs of moving data across the network from a database server to a middle-tier server and back, for example, system architects will choose to place code that mainly accesses or updates data where it will perform best—as close as possible to the actual data.

Improvements over T-SQL

Today, server-side programming in SQL Server is done with T-SQL, a Microsoft extension of the Structured Query Language (SQL)—a standardized language for working with relational database management systems. (For background on T-SQL and SQL Server programming concepts, see the sidebar "SQL Server Programming Concepts".)

Despite its near ubiquity in SQL Server database programming, however, T-SQL is not a modern language: for example, it lacks support for rudimentary constructs such as arrays, and it has limited string manipulation and mathematical computation capabilities. These limitations make certain programming tasks that are fairly simple in modern languages onerous in T-SQL.

The .NET Framework offers database developers the following advantages over T-SQL for server-side programming:

Modern programming languages. With Yukon, database programmers will be able to develop server-side database code in VB.NET or C#. This means that database developers will now have access to the base features common to modern programming languages, such as object-oriented constructs, structured exception handling, and support for arrays. These features have become increasingly important as software projects grow in size and complexity.

Bigger code libraries. The .NET Framework contains extensive libraries of prebuilt classes and routines that include string manipulation functions, math functions, date/time operations, encryption algorithms, functions that access system resources, and many others. These libraries could be particularly helpful for programming tasks that involve complex computation or string manipulation and will give database developers a powerful, built-in set of tools for writing server-side database procedures and functions that are not available to developers coding only in T-SQL. For example, validating the format of a complex string (such as an e-mail address) before entering it into the database is a challenging task in T-SQL, but is trivial using the .NET Framework’s support for regular expressions.

Advanced development environment. The next release of Microsoft’s Visual Studio .NET development environment (code-named Whidbey) will offer database developers full support for server-side database programming, giving them access to the same advanced environment and tools available to other .NET developers. For example, although previous versions of Visual Studio supported debugging T-SQL code, Whidbey will support server-side database development and debugging in VB.NET and C#; developers will be able to step through database applications, even if they include multiple procedures written in a variety of languages and running on multiple tiers.

User defined data types and aggregates. User-defined data types give developers a way to create consistent nomenclatures for frequently used types of data, which can help organize large, complex software development projects. For example, a developer could define a U.S. Social Security number as a 12-character data type and use that type across multiple database tables. SQL Server 2000 developers can extend the type system in the database, but can do so based only on data types already defined by SQL Server. With Yukon, developers will be able to create data types based on any .NET data structure they define. Once defined and registered with SQL Server, these data types can be used in the database in the same way predefined system data types are used: to define columns in tables or to input parameters to functions, for instance.

T-SQL also provides a set of functions (called aggregate functions) that perform operations on a selected group of data elements. For example, the SUM function returns the sum of numbers in a table column. Yukon will allow developers to define their own aggregate functions, with arbitrarily complex internal logic; this could be particularly helpful for creating involved mathematical aggregates, such as a net income calculation for a global corporation that involves complex currency conversion and tax computations.

Portability. The languages and development environment for server-side database programming in Yukon are identical to those used for many other types of software development projects (namely the .NET languages and Visual Studio). Additionally, the classes and methods used for server-side data access in Yukon are based on those used for client-side data access. This affords two advantages to development organizations. First, it makes skills portable: developers previously limited to VB application development, for example, will find it much easier to transition to server-side database development. Second, moving code between the client- or middle-tier and the server-tier will be a relatively straightforward exercise. This could particularly help developers and system architects trying to optimize system performance; for example, it could allow them to move data-intensive client code to the database without extensive modifications.

Safer Extensions

.NET integration offers a safer option for server-side programming than Extended Stored Procedures (XPs), an alternative to T-SQL available in current versions of SQL Server. XPs are blocks of code (typically written in the C or C++ programming language) that run in the database server. Developers can use XPs to write server-side code that would be difficult in T-SQL, with one major drawback: unlike T-SQL code, XPs are unsafe. A bug in an XP can bring down the database server, and XPs can introduce security vulnerabilities, such as buffer overflows, that can be exploited by worms and viruses.

Code written in .NET languages offers a safer alternative. Such code runs under the guidance of the CLR (and is referred to as "managed code"). The CLR takes steps to verify that code is safe to run. For example, the CLR automatically performs checks designed to help prevent buffer overflows.

T-SQL Also Moves Forward

In addition to integrating the .NET Framework, Microsoft will also improve T-SQL in Yukon.

For example, Yukon will introduce a feature called Common Table Expressions (CTEs), which should ease the construction of queries against databases with nested structures, such as a database that contains employees and their reporting relationships in a large, multilevel organization. In addition, CTEs could greatly reduce the amount of T-SQL code required to answer queries, such as "find all employees reporting up through a specific manager in an organization."

Yukon will also improve error handling in T-SQL transactions. In SQL Server 2000 and previous versions, error handling is largely ad hoc: developers need to insert custom error-handling code after every statement they suspect might fail. In severe cases, errors can cause transactions and the procedures performing them to abort without giving the developer an opportunity to handle the error. Yukon will introduce structured exception handling similar to that in C# and VB.NET. Using these capabilities, for example, a developer will be able to catch errors that could cause a transaction to abort in SQL Server 2000, log the error, and roll back the failed transaction.

Getting Serious About XML, Web Services

Storing, retrieving, and updating XML data in an efficient, reliable, and transactional manner is becoming increasingly important as XML and Web services gain traction as methods of storing and exchanging key business information and documents (such as purchase orders and billing invoices).

SQL Server 2000 introduced support for XML, decomposing the internal components of an XML document into relational tables in the database (a process known as "shredding" the XML). Additionally, SQL Server 2000 can render relational tables as XML documents—for example, returning results to calling applications in XML format. Alternately, users can store XML documents as unstructured text or character data.

This presents a choice to developers working with XML documents in SQL Server 2000: they can store the XML as text, which limits SQL Server’s ability to effectively run queries against the data or perform updates or other manipulations on elements of the XML document, or they can shred the XML into SQL Server data tables and lose the original format of the document.

Yukon Native Support

Yukon support for XML enhances the XML features in SQL Server 2000 in the following ways:

Native XML data type. A new XML type lets database developers work with XML documents in their original format. The XML data type is treated the same way as other types of SQL Server data: tables containing XML can be indexed on their XML content, for example. Additionally, XML data inserted into an XML-typed column can be validated against a schema associated with the column to ensure that the data are properly formed. Perhaps most important, Yukon stores XML documents in their native form inside the database; consequently, the documents gain the same operational benefits (for example, transaction integrity and logging) as other types of SQL Server data.

Improved query capabilities. Yukon will implement an emerging query language called XML Query Language (XQuery), which is designed to build queries suited to XML documents. Additionally, Microsoft will extend XQuery in Yukon in two important ways. First, the version of XQuery supported in Yukon will allow queries against tables that contain a mixture of relational and XML data (for example, a table that has both XML data type columns and columns containing other SQL data types). Second, Yukon XQuery will support data manipulation operations in XML, allowing developers to write queries that not only return result sets from XML columns but also update, insert, and delete elements within the XML data in the columns.

Direct Web services access. SQL Server 2000 introduced an add-on that, in combination with Microsoft’s Internet Information Services (IIS), allowed developers to expose database logic (such as stored procedures) as Web services. Among other things, this process allows SQL Server to better integrate with other applications in a heterogeneous business application, such as a business application in which logic resides on Linux servers and business data are stored in a SQL Server database.

Yukon will remove the middle layer from this process: it will support direct HTTP access to SQL Server without IIS, simplifying deployment and reducing operational cost and complexity. (For an illustration, see "A Yukon-Based Web Service".)

Questions and Considerations

Yukon represents a significant overhaul of SQL Server and could revolutionize the way database developers program the product. However, with these changes come a number of questions and considerations that organizations evaluating Yukon will need to weigh:

T-SQL knowledge is still required. T-SQL will not be replaced by .NET programming languages. In fact, data definition and manipulation operations will still be performed with T-SQL. For example, a developer writing a stored procedure in C# that selects rows from a database will have to define a string that contains the appropriate T-SQL select command. This string will be passed to the T-SQL parser by a C# method call and then executed. Consequently, for the foreseeable future, server-side database development will require at least basic familiarity with T-SQL.

T-SQL or managed code? Development organizations face some complex decisions regarding where and when to use managed code versus T-SQL. Because .NET languages rely on T-SQL for certain operations (such as creating tables or indexes, or inserting, updating, or deleting rows in tables), executing such operations from .NET code will incur overhead. Consequently, procedures and functions that primarily involve those types of operations will likely perform better if written in T-SQL. On the other hand, procedures containing complicated algorithms or computational steps will be both easier to write and faster when coded in .NET languages.

Support for other .NET languages. Microsoft plans to offer support for VB.NET and C# in Yukon. Support for other .NET languages may eventually be offered. (For example, Microsoft is considering support for J#.)

Where to place code? In today’s business systems and applications, complex business logic frequently runs outside of SQL Server, on middle-tier servers that connect to clients on one side and the database server on the other. There are two main reasons for this. First, T-SQL is a functionally weak language and, as such, is not the best option for building large, computationally involved systems. Second, many system architects move compute-intensive logic away from the database server to avoid having this code interfere with the database’s primary job: fast and efficient data retrieval, update, and insertion.

Although Yukon addresses the first concern, it does not alter the fact that placing a great deal of computational code on the database server may impede the server’s performance, and thus the overall performance of the business system or application. Developers and architects will need to carefully weigh the performance gains their functions and procedures may see from direct, in-process access to SQL Server data against the performance hit that will undoubtedly result if too much code runs on the database server.

Availability and Resources

At the Oct. 2003 Professional Developers Conference, Microsoft handed out Yukon kits that included the product bits (an early version of its official Beta 1 release, which is an invitation-only beta), preliminary product documentation, and a number of technical white papers.

A broader-based second Beta will likely be available in the first half of 2004. The production release of Yukon is targeted for the end of 2004.

Technical white papers and other information on Yukon can be found at www.microsoft.com/sql/yukon/productinfo/default.asp.