| XML to Glue Applications Together |
| Feb. 11, 2002 |
The newly released .NET development platform gives a prominent spot to the Extensible Markup Language (XML) data format. In fact, XML is popping up in all of Microsoft's strategic products (and seemingly all of its executives' speeches). "XML is affecting everything at Microsoft, not just our tools and our platforms but the very applications themselves and the way they deal with data," says Chief Software Architect Bill Gates. By understanding XML, IT planners and software architects can design better applications with Microsoft's operating systems, servers, and tools, and can tie those applications into larger information systems more effectively. Three Key Roles The XML format is showing up throughout Microsoft's product line, from Access (as a file and report format) to Windows XP (as a format for program installation information). However, XML will have the biggest impact on software and information system design on the Microsoft platform when it is used in the following three ways: A "hub" format for application integration. XML provides a data exchange format for integrating applications between departments or companies. This can encompass everything from connecting a company's outsourced e-commerce Web site to its mainframe applications, to hooking a subsidiary's line-of-business applications to other applications at headquarters. In any case, companies can use application integration to cut costs, reduce turnaround time, and minimize errors by eliminating data reentry and automating transaction flow. A "write-once, view-many" format for user interfaces. XML provides a neutral data format for diverse browsers and devices. The goal is to enable applications (or authors) to write data such as reports, forms, and Web pages once, and then present the data differently for each type of user. This "write-once, view-many" process can make it easier to maintain data, and it enables designers to work more independently from content authors and application developers. A "catchall" data format for databases. XML provides a standard format for storing and retrieving data in databases and other data stores. In particular, XML enables applications to move "semi-structured," nontabular data like Web pages, purchase orders, and bug reports into and out of databases, and to integrate data from diverse back-end data stores. This can help companies get the benefits of database technology in more applications and enhance the scalability of Web applications. Microsoft is counting on XML to succeed in all three roles. It believes XML-based application integration will drive demand for products such as BizTalk Server and its .NET development platform. It hopes that XML-processing "smart client" Windows applications will reduce the threat of "thin client" applications running on other platforms. And it is counting on XML processing features in SQL Server to make that database product a stronger platform for Web applications and a viable storage service for Microsoft products such as Exchange. Overall, Microsoft hopes that XML will provide a pervasive "glue" technology like COM that enables it and its customers to tie together systems built on Microsoft products. Extensible Format, Standard Libraries Even though it can play many roles, XML itself is a fairly simple idea— an extensible, text-based data format. Like HTML, XML data consist of text mixed with angle-bracketed tags. XML data can include binary information, but it must be encoded as text or stored in separate files referenced in the XML. (See the illustration "An Update Article in XML".) The reason XML can play so many roles is that developers can define new application-specific data formats by extending it. Furthermore, applications can use a standard code library called an XML processor to read and validate data in developer-defined formats. (See the illustration "Using an XML Processor".) Developers define XML data formats in one of two types of documents: a Document Type Definition (DTD) or an XML schema in the XML Schema language. (See the illustration "Defining the Update Article Format".) One of the key benefits of the XML processor is that developers can create data formats that fit the needs of their applications, but they don't have to write all the code that reads, validates, and writes application data. The XML processor does that job. Application Integration Microsoft's strategy for application integration with XML revolves around Web services, Web server applications that communicate with the outside world by sending messages in XML formats over standard Internet protocols. (Microsoft uses the term XML Web services, but Web services is used by many other vendors.) To integrate two applications with Web services, organizations perform the following steps:
(See the illustration "Application Integration with Web Services".) The Simple Object Access Protocol (SOAP) provides a standard XML message format for Web services, and it has the support of many software vendors, including Microsoft, IBM, Oracle, and SAP. SOAP defines a set of generic message fields (such as the address of the application that should receive a message) in XML and also defines a general way for developers to encode application-specific data (such as a purchase order or syndicated news article) as XML in the message. Developers also use the XML-based Web Service Description Language (WSDL) to write machine-readable descriptions of Web services. With a WSDL description of a Web service, a developer can automatically create application code that exchanges messages with the service, using tools such as Visual Studio .NET and the Office Web Services Toolkit. Of course, XML does not solve all application integration problems. In particular, organizations have to agree on an XML format for the application-specific data Web services exchange. For example, a news syndicate that delivers articles through a Web service has to agree with its customers that an article's title should be marked with a tag called <title> rather than a tag called <article>. Various industry groups and software vendors (including the Microsoft-led BizTalk.org) are defining message formats for particular industries or business processes (e.g., purchasing), but no single "Grand Unified Message Format" is likely to emerge anytime soon; organizations will have to work with their partners to pick among the available message formats, and quite possibly define formats from scratch. Why XML Works for Application Integration Numerous companies have already integrated applications using distributed object frameworks (Microsoft’s Distributed COM [DCOM], the Object Management Group’s CORBA, and Sun’s Java Remote Method Invocation [RMI]) or message queuing systems (Microsoft Message Queue, IBM MQSeries). Electronic Data Interchange (EDI) has also gained considerable use for functions such as supply chain integration in manufacturing. However, several factors make XML and SOAP more appealing than these existing technologies. Reach. Microsoft, IBM, and other vendors have made XML processors available for almost any operating system and software development platform, free of charge. These vendors have also worked within the World Wide Web Consortium (W3C) to standardize XML and verify that their processors work correctly with one another. The result is that XML processors are more readily available and less expensive than software for technologies such as EDI. They are also less tied to vendor-specific software platforms than technologies like DCOM and MQSeries. Cheaper computing. XML processors read and validate data more slowly than EDI or DCOM software, which has been carefully tuned for these specific formats. However, increases in computing power have made this disadvantage much less noticeable. Similarly, XML's restriction to printable characters and its use of named tags makes it much less compact than text EDI formats or DCOM's binary format, but cheaper disk storage and increased availability of network bandwidth have made this disadvantage much less serious. Whatever its technical merits, XML seems fated to win a larger role in application integration because of vendor politics: Microsoft, IBM, Oracle, and Sun have all adopted the format in their software platforms. Microsoft and IBM in particular have promoted XML for application integration, in part as a way to integrate their own far-flung software product lines, However, all platform vendors are feeling network effects as major commercial application developers such as PeopleSoft and SAP adopt XML integration and encourage vendors to support it. The result is that Microsoft and its competitors generally agree on the details of basic XML technologies like SOAP and WSDL needed for application integration, even as they continue to fight the battles of .NET versus Java, and Windows versus Unix. .NET, BizTalk Lead Integration Lineup Microsoft has already delivered software for integrating applications with Web services and XML, including the following: Visual Studio.NET and ASP.NET. The latest version of Visual Studio .NET (VS.NET), Microsoft's integrated development environment (IDE), provides numerous tools for working with Web services, such as a "Web reference" tool that automatically generates code for exchanging messages with Web services. Developers can also easily create Web services on the ASP.NET Web server API, which ships in VS.NET and the free .NET Framework SDK. In the simplest case, the developer can turn an application into a Web service by adding a few keywords to the application code. At run time, ASP.NET takes care of sending and receiving SOAP messages, running the application code, and converting application data to and from XML. BizTalk Server. This server product enables companies to create logical hubs that route messages among applications, manage transactions, and convert data between application-specific message formats, protocols, and APIs. BizTalk Server hubs can be Web services that send and receive messages in SOAP or other XML formats, as well as some non-XML formats (e.g., EDIFACT EDI). BizTalk Server uses XML document objects to represent messages internally and includes a graphical Mapper tool for writing XML Stylesheet Language Transformations (XSLT) that convert between message formats. The SOAP Toolkit. This free collection of development tools and software components enables developers to create XML Web service front-ends for existing COM applications. Components in the toolkit automatically translate application data between COM binary data types and XML, substantially simplifying the job of the developer. However, the .NET SDK, VS.NET, and BizTalk Server have substantially better tools for building and using Web services. Microsoft also plans to add Web service support to the COM+ component framework in Windows .NET Server when it ships in mid-2002. According to Microsoft, administrators will be able to automatically create a Web service front-end for any COM+ application on Windows .NET by checking a box at deployment time. Presenting Data to Diverse Clients Microsoft supports XML as a presentation-independent data format for diverse "thin clients"—that is, browsers interacting with Web sites or Web server applications. However, the company is betting that XML will increasingly serve "smart client" applications, which use downloadable presentation code running on the client device. Style Sheets for Thin Clients XML supports a presentation architecture based on client-specific style sheets. (See the illustration "Thin Client Presentation with Style Sheets".) A style sheet is a specialized script that converts XML data into the presentation format (frequently HTML) required by the particular client type. At run time, the application detects which client type it is dealing with, selects the appropriate style sheet for that client, and then applies the style sheet to any XML data it needs to present. The result is that an application or author produces the data once, and doesn't need to know the details of the markup language used by a particular client, or all the design principles needed to create a good presentation for that client. These details can be left to a style sheet designer. There are many solutions other than XML for supporting diverse thin clients. For example, Web sites hosted on Microsoft's Content Management Server use Active Server Pages (ASP) and a SQL Server database to dynamically generate content for specific users and client devices. Also, Microsoft's recently released Mobile Internet Toolkit enables developers to create ASP.NET applications that adapt their output to a user's browser and device, making it possible to create a single page that supports Pocket PCs, browser-equipped cellular phones, and PCs. The advantage that XML and style sheets offer over these technologies is that they are standardized by the W3C and supported on platforms other than Windows; however, other technologies might provide better tools for authoring content and presentation, especially by graphic designers and others who might be uncomfortable working with style sheet languages like XSLT. How Microsoft Supports Style Sheets Microsoft provides several ways to use style sheets with XML. For example, it supplies a free, unsupported Internet Information Server (IIS) Web server plug-in that accepts requests for XML data, selects a style sheet based on information in the request (such as the identity of the user's browser), and then uses the style sheet to convert the requested XML data to HTML. Like BizTalk Server, this IIS plug-in uses style sheets in the XSLT scripting language. Microsoft itself makes use of server-side style sheets in its "digital dashboard" technology for corporate portals. (See "Corporate Portal Strategy in Flux" on page 3 of the Nov. 2001 Update.) Versions 5.0 and later of Internet Explorer (IE) can also apply style sheets to XML Web pages it retrieves; these style sheets run on the client machine, which can provide higher performance than running the style sheet on the server. IE supports both XSLT and Cascading Style Sheets (CSS), a style-sheet language that's more limited than XSLT but simpler and more widely used. Web Services for Smart Clients More recently, Chief Software Architect Bill Gates has been promoting a "smart client" presentation architecture, in which the application's user interface is delivered in device-specific, downloadable client software, and the rest of the application is one or more Web services. (See the illustration "Smart Client Presentation with Windows Forms".) This architecture gives users a more responsive user interface than a typical thin client could, while giving administrators the ability to deploy and maintain the client code centrally. Microsoft does not expect smart clients to replace all thin clients, but believes they will prove useful for combining information from multiple, independent Web services. For example, a smart-client application for a company's sales force might provide a single front end to Web services for placing orders, reporting expenses, and managing customer relationships. Microsoft is counting on the .NET Framework and its Windows Forms (WinForms) graphical user interface (GUI) library to solve the deployment and security problems that have turned companies away from "fat client" technologies such as Visual Basic (VB), client-side browser scripts, Java applets, and ActiveX controls. WinForms provides a set of GUI controls (e.g., buttons, text input fields, calendar input controls for dates) that can be used to view and edit XML data. Like VB clients, WinForms clients can provide full Windows GUIs, but users can more easily download and install WinForms clients over the Web. WinForms clients also offer finer-grained control over security than other technologies for downloadable code, such as scripts, applets, and ActiveX. These technologies either sharply limit what downloaded clients can do or give clients the full privileges of their users. In contrast, WinForms clients enable administrators to control the client's precise privileges through policy. For example, an administrator can specify that downloaded WinForms clients may only write to files in an isolated "safe" area on the local disk. This means that users can download and run more powerful client applications than before, with less chance of picking up a virus or worm. Accessing Data Through XML Microsoft supports XML as a format for exchanging data with databases, particularly databases in SQL Server. Supporting XML for data access makes it simpler for databases to store and retrieve "semi-structured" data that include a mixture of unstructured blocks of text or binary data (e.g., the text and graphics of the sample Update article) and structured fields (e.g., the deadline attribute in the sample Update article). XML data access can also simplify coding and speed performance in XML applications. Finally, XML can provide a single "hub" format for integrating SQL Server with other data sources. This lets developers access the data in a single format, using standard tools and APIs, regardless of the data's source. (See the illustration "XML Data Access Architecture".) XML for SQL Server in Free Kit A free SQL Server 2000 Web Services Toolkit delivers XML data access support for SQL Server. The toolkit contains extensions to Microsoft's ActiveX Data Objects (ADO) data access library that enable applications to get query results in XML and update databases with XML messages. Applications can also cache retrieved XML data, update it, and send batches of changes back to the database in XML. (These capabilities were previously delivered in a free toolkit called SQLXML, which is now part of the SQL Web Services Toolkit.) The toolkit also delivers XML data access components for ADO.NET, the new data access API of the .NET Framework. ADO.NET provides much simpler APIs for XML data access than ADO. In particular, it provides a single "DataSet" API for accessing both XML data and SQL Server databases, enabling developers to learn and use a single API for a large range of data stores. Microsoft's own WinForms controls for smart clients use the DataSet API to access data, enabling them to present database data or XML with equal ease. In addition, the toolkit enables developers to selectively publish groups of SQL Server stored procedures (code that runs on the database server) as Web services. Applications can execute these procedures by sending SOAP messages to the server. This feature will essentially make SQL Server a "Web service in a box" for accessing databases. The .NET My Services, Microsoft's planned public Web services for storing user data, might be constructed this way. (See ".NET My Services Picture Getting Clearer" on page 26 of the Dec. 2001 Update.) Finally, the toolkit includes an IIS plug-in that enables clients to query a database over HTTP and get results back in HTML. With this plug-in, thin client server applications can be hosted entirely in SQL Server. Experiments suggest that applications built this way can have a small amount of code and high performance compared to alternatives such as Active Server Pages (ASP). Note that neither ADO nor ADO.NET supports exchanging XML with Oracle 9i or IBM DB2. However, applications can use the native APIs of these data sources to retrieve XML, and then use ADO.NET to work with the resulting XML data (through the DataSet API) in the same way as they work with data retrieved from SQL Server. In effect, XML can serve as an integration format for data sources, just as it serves as an integration format for applications. New Database Engine To Improve XML Caching Microsoft is working an new XML-enabled database engine based on SQL Server 2000 that will eventually be used in all Microsoft server products. Due in 2003 or later, this Yukon engine will even replace the company's other XML-enabled database, the Web Store technology used in Exchange and SharePoint Portal Server. Yukon will provide additional features for XML caching, such as a cache management protocol for signaling applications to refresh cached data. (See "Microsoft Rethinks Exchange Storage Architecture" on page 3 of the May 2001 Update.) The Future: Less Visible but More Pervasive XML has become central to Microsoft's thinking on application architecture because it provides a good solution for integrating disparate applications, presenting application data to diverse clients, and transmitting and caching database data. Most important, it allows a single software infrastructure—the XML processor—to help solve all these problems at once. As a result, the XML processor also functions as a single point of contact where information systems built with Microsoft products can interact with components from the company's partners and competitors. Microsoft has a lot of work ahead of it to realize its full vision for XML. In 2002, for instance, it must update IE, BizTalk Server, and numerous other products to support the final, W3C-recommended XML Schema language; shipping versions of these products support only a limited, preliminary subset of the language. The greatest challenge could be implementing Yukon, integrating it into Exchange, and then re-architecting its Outlook client to use the XML capabilities of the new database product. Nevertheless, XML will become increasingly important as a way to tie application components together on Windows, and to tie applications into larger information systems. Visual Studio.NET and ASP.NET will make XML less visible to the developer, but this could actually make the format more pervasive by lowering the amount of developer effort required to use it. XML will have its biggest impact on server applications in the near term, but Microsoft is already preparing Office and other desktop applications to become smart clients. The company has committed itself to XML for the long haul. As Chief Executive Officer Steve Ballmer puts it, "If you think about how long it took the PC really to catch on or graphical user interfaces to catch on, this is not a one-year phenomenon. We will really see the XML Web service revolution play out fully in four or five years." Resources A good starting point for general information on XML is www.oasis-open.org/cover. XML information for Windows developers is at msdn.microsoft.com/xml. For more information on XML Web Services and their role in Microsoft's overall strategy, see the July 2001 Research Report, "Understanding .NET." For more information on ASP.NET and other .NET developer technology, see the Feb. 2002 Research Report, "The .NET Development Platform." For more information on SQL Server's XML capabilities, see the Dec. 2000 Research Report, "SQL Server 2000 Targets Enterprises and the Web." The SQL Server 2000 Web Services Toolkit is at www.microsoft.com/sql/webservices. The SOAP Toolkit is described in "New Toolkits Power Web Services Now" on page 7 of the Mar. 2001 Update. For information on the Mobile Internet Toolkit, see "Beta SDK Supports Mobile Web Applications" on page 5 of the Feb. 2001 Update. |