![]() |
|
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
|
||
| Home > Samples > Update > January 2007 |
![]() ![]() |
| New File Formats in Office 2007 | ||||
|
By Rob Helm [bio]
The following is the full text of an article published by Directions on Microsoft, an independent research firm focused exclusively on Microsoft strategy & technology. More samples of our content, as well as a list of upcoming articles and reports are also available. New XML-based file formats are the default for Word, Excel, PowerPoint, and Access in Office 2007. The formats offer smaller file size, better security, and more direct access to Office data by third-party applications, which will strengthen Office as a client for document management. Office 2007 will support existing Office file formats, and older versions will support the new formats with an add-on. To use the new formats, organizations will need to plan Office migrations carefully so that their chosen format is the default for the organization and can be read by all users, including vendors, partners, and contractors. New Formats for Compression, Security, Applications Word, Excel, PowerPoint, and Access 2007 all save documents by default as compressed (zipped) packages with document content and formatting in XML. These new XML-based formats resemble the optional XML formats currently supported by Word and Excel 2003. The new default formats have several advantages: Smaller file size. XML formats typically produce larger files than binary formats do, but the files compress better. As a result, Office XML files can actually be significantly smaller than the corresponding binary files. (See the chart "File Sizes by Format".) On the other hand, some files in the new formats will take longer to load because of the overhead of uncompressing and processing XML. Large Excel XML files are particularly slow, so Microsoft has created a new Excel 2007 binary file format intended to enable large workbooks to load more quickly. Potential viruses blocked. Office 2007's default file formats exclude executable code, including macros. This measure prevents viruses and other malicious code from propagating in Office files, which in turn means organizations can be more liberal about letting files be exchanged over the Web and sent as e-mail attachments. Office 2007 does support alternate file formats that can contain macros; for example, Word supports a DOCX format (macro-free document) and a DOCM format (document that allows macros). Administrators can control the use of these formats by Group Policy. Access to Office data from other applications. The new formats are fully documented (via an XML schema), are licensed royalty-free, and can be uncompressed and accessed through standard APIs and tools, even on platforms other than Windows. Furthermore, the formats are extensible—Office XML files can include application-specific data that conform to a developer-defined custom XML schema. These features simplify development of applications that process Office files (e.g., an application that generates a Word offer letter from human resources data). To write such applications previously, developers either had to embed Office in their applications or attempt to reverse-engineer the Office binary formats. The former was difficult for server applications because Office was hard to configure and run reliably on servers (and didn't run at all on non-Windows servers), while the latter was impractical for any but the most determined developers (e.g., developers of Office competitors). Note that while the bulk of an Office 2007 file in the new format is XML, not all of the data in that file are. For example, images, OLE objects such as embedded Visio diagrams, and macro code in macro-enabled files are all stored in a binary format within the compressed file. Consequently, some parts of Office files will remain opaque to developers, but document content and formatting will be accessible as XML. Payoffs for Content Management, Public Sector The most important benefit of the new file formats will be for software developers and integrators who use Office as part of a larger solution. Because the Office XML formats are documented and accessible through standard APIs and tools, applications other than Office can do tasks such as generating documents from user input and extracting data from documents for business applications such as customer relationship management (CRM) systems. Microsoft itself could eventually benefit: the company's business applications (such as Dynamics CRM) could exploit the formats to extract information from Office documents or annotate them. The new formats could also help secure Office's position in the public sector. Some agencies and advocates have opposed using Office documents for public records because the documents could be accessed only from Microsoft products. Some have used this argument on behalf of Sun Microsystems' StarOffice suite (and the OpenOffice.org suite based on it), which uses an XML format similar to that of Office 2007; a subset of this format has been standardized as OpenDocument through the International Standards Organization (ISO). In response, Microsoft has made the Office XML format schemas available royalty-free and submitted them to the European Computer Manufacturers Association, a standards body that also standardized JavaScript and some aspects of Microsoft's .NET technology. Microsoft has also supported development of a utility to convert between Office formats and OpenDocument. These moves have somewhat defused the arguments against Office documents for public archives. Implementing the Change This is not the first change to Office file formats, and past transitions give cause for concern. Many organizations struggled with incompatibilities between Office 95 and the initial release of Office 97, or between Access 2000 and Access 2002/2003. Organizations that move to Office 2007 will probably stick with the existing Office file formats initially, then eventually move to the Office 2007 formats. Keeping Existing Formats Office 2007 can read and write earlier Office file formats. As a result, organizations can upgrade to Office 2007 without altering any existing files. This will be critical for organizations that exchange Office documents with partners or customers who are still on older versions. Some organizations might also enforce use of existing file formats during their migrations to Office 2007, when computers are running a mix of versions. To enforce use of existing Office file formats, administrators have two main tools at their disposal:
However, there are some potential compatibility problems when using the old file formats in Office 2007: New Office 2007 features disabled. Office 2007 has features that aren't supported in the existing file formats, such as Excel worksheets that have more than the 65,536 rows allowed by Excel 2003, and SmartArt diagrams (which are diagrams generated automatically from text). To avoid compatibility problems from these features, Office 2007 normally runs in Compatibility Mode when using the old file formats. Compatibility Mode disables some features that can't be supported in earlier Office versions, such as creation of Excel worksheets with more than 65,536 rows. Compatibility Mode also automatically converts document components and warns the user when the conversion could change the look or behavior of the document; for example, Compatibility Mode converts SmartArt diagrams to bitmap images. A Compatibility Checker command in Office 2007 enables users to spot parts of an Office 2007 document that might not properly convert into the older file formats. Office 2007 won't support some old features. Office 2007 lacks some features of earlier versions of Office, and so it can't fully support some Office files created in earlier versions. For example, Word 2007 cannot open Word files that contain multiple document versions, a feature supported by Word prior to Word 2007. To help organizations spot compatibility problems with existing files, Microsoft is working on a package of tools and documentation called the Office Migration Planning Manager. The package describes features that have changed between Office versions and identifies potential compatibility problems of Office 2007. It also provides file scanning and reporting tools that locate Office files on a corporate network and can report on files that pose problems. Available in beta form in Nov. 2006, the Migration Planning Manager should be ready when Office 2007 is generally available in Jan. 2007. (See the illustration "Migration Planning Manager Scan Report".) Moving to the New Formats Organizations that move to the Office 2007 formats can simplify the move in several ways: Update earlier Office versions for new formats. A free Compatibility Pack patch enables Office 2000, Office XP, and Office 2003 to read and write documents in some Office 2007 formats, but does not enable earlier Office versions to support all Office 2007 features (such as larger Excel spreadsheets). Some specific Office 2007 features, such as SmartArt diagrams, will maintain their integrity when opened in an earlier version of Office with the patch, then reopened in Office 2007. For example, a file with a SmartArt diagram can be created in Office 2007, loaded and saved in Office 2003, and then edited again in Office 2007 without any change to the diagram's behavior. However, not all Office 2007 features work this way, so Office 2007 users will still need to use Compatibility Mode and the Compatibility Checker to ensure their files can handle a trip through earlier Office versions. The Compatibility Pack does not work with Access—it works only with Word, Excel, and PowerPoint. It is currently available in beta form, with a final version planned for Jan. 2007. Convert files with the Migration Planning Manager. The Migration Planning Manager includes a command-line tool that will bulk-copy and convert files from older Office formats to the corresponding Office 2007 formats. As noted, some existing files will not convert perfectly to the new formats because they use features that have been eliminated from Office 2007. However, the Migration Planning Manager includes tools to mitigate the worst problems: for example, it can separate a Word file that includes multiple document versions into a set of separate files, one for each version. A Long Transition Overall, most organizations will have to deal with both old and new Office formats to some extent. Any organization that exchanges Office documents with outside vendors, partners, and customers will occasionally have to send files in old formats to counterparts who have not upgraded Office or who have kept the old formats. Organizations will also occasionally have to read and update files in the new formats that they have received from counterparts who have upgraded Office and moved to the new formats. In summary, any organization moving to Office 2007 will probably want to adhere to an old network interoperability maxim: "Be conservative in what you send, and liberal in what you accept." For Office files, that suggests generating new documents in existing Office formats whenever possible and limiting use of new Office features that aren't supported by the existing formats, but ensuring that all computers can process incoming documents in the new formats. Resources Macro and other code compatibility and security in Office 2007 are summarized in "Custom Code Moves to Office 2007" on page 29 of the June 2006 Update and "Office 2007 Retires Web Controls" on page 32 of the June 2006 Update. Access 2007 compatibility considerations were discussed along with new features in "Access 2007 Targets First-Timers, SharePoint Users" on page 19 of the Dec. 2006 Update. Office 2007 support for OpenDocument formats was outlined in "Open Document Format for Office" on page 22 of the Aug. 2006 Update. Office 2007 deployment guidance and tools are part of the Office Resource Kit at technet2.microsoft.com/Office/en-us/library/9a753419-726c-422b-9863-7dfaf2f522c21033.mspx?mfr=true. A list of potential Office compatibility problems is at technet2.microsoft.com/Office/en-us/library/9a753419-726c-422b-9863-7dfaf2f522c21033.mspx. Compatibility Mode capabilities and limitations are summarized at technet2.microsoft.com/Office/en-us/library/2f7456c2-67e6-4948-9e76-81fce661210e1033.mspx. A preview of the Migration Planning Manager is at www.microsoft.com/downloads/details.aspx?familyid=13580cd7-a8bc-40ef-8281-dd2c325a5a81. Recommended Office 2007 and Windows Vista migration processes and tools are available in the beta Business Desktop Deployment Accelerator at www.microsoft.com/technet/desktopdeployment/bdd/2007/default.mspx. A developer overview of the Office 2007 file format is at msdn2.microsoft.com/en-us/library/ms406049.aspx. A technical blog about Office file formats is blogs.msdn.com/brian_jones.
|
||||
| Members | Contact Us | About Us | Samples | Subscribe | Jobs | |||
|
|
||