DWF format

From AutoDesk
Edited by Paul Bourke

The DWF format is intended only for the efficient viewing of drawings Similar to an electronic plot. It is not intended for the interchange of higher level data between applications, especially as most DWF data is post-tessellation.

For example, a CAD application generates a .dwf file based on a drawing in the application's native format. This .dwf file is then transmitted and displayed by a much simpler viewing application, such as an Internet Web browser. Due to the lack of non-visual data, the DWF is not intended to be read back into the original CAD application, although the .dwf file can refer to another file in the application's native format with a DWF link or an embed operation.

File Organization

.dwf files are organized into three main sections as shown in figure 1.

File identification header
File data block
File termination trailer

Figure 1. DWF file organization

The data in the header and the trailer are encoded as readable ASCII text. Data in the file data block is delimited by operation codes (opcodes) and argument data used by the opcodes (operands) as in table 1.

Table 1. File data block

There are two types of opcode-operand pairs: readable ASCII text and coded binary. All DWF operations have a readable ASCII opcode/operand form, and most operations also have a coded binary opcode/operand form. By using the proper opcode form, you can create a file that is humanly readable or one that is more efficient from a processing and storage point of view or, more commonly, a mixture of both types.

An application reading a .dwf file may not understand a set of opcodes, especially when the application reading the file outdates the application that created the file. For this reason, DWF is designed to allow a file reader to skip most opcodes. In order for the file reader to skip an opcode, it must know the length of its operand. DWF has three categories of opcodes:

1. Single-byte opcodes that must be recognized for efficiency reasons and thus cannot be skipped.

A reader application need not implement these opcodes but must be able to compute their operand length, which requires that the opcode be recognized. If a single-byte opcode is unrecognized by a DWF reading application, the rest of the file cannot be read.

2. Extended ASCII opcodes (humanly readable) that have delimited and nestable operands.

By following some simple rules, a reader application can safely skip such an opcode/operand pair without understanding the operation or its contents.

3. Extended binary opcodes that indicate their operand length so that a reader application can easily skip past the unknown operation and data.

To improve the readability of .dwf files, an opcode may be preceded by white-space. White-space is defined as any number of ASCII spaces, tabs, carriage returns, or line feeds.

File Format Details

This section describes each block of the .dwf file format in detail.

File Identification Header

The file header has two basic functions:

To allow for easy identification of .dwf files by human or machine.
To identify which version of the DWF specification, for example 00.30, was used to encode the file.

Table 2a. File header

Byte	0	1	2	3	4	5	6	7	8	9	10	11
Character	(	D	W	F	(space)	V	0	0	.	3	0	)
ASCII (Hex)	28	44	57	46	20	56	30	30	2E	33	30	29

The header, shown in table 2, is 12 bytes that can be interpreted as a, possibly undetermined, ASCII string, for example, "(DWF V00.30)". The first six bytes are the constant "(DWF V." which identify this as a .dwf file. Note that these are in upper case This constant is followed by a 5-byte version number in the format shown in table 3.

Table 3. The 5-byte version number

Byte	Symbol	Description
6 & 7	<0 .. 99>	ASCII integer; The major revision value of the format used.
8	. (period)	ASCII decimal point
9 & 10	<0 .. 99>	ASCII integer; minor revision value of the format used

The application generating a DWF format should specify the lowest possible version number that an application reading the file would need to support in order to properly use the data. Generally, a reader application should not attempt to read a file with a higher major revision value than what it was designed for. If the minor revision value is higher, this indicates that the file may contain opcodes unknown to the reader, which can and should be skipped.

File Data Block

The file data block starts at the 13th byte of a .dwf file, and is a series of opcode and operand pairs, as in table 4.

Table 4. Opcode, operand pairs

White Space

An opcode may be preceded by any number of white-space characters, which are defined as any number of ASCII spaces, tabs, carriage returns, or line feeds.

Opcode Forms

Opcodes are a single byte in length except for two special cases: extended ASCII and extended binary. This allows for over 200 operations.

Note: Some values in the range from 0 to 255 of a byte are not legal for opcode use.

These single-byte opcodes may have operands that are either readable ASCII or coded binary. Some common operations have separate opcodes for both ASCII and binary operand forms. Generally, if an operand is formatted as readable ASCII, then its single-byte opcode is also a readable ASCII character, which allows the file to be edited with a normal text editor. The following example shows a line drawing operation using the L single-byte opcode followed by a readable ASCII operand:

L 500,20300 90100,48000

This line could also be represented using a binary-coded operand, as shown in the following example, which uses the l opcode. In this example, each underlined character represents a byte of binary operand data:

lXXXXYYYYxxxxyyyy

Except for the two special types of opcodes extended ASCII and extended binary, a file reader must know how to compute the operand length.

Illegal Opcodes

The ASCII representations for the following cannot be used as opcodes:

A space (0x20)	A double quotation mark (0x22)
A tab (0x09)	A period (0x2E)
A hyphen (0x2D)	Parentheses (0x28 and 0x29)
the ASCII digits 0-9 (0x30 - 0x39)	A curly brackets (0x7B and 0x7D)
A carriage return (0x0D)	A square brackets (0x5B and 0x5D)
A line feed (0x0A)	A backwards slash (0x5C)
A single quotation mark (0x27)

Binary Operands

In the case of an opcode with a binary coded operand, the binary data is stored in
little-endian format. On systems with big-endian processors, such as Motorola-, Sun-, and Hewlett Packard-based systems, byte swapping must be performed before writing or reading the .dwf file.

The Extended ASCII opcode

The single-byte opcode, (, open parenthesis character, indicates an extended and possibly nested readable ASCII opcode. Following the open parenthesis character, (, is a multiple-byte, string token opcode followed by white space, followed by zero or more operands, followed by a close parenthesis terminator, ),:

(Origin 240 120)

Extended ASCII opcodes may be nested:

(Owner (FirstName Brian)(LastName Mathews))

Extended ASCII opcodes may contain literal strings surrounded by the single
quote mark, ',:

(Account (Person 'Brian Mathews ;-)') (Company 'Autodesk \'ADSK\''))

Note: Inside a quoted string, the, \ , character may be used to treat a subsequent, ' , or, \ , character as literal data.

The Extended binary opcode

The single-byte opcode open curly brace character, { , indicates an extended and possibly nested binary section of data. Immediately following the open curly brace character, { , is a 4-byte integer that represents the length (in bytes) of the binary data. Following the length field is a 2-byte extended-binary opcode, which allows for over 65,000 operations. Finally, the binary stream is terminated with a closed curl brace character, }. For example, the binary data for a raster of pixels can be represented as

{cccceexxxxxxxxx}

where cccc is the length of the binary data, ee is the opcode for a raster, and xxxxxxxxx is the raster data. The extended binary opcode and the terminating character, } , are counted as part of the binary data stream. Thus, the value of cccc is 12 (nine x's, two e's, and one } ) in this example, which is encoded as a little-endian binary value.

Skipping Unrecognized Opcodes

Skipping extended ASCII and extended binary opcodes.

Skipping extended ASCII opcodes

If a reader application does not recognize an extended ASCII opcode, it should keep scanning the file while matching open paren characters,(, with closed paren characters,) , until the terminating closed paren character, ) , is found. If a single quote character ,' , is found, scanning should continue until a matching single quote,' , is found, ignoring any open paren, ( , or closed paren, ) , characters inside.

Note: While parentheses may be nested, single quote marks are not, as the latter always contain a single literal string.

A backslash character, \ , indicates a literal character will follow that should not be used for literal string termination. Thus, the following would pass the operand This is\was a 'happy' face :-) comment! to the Comment opcode:

(Comment 'This is\\was a \'happy\' face :-) comment!')

Skipping extended binary opcodes

It is possible, although not recommended, for an extended ASCII opcode to contain nested extended binary data, as in

(Embedded_DWG (FileName house.dwg) {ccccXXXXXXXXXX})

where cccc represents a 4-byte little-endian integer indicating the length of the binary data, 11 in this example represented by "XXXXXXXXXX" plus the terminating curly brace, "}".

To skip any binary object, either opcode or operand data, the four byte count cccc must be used rather than searching for the curly brace character, }. Also, notice that this method allows a reader application to skip even a nested set of binary streams as the parent streams cccc count includes the subobjects data.

If the four-byte binary data run count cccc has the value zero, this indicates that the DWF writing application was unable to compute the length of the binary data. Such an opcode can not be skipped, and therefore the reading application must either know how to parse the opcode, or must fail to read the remainder of the .dwf file. Obviously, DWF writing applications should refrain from this practice whenever possible.

The Logical Coordinate System

Most of the coordinates specified in .dwf files are in logical coordinates, as opposed to screen or device coordinates. Logical coordinates are specified as the positive range of 32-bit signed integers (31 bits of precision) with a legal range from 0 to a maximum of 2,147,483,647 (2³¹- 1). Normally, a DWF writing application should scale the geometric primitives of the illustration that is being stored so that a large portion of this 31-bit range is used. This allows a DWF reading application to scale the illustration for the desired display or a user to zoom in on the drawing with sufficient precision to render fine details.

Integer Versus Floating-Point Values

Thirty-two-bit integer values are used because they allow for more precision and greater computing speed than 32-bit floating-point values. Out of a floating-point numbers 32 bits, 8 bits are used to store exponent and sign information, leaving only 24 bits of true precision (not to be confused with a floating point number's large range).

If a map were drawn with DWFs 31-bit integer coordinates, over 21,000 kilometers (>12,000 miles) of distance could be uniquely resolved down to 1-centimeter increments. If 32-bit floating-point coordinates were used, only 167 kilometers (100 miles) could be resolved to this level of detail. By contrast, AutoCAD uses 64 bit double-sized floating point coordinates to address this issue, with a resulting 52 bits of precision and an enormous range. For the more limited purpose of representing an electronic plot, DWF's 31 bits of precision are more than adequate.

Relative Logical Coordinates

Depending upon the opcode in use, these logical coordinate values may be encoded in a .dwf file literally as absolute coordinates or as relative coordinates. Whereas absolute logical coordinates may only range from 0 to 2,147,483,647 (31-bit unsigned), relative coordinates may range from negative 2,147,483,647 to positive 2,147,483,647 (32-bit signed). A relative coordinate is formed from an absolute coordinate by taking the literal coordinate and subtracting from it the previous absolute coordinate in the file.

Relative coordinates are used in order to increase the effectiveness of the DWF's data compression algorithm, which tries to find repeating patterns of data. Common drawings have objects, represented by sequences of lines, circles, and so forth, that may occur multiple times, such as the four tires in an illustration of a car. If absolute coordinates were used, each of the lines and circles that make up an object would have differing coordinates for each instance of that object in an illustration, due to their differing positions. If relative coordinates are used, however, only the first coordinate in the sequence of coordinates differs for each instance of the object. Since the remainder of the coordinate sequence is independent of the object instance, the data compression algorithm will find longer and more frequent sequences of repeating data.

Sixteen bit coordinates

For many applications, the extreme level of detail allowed by DWFs 31-bit logical coordinates is not necessary and may be undesirable due to the increased file size needed to store such large values. For this reason, many drawing operations allow for 16-bit integer relative coordinates to be used 16-bit signed relative values. When a DWF reading application is given either a 32-bit or a 16-bit relative coordinate, its value is converted to a full 31-bit absolute logical coordinate before use. This is a lossless form of compression since the full 31 bits of precision are preserved even when storing only a 16 bit value.

How Opcodes are Selected

When assigning an opcode to an operation, apply the following principles:

By default, a new operation is assigned to an extended ASCII opcode so that it is humanly readable, can be easily skipped by older DWF reading applications, and doesn't use one of the scarce single-byte opcodes. If the operand to this operation represents binary information, this binary information should be converted to a readable ASCII form for storage in the .dwf file.
If the operation's operand represents a large amount of binary information which would be inefficient to convert to a readable ASCII form, then the operation should be assigned an extended binary opcode. This has the advantage of storing the binary data more efficiently from a processing and storage point of view. Also, the conversion doesn't use one of the scarce single-byte opcodes and it still allows older DWF reading applications to skip unrecognized operations.
If an operation is expected to be used frequently in a .dwf file and the operation has a relatively small operand, it should be assigned one of the scarce single-byte opcodes. For common operations with large operands, the space savings from a single-byte opcode is not justified and an extended style opcode should be used instead. Single-byte opcodes must be known to the DWF-reading application and, thus, cannot be skipped.

Execution of the File Data Block

To preserve proper drawing order, the opcodes found in the .dwf file should be executed in the order they are received.

File Termination Trailer

The DWF trailer is simply a special opcode indicating the end of the DWF data sequence file, normally at the end of the file. It is possible for an application to store non-DWF data following the .dwf file termination opcode.

Table 2b. File trailer

Byte	0	1	2	3	4	5	6	7	8	9
Character	(	E	n	d	0	f	D	W	F	)
ASCII (Hex)	28	45	6E	64	4F	66	44	57	46	29

Example

Following is an example of a .dwf file that uses readable ASCII opcodes exclusively. This same example could be represented more efficiently using binary coding, but it is difficult to show this in a printed document.

(DWF V01.00)
(DrawingInfo

(SourceFilename house plan.dwg)
(Description 'Blueprints for the first floor of the new library.')
(Creator 'AutoCAD R13C4')
(Created 820497600 'January 1, 1996')
(Author 'Brian P. Mathews, Autodesk')
(Bounds 1,1 9750000,256234000)
(Scale 0.0001 0.0001 meter)
(Projection stretched

)
(Comment This is a DWF comment! )
(View 30,40 6500,9000)
(Layer 1 Electrical)

(Comment changing the color from the default)
C 132
(Comment 'drawing some lines :-)' )
L 25,30 250,400 L 100,150 120,119 L 200,150 120,219
L 400,150 120,19
(Comment Removing some of the spaces)
L25,30 250,400L100,150 120,119L200,150 120,219

(Layer 2 Heating)

C12
L250,300 250,400
(URL 'http://www.autodesk.com')

v (Comment The following line wont be visible.)
L40,15 120,19
V (Comment Subsequent geometry will be visible.)

(URL)
L430,115 120,19

(EndOfDWF)

Opcode Mnemonics

Table 1. Opcode mnemonics

Mnemonic	Meaning
<B>	Unsigned byte of binary memory
<S>	Signed short integer (two bytes) of binary memory, stored in little-endian format. If used to represent a coordinate value, then it is interpreted to be a relative coordinate.
<US>	Unsigned short integer (two bytes) of binary memory, stored in little-endian format. If used to represent a coordinate value, then it is interpreted to be an absolute coordinate.
<L>	Signed long integer (four bytes) of binary memory, stored in little-endian format. If used to represent a coordinate value, then it is interpreted to be a relative coordinate.
<UL>	Unsigned long integer (four bytes) of binary memory, stored in little-endian format. If used to represent a coordinate value, then it is interpreted to be an absolute coordinate.
<I>	Integer (signed or unsigned) value expressed as a series of ASCII-coded bytes (including the hyphen (-),and the ASCII digits 0- 9); sequence is terminated by a non-digit character.
<F>	Floating-point (signed or unsigned) value expressed as a series of ASCII coded bytes (including the hyphen (-), the period (.), "E" and the ASCII digits 0- 9).
<R>	Nested (recursive) set of opcodes.
<T>	ASCII text string (a series of ASCII characters) terminated by a ")" and possibly enclosed in ' marks.
[ ]	Optional pattern that may occur in the data stream exactly once, or not at all.
[ ]⁺	Pattern that must occur in the data stream at least one time and may repeat multiple times.
[ ]^*	Pattern that may occur once, many times, or not at all.
\	Indicates that the expression continues on the next line because it would not fit in the table properly (the \ is not part of the data stream).
<ws>	Number of ASCII white-space bytes, including the space, tab, line feed, or carriage return characters.

Opcodes Listed by Format

This section lists opcodes by single byte, extended ASCII, and extended binary formats. It is a convenient, quick reference to the standard opcode definitions in chapter 5.

Single Byte Opcodes

Table 1. Single byte formatted opcodes

ASCII	Hex	Operand Format	Refer to
Ctrl-C	03	<B><B><B><B>	Set Color
Ctrl-f	06	<B_count>[<US_Ecount>]\ <B_c-i>+<B_b+i><L_cs>\ <L_p&f>	Set Font
Ctrl-G	07	<B>[<US>]<S><S><UL>\ <S><S><UL>[<S><S><UL>]⁺	Draw Gouraud Polytriangle
Ctrl-k	11	B_CS-count>[<US_CS-count>]\ [<B_P-counti>[US_P-counti>]]⁺\ <S_x1><S_y1>[<S_xj><S_yj>]⁺	Draw Contour Set
Ctrl-L	0C	<S><S><S><S>	Draw Line
-	8D	<B>[<US>][<S><S>]⁺	Draw Polymarker
Ctrl-P	10	<B>[<US>][<S><S>]⁺	Draw Polyline
Ctrl-R	12	<S><S><US>	Draw Circle
Ctrl-x	18	<L_f><L_H><L_x><L_y> <B_count>[<US_Ecount>]<US_ci>⁺	Draw Text
C	43	[<ws>]<I>	Set Color
E	45	[<ws>]<I>,<I><ws><I>,<I>\ <ws><I>,<I><ws><I>	Draw Ellipse
F	46	*None*	Set Fill Mode
G	47	[<ws>]<I>	Set Marker Glyph
K	71	<B_CS-count>[<US_CS-count>]\ [<B_P-counti>[US_P-counti>]]⁺\ <L_x1><L_y1>[<L_xj><L_yj>]⁺	Draw Contour Set
L	4C	[<ws>]<I>,<I><ws><I>,<I>	Draw Line
M	4D	[<ws>]<I>[<ws><I>,<I>]⁺	Draw Polymarker
O	4F	<UL><UL>	Set CurrentPoint.
P	50	[<ws>]<I>[<ws><I>,<I>]⁺	Draw Polyline/Polygon
R	52	[<ws>]<I>,<I>,<I><ws><I>,<I>	Draw Circle
S	53	[<ws>]<I>	Set Marker Size
V	56	*None*	Set Visibility
b	62	<B>[<US>]<L><L>\ [<L><L><L><L><L><L>]⁺	Draw PolyBézier curve
c	63	<B>	Set Color
e	65	<L><L><UL><UL><US><US><US>	Draw Ellipse
f	66	*None*	Set Fill Mode
g	67	<B>[<US>]<L><L><UL>\ <L><L><UL>[<L><L><UL>]⁺	Draw Gouraud Polytriangle
l	6C	<L><L><L><L>	Draw Line
m	6D	<B>[<US>][<L><L>]⁺	Draw Polymarker
p	70	<B>[<US>][<L><L>]⁺	Draw Polyline
r	72	<L><L><UL>	Draw Circle
s	73	<UL>	Set Marker Size
t	74	<B>[<US>][<L><L>]⁺	Draw Polytriangle
v	76	*None*	Set Visibility
w	77	<B>[<US>][<L><L>]⁺	Draw Textured Polytriangle
x	78	<US_ws><US_ics><L_f><US_q>\ <B_os-count>[<US_os-Ecount>]\ *[<B_os-posi>[<US_os-Eposi>]]^\ <B_us-count>[<US_us-Ecount>]\ [<B_us-posj>[<US_us-Eposj>]]^\* <L_f><L_H><L_x><L_y><L_D0><L_D1>\ <L_D2><L_D3><L_D4><L_D5><L_D6><L_D7>\ <B_count>[<US_Ecount>]<US_ci>⁺	Draw Text
N/A	87	<US>	Set Marker Glyph
N/A	8C	<B>[<S><S><S><S>]⁺	Draw Line
N/A	92	<L><L><UL><US><US>	Draw Circle

Extended ASCII Opcodes

Table 2. Extended ASCII formatted opcodes

ASCII Extended Opcode	Operand Format	Refer to
(Author	<ws><T>)	Define Drawing Author
(Background	<ws><I>,<I>,<I>,<I>[<ws>])	Define Drawing Background
(Bezier	<ws><I><ws><I>,<I>\ [<ws><I>,<I><ws><I>,<I><ws><I>,\ <I>]⁺[<ws>])	Draw PolyBezier curve
(Bounds	<ws><I>,<I><ws><I>,<I>[<ws>])	Define Drawing Bounds
(Clip	<ws><I>,<I><ws><I>,<I>[<ws>])	Set Clip
(Color	<ws><I>,<I>,<I>,<I>[<ws>])	Set Color
(ColorMap	<ws><I>[<ws><I>,<I>,<I>,<I>]⁺[<ws>])	Set Color Map
(Comment	[<ws><T>])	Comment
(ContourSet	<ws><I_CS-count>[<ws><I_P-counti>]⁺\ <ws><I_xl>,<I_yl>[<ws><I_xj>,<I_yj>]⁺	Draw Contour Set
(Created	<ws><I><ws><T>)	Define Drawing Creation Time
(Creator	[<ws><T>])	Define Drawing Creator
(Description	[<ws><T>])	Define Drawing Description.
(DrawingInfo	[<ws><T>])	Define Drawing Information Block
(EmbedFile	<ws>(<T>/<T>;[<T>])\ <ws>(<T>)<ws>(<T>)<ws>{<D>})	Embed Source File
(EmbedRef	<ws>(<T>/<T>;[<T>])\ <ws>(<T>)<ws>(<T>)<ws>(<T>))	Embed Source File
(Gouraud	<ws><I><ws><I>,<I><ws>\ <I>,<I>,<I>,<I><ws><I>,<I><ws>\ <I>,<I>,<I>,<I>[<ws><I>,<I><ws>\ <I>,<I>,<I>,<I>]⁺[<ws>])	Draw Gouraud Polytriangle
(Image	<ws><I>,<I><ws>\ <I>,<I><ws><I>,<I><ws><T>\ [<ws>[,]<I>]⁺)	Draw Image
(Layer	<ws><I>[<ws><T>])	Set Layer
(LineCap	<ws><T>)	Set Line Cap
(LineJoin	<ws><T>)	Set Line Join
(LinePattern	<ws><T>)	Set Line Pattern
(LineWeight	<ws><I>[<ws>])	Set Line Weight
(Modified	<ws><I><ws><T>)	Define Drawing Modification Time
(Projection	<ws><T>)	Set Projection
(Scale	<ws><F><ws><F><ws><T>)	Define Drawing Scale
(SourceCreated	<ws><I><ws><T>)	Define Source Drawing Creation Time
(SourceFilename	[<ws><T>])	Define Source Drawing Filename
(SourceModified	<ws><I><ws><T>)	Define Source Drawing Modification Time
(URL	[<ws><T>])	Set URL Link
(View	[<ws><I>,<I><ws><I>,<I>][<ws>])	Define Initial View

Extended Binary Opcodes

Table 3. Extended binary formatted opcodes

Extended Binary Opcode (Hex)	Operand Format	Refer to
00 01	<B>[<B><B><B><B>]⁺	Set Color Map
00 02	<US><US><L><L><L><L>\ '<T>'[<B>]⁺	Draw Image
00 03	<US><L><L><UL><R>	Define Marker Glyph.