KTX File Format

KTX File Format Specification

22^nd December 2018

Editors:: Mark Callow, Georg Kolling and Jacob Ström

Abstract

KTX is a format for storing textures for OpenGL^® and OpenGL^® ES applications. It is distinguished by the simplicity of the loader required to instantiate a GL texture object from the file contents.

Status of this document

This is the approved final specification.

File Structure

Byte[12] identifier
UInt32 endianness
UInt32 glType
UInt32 glTypeSize
UInt32 glFormat
Uint32 glInternalFormat
Uint32 glBaseInternalFormat
UInt32 pixelWidth
UInt32 pixelHeight
UInt32 pixelDepth
UInt32 numberOfArrayElements
UInt32 numberOfFaces
UInt32 numberOfMipmapLevels
UInt32 bytesOfKeyValueData
  
for each keyValuePair that fits in bytesOfKeyValueData
    UInt32   keyAndValueByteSize
    Byte     keyAndValue[keyAndValueByteSize]
    Byte     valuePadding[3 - ((keyAndValueByteSize + 3) % 4)]
end
  
for each mipmap_level in numberOfMipmapLevels¹
    UInt32 imageSize; 
    for each array_element in numberOfArrayElements²
       for each face in numberOfFaces³
           for each z_slice in pixelDepth²
               for each row or row_of_blocks in pixelHeight²
                   for each pixel or block_of_pixels in pixelWidth
                       Byte data[format-specific-number-of-bytes]⁴
                   end
               end
           end
           Byte cubePadding[0-3]
       end
    end
    Byte mipPadding[0-3]
end

Replace with 1 if this field is 0 or if glInternalFormat is one of the GL_PALETTE* formats from OES_compressed_paletted_texture.
Replace with 1 if this field is 0.
Must be 1 if glInternalFormat is one of the GL_PALETTE* formats from GL_OES_compressed_paletted_texture.
Uncompressed texture data matches a GL_UNPACK_ALIGNMENT of 4.

Field Descriptions

identifier

The file identifier is a unique set of bytes that will differentiate the file from other types of files. It consists of 12 bytes, as follows:

Byte[12] FileIdentifier = {
   0xAB, 0x4B, 0x54, 0x58, 0x20, 0x31, 0x31, 0xBB, 0x0D, 0x0A, 0x1A, 0x0A
}

This can also be expressed using C-style character definitions as:

Byte[12] FileIdentifier = {
    '«', 'K', 'T', 'X', ' ', '1', '1', '»', '\r', '\n', '\x1A', '\n'
}

The rationale behind the choice values in the identifier is based on the rationale for the identifier in the PNG specification. This identifier both identifies the file as a KTX file and provides for immediate detection of common file-transfer problems.

Byte [0] is chosen as a non-ASCII value to reduce the probability that a text file may be misrecognized as a KTX file.
Byte [0] also catches bad file transfers that clear bit 7.
Bytes [1..6] identify the format, and are the ascii values for the string "KTX 11".
Byte [7] is for aesthetic balance with byte 1 (they are a matching pair of double-angle quotation marks).
Bytes [8..9] form a CR-LF sequence which catches bad file transfers that alter newline sequences.
Byte [10] is a control-Z character, which stops file display under MS-DOS, and further reduces the chance that a text file will be falsely recognised.
Byte [11] is a final line feed, which checks for the inverse of the CR-LF translation problem.

endianness

endianness contains the number 0x04030201 written as a 32 bit integer. If the file is little endian then this is represented as the bytes 0x01 0x02 0x03 0x04. If the file is big endian then this is represented as the bytes 0x04 0x03 0x02 0x01. When reading endianness as a 32 bit integer produces the value 0x04030201 then the endianness of the file matches the the endianness of the program that is reading the file and no conversion is necessary. When reading endianness as a 32 bit integer produces the value 0x01020304 then the endianness of the file is opposite the endianness of the program that is reading the file, and in that case the program reading the file must endian convert all header bytes and, if glTypeSize > 1, all texture data to the endianness of the program (i.e. a little endian program must convert from big endian, and a big endian program must convert to little endian).

glType

For compressed textures, glType must equal 0. For uncompressed textures, glType specifies the type parameter passed to glTex{,Sub}Image*D, usually one of the values from table 8.2 of the OpenGL 4.4 specification (UNSIGNED_BYTE, UNSIGNED_SHORT_5_6_5, etc.)

glTypeSize

glTypeSize specifies the data type size that should be used when endianness conversion is required for the texture data stored in the file. If glType is not 0, this should be the size in bytes corresponding to glType. For texture data which does not depend on platform endianness, including compressed texture data, glTypeSize must equal 1.

glFormat

For compressed textures, glFormat must equal 0. For uncompressed textures, glFormat specifies the format parameter passed to glTex{,Sub}Image*D, usually one of the values from table 8.3 of the OpenGL 4.4 specification. (RGB, RGBA, BGRA, etc.)

glInternalFormat

For compressed textures, glInternalFormat must equal the compressed internal format, usually one of the values from table 8.14 of the OpenGL 4.4 specification. For uncompressed textures, glInternalFormat specifies the internalformat parameter passed to glTexStorage*D or glTexImage*D, usually one of the sized internal formats from tables 8.12 & 8.13 of the OpenGL 4.4 specification. The sized format should be chosen to match the bit depth of the data provided. glInternalFormat is used when loading both compressed and uncompressed textures, except when loading into a context that does not support sized formats, such as an unextended OpenGL ES 2.0 context where the internalformat parameter is required to have the same value as the format parameter.

glBaseInternalFormat

For both compressed and uncompressed textures, glBaseInternalFormat specifies the base internal format of the texture, usually one of the values from table 8.11 of the OpenGL 4.4 specification (RGB, RGBA, ALPHA, etc.). For uncompressed textures, this value will be the same as glFormat and is used as the internalformat parameter when loading into a context that does not support sized formats, such as an unextended OpenGL ES 2.0 context.

pixelWidth, pixelHeight, pixelDepth

The size of the texture image for level 0, in pixels. No rounding to block sizes should be applied for block compressed textures.

For 1D textures pixelHeight and pixelDepth must be 0. For 2D and cube textures pixelDepth must be 0.

numberOfArrayElements

numberOfArrayElements specifies the number of array elements. If the texture is not an array texture, numberOfArrayElements must equal 0.

numberOfFaces

numberOfFaces specifies the number of cubemap faces. For cubemaps and cubemap arrays this should be 6. For non cubemaps this should be 1. Cube map faces are stored in the order: +X, -X, +Y, -Y, +Z, -Z.

Due to GL_OES_compressed_paletted_texture not defining the interaction between cubemaps and its GL_PALETTE* formats, if `glInternalFormat` is one of its GL_PALETTE* format, numberOfFaces must be 1

numberOfMipmapLevels

numberOfMipmapLevels must equal 1 for non-mipmapped textures. For mipmapped textures, it equals the number of mipmaps. Mipmaps are stored in order from largest size to smallest size. The first mipmap level is always level 0. A KTX file does not need to contain a complete mipmap pyramid. If numberOfMipmapLevels equals 0, it indicates that a full mipmap pyramid should be generated from level 0 at load time (this is usually not allowed for compressed formats).

For the GL_PALETTE* formats, this equals the number of mipmaps and is passed as the levels, parameter when uploading to OpenGL {,ES}. However all levels are packed into a single block of data along with the palette so numberOfMipmapLevels is considered to be 1 in the for loop over the data. Individual mipmaps are not identifiable.

bytesOfKeyValueData

An arbitrary number of key/value pairs may follow the header. This can be used to encode any arbitrary data. The bytesOfKeyValueData field indicates the total number of bytes of key/value data including all keyAndValueByteSize fields, all keyAndValue fields, and all valuePadding fields. The file offset of the first imageSize field is located at the file offset of the bytesOfKeyValueData field plus the value of the bytesOfKeyValueData field plus 4.

keyAndValueByteSize

keyAndValueByteSize is the number of bytes of combined key and value data in one key/value pair following the header. This includes the size of the key, the NUL byte terminating the key, and all the bytes of data in the value. If the value is a UTF-8 string it should be NUL terminated and the keyAndValueByteSize should include the NUL character (but code that reads KTX files must not assume that value fields are NUL terminated). keyAndValueByteSize does not include the bytes in valuePadding.

keyAndValue

keyAndValue contains 2 separate sections. First it contains a key encoded in UTF-8 without a byte order mark (BOM). The key must be terminated by a NUL character (a single 0x00 byte). Keys that begin with the 3 ascii characters 'KTX' or 'ktx' are reserved and must not be used except as described by this spec (this version of the KTX spec defines a single key). Immediately following the NUL character that terminates the key is the Value data.

The Value data may consist of any arbitrary data bytes. Any byte value is allowed. It is encouraged that the value be a NUL terminated UTF-8 string but this is not required. UTF-8 strings must not contain BOMs. If the Value data is binary, it is a sequence of bytes rather than of words. It is up to the vendor defining the key to specify how those bytes are to be interpreted (including the endianness of any encoded numbers). If the Value data is a string of bytes then the NUL termination should be included in the keyAndValueByteSize byte count (but programs that read KTX files must not rely on this).

valuePadding

valuePadding contains between 0 and 3 bytes of value 0x00 to ensure that the byte following the last byte in valuePadding is at a file offset that is a multiple of 4. This ensures that every keyAndValueByteSize field, and the first imageSize field, is 4 byte aligned. This padding is included in the bytesOfKeyValueData field but not the individual keyAndValueByteSize fields.

imageSize

For most textures imageSize is the number of bytes of pixel data in the current LOD level. This includes all array layers, all z slices, all faces, all rows (or rows of blocks) and all pixels (or blocks) in each row for the mipmap level. It does not include any bytes in mipPadding.

The exception is non-array cubemap textures (any texture where numberOfFaces is 6 and numberOfArrayElements is 0). For these textures imageSize is the number of bytes in each face of the texture for the current LOD level, not including bytes in cubePadding or mipPadding.

cubePadding

For non-array cubemap textures (any texture where numberOfFaces is 6 and numberOfArrayElements is 0) cubePadding contains between 0 and 3 bytes of value 0x00 to ensure that the data in each face begins at a file offset that is a multiple of 4. In all other cases cubePadding is empty (0 bytes long).

This is empty in the non-array cubemap case as well. The requirement of GL_UNPACK_ALIGNMENT = 4 means the size of uncompressed textures will always be a multiple of 4 bytes. All known compressed formats, that are usable for cubemaps, have block sizes that are a multiple of 4 bytes.

The field is still shown in case a compressed format emerges with a block size that is not a multiple of 4 bytes.

mipPadding

Between 0 and 3 bytes of value 0x00 to make sure that all imageSize fields are at a file offset that is a multiple of 4.

This is empty for all known texture formats for the reasons given in cubePadding and is retained for the same reason.

General comments

The unpack alignment is 4. I.e. uncompressed pixel data is packed according to the rules described in section 8.4.4.1 of the OpenGL 4.4 specification for a GL_UNPACK_ALIGNMENT of 4.

Values listed in tables referred to in the OpenGL 4.4 specification may be supplemented by extensions. The references are given as examples and do not imply that all of those texture types can be loaded in OpenGL ES or earlier versions of OpenGL.

Texture data in a KTX file are arranged so that the first pixel in the data stream for each face and/or array element is closest to the origin of the texture coordinate system. In OpenGL that origin is conventionally described as being at the lower left, but this convention is not shared by all image file formats and content creation tools, so there is abundant room for confusion.

The desired texture axis orientation is often predetermined by, e.g. a content creation tool's or existing application's use of the image. Therefore it is strongly recommended that tools for generating KTX files clearly describe their behaviour, and provide an option to specify the texture axis origin and orientation relative to the logical orientation of the source image. At minimum they should provide a choice between top-left and bottom-left as origin for 2D source images, with the positive S axis pointing right. Where possible, the preferred default is to use the logical upper-left corner of the image as the texture origin. Note that this is contrary to the standard interpretation of GL texture coordinates. However, the majority of texture compression tools use this convention.

As an aid to writing image manipulation tools and viewers, the logical orientation of the data in a KTX file may be indicated in the file's key/value metadata. Note that this metadata affects only the logical interpretation of the data, has no effect on the mapping from pixels in the file byte stream to texture coordinates. The recommended key to use is:

KTXorientation

It is recommended that viewing and editing tools support at least the following values:

S=r,T=d
S=r,T=u
S=r,T=d,R=i
S=r,T=u,R=o

where

S indicates the direction of increasing S values
T indicates the direction of increasing T values
R indicates the direction of increasing R values
r indicates increasing to the right
l indicates increasing to the left
d indicates increasing downwards
u indicates increasing upwards
o indicates increasing out from the screen (moving towards viewer)
i indicates increasing in towards the screen (moving away from viewer)

Although other orientations can be represented, it is recommended that tools that create KTX files use only the values listed above as other values may not be widely supported by other tools.

An example KTX file:

// HEADER
0xAB, 0x4B, 0x54, 0x58, // first four bytes of Byte[12] identifier
0x20, 0x31, 0x31, 0xBB, // next four bytes of Byte[12] identifier
0x0D, 0x0A, 0x1A, 0x0A, // final four bytes of Byte[12] identifier
0x04, 0x03, 0x02, 0x01, // Byte[4] endianess (Big endian in this case)
0x00, 0x00, 0x00, 0x00, // UInt32 glType = 0
0x00, 0x00, 0x00, 0x01, // UInt32 glTypeSize = 1
0x00, 0x00, 0x00, 0x00, // UInt32 glFormat = 0
0x00, 0x00, 0x8D, 0x64, // UInt32 glInternalFormat = GL_ETC1_RGB8_OES
0x00, 0x00, 0x19, 0x07, // UInt32 glBaseInternalFormat = GL_RGB
0x00, 0x00, 0x00, 0x20, // UInt32 pixelWidth = 32
0x00, 0x00, 0x00, 0x20, // UInt32 pixelHeight = 32
0x00, 0x00, 0x00, 0x00, // UInt32 pixelDepth = 0
0x00, 0x00, 0x00, 0x00, // UInt32 numberOfArrayElements = 0
0x00, 0x00, 0x00, 0x01, // UInt32 numberOfFaces = 1
0x00, 0x00, 0x00, 0x01, // UInt32 numberOfMipmapLevels = 1
0x00, 0x00, 0x00, 0x10, // UInt32 bytesOfKeyValueData = 16
// METADATA
0x00, 0x00, 0x00, 0x0A, // UInt32 keyAndValueByteSize = 10
0x61, 0x70, 0x69, 0x00, // UTF8 key:   'api\0'
0x67, 0x6C, 0x65, 0x73, // UTF8 v: 'gles2\0'
0x32, 0x00, 0x00, 0x00, // Byte[2] valuePadding (2 bytes)
// TEXTURE DATA
0x00, 0x00, 0x02, 0x00, // UInt32 imageSize = 512 bytes
0xD8, 0xD8, 0xD8, 0xDA, // Byte[512] ETC compressed texture data...
...

IANA Mime-Type Registration Information

Permission is expressly granted to IANA to copy this section as necessary for managing the MIME types registry.

Type name: Image

Subtype name: ktx

Required parameters: none

Optional parameters: none

Encoding considerations: binary

Security considerations:

The ktx type is a binary data stream which contains no executable code that could disrupt a client processor. There is no provision in the type specification that would allow authors to insert executable code that would present any security risk to a client machine.

Because every item's length is available at its beginning, there is robust defense against corrupted or fraudulent data that might overflow a decoder's buffer. Also the signature bytes provide early detection of common file transmission errors.

The ktx type may contain texture data compressed using OpenGL standard or vendor-specific schemes. These compression schemes are designed so small blocks of data (typically around 64 bits) can be decompressed in real time into a small block of pixels (typically 4x4) during texel fetch. In such schemes it is not possible for a small amount of data to expand enormously because the level of compression is limited; the compressed size is related directly to the number of pixels in the uncompressed image and not to the content of the data.

The ktx type does not provide encryption of the data payload. Users or applications wishing or needing to keep their images confidential must overlay their own encryption on the ktx data during transmission.

Interoperability considerations:

The ktx type includes a field identifying the endianness of the machine which created the data. Applications reading the data are expected to check this field and convert the endianness, if necessary. The texture data payload may be compressed using an OpenGL-vendor-specific scheme. In this case, only devices or applications having a matching decompressor will be able to display the data. The compression scheme is identified in the ktx data so applications can quickly reject data using unsupported schemes.

Published specification:

Applications that use this media type :

Currently only etcpack. It is anticipated it will be widely used by applications built on top of the OpenGL family of standards, OpenGL, OpenGL ES and WebGL, as the means of delivering texture data.

Additional information:

Magic number(s): 12 octets - { 0xAB, 0x4B, 0x54, 0x58, 0x20, 0x31, 0x31, 0xBB, 0x0D, 0x0A, 0x1A, 0x0A }
File extension(s): .ktx
Macintosh file type code(s):

Person & email address to contact for further information: Mark Callow (callow_mark at hicorp.co.jp)

Intended usage: COMMON

Restrictions on usage: none

Authors: Mark Callow, Georg Kolling, Jacob Ström

Change controller: The Khronos Group, the industry consortium responsible for standards such as OpenGL, OpenGL ES, WebGL and OpenCL.

References

Normative references

OpenGL^® 4.4 Core Profile, Mark Segal, Kurt Akeley, Jon Leech, July 2013.

This reference is not intended to imply that values in file header fields are limited to those in the referenced tables. New values may be introduced at any time by OpenGL {,ES} extensions or new versions.

OES_compressed_paletted_texture, Aaftab Munshi, July 2003.

Other references

Acknowledgements

This specification was produced by the Khronos OpenGL^® ES Working Group.

Special thanks to: Acorn Pooley (NVIDIA), Bruce Merry (ARM).

Revision History

03 Sept 2013: Clarified that glBaseInternalFormat is only used when a context does not support sized formats, updated references to point to the OpenGL 4.4 specification with hyperlinks and included TexStorage*D and TexSubImage*D with the functions whose parameters are provided by a KTX file header.
26 July 2018: Clarified when data needs to be endian converted. Documented handling of paletted texture formats. Documented that cubePadding and mipPadding are likely to be empty. Removed hidden "joke".
22 Dec 2018: Clarified that UTF-8 strings must not contain BOMs and that padding bytes must have the value 0.