Mathematical typesetting using HTML

Written by Paul Bourke
January 1994


Warning
There may well be parts of this document which "don't look right" given the exact browser/machine/fonts combination. Consider this to to a personal example of the problem of using mathemathics in HTML documents, all the examples below function in the authors environment.


Mathematical typesetting within WWW dosuments using HTML has always been problematic. HTML as a page description/layout description was designed using a minimal client/browser model, that is, the content needed to be reliably displayed on all platforms no matter how primitive. Indeed, it wasn't even assumed that the browser would have graphical capabilities (eg: lynx) and it was recommended that authors of HTML files which included graphics would include the "alt" tag with images so that non image capable browsers would have some idea what the image was about.
It was also assumed that the HTML document would not be able to make assumtions regarding fonts, most browsers give control over fonts to the user not the author. One immediate issue this raises is the inability for authors to use typefaces consisting of Greek symbols, a common convention in scientific documentation.

The remainder of this document will discuss some of the solutions to the problem of delivering mathematical typsetting within HTML files. The examples shown here should display correctly on the major WWW browsers, they have been designed however using NetScape version 3.

Graphical equations

Replacing equations with images has the advantage that the equation can be rendered with whatever typesetting tools are available, the image representation is placed in the document as a gif image. As an example The following was created by hand using a standard drawing package.

The following was extracted from a Latex based document

While this technique works well for equations on a line by themselves, it can look ugly when an attempt is made to include images on a line with other text. The typeface is normally different, has a different size, and the images will not normally align vertically with the text.
There is an align middle tag for the <img> tag which can be used to align the symbols closer to the right line height, see for example the following 4 lower case greek characters: pi (), beta (), lower case x (), phi ().
The whole alphabet based on the standard "Symbol" font is

Character set

The only assured typeface is the so called "latin 1" set. The characters from 32 to 126 are the standard ascii characters, the characters from 160 to 255 are mostly accents. There are some useful symbols, for example, plus and minus (±), degree (°), double angle brackets («»)
Of symbols with special HTML significance have & sequences, namely ampersand (&), less than (<) and greater than (>)

Some extensions to HTML allow font specification, this will not be discussed here as it obviously depends on the character sets available on the clients computer. For normal text if the viewer sees a different font from the one specified they can at least still read it. If the font was used because it contains special symbols then the viewer cannot in general determine any meaning if the result is displayed in another font.

Superscript and subscript

The HTML specification is changing slowly and the browsers evolve to follow those trends, the browsers also often add additional features. For some time now it has been possible to superscript and subscript using the tags <sup> </sup> and <sub> </sub>. So for example the following work inline with text N2, Pij, and f(t) = A(t) e-jwt

While the exact implementation of these is browser specific, they normally automatically reduce the font size of the subscript and superscript. Unfortunately they don't work ideally in combination Pij2, that is, the ij doesn't appear directly below the 2.

The Greek character set images used above can also have superscripts and subscripts added to them although they can't be the superscript of superscript themselves, for example, n
This can be extended to other commonly used symbols such as integrals () and square roots ()
01 f(x) dx = 2

Space alignment

Another approach is to use the <pre> </pre> tags and space align symbols, fractions, etc.
   x   y   z
   - + - + - = 1
   a   b   c 
Unfortunately not all browsers allow you to include other HTML styles within <pre> </pre>, NetScape does so the following will work
   x2   y2   z2
   -  + -  + -  = r2
   a2   b2   c2
In general this technique should be avoided, not only isn't it very attractive but it does make assumptions of the font size when applying other HTML formatting such as bold, italic, etc.

Tables

Tables have been available in the HTML specification for quite a while, for example
Column 1 Column 2 Column 3 Column 4 Column 5
Row 1 Left Right Centered 4
Row 2 1 2 3 4
Row 3 1 2 3 4

Tables can be used to lay out multiple items, an obvious application is to place labels on the right for equations, the line below consists of a two column, one row table. The first cell holds the equation and the second the right aligned equation number.
. . . . (3a)
This is obviously nicer than including the label with the graphic image, the above method will automatically adjust with page size.

A table without borders is suitable for items in regular rows and columns

x0 + x1 + x2 = 0
- x1 - x2 = 1
x0 + 2x1 + x2 = 15

Each of the above items including the individual signs are in their own table row/column.

Linking

The primary means of moving around the WWW is through links. These may be links to other relevant documents anywhere else on the WWW. Links can also refer to positions within the same document by using the <a name="---"> </a> link reference. This is particularly useful for jumping from the main text to a figure or reference whenever it is mentioned in the document. For example see equation 1.

Summary

The above illustrates that while conventional mathematical typesetting in HTML is still not possible, there are a large number of effects that can be produced with "standard" HTML and which will thus be viewed correctly on the vast majority of browers. The simple inline expression can mostly be dealt with, the others can be made into images and placed on a line by themselves.