Evaluation of the Linux XIG drivers
for the WildCat 5110 and 6110

Written by Paul Bourke

(/usr/X11R6/lib/X11/AcceleratedX/bin/Xsupport)


The following are observations on the WildCat 5110 and 6110 OpenGL card and, more importantly, the drivers for Linux from XIG. All applications are developed inhouse and tested on a variety of cards. This card and the XIG drivers were used in three modes.

While we're not actively using the drivers/card in this mode, I have verified that they function correctly presenting frame sequential stereo at 120Hz on a double width screen. The card is rated to 1280x1024 in dual view and 120Hz but I've only tested it at 1024x768.


Texture problem with DELL 650s

There seems to be a serious problem with the cards/drivers with the more recent DELL machines we have been purchasing, namely the 650s. Considerable time has been spent trying to track this down without success. The "guess" is that it is a problem with between the drivers and the increased cache (256 to 512) in those machines. It manifests itself in application that make a high use of textures, it seems to occur more often during window resizing.

Unresolved.
This has prompted us to switch to the Quadro4-900 cards and linux drivers from NVidia.


Refresh sync on 6110

I don't seem to be able to find the sync to vertical refresh option on the WildCat 6110 Xsetup panel. Am I missing something or doesn't it exist, my digital movie playback looks pretty bad without it (compared to the WildCat 5110).

We had to remove the option from Xsetup because of some problems we were seeing with server / direct rendering client communication. The feature is now controlled via an environment variable. Set the OGL_NO_VBLANK variable to 0 (zero) before running your app. If you are executing from a bash shell, just 'export OGL_NO_VBLANK=0'.


16:9 resolutions

How does one choose 16:9 screen resolutions? For example to drive a DLP projector that has native 1280x720 pixels.

Our server doesn't support 1280x720 officially. However, you may be able to get it to work using the Xtiming I've created for you below. If that doesn't work, we'll probably have to get one of these projectors in so we can add support for it officially. Note that I'm assuming the projector runs at 60Hz. Here's what you'll have to do to try this:

1) Edit your /usr/X11R6/lib/X11/AcceleratedX/etc/Xtimings file, and paste the information below into it.

2) Edit your board's .xqa file to include 1280x720 in all [RESOLUTION] sections. The name of the .xqa file for your board can be found in /etc/Xaccel.ini, and it is located in /usr/X11R6/lib/X11/AcceleratedX/boards/YourBoardVendor/.

You should then be able to configure 1280x720 @ 60Hz in Xsetup.

[PREADJUSTED_TIMING]
    PreadjustedTimingName = "1280x720 @ 60Hz";

    HorPixel              = 1280;
    VerPixel              = 720;
    PixelWidthRatio       = 4;
    PixelHeightRatio      = 3;
    HorFrequency          = 44.772;
    VerFrequency          = 60.016;
    ScanType              = NONINTERLACED;
    HorSyncPolarity       = NEGATIVE;
    VerSyncPolarity       = POSITIVE;
    CharacterWidth        = 8;
    PixelClock            = 74.500;
    HorTotalTime          = 22.336;
    HorAddrTime           = 17.181;
    HorBlankStart         = 17.181;
    HorBlankTime          = 5.154;
    HorSyncStart          = 17.933;
    HorSyncTime           = 1.826;
    VerTotalTime          = 16.663;
    VerAddrTime           = 16.082;
    VerBlankStart         = 16.082;
    VerBlankTime          = 0.581;
    VerSyncStart          = 16.104;
    VerSyncTime           = 0.067;
Hmmmm, this seems unsustainable if one needs a different entry for different projectors.....it depends of course on how many of the parameters are critical. There are for example half a dozen native 16:9 projectors we're considering using for an upcoming project. As time goes on 16:9 is going to be more prevalent for home theatre type projection, etc.

I will let engineering know that there is some demand for official 16:9 support. If these projectors are similar in specification, however, and they all run at 1280x720, the same .vda file _should_ work for all of them. As long as they all do 1280x720 @ 60Hz, I would be surprised if they didn't work with the same configuration. The Xtiming is system-wide, and the .xqa file is specific to the card being used, so the only variable would be the .vda file.


3D texture and mipmaps

There seems to be a problem with 3D textures and mipmapping where the lower corner of the 3dtexture data appears to be corrupted by the mipmaps.

For example the following works fine (only a 32x32x32 3D texture)
   glGenTextures(1,&texid);
   glBindTexture(GL_TEXTURE_3D,texid);
   glTexParameteri(GL_TEXTURE_3D,GL_TEXTURE_WRAP_S,GL_CLAMP);
   glTexParameteri(GL_TEXTURE_3D,GL_TEXTURE_WRAP_T,GL_CLAMP);
   glTexParameteri(GL_TEXTURE_3D,GL_TEXTURE_WRAP_R,GL_CLAMP);
   glTexParameteri(GL_TEXTURE_3D,GL_TEXTURE_MIN_FILTER,GL_LINEAR);
   glTexParameteri(GL_TEXTURE_3D,GL_TEXTURE_MAG_FILTER,GL_LINEAR);
   glTexImage3D(GL_TEXTURE_3D,0,4,ntex,ntex,ntex,0,GL_RGBA,GL_UNSIGNED_BYTE,tex3d);
while the following doesn't
   glGenTextures(1,&texid);
   glBindTexture(GL_TEXTURE_3D,texid);
   glTexParameteri(GL_TEXTURE_3D,GL_TEXTURE_WRAP_S,GL_CLAMP);
   glTexParameteri(GL_TEXTURE_3D,GL_TEXTURE_WRAP_T,GL_CLAMP);
   glTexParameteri(GL_TEXTURE_3D,GL_TEXTURE_WRAP_R,GL_CLAMP);
   glTexParameteri(GL_TEXTURE_3D,GL_TEXTURE_MIN_FILTER,GL_LINEAR_MIPMAP_LINEAR);
   glTexParameteri(GL_TEXTURE_3D,GL_TEXTURE_MAG_FILTER,GL_LINEAR);
   glTexImage3D(GL_TEXTURE_3D,0,4,ntex,ntex,ntex,0,GL_RGBA,GL_UNSIGNED_BYTE,tex3d);
   gluBuild3DMipmaps(GL_TEXTURE_3D,4,ntex,ntex,ntex,GL_RGBA,GL_UNSIGNED_BYTE,tex3d);
With mipmap
Without mipmap
Unresolved


Line Loop Limitation

Quote from page 102 of the "Blue Book": Regardless of the value chosen for mode, there is no limit to the number of vertices that can be defined between glBegin and glEnd.

Unfortunately glBegin(GL_LINE_LOOP) will cause a severe crash if the number of vertices is 512 or more. A crappy demo program is given here: test2.c

   // glBegin(GL_LINE_LOOP); this fails for N >= 511
   // glBegin(GL_LINE_STRIP); works for "any" number
   for (i=0;i<N;i++) {
      phi = TWOPI * i / (double)N;
      p.z = 0.5;
      p.y = 2 * cos(phi);
      p.x = 1.5 * sin(phi);
      glVertex3f(p.x,p.y,p.z);
   }
   glEnd();
Fixed in 2.1-3 of the driver and V25 of xsvc.


Flood drawing

If frames are drawn as fast as possible (glutPostRedisplay() every glut idle event) the machine quickly "hangs". Actually the graphics is still being drawn but the mouse and keyboard freeze (I guess they aren't getting any events or interrupts (?)). I don't experience this on any other system I work with. The fault doesn't necessarily occur immediately but it can be hurried along with a few mouse clicks, window shift, etc.
Here's a small piece of code that illustrates the problem: test1.c.

For example
void HandleVisibility(int visible)
{
   if (visible == GLUT_VISIBLE)
      glutIdleFunc(HandleIdle);
   else
      glutIdleFunc(NULL);
}

/*
   What to do on an idle event
*/
void HandleIdle(void)
{
   /* This works fine
   static double tstart = -1;
   double tstop;
   double target = 0.04;

   if (tstart < 0)
      tstart = GetRunTime();
   tstop = GetRunTime();
   if (tstop - tstart > target) {
      glutPostRedisplay();
      tstart = tstop;
   }
   */
   /* Just the following will eventually fail */
   glutPostRedisplay();
}
Unresolved.


Sync to refresh

The sync to refresh option as provided on the Xsetup utility doesn't seem to have any effect. My sample model runs at 100+ frames per second even though the refresh may be set to 60Hz (for DLP projectors say). Tearing and uneven playback on many types of content is obvious.

Fixed in 2.1-2 with xsvc 22, verified!


Image flashing

I notice many colour flashes when first launching OpenGL applications. The flash seems to be parts of the last image displayed. No amount of buffer clearing in OpenGL makes any difference. This is VERY disturbing then running full screen applications and transitioning between them through a black screen!

That is a HW issue with the WC2/WC3. What happens is that there is something I would call a pixel-cache for drawing. I.e. some onchip memory that is not immediately after rendering written back to the framebuffer. We have found instances where the display controllers notion of what is actually to be displayed and the pixel-caches notion of what needs to be written back disagree. 3Dlabs told us about a workaround of this problem which they have used in their Windows drivers (of forced flush of the pixel-cache). However using this exact workaround causes under LINUX hangs. We are aware of the issue and are trying to resolve it (without hanging the graphics chip of course).


Background process performance

Currently, if I run an OpenGL application (one which isn't particularly demanding of memory or cpu) I then can't connect to the machine remotely, or at least VERY slowly. It almost seems as if the OpenGL application is using very iota of CPU and not allowing other processes/kernal to do anything. All KDE window function seem to be running very slow, I can almost see the bottom panel drawing itself.

Unresolved Seems to be fixed in 2.1-3 with xsvc 25.


Screen size

2.1-3 with xsvc 25 has introduced a screen height reporting problem. For example
    glutReshapeFunc(HandleReshape);
and
    glutGet(GLUT_WINDOW_HEIGHT);
reports a screen height one less than it should be!!!!!

Fixed with 2.1-4.


Drawing over other windows

It seems that if I bring another X window up in front of my OpenGL window that the X Window gets overwritten. This occurs for things like the "forms" library which I use a lot but also for more basic things such as glut menus. It ends up being pretty messy.

To my slight annoyance this has stopped happening. "Annoyance" because I worry that it might start happening again.


Display lists

Using display lists results in a HUGE performance degradation !n some unusual cases. From 100fps to 5 fps for a mere 2500 lines in the display list. The "red book" states that using display lists should be no worse than issuing the instructions individually.

Here's an example, works fine without displaylists, runs like a dead dog in a displaylist.
   for (j=0;j<360;j+=6) {
      glPushMatrix();
      glRotatef((double)j,0.0,1.0,0.0);
      for (i=-140;i<140;i++) {
         glLineWidth(1.0+drand48()*4);
         glColor3f(0.7+rand()/4,0.7+drand48()/4,0.7+drand48()/4);
         glBegin(GL_LINES); 
         x = r1 + r1 * cos(i*DTOR);
         y = r2 * sin(i*DTOR);
         z = 0;
         glVertex3f(x,y,z);   
         x = r1 + r1 * cos((i+1)*DTOR);
         y = r2 * sin((i+1)*DTOR);
         z = 0;
         glVertex3f(x,y,z);
         glEnd();
      }   
      glPopMatrix();      
   }

By anyones standards this is poor OpenGL code but used when line segments are being drawn each of a different thickness. Sure they can be sorted and drawn in chunks. The same applies to points!

Workaround is easy.


Performance of points

Unlike other cards/drivers I've used there is a huge difference between the following two.

   for (i=0;i<ndot;i++) {
      glBegin(GL_POINTS);
      SetColour(dot[i].colour,transparency,dot[i].p);
      glVertex3f(dot[i].p.x,dot[i].p.y,dot[i].p.z);
      glEnd();
   }
and
   glBegin(GL_POINTS);
   for (i=0;i<ndot;i++) {
      SetColour(dot[i].colour,transparency,dot[i].p);
      glVertex3f(dot[i].p.x,dot[i].p.y,dot[i].p.z);
   }
   glEnd();

Sure, the second is obviously better but the performance hit is higher than expected. Why would one want to do the first.....for cases where there are lots of point size changes.

The solution of course is to sort by size and do something like

   oldsize = dot[0].size;
   glPointSize(oldsize);
   glBegin(GL_POINTS);
   for (i=0;i<ndot;i++) {
      if (oldsize != dot[i].size) {
         oldsize = dot[i].size;
         glEnd();
         glPointSize(oldsize);
         glBegin(GL_POINTS);
      }
      SetColour(dot[i].colour,transparency,dot[i].p);
      glVertex3f(dot[i].p.x,dot[i].p.y,dot[i].p.z);
   }
   glEnd();


View port size

The maximum viewport size "glGetIntegerv(GL_MAX_VIEWPORT_DIMS,num);" is limited to 4096 square instead of the much larger sizes I'm used to on other cards. This makes the standard method for creating high resolution images for printing a little restrictive, that is where one creates a large viewport and portions are chosen in a grid fashion, the buffer is saved for each grid portion and all the bits are merged together for a high resolution result.

XIG inform me that this is as big as it can be, due to limits on the WildCat of 8192 less some framing space due to a hardware bug .... details sketchy. This isn't a terribly serious problem because in most cases (not all) a 4096 image is high enough resolution, there are ways around this limitation.


Unsupported at the moment