« Mac PBuffers | Main | Fragment Program Utilities »

May 18, 2005

Fragment Program Reference

The OpenGL Extensions Guide has a great chapter on the ARB fragment program language, I recommend buying it, but there aren't many good references online. The most useful is the official spec, but it's designed as an exhaustive guide, not a quick reference for programmers. Here's a rundown of the instruction set, and some tips and tricks.

Here's a cut-out-and-keep table of all instructions, based on table X.5 in the spec:



ARB fragment program instructions
InstructionInputsOutputDescriptionPseudocodeNotes
ABSvvabsolute valuefabs(arg1)
ADDv,vvaddarg1+arg2
CMPv,v,vvcompareif (arg1<0) arg2 else arg3
COSssssscosinecos(arg1)Synthesised using 5 instructions on ATI R3xx
DP3v,vssss3-component dot product(arg1.x*arg2.x)+(arg1.y*arg2.y)+(arg1.z*arg2.z)
DP4v,vssss4-component dot product(arg1.x*arg2.x)+(arg1.y*arg2.y)+(arg1.z*arg2.z)+(arg1.w*arg2.w)Emulated with 4 native instructions on ATI R3xx
DPHv,vsssshomogeneous dot product(arg1.x*arg2.x)+(arg1.y*arg2.y)+(arg1.z*arg2.z)+arg2.w
DSTv,vvdistance vectorFunky, see spec
EX2sssssexponential base 2pow(2,arg1)Using negative args on NVidia gives different results to ATI
FLRvvfloorfloor(arg1)
FRCvvfractionarg1-(int)arg1
KILvvkill fragmentif (arg1<0) return
LG2ssssslogarithm base 2log(arg1)
LITvvcompute light coefficientsFunky, see spec
LRPv,v,vvlinear interpolation(arg2*arg1)+(arg3*(1-arg1))Order is lerpValue, end, start
MADv,v,vvmultiply and add(arg2*arg3)+arg4Really useful to reduce instruction counts
MAXv,vvmaximumif (arg1<arg2) arg2 else arg1
MINv,vvminimumif (arg1>arg2) arg2 else arg1
MOVvvmovearg1
MULv,vvmultiplyarg1*arg2
POWs,sssssexponentiatepow(arg1,arg2)
RCPsssssreciprocal1/arg1
RSQsssssreciprocal square rootr1/sqrt(arg1)
SCSsss--sine/cosineresult.x=sin(arg1) result.y=cos(arg1)Synthesised using multiple instructions on ATI R3xx
SGEv,vvset on greater than or equalif (arg1>=arg2) 1.0 else 0.0
SINssssssinesin(arg1)Synthesised using five instructions on ATI R3xx
SLTv,vvset on less thanif (arg1<arg2) 1.0 else 0.0
SUBv,vvsubtractarg1-arg2
SWZvvextended swizzleFunky, see specSynthesised using multiple instructions on ATI R3xx
TEXv,u,tvtexture sampleTexture instructions are almost always the performance bottleneck
TXBv,u,tvtexture sample with bias
TXPv,u,tvtexture sample with projection
XPDv,vvcross product[(arg1.y*arg2.z-arg1.z*arg2.y),(arg1.z*arg2.x-arg1.x*arg2.z),(arg1.x*arg2.y-arg1.y*arg2.x)]

Always specify if you're only using some components in an instruction, the compilers aren't smart enough generally to figure out if you only use the .x component of the result later on, and both vendors' hardware has clever tricks they can play executing vector and scalar instructions in parallel.

Try and calculate everything you can using arithmetic instructions rather than doing table lookups from textures. Memory access almost always seems to be the limiting factor on the speed of our fragment programs, you've got a lot of free instruction slots that can be filled performing extra calculations while the hardware's waiting on memory.

ATI cards support simple swizzling, either where you're masking out some components in the result register (ADD foo.xy, bob, jim;) or where you're duplicating a single component across the whole register (ADD foo, bob, jim.x;)
Anything more complicated will be emulated using multiple instructions (ADD foo, bob.zyzy, jim; or ADD foo, bob.xxxy, jim;)

On ATI, using GL_TEXTURE_RECTANGLE_EXT textures in TEX instructions (RECT as the target) will generate hidden instructions to convert the coordinates to the 0 to 1 range, from the input range of 0 to that's used for the extension. This is especially tricky because it adds another hidden level of texture indirection.

Posted by petewarden at May 18, 2005 04:28 PM

Comments

Post a comment




Remember Me?