The Compiler
Okay, now all we need is the routine that turns a normal sprite into a compiled sprite... A sprite Compiler. The sprite compiler is actually simpler than the RLE converter. With compiled sprites, we simply ignore transparent pixels. Hold your breath! Here we go!:
PROCEDURE ConvertSpriteToCompiled(VAR Sprite : SpriteType);
VAR
SpriteSrc, { Current pointer into source sprite }
SpriteDest : WORD; { Current pointer into dest sprite }
NewSprite : SpriteType; { Temporary sprite }
Value : BYTE; { Color of current pixel }
X, Y : INTEGER; { Current pixel being compiled }
BEGIN
NewSprite.Width := Sprite.Width;
NewSprite.Height := Sprite.Height;
NewSprite.DataLen := Sprite.Width*Sprite.Height*5; { Worst Case }
NewSprite.SType := STCompiled;
GETMEM(NewSprite.Data, NewSprite.DataLen); { Get a chunk of memory }
SpriteSrc := 0; SpriteDest := 0;
FOR Y := 0 TO Sprite.Height-1 DO { Step over each pixel }
BEGIN
FOR X := 0 TO Sprite.Width-1 DO
BEGIN
Value := Sprite.Data^[SpriteSrc]; { Speedup... }
IF Value <> 0 THEN { Ignore blank pixels! }
IF X = 0 THEN { Special case of no offset from start of line }
BEGIN
NewSprite.Data^[SpriteDest ] := $C6; { Opcode for MOV }
NewSprite.Data^[SpriteDest+1] := $05; { MOD/RM value for 0 offset }
NewSprite.Data^[SpriteDest+2] := Value; { Value... }
INC(SpriteDest, 3);
END
ELSE IF X < 128 THEN { Can use a byte offset! }
BEGIN
NewSprite.Data^[SpriteDest ] := $C6; { Opcode for MOV }
NewSprite.Data^[SpriteDest+1] := $45; { MOD/RM value for byte ofs }
NewSprite.Data^[SpriteDest+2] := Lo(X); { Offset }
NewSprite.Data^[SpriteDest+3] := Value; { Value... }
INC(SpriteDest, 4);
END
ELSE
BEGIN { Must use a word offset... }
NewSprite.Data^[SpriteDest ] := $C6; { Opcode for MOV }
NewSprite.Data^[SpriteDest+1] := $85; { MOD/RM value for word ofs }
NewSprite.Data^[SpriteDest+2] := Lo(X); { Low byte of the offset }
NewSprite.Data^[SpriteDest+3] := Hi(X); { High byte of the offset }
NewSprite.Data^[SpriteDest+4] := Value; { Value... }
INC(SpriteDest, 5);
END;
INC(SpriteSrc);
END;
IF Y < (Sprite.Height-1) THEN { Don't do this on the last line}
BEGIN { Add DI, BX }
NewSprite.Data^[SpriteDest ] := $01; { Opcode for Add r/m16, r16 }
NewSprite.Data^[SpriteDest+1] := $DF; { MOD/RM byte for WORD BX }
INC(SpriteDest, 2);
END;
END;
NewSprite.Data^[SpriteDest] := $CB; { Add in our RetF }
INC(SpriteDest);
KillSprite(Sprite); { Don't need this anymore! }
Sprite := NewSprite;
Sprite.DataLen := SpriteDest; { Resize the sprite data to the }
GETMEM(Sprite.Data, Sprite.DataLen); { exact amount needed }
Move(NewSprite.Data^, Sprite.Data^, Sprite.DataLen); { Copy the data }
KillSprite(NewSprite);
END;
In this routine, I made one small and simple optimization. The small block (starting with "IF Value <> 0 THEN") that writes the mov instruction uses the smallest form of the instruction it can. This varies depending on whether the offset is zero, less than 128 (a signed byte), or less than 32768 (A signed word). This means that we can still use sprites that are 320 pixels wide, but we don't have a penalty for small sprites.
There are a number of optimizations that could be made (as discussed earlier) and will probably make up some future articles... Until then, I give you COMPILED SPRITES!!!
Example Program
This example program shows the relative speeds and sizes of the different kinds of sprites. It profiles them using three different test sprites, to show off the strengths and weaknesses of each of the three types (Standard, RLE, and compiled). This program uses the timer unit to achieve milisecond second accuracy. This ensures that the resolution of the timer does not interfere with the timings. You'll also have to grab the newest Sprite unit which includes the compiler and all of the other nifty stuff we've been creating.
For an example, here are the values that I recieved on my 486sx33 (just the percents) from the example program:
Sprite Timings
| Ball |
Checkerboard |
Solid box |
Standard Sprite: 100.00%
RLE Sprite: 53.52%
Compiled Sprite: 60.55%
|
Standard Sprite: 100.00%
RLE Sprite: 179.90%
Compiled Sprite: 55.28%
|
Standard Sprite: 100.00%
RLE Sprite: 100.75%
Compiled Sprite: 122.61%
|
It should be pointed out that the Standard sprite does not use transparency at all, so it is just copying the pixels out. If it did take into consideration the transparency, then it would be much slower.
As you can see from this example data (Run it on your machine... Cache and graphics cards make a big difference), the compiled sprites are faster when they are mostly transparent. Why is this? It is because the compiled sprites transfer data to memory in bite (byte) sized chunks using offsets, while the standard sprite is using a rep movs instruction. It is paying less for the instruction fetch because it only has to read one instruction from CS for each line. Anyways, enjoy compiled sprites, and don't overindulge!
-----------------] Example Starts here [-----------------
PROGRAM SpriteExample;
USES GraphPro, Sprite, Timer;
VAR
SprNum : INTEGER;
S : ARRAY[1..3] OF SpriteType;
I : INTEGER;
T1T, T1S, { Test 1 Time, Test 1 Size }
T2T, T2S, { Test 2 Time, Test 2 Size }
T3T, T3S : ARRAY[1..3] OF REAL; { Sprites types 1..3 }
CONST
Size = 63;
BEGIN
InitGraph;
StartTimer(1000); { 1000 ticks per second... Accurate timer! }
{ Do the first test... }
ClrScr(0);
FOR I := 0 TO Size SHR 1 DO { Ball sprite }
BEGIN
Line(Size SHR 1, I, I, Size SHR 1, 16+I);
Line(Size SHR 1+1, I, Size-I, Size SHR 1, 16+I);
Line(Size SHR 1, Size-I, I, Size SHR 1+1, 16+I);
Line(Size SHR 1+1, Size-I, Size-I, Size SHR 1+1, 16+I);
END;
GetSprite(S[1], 0, 0, Size, Size);
GetSprite(S[2], 0, 0, Size, Size);
GetSprite(S[3], 0, 0, Size, Size);
T1S[1] := S[1].DataLen;
ConvertSpriteToRLE(S[2]);
T1S[2] := S[2].DataLen;
ConvertSpriteToCompiled(S[3]);
T1S[3] := S[3].DataLen;
FOR SprNum := 1 TO 3 DO
BEGIN
Time := 0;
FOR I := 0 TO 319-Size DO
DrawSprite(S[SprNum], I, (SprNum-1)*(Size+2));
T1T[SprNum] := Time;
T1T[SprNum] := T1T[SprNum] / 1000 / (320-Size); { Do not include division in time }
END;
{ Do the second test... }
ClrScr(0);
FOR I := 0 TO Size SHR 1 DO { Checkerboard sprite }
BEGIN
Line(I*2, 0, Size, Size - I*2, 16+I);
Line(0, I*2, Size - I*2, Size, 16+I);
END;
GetSprite(S[1], 0, 0, Size, Size);
GetSprite(S[2], 0, 0, Size, Size);
GetSprite(S[3], 0, 0, Size, Size);
T2S[1] := S[1].DataLen;
ConvertSpriteToRLE(S[2]);
T2S[2] := S[2].DataLen;
ConvertSpriteToCompiled(S[3]);
T2S[3] := S[3].DataLen;
FOR SprNum := 1 TO 3 DO
BEGIN
Time := 0;
FOR I := 0 TO 319-Size DO
DrawSprite(S[SprNum], I, (SprNum-1)*(Size+2));
T2T[SprNum] := Time;
T2T[SprNum] := T2T[SprNum] / 1000 / (320-Size); { Do not include division in time }
END;
{ Do the second test... }
ClrScr(0);
FOR I := 0 TO Size DO { Checkerboard sprite }
Line(I, 0, I, Size, 16+I);
GetSprite(S[1], 0, 0, Size, Size);
GetSprite(S[2], 0, 0, Size, Size);
GetSprite(S[3], 0, 0, Size, Size);
T3S[1] := S[1].DataLen;
ConvertSpriteToRLE(S[2]);
T3S[2] := S[2].DataLen;
ConvertSpriteToCompiled(S[3]);
T3S[3] := S[3].DataLen;
FOR SprNum := 1 TO 3 DO
BEGIN
Time := 0;
FOR I := 0 TO 319-Size DO
DrawSprite(S[SprNum], I, (SprNum-1)*(Size+2));
T3T[SprNum] := Time;
T3T[SprNum] := T3T[SprNum] / 1000 / (320-Size); { Do not include division in time }
END;
EndTimer;
CloseGraph;
WRITELN('Summary of gathered data for ', Size+1,'x', Size+1, ' sprite: ');
WRITELN;
WRITELN(' Time (s) | Size | Time %');
WRITELN('Test #1 - Ball, typical application');
WRITELN('Standard: ', T1T[1]:2:9, ' |', T1S[1]:6:0,
' | ', T1T[1]/T1T[1]*100:3:2, '%');
WRITELN('RLE : ', T1T[2]:2:9, ' |', T1S[2]:6:0,
' | ', T1T[2]/T1T[1]*100:3:2, '%');
WRITELN('Compiled: ', T1T[3]:2:9, ' |', T1S[3]:6:0,
' | ', T1T[3]/T1T[1]*100:3:2, '%');
WRITELN;
WRITELN('Test #2 - Checkerboard, RLE stress test');
WRITELN('Standard: ', T2T[1]:2:9, ' |', T2S[1]:6:0,
' | ', T2T[1]/T2T[1]*100:3:2, '%');
WRITELN('RLE : ', T2T[2]:2:9, ' |', T2S[2]:6:0,
' | ', T2T[2]/T2T[1]*100:3:2, '%');
WRITELN('Compiled: ', T2T[3]:2:9, ' |', T2S[3]:6:0,
' | ', T2T[3]/T2T[1]*100:3:2, '%');
WRITELN;
WRITELN('Test #3 - Solid, Compiled stress test');
WRITELN('Standard: ', T3T[1]:2:9, ' |', T3S[1]:6:0,
' | ', T3T[1]/T3T[1]*100:3:2, '%');
WRITELN('RLE : ', T3T[2]:2:9, ' |', T3S[2]:6:0,
' | ', T3T[2]/T3T[1]*100:3:2, '%');
WRITELN('Compiled: ', T3T[3]:2:9, ' |', T3S[3]:6:0,
' | ', T3T[3]/T3T[1]*100:3:2, '%');
WRITELN;
WRITE('Press Enter to continue!');
READLN;
END.
-----------------] Example Ends here [-----------------
Created by Chris Lattner