float4x4

Computer graphics and stuff

Hardware Instancing in XNA

4 Comments »

Drawing large quantities of objects that are (mostly) identical is something that every graphics programmer will have to do sooner or later. I haven’t been able to find much information regarding the subject, most of it is either outdated or very specialized and thus hard to learn from. With this tutorial I’m going to try to show you how to do hardware instancing for windows in a general step-by-step way.

Since we are going to use hardware to do most of the work for us, we’ll need hardware that is actually capable of doing so. All you need is a graphics card with shader model 3.0 or higher. Don’t worry about it though, if you have an ATI x1300, nVidia 6000 or Intel GMA x3000 series or higher you will probably have shader 3.0 support. Note that this method will not work on the XBOX 360!

Let’s start by creating a new XNA 3.1 Windows project. We will first try to show an object on the screen that we want to duplicate. Create the following members in our game:

VertexBuffer geometryBuffer;
 IndexBuffer indexBuffer;
 VertexDeclaration vertexDeclaration;
 Effect effect;
 VertexElement[] streamElements;
 Matrix view;
 Matrix projection;

Now let’s fill these members by creating a GenerateBuffers function. Call it in your game’s LoadContent function.


 private void GenerateBuffers()
 {
 VertexPositionColor[] vertices = new VertexPositionColor[6];
 vertices[0].Position = new Vector3(-1, 0, -1);
 vertices[0].Color = Color.Red;
 vertices[1].Position = new Vector3(1, 0, -1);
 vertices[1].Color = Color.Green;
 vertices[2].Position = new Vector3(0, 0, 1);
 vertices[2].Color = Color.Blue;
 vertices[3].Position = new Vector3(-1, 1, -1);
 vertices[3].Color = Color.Red;
 vertices[4].Position = new Vector3(1, 1, -1);
 vertices[4].Color = Color.Green;
 vertices[5].Position = new Vector3(0, 1, 1);
 vertices[5].Color = Color.Blue;
 geometryBuffer = new VertexBuffer(GraphicsDevice, typeof(VertexPositionColor), 6, BufferUsage.WriteOnly);
 geometryBuffer.SetData(vertices);

 int[] indices = new int[24];
 indices[0] = 1; indices[1] = 3; indices[2] = 0;
 indices[3] = 4; indices[4] = 3; indices[5] = 1;
 indices[6] = 2; indices[7] = 4; indices[8] = 1;
 indices[9] = 5; indices[10] = 4; indices[11] = 2;
 indices[12] = 5; indices[13] = 2; indices[14] = 0;
 indices[15] = 3; indices[16] = 5; indices[17] = 0;
 indices[18] = 2; indices[19] = 1; indices[20] = 0;
 indices[21] = 5; indices[22] = 3; indices[23] = 4;
 indexBuffer = new IndexBuffer(GraphicsDevice, typeof(int), 24, BufferUsage.WriteOnly);
 indexBuffer.SetData(indices);
 }

Next up is creating a vertex declaration. For this we’ll create a custom one, as we want to extend it later on. Create a function GenerateStreamElements and call it in LoadContent as well.


 private void GenerateStreamElements()
 {
 streamElements = new VertexElement[2];
 streamElements[0] = new VertexElement(0, 0, VertexElementFormat.Vector3, VertexElementMethod.Default, VertexElementUsage.Position, 0);
 streamElements[1] = new VertexElement(0, sizeof(float) * 3, VertexElementFormat.Color, VertexElementMethod.Default, VertexElementUsage.Color, 0);
 vertexDeclaration = new VertexDeclaration(GraphicsDevice, streamElements);
 }

The vertex declaration basically tells your GPU how to interpret the data you are sending it. In this case, the first 3 elements are floats that determine the position of our vertices. The next element determines it’s colour.

We also need a shader to display our triangle. Create a new fx file and load it in your LoadContent function.

float4x4 WVP;

struct InstancingVSinput
{
 float4 Position : POSITION0;
 float4 Color : COLOR0;
};

struct InstancingVSoutput
{
 float4 Position : POSITION0;
 float4 Color : COLOR0;
};

InstancingVSoutput InstancingVS(InstancingVSinput input)
{
 InstancingVSoutput output;
 float4 pos = input.Position;
 pos = mul(pos, WVP);

 output.Position = pos;
 output.Color = input.Color;
 return output;
}

float4 InstancingPS(InstancingVSoutput input) : COLOR0
{
 return input.Color;
}

technique Instancing
{
 pass Pass0
 {
 VertexShader = compile vs_3_0 InstancingVS();
 PixelShader = compile ps_3_0 InstancingPS();
 }
}

We’re almost ready to draw, first we need to create our view and projection matrices in our LoadContent function.

view = Matrix.CreateLookAt(new Vector3(5, 5, 5), Vector3.Zero, Vector3.UnitZ);
 projection = Matrix.CreatePerspectiveFieldOfView(MathHelper.PiOver4, (float)Window.ClientBounds.Width / (float)Window.ClientBounds.Height, 0.1f, 2000.0f);

Finally we can get to our draw method:


 protected override void Draw(GameTime gameTime)
 {
 GraphicsDevice.Clear(Color.CornflowerBlue);
 GraphicsDevice.VertexDeclaration = vertexDeclaration;
 effect.CurrentTechnique = effect.Techniques["Instancing"];
 effect.Parameters["WVP"].SetValue(view * projection);
 GraphicsDevice.Indices = indexBuffer;
 effect.Begin();
 foreach(EffectPass p in effect.CurrentTechnique.Passes)
 {
 p.Begin();
 GraphicsDevice.Vertices[0].SetSource(geometryBuffer, 0, VertexPositionColor.SizeInBytes);
 GraphicsDevice.DrawIndexedPrimitives(PrimitiveType.TriangleList, 0, 0, 6, 0, 8);
 p.End();
 }
 effect.End();
 base.Draw(gameTime);
 }

If you run this code, you should get the same result as in the picture above. So far, we’ve done nothing out of the ordinary. Simply create a vertex and index buffer and show it. Our next step is to draw more of these triangles, a lot more. A naive way of doing this would be to loop through the draw call, changing the world matrix each time. What we’ll be doing, is create another buffer that contains these world matrices and send that to the GPU.

The first thing we need to do is create a list of positions for each of our triangles. Let’s create the following structure and add some more members:

int count = 200;
 VertexBuffer instanceBuffer;

struct InstanceInfo
 {
 public Matrix World;
 };

We will fill the instance buffer with the GenerateInstanceInformation function. It will create translation matrices with random positions. Again, simply add this function to the LoadContent. Each matrix has 16 floats so we have to add that much data every time.


 private void GenerateInstanceInformation()
 {
 InstanceInfo[] instances = new InstanceInfo[count];
 Random rnd = new Random();
 for(int i = 0; i < count; i++)
 {
 instances[i].World = Matrix.CreateTranslation(new Vector3(-rnd.Next(200), -rnd.Next(300), -rnd.Next(150)));
 }
 instanceBuffer = new VertexBuffer(GraphicsDevice, sizeof(float) * (16) * count, BufferUsage.WriteOnly);
 instanceBuffer.SetData(instances);
 }

Remember that our vertex declaration tells our GPU what’s coming. So we’ll need to add these instance matrices to our stream elements. The problem in doing this, is that XNA doesn’t provide a matrix format for us to add. Luckily we can use texture coordinates for the job. A matrix is basically four Vector4 elements so we’ll simply add that. This is what our new GenerateStreamElements function should look like:


 private void GenerateStreamElements()
 {
 streamElements = new VertexElement[6];
 streamElements[0] = new VertexElement(0, 0, VertexElementFormat.Vector3, VertexElementMethod.Default, VertexElementUsage.Position, 0);
 streamElements[1] = new VertexElement(0, sizeof(float) * 3, VertexElementFormat.Color, VertexElementMethod.Default, VertexElementUsage.Color, 0);
 streamElements[2] = new VertexElement(1, 0, VertexElementFormat.Vector4, VertexElementMethod.Default, VertexElementUsage.TextureCoordinate, 0);
 streamElements[3] = new VertexElement(1, sizeof(float) * 4, VertexElementFormat.Vector4, VertexElementMethod.Default, VertexElementUsage.TextureCoordinate, 1);
 streamElements[4] = new VertexElement(1, sizeof(float) * 8, VertexElementFormat.Vector4, VertexElementMethod.Default, VertexElementUsage.TextureCoordinate, 2);
 streamElements[5] = new VertexElement(1, sizeof(float) * 12, VertexElementFormat.Vector4, VertexElementMethod.Default, VertexElementUsage.TextureCoordinate, 3);
 vertexDeclaration = new VertexDeclaration(GraphicsDevice, streamElements);
 }

Here we added four Vector4 elements to a second stream. Stream 0 now contains our geometry, stream 1 our instances. Note that the offset is per stream so we start with offset 0 for element 2. The final argument, the usage index, determines which texture coordinate field will be used. In this sample, I use TEXCOORD0 to TEXCOORD3. If you use texturing on your objects you may want to use TEXCOORD1 to TEXCOORD4 etc…

Let’s update our shader to accept the matrices, all we need to do is change the vertex shader.

InstancingVSoutput InstancingVS(InstancingVSinput input, float4x4 instanceTransform : TEXCOORD0)
{
 InstancingVSoutput output;
 float4 pos = input.Position;
 pos = mul(pos, transpose(instanceTransform));
 pos = mul(pos, WVP);
 output.Position = pos;
 output.Color = input.Color;
 return output;
}

As you can see, all we do is use the instanceTransform matrix as a world matrix by multiplying our position before we apply the view and projection matrix. We need to transpose this matrix however, because we read out a float4x4 element and not four float4 elements. I suppose you could reconstruct a matrix by reading four float4 elements, but this solution is much more elegant.

We aren’t quite done yet, we still need to make some changes to our drawing code and our view matrix so that we can see all our triangles.

view = Matrix.CreateLookAt(new Vector3(150, 150, 150), Vector3.Zero, Vector3.UnitZ);
protected override void Draw(GameTime gameTime)
 {
 GraphicsDevice.Clear(Color.White);
 GraphicsDevice.VertexDeclaration = vertexDeclaration;
 effect.CurrentTechnique = effect.Techniques["Instancing"];
 effect.Parameters["WVP"].SetValue(view * projection);
 GraphicsDevice.Indices = indexBuffer;
 effect.Begin();
 foreach(EffectPass p in effect.CurrentTechnique.Passes)
 {
 p.Begin();
 GraphicsDevice.Vertices[0].SetSource(geometryBuffer, 0, VertexPositionColor.SizeInBytes);
 GraphicsDevice.Vertices[0].SetFrequencyOfIndexData(count);
 GraphicsDevice.Vertices[1].SetSource(instanceBuffer, 0, sizeof(float) * (16));
 GraphicsDevice.Vertices[1].SetFrequencyOfInstanceData(1);
 GraphicsDevice.DrawIndexedPrimitives(PrimitiveType.TriangleList, 0, 0, 6, 0, 8);
 p.End();
 }
 effect.End();
 base.Draw(gameTime);
 }

As you can see, we set the frequency of index data on stream 0, our geometry stream, to the amount of triangles we wish to draw. We also set the instance buffer to stream 1 and set the frequency of instance data to one. People often mix up these two functions so if you’re having trouble, check those.

Running this code should net you the following result for count 200 and count 200000 respectively. Note that even with this massive amount of triangles, we still get a steady 60 fps with V Sync turned on using an ATI Radeon Mobility 4650.

As you can see, we already have a decent set-up that can be used to draw grass, trees, buildings or any other repetitive objects. You’ll undoubtedly want to go one step further and add some sort of variance to instances to hide the fact that every single piece of grass is identical. Let’s do this by giving our triangles some randomized colours.

The idea is to not only send a world matrix per instance, but also a colour. This will affect our current code in quite a few ways. Firstly we need to update our instanceInformation structure to contain colour data and our vertex declaration needs another element.

struct InstanceInfo
 {
 public Matrix World;
 public Vector4 Color;
 };
 private void GenerateStreamElements()
 {
 streamElements = new VertexElement[7];
 streamElements[0] = new VertexElement(0, 0, VertexElementFormat.Vector3, VertexElementMethod.Default, VertexElementUsage.Position, 0);
 streamElements[1] = new VertexElement(0, sizeof(float) * 3, VertexElementFormat.Color, VertexElementMethod.Default, VertexElementUsage.Color, 0);
 streamElements[2] = new VertexElement(1, 0, VertexElementFormat.Vector4, VertexElementMethod.Default, VertexElementUsage.TextureCoordinate, 0);
 streamElements[3] = new VertexElement(1, sizeof(float) * 4, VertexElementFormat.Vector4, VertexElementMethod.Default, VertexElementUsage.TextureCoordinate, 1);
 streamElements[4] = new VertexElement(1, sizeof(float) * 8, VertexElementFormat.Vector4, VertexElementMethod.Default, VertexElementUsage.TextureCoordinate, 2);
 streamElements[5] = new VertexElement(1, sizeof(float) * 12, VertexElementFormat.Vector4, VertexElementMethod.Default, VertexElementUsage.TextureCoordinate, 3);
 streamElements[6] = new VertexElement(1, sizeof(float) * 16, VertexElementFormat.Vector4, VertexElementMethod.Default, VertexElementUsage.TextureCoordinate, 4);
 vertexDeclaration = new VertexDeclaration(GraphicsDevice, streamElements);
 }

The total amount of floats per instance is now 20 (16 for the matrix and 4 for the colour). We need to update this where needed. We also need to generate a random colour. The following functions need changes:


 private void GenerateInstanceInformation()
 {
 InstanceInfo[] instances = new InstanceInfo[count];
 Random rnd = new Random();
 for(int i = 0; i < count; i++)
 {
 instances[i].World = Matrix.CreateTranslation(new Vector3(-rnd.Next(200), -rnd.Next(300), -rnd.Next(150)));
 instances[i].Color = new Vector4((float)rnd.NextDouble(), (float)rnd.NextDouble(), (float)rnd.NextDouble(), 1);
 }
 instanceBuffer = new VertexBuffer(GraphicsDevice, sizeof(float) * (16 + 4) * count, BufferUsage.WriteOnly);
 instanceBuffer.SetData(instances);
 }
protected override void Draw(GameTime gameTime)
 {
 GraphicsDevice.Clear(Color.White);
 GraphicsDevice.VertexDeclaration = vertexDeclaration;
 effect.CurrentTechnique = effect.Techniques["Instancing"];
 effect.Parameters["WVP"].SetValue(view * projection);
 GraphicsDevice.Indices = indexBuffer;
 effect.Begin();
 foreach(EffectPass p in effect.CurrentTechnique.Passes)
 {
 p.Begin();
 GraphicsDevice.Vertices[0].SetSource(geometryBuffer, 0, VertexPositionColor.SizeInBytes);
 GraphicsDevice.Vertices[0].SetFrequencyOfIndexData(count);
 GraphicsDevice.Vertices[1].SetSource(instanceBuffer, 0, sizeof(float) * (16 + 4));
 GraphicsDevice.Vertices[1].SetFrequencyOfInstanceData(1);
 GraphicsDevice.DrawIndexedPrimitives(PrimitiveType.TriangleList, 0, 0, 6, 0, 8);
 p.End();
 }
 effect.End();
 base.Draw(gameTime);
 }

Last but not least, our shader needs to read this colour from TEXCOORD4 and use it rather than it’s geometry stream colour. We update the vertex shader as such:

InstancingVSoutput InstancingVS(InstancingVSinput input, float4x4 instanceTransform : TEXCOORD0, float4 colour : TEXCOORD4)
{
 InstancingVSoutput output;
 float4 pos = input.Position;
 pos = mul(pos, transpose(instanceTransform));
 pos = mul(pos, WVP);
 output.Position = pos;
 output.Color = colour;
 return output;
}

If everything went according to plan you should get a result like the picture above. As you can see, adding extra parameters per instance is not that hard.  Hopefully this tutorial helps you improve the quality and performance of your application or game. If you have any comments or questions feel free to leave me a message. The full code can be found below.

Code:

public class Game1 : Microsoft.Xna.Framework.Game
{
 GraphicsDeviceManager graphics;

 VertexBuffer geometryBuffer;
 IndexBuffer indexBuffer;
 VertexDeclaration vertexDeclaration;
 Effect effect;
 VertexElement[] streamElements;
 Matrix view;
 Matrix projection;
 int count = 200000;
 VertexBuffer instanceBuffer;

 struct InstanceInfo
 {
 public Matrix World;
 public Vector4 Color;
 };

 public Game1()
 {
 graphics = new GraphicsDeviceManager(this);
 Content.RootDirectory = "Content";
 }

 protected override void Initialize()
 {
 base.Initialize();
 }

 protected override void LoadContent()
 {
 GenerateBuffers();
 GenerateStreamElements();
 GenerateInstanceInformation();

 effect = Content.Load<Effect>("instancing");
 view = Matrix.CreateLookAt(new Vector3(150, 150, 150), Vector3.Zero, Vector3.UnitZ);
 projection = Matrix.CreatePerspectiveFieldOfView(MathHelper.PiOver4, (float)Window.ClientBounds.Width / (float)Window.ClientBounds.Height, 0.1f, 2000.0f);
 }

 protected override void UnloadContent()
 {
 }

 private void GenerateBuffers()
 {
 VertexPositionColor[] vertices = new VertexPositionColor[6];
 vertices[0].Position = new Vector3(-1, 0, -1);
 vertices[0].Color = Color.Red;
 vertices[1].Position = new Vector3(1, 0, -1);
 vertices[1].Color = Color.Green;
 vertices[2].Position = new Vector3(0, 0, 1);
 vertices[2].Color = Color.Blue;
 vertices[3].Position = new Vector3(-1, 1, -1);
 vertices[3].Color = Color.Red;
 vertices[4].Position = new Vector3(1, 1, -1);
 vertices[4].Color = Color.Green;
 vertices[5].Position = new Vector3(0, 1, 1);
 vertices[5].Color = Color.Blue;
 geometryBuffer = new VertexBuffer(GraphicsDevice, typeof(VertexPositionColor), 6, BufferUsage.WriteOnly);
 geometryBuffer.SetData(vertices);

 int[] indices = new int[24];
 indices[0] = 1; indices[1] = 3; indices[2] = 0;
 indices[3] = 4; indices[4] = 3; indices[5] = 1;
 indices[6] = 2; indices[7] = 4; indices[8] = 1;
 indices[9] = 5; indices[10] = 4; indices[11] = 2;
 indices[12] = 5; indices[13] = 2; indices[14] = 0;
 indices[15] = 3; indices[16] = 5; indices[17] = 0;
 indices[18] = 2; indices[19] = 1; indices[20] = 0;
 indices[21] = 5; indices[22] = 3; indices[23] = 4;
 indexBuffer = new IndexBuffer(GraphicsDevice, typeof(int), 24, BufferUsage.WriteOnly);
 indexBuffer.SetData(indices);
 }

 private void GenerateStreamElements()
 {
 streamElements = new VertexElement[7];
 streamElements[0] = new VertexElement(0, 0, VertexElementFormat.Vector3, VertexElementMethod.Default, VertexElementUsage.Position, 0);
 streamElements[1] = new VertexElement(0, sizeof(float) * 3, VertexElementFormat.Color, VertexElementMethod.Default, VertexElementUsage.Color, 0);
 streamElements[2] = new VertexElement(1, 0, VertexElementFormat.Vector4, VertexElementMethod.Default, VertexElementUsage.TextureCoordinate, 0);
 streamElements[3] = new VertexElement(1, sizeof(float) * 4, VertexElementFormat.Vector4, VertexElementMethod.Default, VertexElementUsage.TextureCoordinate, 1);
 streamElements[4] = new VertexElement(1, sizeof(float) * 8, VertexElementFormat.Vector4, VertexElementMethod.Default, VertexElementUsage.TextureCoordinate, 2);
 streamElements[5] = new VertexElement(1, sizeof(float) * 12, VertexElementFormat.Vector4, VertexElementMethod.Default, VertexElementUsage.TextureCoordinate, 3);
 streamElements[6] = new VertexElement(1, sizeof(float) * 16, VertexElementFormat.Vector4, VertexElementMethod.Default, VertexElementUsage.TextureCoordinate, 4);
 vertexDeclaration = new VertexDeclaration(GraphicsDevice, streamElements);
 }

 private void GenerateInstanceInformation()
 {
 InstanceInfo[] instances = new InstanceInfo[count];
 Random rnd = new Random();
 for(int i = 0; i < count; i++)
 {
 instances[i].World = Matrix.CreateTranslation(new Vector3(-rnd.Next(200), -rnd.Next(300), -rnd.Next(150)));
 instances[i].Color = new Vector4((float)rnd.NextDouble(), (float)rnd.NextDouble(), (float)rnd.NextDouble(), 1);
 }
 instanceBuffer = new VertexBuffer(GraphicsDevice, sizeof(float) * (16 + 4) * count, BufferUsage.WriteOnly);
 instanceBuffer.SetData(instances);
 }

 protected override void Update(GameTime gameTime)
 {
 if (Keyboard.GetState().IsKeyDown(Keys.Escape))
 this.Exit();

 base.Update(gameTime);
 }

 protected override void Draw(GameTime gameTime)
 {
 GraphicsDevice.Clear(Color.White);
 GraphicsDevice.VertexDeclaration = vertexDeclaration;
 effect.CurrentTechnique = effect.Techniques["Instancing"];
 effect.Parameters["WVP"].SetValue(view * projection);
 GraphicsDevice.Indices = indexBuffer;
 effect.Begin();
 foreach(EffectPass p in effect.CurrentTechnique.Passes)
 {
 p.Begin();
 GraphicsDevice.Vertices[0].SetSource(geometryBuffer, 0, VertexPositionColor.SizeInBytes);
 GraphicsDevice.Vertices[0].SetFrequencyOfIndexData(count);
 GraphicsDevice.Vertices[1].SetSource(instanceBuffer, 0, sizeof(float) * (16 + 4));
 GraphicsDevice.Vertices[1].SetFrequencyOfInstanceData(1);
 GraphicsDevice.DrawIndexedPrimitives(PrimitiveType.TriangleList, 0, 0, 6, 0, 8);
 p.End();
 }
 effect.End();
 base.Draw(gameTime);
 }
}

Shader:

float4x4 WVP;

struct InstancingVSinput
{
 float4 Position : POSITION0;
 float4 Color : COLOR0;
};

struct InstancingVSoutput
{
 float4 Position : POSITION0;
 float4 Color : COLOR0;
};

InstancingVSoutput InstancingVS(InstancingVSinput input, float4x4 instanceTransform : TEXCOORD0, float4 colour : TEXCOORD4)
{
 InstancingVSoutput output;
 float4 pos = input.Position;
 pos = mul(pos, transpose(instanceTransform));
 pos = mul(pos, WVP);
 output.Position = pos;
 output.Color = colour;
 return output;
}

float4 InstancingPS(InstancingVSoutput input) : COLOR0
{
 return input.Color;
}

technique Instancing
{
 pass Pass0
 {
 VertexShader = compile vs_3_0 InstancingVS();
 PixelShader = compile ps_3_0 InstancingPS();
 }
}

4 Responses

  1. I definitely don’t want to get on your nerves but I am so curious to here about your solution. Did you already had some time. Sorry for bothering you :)

  2. Wow, that would be a great help. Thank you a lot!

  3. Hey Daniel,

    Once I get back from work I will see if I can put together an XNA 4 example with textures. I haven’t posted in a long while so it’s time for something anyway :)

  4. Great introduction. Although it does not work in XNA 4.0 anymore it makes clear how hardware instancing works. I am just wondering how this could be done with textures instead of colors. Because textures can not be represented as floats. Do you have some hints for me on how to do this?

Leave a Reply