I am trying to write an OpenGL shell that will allow me to use all my existing graphic code (written for OpenGL) and direct OpenGL calls to Direct3D equivalents. So far, this has worked surprisingly well, but performance has proven to be quite a challenge.
Now, I admit that, most likely, I use D3D in a way that it was never developed. I update one vertex buffer thousands of times in a render cycle. Each time I draw a "sprite", I send 4 vertices to the GPU with the coordinates of the texture, etc., And when the number of "sprites" on the screen at one time reaches about 1 to 1.5 k, then the FPS of my application drops to below 10 frames per second.
Using VS2012 performance analysis (which is surprising, by the way), I see that the ID3D11DeviceContext-> Draw method takes up most of the time:
Screenshot here
Is there some kind of setup that I use incorrectly when setting up my vertex buffer or during the draw method? Is it really, really bad, to use the same vertex buffer for all my sprites? If so, what other options do I have that would not greatly change the architecture of my existing graphical code base (which is built around the OpenGL paradigm ... send EVERYTHING to the GPU in every frame!)
The biggest FPS killer in my game is when I display a lot of text on the screen. Each symbol is a textured square, and each of them requires a separate update of the vertex buffer and a separate call to Draw. If D3D or hardware doesn't like many Draw calls, how else can you draw a lot of text on the screen at a time?
Let me know if there is any other code you want to see to help me diagnose this problem.
Thank!
Here is the hardware I'm running on:
- Core i7 @ 3.5GHz
- 16 gigabytes of RAM
- GeForce GTX 560 Ti
And here is the software that I run:
- Preview Windows 8
- VS 2012
- DirectX 11
Here is the draw method:
void OpenGL::Draw(const std::vector<OpenGLVertex>& vertices)
{
auto matrix = *_matrices.top();
_constantBufferData.view = DirectX::XMMatrixTranspose(matrix);
_context->UpdateSubresource(_constantBuffer, 0, NULL, &_constantBufferData, 0, 0);
_context->IASetInputLayout(_inputLayout);
_context->VSSetShader(_vertexShader, nullptr, 0);
_context->VSSetConstantBuffers(0, 1, &_constantBuffer);
D3D11_PRIMITIVE_TOPOLOGY topology = D3D11_PRIMITIVE_TOPOLOGY_TRIANGLESTRIP;
ID3D11ShaderResourceView* texture = _textures[_currentTextureId];
_context->PSSetShader(_pixelShaderTexture, nullptr, 0);
_context->PSSetShaderResources(0, 1, &texture);
D3D11_MAPPED_SUBRESOURCE mappedResource;
D3D11_MAP mapType = D3D11_MAP::D3D11_MAP_WRITE_DISCARD;
auto hr = _context->Map(_vertexBuffer, 0, mapType, 0, &mappedResource);
if (SUCCEEDED(hr))
{
OpenGLVertex *pData = reinterpret_cast<OpenGLVertex *>(mappedResource.pData);
memcpy(&(pData[_currentVertex]), &vertices[0], sizeof(OpenGLVertex) * vertices.size());
_context->Unmap(_vertexBuffer, 0);
}
UINT stride = sizeof(OpenGLVertex);
UINT offset = 0;
_context->IASetVertexBuffers(0, 1, &_vertexBuffer, &stride, &offset);
_context->IASetPrimitiveTopology(topology);
_context->Draw(vertices.size(), _currentVertex);
_currentVertex += (int)vertices.size();
}
And here is the method that creates the vertex buffer:
void OpenGL::CreateVertexBuffer()
{
D3D11_BUFFER_DESC bd;
ZeroMemory(&bd, sizeof(bd));
bd.Usage = D3D11_USAGE_DYNAMIC;
bd.ByteWidth = _maxVertices * sizeof(OpenGLVertex);
bd.BindFlags = D3D11_BIND_VERTEX_BUFFER;
bd.CPUAccessFlags = D3D11_CPU_ACCESS_FLAG::D3D11_CPU_ACCESS_WRITE;
bd.MiscFlags = 0;
bd.StructureByteStride = 0;
D3D11_SUBRESOURCE_DATA initData;
ZeroMemory(&initData, sizeof(initData));
_device->CreateBuffer(&bd, NULL, &_vertexBuffer);
}
Here is my vertex shader code:
cbuffer ModelViewProjectionConstantBuffer : register(b0)
{
matrix model;
matrix view;
matrix projection;
};
struct VertexShaderInput
{
float3 pos : POSITION;
float4 color : COLOR0;
float2 tex : TEXCOORD0;
};
struct VertexShaderOutput
{
float4 pos : SV_POSITION;
float4 color : COLOR0;
float2 tex : TEXCOORD0;
};
VertexShaderOutput main(VertexShaderInput input)
{
VertexShaderOutput output;
float4 pos = float4(input.pos, 1.0f);
// Transform the vertex position into projected space.
pos = mul(pos, model);
pos = mul(pos, view);
pos = mul(pos, projection);
output.pos = pos;
// Pass through the color without modification.
output.color = input.color;
output.tex = input.tex;
return output;
}