- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
System Used: Surface Pro 4
CPU SKU: i5
GPU SKU: Intel® HD Graphics 520
Processor Line: 6300u
System BIOS Version:
CMOS settings:
Graphics Driver Version: 20.19.15.4380 (Beta, also occurs with most recent certified driver)
GOP/VBIOS Version:
Operating System: Windows
OS Version: 10 Pro
API: DX12
CPU SKU: i5
GPU SKU: Intel® HD Graphics 520
Processor Line: 6300u
System BIOS Version:
CMOS settings:
Graphics Driver Version: 20.19.15.4380 (Beta, also occurs with most recent certified driver)
GOP/VBIOS Version:
Operating System: Windows
OS Version: 10 Pro
API: DX12
Running sample code using ExecuteIndirect crashes the driver and aborts the application.
Tested using the 'Asteroids' demo [intel]: https://github.com/GameTechDev/asteroids_d3d12 (toggle ExecuteIndirect with key)
Tested using the 'MiniEngine/Microsoft Samples : D3D12ExecuteIndirect' : https://github.com/Microsoft/DirectX-Graphics-Samples
Tested using the 'MiniEngine/Microsoft Samples : D3D12ExecuteIndirect' : https://github.com/Microsoft/DirectX-Graphics-Samples
Both samples appear correct. In one case the ExecuteIndirectArgument buffer is populated by CPU into a GENERIC_READ/UPLOAD buffer and the other via a UAV write.
The application will crash shortly after launching with ExecuteIndirect enabled and a Driver Recovery notice will appear.
The application will crash shortly after launching with ExecuteIndirect enabled and a Driver Recovery notice will appear.
Occurs with and without Debug validation library enabled.
Tested most recently on the latest 4380 (beta) driver, but also occurs on latest certified driver provided by Windows update.
Have not found any documents to suggest HD 520 should not support ExecuteIndirect on D3D12.
Intel documentation describes implementation details involving Compute to patch the command buffer which sounds portable - Compute otherwise seems to behave correctly.
If there are alignment/stride/size restrictions on command buffer required for patching then I'd appreciate documentation.
Intel documentation describes implementation details involving Compute to patch the command buffer which sounds portable - Compute otherwise seems to behave correctly.
If there are alignment/stride/size restrictions on command buffer required for patching then I'd appreciate documentation.
Thanks,
Daniel
Daniel
1 Solution
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Daniel,
We reproduced the issue and have a fix ready. We were fortunate a similar issue was fixed in the driver already. So the next driver release will fix your issue. The driver release will happen next week.
-Michael
Link Copied
3 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Daniel,
I have got this captured and am getting D3D driver team to investigate. I will let you know what we find out.
-Michael
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Daniel,
We reproduced the issue and have a fix ready. We were fortunate a similar issue was fixed in the driver already. So the next driver release will fix your issue. The driver release will happen next week.
-Michael
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Michael,
I have the 4404 beta driver and it appears to behave correctly now. Great, this is supper useful.
That said, I was disappointed in the performance of ExecuteIndirect in the Asteroids Intel example.
On the HD520/Surface I am using the demo is heavily GPU bound, so I expected some drop in performance from moving the CPU work of setting up the dispatches to the GPU on this Hardware.
However it seems that the Driver will spend about 5ms of GPU time translating (I presume) the 12500 Indirect Draw Arguments (400000bytes total) into something it can dispatch, dropping the FPS from ~60 (no-vsync)
to ~40.
However it seems that the Driver will spend about 5ms of GPU time translating (I presume) the 12500 Indirect Draw Arguments (400000bytes total) into something it can dispatch, dropping the FPS from ~60 (no-vsync)
to ~40.
Can you tell me if this is the expected GPU performance impact of ExecuteIndirect for ~12k draws on an HD 520 type GPU?
Efficiently walking 40byte structures in compute isn't fun in my experience. Are there more ideal command signature strides that might perform better?
Regards,
Daniel
Efficiently walking 40byte structures in compute isn't fun in my experience. Are there more ideal command signature strides that might perform better?
Regards,
Daniel
Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page