GDC 2014: "Approaching Zero Driver Overhead in OpenGL (Presented by NVIDIA)" by Cass Everitt, Tim Foley, John McDonald, Graham Sellers of NVIDIA, Intel, NVIDIA, AMD https://gdcvault.com/play/1020791/Approaching-Zero-Driver-Overhead-in
-
@GDCPresoReviews @TomF I think it's also just an issue of such diverging hardware capabilities, combined with the desire for low level access. It's a lot less bad if you filter for the ones appropriate for a given device class. At least in CPU land Intel and AMD mostly straight up copy each others ISA extensions.
@GDCPresoReviews @TomF like extension bloat is certainly annoying, but mostly you never even consider using most of them. Otoh things like push constants combined with buffer device address, dynamic rendering, etc are *hugely* simplifying (and core vulkan!) but if you need to support mobile it's not happening. Imo extensions have the scary numbers, but the variety of ways to implement even the most basic things like "how to send data to shader" is the true pain of learning and using the api.
-
@GDCPresoReviews @TomF like extension bloat is certainly annoying, but mostly you never even consider using most of them. Otoh things like push constants combined with buffer device address, dynamic rendering, etc are *hugely* simplifying (and core vulkan!) but if you need to support mobile it's not happening. Imo extensions have the scary numbers, but the variety of ways to implement even the most basic things like "how to send data to shader" is the true pain of learning and using the api.
Right, I think that’s why I had such a strong reaction to this AZDO talk in particular. The techniques they were describing weren’t iterative improvements on top of what you already have. They were describing fundamentally changing how your algorithms are structured. And they described a few different ways to do it, and indicated that you should use whichever is supported by the user’s device at runtime. Which just seems totally untenable to me.
-
@GDCPresoReviews @TomF I think it's also just an issue of such diverging hardware capabilities, combined with the desire for low level access. It's a lot less bad if you filter for the ones appropriate for a given device class. At least in CPU land Intel and AMD mostly straight up copy each others ISA extensions.
@dotstdy @GDCPresoReviews Heh - eventually. And usually the second one to get there does it better because they have a bit of hindsight. AMD has the best implementation of AVX512 so far
-
Right, I think that’s why I had such a strong reaction to this AZDO talk in particular. The techniques they were describing weren’t iterative improvements on top of what you already have. They were describing fundamentally changing how your algorithms are structured. And they described a few different ways to do it, and indicated that you should use whichever is supported by the user’s device at runtime. Which just seems totally untenable to me.
I’ve heard others make the point I’m trying to make like this: if you use this dialect of the technology, would it recognize itself in the mirror?
Meaning, like, does the AZDO dialect of OpenGL look like OpenGL? Does it smell like OpenGL?
For comparison: you can write C++ style code but spell it in Lisp, and you can write Lisp style code but spell it in C++, and both of those are poor uses of programming languages
-
@dotstdy @GDCPresoReviews Heh - eventually. And usually the second one to get there does it better because they have a bit of hindsight. AMD has the best implementation of AVX512 so far
@TomF @dotstdy @GDCPresoReviews Can’t intel have hindsight now and redo it? Spec is the same I guess. Sunk cost stopping them?
-
I’ve heard others make the point I’m trying to make like this: if you use this dialect of the technology, would it recognize itself in the mirror?
Meaning, like, does the AZDO dialect of OpenGL look like OpenGL? Does it smell like OpenGL?
For comparison: you can write C++ style code but spell it in Lisp, and you can write Lisp style code but spell it in C++, and both of those are poor uses of programming languages
@GDCPresoReviews @dotstdy Totally true, but even by the time of AZDO, it's a totally legit question to ask "what does OpenGL look like anyway?" It was an old old API, and had gone through so many revisions. It doesn't help that most extensions, even the official ones, were written as deltas on previous docs!
-
@TomF @dotstdy @GDCPresoReviews Can’t intel have hindsight now and redo it? Spec is the same I guess. Sunk cost stopping them?
@breakin @dotstdy @GDCPresoReviews They did! That's what AVX10 is. And I mostly think it's a good idea. It sucks that you can't rely on 512-bit support, but that's a physical limit, and I'll just have to accept the designers' words that they can't make it work. But given that, AVX10 is very acceptable.
-
@GDCPresoReviews @dotstdy Totally true, but even by the time of AZDO, it's a totally legit question to ask "what does OpenGL look like anyway?" It was an old old API, and had gone through so many revisions. It doesn't help that most extensions, even the official ones, were written as deltas on previous docs!
@TomF @GDCPresoReviews deltas are the most annoying part of khronos extensions, but at least these days for vulkan extensions they're publishing a "wtf is this" document as well, which goes a surprisingly long way towards improving things. e.g. https://github.com/KhronosGroup/Vulkan-Docs/blob/main/proposals/VK_KHR_shader_quad_control.adoc
-
undefined Oblomov ha condiviso questa discussione
-
@breakin @dotstdy @GDCPresoReviews They did! That's what AVX10 is. And I mostly think it's a good idea. It sucks that you can't rely on 512-bit support, but that's a physical limit, and I'll just have to accept the designers' words that they can't make it work. But given that, AVX10 is very acceptable.
@TomF @dotstdy @GDCPresoReviews Interesting! Almost seems as if one might have a different program running on the e-cores with this!
Edit: seems like this is no different from avx512.
-
@TomF @dotstdy @GDCPresoReviews Interesting! Almost seems as if one might have a different program running on the e-cores with this!
Edit: seems like this is no different from avx512.
@breakin @dotstdy @GDCPresoReviews AVX10 just means a core can support "AVX256" (i.e. AVX512 features but half the width) without supporting the full 512 bits, which is difficult for the small cores. So that's a good thing.
-
@breakin @dotstdy @GDCPresoReviews AVX10 just means a core can support "AVX256" (i.e. AVX512 features but half the width) without supporting the full 512 bits, which is difficult for the small cores. So that's a good thing.
@TomF @breakin @dotstdy @GDCPresoReviews IMO intel should just support avx512 even in e-cores, even if it means being really, really slow. It's more important that a feature works than that it's fast. But that's just my take.
-
@TomF @breakin @dotstdy @GDCPresoReviews IMO intel should just support avx512 even in e-cores, even if it means being really, really slow. It's more important that a feature works than that it's fast. But that's just my take.
@sol_hsa @TomF @breakin @GDCPresoReviews this is what they're doing in 10.2, it's just maximally confusing because why not.
-
@sol_hsa @TomF @breakin @GDCPresoReviews this is what they're doing in 10.2, it's just maximally confusing because why not.
@dotstdy @sol_hsa @breakin @GDCPresoReviews This is the (Intel) way.
-
@dotstdy @sol_hsa @breakin @GDCPresoReviews This is the (Intel) way.
@TomF @sol_hsa @breakin @GDCPresoReviews sometimes I think people take "a sign of a good deal is that everyone is unhappy" a bit too seriously :')