Title: Links of the Open World Programmer Brief: Various links to articles and technical documentation I've found along the way of making my engines, mostly centered around OpenGL. Date: 1714393506 Tags: Programming, OpenGL, GLSL, Generation CSS: /style.css ## optimization articles - [Various instancing hacks and methods, with benchmarks](https://solhsa.com/instancing.html) - [Buffered memory can have drastic differences in read speed](https://community.intel.com/t5/Developing-Games-on-Intel/glMapBuffer-reading-mapped-memory-is-very-slow/td-p/1059726) - [Selecting device is problematic with opengl, but there are some hacky ways](https://stackoverflow.com/questions/68469954/how-to-choose-specific-gpu-when-create-opengl-context) - [For linux environments it's a lot messier](https://bbs.archlinux.org/viewtopic.php?id=266456) - [About vertex caching in instanced draw](http://eelpi.gotdns.org/papers/fast_vert_cache_opt.html) - [Two level indexing](https://stackoverflow.com/questions/11148567/rendering-meshes-with-multiple-indices) - [OpenGL Insights](https://openglinsights.com/) - [Da Pipeline](https://fgiesen.wordpress.com/2011/07/09/a-trip-through-the-graphics-pipeline-2011-index/) - [Apple guide on GLES](https://developer.apple.com/library/archive/documentation/3DDrawing/Conceptual/OpenGLES_ProgrammingGuide/Introduction/Introduction.html) - [About GPU cache](https://www.rastergrid.com/blog/gpu-tech/2021/01/understanding-gpu-caches/) - [Nvidia guide](https://docs.nvidia.com/drive/drive_os_5.1.6.1L/nvvib_docs/index.html#page/DRIVE_OS_Linux_SDK_Development_Guide/Graphics/graphics_opengl.html) - [Optimizing Triangle Strips for Fast Rendering](http://www.cs.umd.edu/gvil/papers/av_ts.pdf) - [TEGRA specific GLES2 guide](https://docs.nvidia.com/drive/drive_os_5.1.6.1L/nvvib_docs/DRIVE_OS_Linux_SDK_Development_Guide/baggage/tegra_gles2_performance.pdf) - [List of old NVidia GLSL pragmas](http://www.icare3d.org/news_articles/nvidia_glsl_compiling_options.html) - [Radeon 9XXX series optimization guide](https://people.freedesktop.org/~mareko/radeon-9700-opengl-programming-and-optimization-guide.pdf) - [GLSL optimizations](https://www.khronos.org/opengl/wiki/GLSL_Optimizations) - [IPhone 3D Programming book](https://www.oreilly.com/library/view/iphone-3d-programming/9781449388133/bk01-toc.html) - [Article about cache utilization tips and techniques](https://johnnysswlab.com/make-your-programs-run-faster-by-better-using-the-data-cache/) - [Pixel Buffer Object](http://www.songho.ca/opengl/gl_pbo.html) - [Screen quads](https://stackoverflow.com/questions/2588875/whats-the-best-way-to-draw-a-fullscreen-quad-in-opengl-3-2) - [Performance problems with framebuffer swaps](https://stackoverflow.com/questions/10729352/framebuffer-fbo-render-to-texture-is-very-slow-using-opengl-es-2-0-on-android) - [Matrices inside textures](https://stackoverflow.com/questions/29672810/efficient-way-to-manage-matrices-within-a-graphic-application-using-texture-buff) - [OpenGL Insights - Asynchronous Buffer Transfers](https://zbook.org/read/448e5c_opengl-insights-university-of-pennsylvania.html) - [Scene rendering techniques presentation](https://on-demand.gputechconf.com/gtc/2014/presentations/S4379-opengl-44-scene-rendering-techniques.pdf) - [High-performance extension galore](http://behindthepixels.io/assets/files/High-performance,%20Low-Overhead%20Rendering%20with%20OpenGL%20and%20Vulkan%20-%20Edward%20Liu.pdf) - [NVidia example on shader based occlusion culling](https://github.com/nvpro-samples/gl_occlusion_culling) - [Packing](http://smt565.blogspot.com/2011/04/bit-packing-depth-and-normals.html) - [Thread on texture DMA when transferring](https://community.khronos.org/t/texture-performance/49104) - [More recent DMA thread](https://community.khronos.org/t/direct-memory-access-in-opengl/108312/22) - [Mesa driven GLSL optimizer, might be relevant for devices with poor optimizing compilers](https://github.com/aras-p/glsl-optimizer) - [In-depth OpenGL feature overview with hardware support listed](https://www.g-truc.net/doc/Effective%20OpenGL.pdf) - [Z-order curve to increase locality of multidimensional data](https://en.wikipedia.org/wiki/Z-order_curve#Texture_mapping) - [Thread group locality](https://developer.nvidia.com/blog/optimizing-compute-shaders-for-l2-locality-using-thread-group-id-swizzling/) - [Shader-db to prove instruction counts](https://blogs.igalia.com/apinheiro/2015/09/optimizing-shader-assembly-instruction-on-mesa-using-shader-db/) - [Texture cache](https://computergraphics.stackexchange.com/questions/357/is-using-many-texture-maps-bad-for-caching) - [Hierarchical Z map occlusion culling](https://www.rastergrid.com/blog/2010/10/hierarchical-z-map-based-occlusion-culling/) - [Reducing driver overhead](https://gdcvault.com/play/1020791/) - [Persistent mapping](https://www.khronos.org/opengl/wiki/Buffer_Object#Persistent_mapping) - [Post transform cache friendly way of rendering regular grids](http://www.ludicon.com/castano/blog/2009/02/optimal-grid-rendering/) - [Discussion on above](https://community.khronos.org/t/optimize-grid-rendering-for-post-t-l-cache/72272/9) - [Vertex optimization on modern GPUs](https://www.tugraz.at/fileadmin/user_upload/Institute/ICG/Images/team_steinberger/Pipelines/HPG-2018_shading_rate-authorversion.opt.pdf) - [General SIMD usage and techniques](https://repository.dl.itc.u-tokyo.ac.jp/record/48871/files/A32992.pdf) - [Fragment friendly circle meshing](http://www.humus.name/index.php?page=News&ID=228) - [Occlusion culling for terrain](https://www.researchgate.net/publication/248358913_Voxel_Column_Culling_Occlusion_Culling_For_Large_Terrain_Models) - [Billboard quad transformation optimization](https://gamedev.stackexchange.com/questions/201963/efficient-calculation-of-billboard-sprite-transformations) - [NVidia bindless extensions](https://developer.download.nvidia.com/opengl/tutorials/bindless_graphics.pdf) ## technical stuff - [Determinism between opengl vendors](https://stackoverflow.com/questions/7922526/opengl-deterministic-rendering-between-gpu-vendor) - [Early fragment test, my beloved](https://www.khronos.org/opengl/wiki/Early_Fragment_Test) - [Shape digitalization](https://tug.org/docs/hobby/hobby-thesis.pdf) - [Line and circle rasterization](http://www.sunshine2k.de/coding/java/Bresenham/RasterisingLinesCircles.pdf) - [Occlusion culling of Vintage Story](https://github.com/tyronx/occlusionculling) - [Minecraft work on cave occlusion, in 2 parts](https://tomcc.github.io/2014/08/31/visibility-1.html) - [Order independent blending technique](https://jcgt.org/published/0002/02/09/) - [High performance voxel engine](https://nickmcd.me/2021/04/04/high-performance-voxel-engine/) - [Monotone meshing](https://blackflux.wordpress.com/tag/monotone-meshing/) - [Capsule collision detection](https://wickedengine.net/2020/04/26/capsule-collision-detection/) - [Forsyth vertex cache optimization](https://tomforsyth1000.github.io/papers/fast_vert_cache_opt.html) - [Depth buffer based lighting](https://www.researchgate.net/publication/320616607_Eye-Dome_Lighting_a_non-photorealistic_shading_technique) ## generational stuff - [Domain warping](https://iquilezles.org/articles/warp/) - [Portable and fast Perlin noise in legacy GLSL](https://arxiv.org/abs/1204.1461) - [Evaluation of GPU noise hashing solutions](https://jcgt.org/published/0009/03/02/paper.pdf) - [SHISHUA](https://espadrine.github.io/blog/posts/shishua-the-fastest-prng-in-the-world.html) - [Generalized lattice noise](https://www.codeproject.com/Articles/785084/A-generic-lattice-noise-algorithm-an-evolution-of) - [Procedural hydrology](https://nickmcd.me/2020/04/15/procedural-hydrology/) - [Tectonics](https://nickmcd.me/2020/12/03/clustered-convection-for-simulating-plate-tectonics/) - [Approximation of heightmaps](https://www.cs.cmu.edu/~garland/scape/scape.pdf) ## notable extensions - [Vertex array locking](https://registry.khronos.org/OpenGL/extensions/EXT/EXT_compiled_vertex_array.txt) - [Packed pixels](https://people.freedesktop.org/~marcheu/extensions/EXT/packed_pixels.html) - [Provoking vertex](https://registry.khronos.org/OpenGL/extensions/EXT/EXT_provoking_vertex.txt) - [Framebuffer fetch](https://registry.khronos.org/OpenGL/extensions/EXT/EXT_shader_framebuffer_fetch.txt) - [Integer textures](https://registry.khronos.org/OpenGL/extensions/EXT/EXT_texture_integer.txt) - [Texture swizzle](https://registry.khronos.org/OpenGL/extensions/EXT/EXT_texture_swizzle.txt) - [Shader binaries](https://registry.khronos.org/OpenGL/extensions/ARB/ARB_get_program_binary.txt) - [Internal format query](https://registry.khronos.org/OpenGL/extensions/ARB/ARB_internalformat_query2.txt) - [Direct state access](https://registry.khronos.org/OpenGL/extensions/EXT/EXT_direct_state_access.txt) - [Texture view](https://registry.khronos.org/OpenGL/extensions/EXT/EXT_texture_view.txt) - [No error](https://registry.khronos.org/OpenGL/extensions/KHR/KHR_no_error.txt) - [Trinary min and max](https://registry.khronos.org/OpenGL/extensions/AMD/AMD_shader_trinary_minmax.txt) - [NV occlusion query, with partitioning, without locking, potentially with less overdraw](https://registry.khronos.org/OpenGL/extensions/NV/NV_conditional_render.txt) - [ES2 compatibility, can be used to query precision of floats](https://registry.khronos.org/OpenGL/extensions/ARB/ARB_ES2_compatibility.txt) - [Pipeline stats](https://registry.khronos.org/OpenGL/extensions/ARB/ARB_pipeline_statistics_query.txt) - [Parallel shader compile](https://registry.khronos.org/OpenGL/extensions/ARB/ARB_parallel_shader_compile.txt) - [Shader inter group communication](https://registry.khronos.org/OpenGL/extensions/ARB/ARB_shader_ballot.txt) - [Granular buffer memory control](https://registry.khronos.org/OpenGL/extensions/ARB/ARB_sparse_buffer.txt) - [Window pos](https://people.freedesktop.org/~marcheu/extensions/ARB/window_pos.html) - [Optimized fixed function fog](https://people.freedesktop.org/~marcheu/extensions/doc/fog_coord.html) ## data representations - [Efficient varying-length integers](https://john-millikin.com/vu128-efficient-variable-length-integers) - [Awesome article on hashtables](https://thenumb.at/Hashtables/) - [Crit-bit trees](https://cr.yp.to/critbit.html) - [QP tries](https://dotat.at/prog/qp/README.html)