More

graffix · on Jan 13, 2024

My current job is in robotic space exploration, on a renderer in which to immerse a robot in a virtual world. This is just play in-silico, relying heavily on immersive technology developed for video games. In evolutionary terms, play's function is to learn in preparation for real events, which appears to work out on the technological frontier too.

graffix · on Dec 20, 2023

GPGPU was possible before compute shaders, but very painful. ShaderToy is the hard way today (and for the last ~15 years).

Simulations commonly target GPU, and a game is just an interactive simulation, so in some respect it's straightforward to write games for GPU. From a utilization perspective, you'll need minimally tens of thousands of dynamic game entities, which is one among many reasons why it's uncommon.

graffix · on Oct 22, 2023

Indeed. Though I did like the rest of the article, three of the authors' pillars to define specialization are dubious:

> 1. substantial numbers of calculations can be parallelized

> 2. the computations to be done are stable and arrive at regular intervals ('regularity')

> 3. relatively few memory accesses are needed for a given amount of computation ('locality')

Where (1) fails, any modern multicore + SIMD + ILP desktop/console/mobile CPU will run at a tiny fraction of its peak throughput. While sufficiently small serial tasks still complete in "good enough" time, the same could be said of running serial programs on GPU (in fact this is sometimes required in GPU programming). People routinely (and happily) use PL implementations which are ~100x slower than C. The acceptibility of ludicrous under-utilization factors depends on the tininess of your workload and amount of time to kill. Parallelism is used broadly for performance; it's about as un-specialized as you can get!

(2) and (3) are really extensions of (1), but both remain major issues for serial implementations too. There mostly aren't serial or parallel applications, rather it's a factor in algorithm selection and optimization. Almost anything can be made parallel. Naturally you specialize HW to extract high performance, which requires parallelism, for specialized HW as for everywhere else.

The authors somewhat gesture towards the faults of their definition of "specialized" later on. Truly specialized HW trades much (or all) programmability in favor of performance, a metric which excludes GPUs from the last ~15 years:

> [The] specialization with GPUs [still] benefited a broad range of applications... We also expect significant usage from those who were not the original designer of the specialized processor, but who re-design their algorithm to take advantage of new hardware, as deep learning users did with GPUs.

graffix · on Oct 15, 2023

Right. Note that Gaussian splatting as a rendering primitive dates from the early 90s, but it never saw much use. Splats aren't very good for magnification (important for medical/scientific/engineering visualization), nor do they have easy support for domain repetition (important for video games).

The new thing is fast regression of a real light field into Gaussian splats, which can then be rendered at reasonable rates. Dynamic lighting requires full inverse rendering as a preprocess, which is way beyond the scope of the technique. Technically Gaussian splats could form part of an inverse rendering pipeline, and also be the final target representation, but I'm not sure there would be any benefit over alternative rendering primitives.

graffix · on Sept 9, 2023

> I don't see "resisting using neural anything" as a good thing, just because they are popular

I read Aras' quip as more narrowly technical. OFC it's ambiguous and you'd have to ask the man himself.

The gist of NeRF is to obtain an NN representation of a 5D light field (3D position + 2D direction) from samples (photographs) of the real-world light field. Alarm bells ring already- 5 dimensions isn't that many! Considering NeRF has always used a low-rank spherical harmonic representation of the directional domain, it's even more like 3D-and-change. To reconstruct a function of such low dimensionality, why choose an NN?

Then at inference time, for each pixel, you have sample the NN repeatedly over the view ray. This part is exceedingly silly, as compact representations of light fields are a solved bread-and-butter problem in graphics.

Later on Plenoxels explicitly took the "Ne" out of "NeRF", giving far higher training and inference performance (also mentioned ITT). To be fair, and later still, Nvidia somewhat redeemed NNs here with Instant NeRF: https://nvlabs.github.io/instant-ngp/assets/mueller2022insta... ...where the twist was to interpolate fancy input emeddings, which are run through a tiny NN. That tininess is important, as the need to fetch NN weights from VRAM would kick NNs right off the Pareto frontier.

Zooming out, NNs have only seen wide adoption in graphics engineering for reconstruction from sparse data (inc. denoising). Makes sense, as that's a high-dimensional problem. Still, beware that the NN solutions rarely blow handmade algorithms out of the water. I also think using tiny NNs for compression- closely related to reconstruction- has a future too. Beyond that, if NNs were to set the graphics world ablaze, it would've happened by now.

Lots of graphics engineering is just approximating functions, so it's natural NNs have some place here. However, our functions tend to be more understandable, tractable, malleable. It's not an application domain where it's virtually impossible to write an algorithmic solution by hand (let alone one that performs well), like natural language understanding.

graffix · on May 12, 2023

This is great! I've poked around for a tool with ShaderToy's immediacy, but for compute shaders. Always come up empty until now.

graffix · on April 10, 2023

Three sentences in, it was obviously sleep apnea. Guess the author was going for dramatic irony. I'm curious why he didn't do a cursory Internet search for his symptoms.

graffix · on March 24, 2023

At work I inherited a raytracer codebase with a severe memory bloat problem on terrains. The size of terrain BLASes is precisely what one would expect from a bog-standard BVH with branch factor 2, so I'm sure you're right.

This is on Turing. Nvidia would've been motivated to de-risk the introduction of RTX by making boring choices. You may well see different results on later archs.

graffix · on March 9, 2023

Pixels are points with a (usually) square footprint. Whether the point-ness or square-ness is emphasized depends on context. "Pixel" = "picture element" = "quantum of a picture", nothing more.

graffix · on Feb 2, 2023

In game AI, these are discussed often under various guises like "influence maps" and "Dijkstra maps".