3D APIs used the same idea for a long time, probably going all the way back to the mid-90's with D3D2 'execute buffers' (GL display lists are even older but just similar, they're expected to be recorded once and executed many times instead of being rebuilt each frame).
And io_uring itself was more directly inspired by NVMe and RDMA, which of course work with these same queues as GFX cards. The original io_uring patch compares itself to SPDK, whose premise is "what if we expose an abstraction for a hardware queue per thread to an application " - basically the same programming model as io_uring. And SPDK was just taking techniques from networking (DPDK) and applying them to storage.