Hacker Newsnew | past | comments | ask | show | jobs | submit | tanelpoder's commentslogin

At an old startup attempt we once created a nested hierarchy metrics visualization chart that I later ended up calling Bookshelf Charts, as some of the boxes filled with with smaller boxes looked like a bookshelf (if you tilted your head 90 degrees). Something between FlameGraphs and Treemaps. We also picked “random” colors for aesthetics, but it was interactive enough so you could choose a heat map color for the plotted boxes (where red == bad).

The source code got lost ages ago, but here are some screenshots of bookshelf graphs applied to SQL plan node level execution metrics:

https://tanelpoder.com/posts/sql-plan-flamegraph-loop-row-co...


Very neat. And if anyone from Plotly should happen to be reading this, a compact format like this might be an interesting option for Icicle Charts, akin to how the compact, indented version of Excel pivot tables saves horizontal space over the "classic" format pivot table.

Thanks for sharing, that is a neat in-between.

I figure it’s one way to keep your compiler version unchanged for eBPF work, while you might update/upgrade your dev OS packages over time for other reasons. The title of the linked issue is this:

“Checksum code does not work for LLVM-14, but it works for LLVM-13”

Newer compilers might use new optimizations that the verifier won’t be happy with. I guess the other option would be to find some config option to disable that specific incompatible optimization.


If anyone is interested in reading about a similar ”local-NVMe made redundant & shared over network as block devices” engine, last year I did some testing of Silk’s cloud block storage solution (1.3M x 8kB IOPS and 20 GiB/s throughput when reading the block store from a single GCP VM). They’re using iSCSI with multipathing on the client side instead of a userspace driver:

https://tanelpoder.com/posts/testing-the-silk-platform-in-20...


I submitted the link that Julian posted (to his article) on X yesterday. There’s indeed some refresh action in my browser too, but the article opens up.


When it was first submitted the URL was http://0.0.0.0:4000/ .. that's since been magically fixed (by a mod?). I checked http://www.hydromatic.net and the post wasn't visible their either (most recent was Morel Rust release 0.2.0)


Interesting. I just posted the link from this X post [1] without modifying it (and it wasn’t 0.0.0.0…). I think one thing HN does is that when it detects a HTTP redirect, it will start using the referred to URL instead…

[1] https://x.com/julianhyde/status/1982637782544900243


Yep.. it was the canonical header (not the fault of your submission, it's wrong on the blog and will effected SEO):

  <link rel="canonical" href="http://0.0.0.0:4000/draft-blog/2025/10/26/history-of-lambda-syntax.html" />
.. that the blog is called draft-blog doesn't help either.


(I submitted this link). My interest in this approach in general is about observability infra at scale - thinking about buffering detailed events, metrics and thread samples at the edge and later only extract things of interest, after early filtering at the edge. I’m a SQL & database nerd, thus this approach looks interesting.


One snippet: "At this stage, the remaining events are recorded to a 40 Petabyte disk buffer."

A 40 PB disk buffer :-)


Indeed, would be nice if there were a standardized API/naming for internal NVMe events, so you'd not have to look up the vendor-specific RAW counters and their offsets. Somewhat like the libpfm/PerfMon2 library for standardized naming for common CPU counters/events across architectures.

The `nvme id-ctrl -H` (human readable) option does parse and explain some configuration settings and hardware capabilities in a more standardized human readable fashion, but availability of internal activity counters, events vary greatly across vendors, products, firmware versions (and even your currently installed nvme & smartctl software package versions).

Regarding eBPF (for OS level view), the `biolatency` tool supports -F option to additionally break down I/Os by the IORQ flags. I have added the iorq_flags to my eBPF `xcapture` tool as well, so I can break down IOs (and latencies) by submitter PID, user, program, etc and see IORQ flags like "WRITE|SYNC|FUA" that help to understand why some write operations are slower than others (especially on commodity SSDs without power-loss-protected write cache).

An example output of viewing IORQ flags in general is below:

https://tanelpoder.com/posts/xcapture-xtop-beta/#disk-io-wai...


It's not only NVMe/SSD that could use such standardization.

If you want detailed Ryzen stats you have to use ryzen_monitor. If you want detailed Seagate HDD stats you have to use OpenSeaChest. If you want detailed NIC queue stats there's ethq. I'm sure there are other examples as well.

Most hardware metrics are still really difficult to collect, understand and monitor.


Good article, thanks for sharing. I've been working on one part of this problem space for quite a while too. I want ability to directly drill down into latency reasons and underlying application component threads' wall-clock time, instead of having to correlate various systemwide utilization metrics and try to manually connect the dots.

I'm using eBPF-based dimensional data analysis, starting from bottom up (every system is a bunch of threads, including distributed systems) and move up from there. This doesn't replace existing distributed tracing approaches for end to end request view, but gives you deep observability all the way down to each service's underlying threads' wall-clock time (where blocked, sleeping and why, etc).

At this year's P99CONF I will launch the first GA release of my (open source) 0x.tools xcapture eBPF collectors, with a reference implementation of a TUI tool, showing dimensional performance modeling on these new thread sampling signals (xtop).

A couple of 1-minute asciicasts of xtop are here: https://tanelpoder.com/posts/xcapture-xtop-beta/


I guess it's because SimpleFold came from a research lab with different autonomy and less competing interests and internal politics...


Yeah looks like this will be a common theme :-)


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: