Hacker Newsnew | past | comments | ask | show | jobs | submit | frumiousirc's commentslogin

Yours matches my own experience and work habits.

My mental model is that LLMs are obedient but lazy. The laziness shows in the output matching the letter of the prompt but with as high "code entropy" as possible.

What I mean by "code entropy" is, for example, copy-paste-tweak (high entropy) is always easier (on the short term) for LLMs (and humans) to output than defining a function to hold concepts common across the pastes with the "tweak" represented by function arguments.

LLMs will produce high entropy output unless constrained to produce lower entropy ("better") code.

Until/unless LLMs are trained to actually apply craft learned by experienced humans, we must be explicit in our prompts.

For example, I get good results from say Claude Sonnnet when my instruction include:

- Statements of specific file, class, function names to use.

- Explicit design patterns to apply. ("loop over the outer product of lists of choices for each category")

- Implementation hints ("use itertools.product() to iterate over the combinations")

- And, "ask questions if you are uncertain" helps trigger an iteration to quickly clarify something instead of fixing the resulting code.

This specificity makes prompting a lot more work but it pays off. I only go this far when I care about the resulting code. And, I still often "retouch" as you also describe.

OTOH, when I'm vibing I'll just give end goals and let the slop flow.


Data flow graphs could arguably be called structured concurrency (granted, of nodes that resemble actors).

FWIW, this has become a perfectly cromulent pattern over the decades.

It allows highly concurrent computation limited only by the size and shape of the graph while allowing all the payloads to be implemented in simple single-threaded code.

The flow graph pattern can also be extended to become a distributed system by having certain nodes have side-effects to transfer data to other systems running in other contexts. This extension does not need any particularly advanced design changes and most importantly, they are limited to just the "entrance" and "exit" nodes that communicate between contexts.

I am curious to learn more about your system. In particular, what language or mechanism you use for the description of the graph.


We’re using the C++ Actor Framework (CAF) to provide the actor system implementation, and then we ended up using a stupid old protobuf to describe the compute graph. Protobuf doubles as a messaging format and a schema with reflection, so it lets us receive pipeline jobs over gRPC and then inflate them with less boilerplate (by C++ standards, anyway).

Related to what you were saying, the protobuf schema has special dedicated entries for the entrance and exit nodes, so only the top level pipeline has them. Thus the recursive aspect (where nodes can themselves contain sub-graphs) applies only to the processor-y bit in the middle. That allowed us to encourage the side effects to stay at the periphery, although I think it’s still possible in principle. But at least the design gently guides you towards doing it that way.

After having created our system, I discovered the Reactor framework (e.g. Lingua Franca). If I could do it all over, I think I would have built using that formalism, because it is better suited for making composable dataflows. The issue with the actor model for this use case is that actors generally know about each other and refer to each other by name. Composable dataflows want the opposite assumption: you just want to push data into some named output ports, relying on the orchestration layer above you to decide who is hooked up to that port.

To solve the above problem, we elected to write a rather involved subsystem within the inflation layer that stitches the business actors together via “topic” actors. CAF also provides a purpose-built flows system that sits over top of the actors, which allows us to write the internals of a business actors in a functional reactive-x style. When all is said and done, our business actors don’t really look much like actors - they’re more like MIMO dataflow operators.

When you zoom out, it also becomes obvious that we are in many ways re-creating gstreamer. But if you’ve ever used gstreamer before, you may understand why “let’s rest our whole business on writing gstreamer elements” is too painful a notion to be entertained.


Sounds good. Thanks for describing it.

Since you still have C++ involved and if you are still looking for composable dataflow ideas, take a look at TBB's "flow_graph" module. It's graph execution is all in-process while what you describe sounds more distributed, but perhaps it is still interesting.


I see the "hunter2" exploit is ready to be upgraded for the LLM era.


Eject mass in the forward direction of its current tangent of motion. Slow down to go down.


So, for this they have a bit of expendable extra mass on board? What material is it, would it not cause even more debris then?


The 'expendable mass' is almost never a solid or liquid. It's the gaseous combustion exhaust or plasma exhaust from the satellite's thrusters. The advantage of gases is that they just expand and disperse fast enough to be too wispy to cause anything on impact.

However, there are a few systems that do use solid masses for obtaining a reaction force. A remarkable example is called a 'Yo-yo despinner' [1]. It was used in missions like Phoenix (Mars mission) and Dawn (Asteroid belt proto-planet mission). And yes, it does create space debris. But those space debris are probably somewhere in orbit around the sun. Nothing that those guys are going to be too worried about.

[1] https://en.wikipedia.org/wiki/Yo-yo_de-spin


https://starlink.com/technology

> Efficient argon thrusters enable Starlink satellites to orbit raise, maneuver in space, and deorbit at the end of their useful life. Starlink is the first argon propelled spacecraft ever flown in space.

And you can see "How Ion Engines Work in Under 60 Seconds" https://www.youtube.com/shorts/_MUv28Yf_4g


Satellites need thrusters for station keeping. Otherwise they drift out of their desired orbits over time.


Yes though the smallest ones like cubesats don't have them. They do tend to have rotation wheels for keeping themselves aligned but they can't actually affect their own orbit.


You're correct. I'm just going to add a bit more technical context here. The process of keeping a steady orbit is called 'station keeping'. Similarly, the process of maintaining the correct orientation/alignment is called 'attitude control' (attitude is the technical term for orientation).

Attitude control can be achieved to a finite limit using momentum wheels or reaction wheels. But at some point, it will reach its maximum speed and its control capability saturates. You will need to 'desaturate' the wheels and restore its control capability. One method is to produce a counter torque using special reaction control thrusters (RCT) called 'attitude control thrusters'. That needs propellants. Smaller satellites don't have that luxury. So they exploit Earth's magnetic field by using a 'magnetic torquer' to produce the counter torque against it. That needs only power, not propellants.


> No matter how advanced the model gets, you'll get better results if you can clarify your thoughts well in written language.

This definitely agrees with my experience. But a corollary is that written human language is very cumbersome to encode some complex concepts. More and more I give up on LLM-assisted programming because it is easier to express my desires in code than using English to describe what forms I want to see in the produced code. Perhaps once LLMs get something akin to judgement and wisdom I can express my desires in the terms I can use with other experienced humans and take for granted certain obvious quality aspects I want in the results.


Pedantically, this game is at least 2D due to including time. It presents more dimensions if we consider the different LED colors.

I'm now trying to contemplate what a truly 1D pong game would be. We can't escape time so we would have to remove positional and chromatic dimensions. That leaves us with a single blinking monochromatic LED.

Perhaps the game would resemble Richmond's flashing lights.


L1 (abs linear difference) is useful as minimizing on it gives an approximation of minimizing on L0 (count, aka maximizing sparsity). The reason for the substitution is that L1 has a gradient and so minimization can be fast with conventional gradient descent methods while minimizing L0 is a combinatoric problem and solving that is "hard". It is also common to add an L1 term to an L2 term to bias the solution to be sparse.


The article is about comments. But, more generally, I think the issue here is about naming things.

Names capture ideas. Only if we name something can we (or at least I) reason about it. The more clear and descriptive a name for something is, the less cognitive load is required to include the thing in that reasoning.

TFA's example that "weight" is a better variable name than "w" is because "weight" immediately has a meaning while use of "w" requires me to carry around the cumbersome "w is weight" whenever I see or think about "w".

Function names serve the same purpose as variable names but for operations instead of data.

Of course, with naming, context matters and defining functions adds lines of code which adds complexity. As does defining overly verbose variable names: "the_weight_of_the_red_ball" instead of "weight". So, some balance that takes into account the context is needed and perhaps there is some art in finding that balance.

Comments, then, provide a useful intermediate on a spectrum between function-heavy "Uncle Bob" style and function-less "stream of consciousness" style.


the first time you write something, descriptive names are handy, but if youre writing a second or thrid copy, or trying to combine several back down into one, those names are all baggage and come with a full mental model.

an alternative ive seen work well is names that arent descriptive on their own, but are unique and memorable, and can be looked up from a dictionary


I was also wondering about the inherent resolution for the BPM precision claims.

Besides the sample period, the total number of samples matter for frequency resolution (aka BPM precision).

44100 Hz sampling frequency (22.675737 us period) for 216.276 s is 9537772 samples (rounding to nearest integer). This gives frequency samples with a bandsize of 0.0046237213 Hz which is 0.27742328 BPM.

Any claim of a BPM more precise than about 0.3 BPM is "creative interpretation".

And this is a minimum precision. Peaks in real-world spectra have width which further reduces the precision of their location.

Edit to add:

https://0x0.st/Pos0.png

This takes my flac rip of the CD and simply uses the full song waveform. This artificially increases frequency precision by a little compared to taking only the time span where beats are occurring.


This is plainly false though. You're saying beats can't be localized to less than one second of precision (regardless of track length, which already smells suspect). Humans can localize a beat to within 50ms.


Yes, I got lost in the numbers and made a blunder by misinterpreting what we mean by frequency resolution expressed in "BPM" instead of Hz.

It is correct to say "0.0046237213 Hz which is 0.27742328 BPM". My mistake was to interpret 0.27742328 BPM as the limit of frequency resolution in units of BPM. Rather, any BPM measured must be an exact multiple of 0.27742328 BPM.

Thanks for pointing out my mistake!

> (regardless of track length, which already smells suspect)

Frequency resolution being dependent on the number of samples is a very well known property of basic sampling theory and signal analysis.

In fact, one can interpolate the frequency spectrum by zero-padding the time samples. This increases the resolution in an artificial way because it is after all an interpolation. However, a longer song has more natural frequency resolution than a shorter song.

Note, this frequency resolution is not related to fidelity which is some messy human related thing that is over a sliding window of shorter duration that I don't pretend to understand.

BTW, the opposite is also possible. You can zero-pad the spectrum as a means of resampling (interpolating) the time domain. This is slower but more spectrally correct than say time-domain linear or cubic interpolation.

These techniques require an FFT and so are somewhat expensive to apply to long signals like an entire song, as I did for the plot. Daft Punk's HBFS takes about 8 seconds on one CPU core with Numpy's FFT.


Well, plants and eyes long predate apes.

Water is most transparent in the middle of the "visible" spectrum (green). It absorbs red and scatters blue. The atmosphere has a lot of water as does, of course, the ocean which was the birth place of plants and eyeballs.

It would be natural for both plants and eyes to evolve to exploit the fact that there is a green notch in the water transparency curve.

Edit: after scrolling, I find more discussion on this below.


Eyes aren't all equal. Our trichromacy is fairly rare in the world of animals.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: