Much of the podcast problems were due to Honey leaking any promo code to the world, including employee discounts and customer specific discounts.
When Intuit is scraping every browser tab, there is no way to link a podcast campaign with engagement, so the way they were paid for driving traffic is lost.
Basically Honey copied the Ashley Madison model, unconstrained addition with a pay for delete. Ashley Madison had no email verification fyi, any bot or angry neighbor could sign you up for an account, then they wanted payment to delete.
Honey would extract any promo code they could find, then try to make you pay to remove it.
TLA+ is not a silver bullet, and like all temporal logic, has constraints.
You really have to be able to reduce your models to: “at some point in the future, this will happen," or "it will always be true from now on”
Have probabilistic outcomes? Or even floats [0] and it becomes challenging and strings are a mess.
> Note there is not a float type. Floats have complex semantics that are extremely hard to represent. Usually you can abstract them out, but if you absolutely need floats then TLA+ is the wrong tool for the job.
TLA+ works for the problems it is suitable for, try and extend past that and it simply fails.
> You really have to be able to reduce your models to: “at some point in the future, this will happen," or "it will always be true from now on”
You really don't. It's not LTL. Abstraction/refinement relations are at the core of TLA.
> Or even floats [0] and it becomes challenging and strings are a mess.
No problem with floats or strings as far as specification goes. The particular verification tools you choose to run on your TLA+ spec may or may not have limitations in these areas, though.
> TLA+ works for the problems it is suitable for, try and extend past that and it simply fails.
TLA+ can specify anything that could be specified in mathematics. That there is no predefined set of floats is no more a problem than the one physicists face because mathematics has no "built-in" concept for metal or temperature. TLA+ doesn't even have any built in notions of procedures, memory, instructions, threads, IO, variables in the programming sense, or, indeed programs. It is a mathematical framework for describing the behaviour of discrete or hybrid continuous-discrete dynamical systems, just as ODEs describe continuous dynamical systems.
But you're talking about the verfication tools you can run on TLA+ spec, and like all verification tools, they have their limitations. I never claimed otherwise.
You are, however, absolutely right that it's difficult to specify probabilistic properties in TLA+.
> No problem with floats or strings as far as specification goes. The particular verification tools you choose to run on your TLA+ spec may or may not have limitations in these areas, though.
I think it's disingenuous to say that TLA+ verifiers "may or may not have limitations" wrt floats when none of the available tools support floats. People should know going in that they won't be able to verify specs with floats!
I'm not sure how a "spec with floats" differs from a spec with networks, RAM, 64-bit integers, multi-level cache, or any computing concept, none of which exists as a primitive in mathematics. A floating point number is a pair of integers, or sometimes we think about it as a real number plus some error, and TLAPS can check theorems about specifications that describe floating-point operations.
Of course, things can become more involved if you want to account for overflow, but overflow can get complicated even with integers.
You say no tools but you can "verify floats" with TLAPS. I don't think that RAM or 64-bit integers have facsimiles in TLA+. They can be described mathematically in TLA+ to whatever level of detail you're interested in (e.g. you have to be pretty detailed when describing RAM when specifying a GC, and even more when specifying a CPU's memory-access subsystem), but so can floating point numbers. The least detailed description - say, RAM is just data - is not all that different from representing floats as reals (but that also requires TLAPS for verification).
The complications in describing machine-representable numbers also apply to integers, but these complications can be important, and the level of detail matters just as it matters when representing RAM or any other computing concept. Unlike, say, strings, there is no single "natural" mathematical representation of floating point numbers, just as there isn't one for software integers (integers work differently in C, Java, JS, and Zig; in some situations you may wish to ignore these differences, in others - not). You may want to think about floating point numbers as a real + error, or you may want to think about them as a mantissa-exponent pair, perhaps with overflow or perhaps without. The "right" representation of a floating point number highly depends on the properties you wish to examine, just like any other computing construct. These complications are essential, and they exist, pretty much in the same form, in other languages for formal mathematics.
P.S. for the case of computing a mean, I would use Real rather than try to model floating point if the idiosyncracies of a particular FP implementation were important. That means you can't use TLC. In some situations it could suffice to represent the mean as any number (even an integer) that is ≥ min and ≤ max, but TLC isn't very effective even for algorithms involving non-tiny sets of integers when there's "interesting" arithmetic involved.
I don't know the state of contemporary model checkers that work with theories of reals and/or FP, and I'm sure you're much more familar with that than me, but I believe that when it comes to numeric computation, deductive proofs or "sampling tests" (such as property-based testing) are still more common than model-checking. It could be interesting to add a random sampling mode to TLC that could simulate many operations on reals using BigDecimal internally.
> TLA+ can specify anything that could be specified in mathematics.
You are talking about the logic of TLA+, that is, its mathematical definition. No tool for TLA+ can handle all of mathematics at the moment. The language was designed for specifying systems, not all of mathematics.
There are excellent probabilistic extensions to temporal logic out there that are very useful to uncover subtle performance bugs in protocol specifications, see e.g. what PRISM [1] and Storm [2] implement. That is not within the scope of TLA+.
Formal methods are really broad, ranging from lightweight type systems to theorem proving. Some techniques are fantastic for one type of problem but fail at others. This is quite natural, the same thing happens with different programming paradigms.
For example, what is adequate for a hard real-time system (timed automata) is useless for a typical CRUD application.
> TLA+ is not a silver bullet, and like all temporal logic, has constraints.
>
> You really have to be able to reduce your models to: “at some point in the future, this will happen," or "it will always be true from now on”
I think people get confused by the word "temporal" in the name of TLA+. Yes, it has temporal operators. If you throw them away, TLA+ (minus the temporal operators) would be still extremely useful for specifying the behavior of concurrent and distributed systems. I have been using TLA+ for writing specifications of distributed algorithms (e.g., distributed consensus) and checking them for about 6 years now. The question of liveness comes the last, and even then, the standard temporal logics are barely suitable for expressing liveness under partial synchrony. The value of temporal properties in TLA+ is overrated.
The "temporal" in TLA+ isn't about □ and ⬦. It's about ' and the abstraction-refinment relation with stuttering at its core (contrasted with o and its non-stuttering meaning). Of course, you can't really specify anything in TLA+ without □ (unless you rely on TLC, which inserts the □ for you).
You cannot specify much in TLA+ without ' and □, andtThe "temporal" part of TLA+ - i.e. the TLA logic - is essential; but saying it's like "all temporal logics" is ignoring the abstraction-refinement relation, which is the heart of TLA+ (that's what ⇒, basic implication, in TLA+ means) and other temporal logics miss.
Of course, you could hypothetically use the + part of TLA+, the formalised set theory, to specify everything, but that would be very inconvenient.
> Once the code for all destinations lived in a single repo, they could be merged into a single service. With every destination living in one service, our developer productivity substantially improved. We no longer had to deploy 140+ services for a change to one of the shared libraries. One engineer can deploy the service in a matter of minutes.
This is the problem with the undefined nature of the term `microservices`, In my experience if you can't develop in a way that allows you to deploy all services independently and without coordination between services, it may not be a good fit for your orgs needs.
In the parent SOA(v2), what they described is a well known anti-pattern: [0]
Application Silos to SOA Silos
* Doing SOA right is not just about technology. It also requires optimal cross-team communications.
Web Service Sprawl
* Create services only where and when they are needed. Target areas of greatest ROI, and avoid the service sprawl headache.
If you cannot, due to technical or political reasons, retain the ability to independently deploy a service, no matter if you choose to actually independently deploy, you will not gain most of the advantages that were the original selling point of microservices, which had to do more with organizational scaling than technical conserns.
There are other reasons to consider the pattern, especially due to the tooling available, but it is simply not a silver bullet.
And yes, I get that not everyone is going to accept Chris Richardson's definitions[1], but even in more modern versions of this, people always seem to run into the most problems because they try to shove it in a place where the pattern isn't appropriate, or isn't possible.
But kudos to Twilio for doing what every team should be, reassessing if their previous decisions were still valid and moving forward with new choices when they aren't.
I would caution that microservices should be architected with technical concerns first—-being able to deploy independently is a valid technical concern too.
Doing it for organizational scaling can lead to insular vision with turf defensive attitude, as teams are rewarded on the individual service’s performance and not the complete product’s performance. Also refactoring services now means organizational refactoring, so the friction to refactor is massively increased.
I agree that patterns should be used where most appropriate, instead of blindly.
What pains me is that a language like “Cloud-Native” has been usurped to mean microservices. Did Twilio just stop having a “Cloud-Native” product due to shipping a monolith? According to CNCF, yes. According to reason, no.
IMHO there are some serious problems here that won't relate to many situations, and is not really "waste" in the way claimed and will actually probably result in greater spends.
> Memory waste: request - actual usage [0]
Memory "requests" are hints to the kube-scheduler for placement, not a target for expected usage.
Memory limits are for enforcement, typically when to call the OOM killer
Niether placement nor oomkilling limits should have anything to do with normal operating parameters.
> The memory request is mainly used during (Kubernetes) Pod scheduling. On a node that uses cgroups v2, the container runtime might use the memory request as a hint to set memory.min and memory.lo [2]
By choosing to label the delta between these two as "waste" you will absolutely suffer from Goodhart's law and you will teach your dev team to not just request, but allocate memory and don't free it so that they can fit inside this invalid metric's assumptions.
It is going to work against the more reasonable goals of having developers set their limits as low as possible without negative effects, while also protecting the node and pod from memory leaks, while still gaining the advantages of over-provisioning, which is where the big gains are to be made.
you are technically right that requests are scheduling hints but in a cluster autoscaler world, requests=bill.
If I request 8GB for a pod that uses 1GB, the autoscaler spins up nodes to accommodate that 8GB reservation. That 7GB gap is capacity the company is paying for but cannot use for other workloads.
Valid point on Goodhart's Law, tho the goal shouldn't be fill the RAM, but rather lower the request to match the working set so we can bin-pack tighter.
This script does nothing to solve that and actually exasperates problem of people over specing.
It makes assumptions about pricing[0], when if you do need a peak of 8GB it would force you into launching and consuming that 8GB immediately, because it is just reading a current snapshot from /proc/$pid/status:VmSize [1] and says you are waisting memory if "request - actual usage (MiB)" [2]
What if once a week you need to reconcile and need that 8GB, what if you only need 8GB once every 10 seconds? There script won't see that; so to be defensive you can't release that memory, or you will be 'wasting' resource despite that peek need.
What if your program only uses 1GB, but you are working on a lot of parquet files, and with less ram you start to hit EBS IOPS limits or don't finish the nightly DW run because you have to hit disk vs working from the buffer with headroom etc..
This is how bad metrics wreck corporate cultures, the ones in this case encourage overspending. If I use all that ram I will never hit the "top_offender" list[3] even if I cause 100 extra nodes to be launched.
Without context, and far more complicated analytics "request - actual usage (MiB)" is meaningless, and trivial to game.
What incentive except making sure that your pods request ~= RES 24x7x356 ~= OOM_KILL limits/2, to avoid being in the "top_offender" does this metric accomplish?
Once your skip's-skip's-skip sees some consultant labeled you as a "top_offender" despite your transient memory needs etc... how do you work that through? How do you "prove" that against a team gaming the metric?
Also as a developer you don't have control over the clusters placement decisions, nor typically directly choosing the machine types. So blaming the platform user on the platform teams' inappropriate choice of instance types, while shutting down many chances of collaboration by blaming the platform user typically also isn't a very productive path.
Minimizing cloud spend is a very challenging problem, which typically depends on collaboration more than anything else.
The point is that these scripts are not providing a valid metric, and absolutely presenting that metric in a hostile way. It could be changed to help a discovery process, but absolutely will not in the current form.
Really fair critique regarding the snapshot approach. You're right optimizing limits based on a single point in time is dangerous for bursty workloads the need 8GB for 10 seconds scenario.
The intent of this script isn't to replace long-term metric analysis like Prometheus/Datadog trends, but to act as a smoke test for gross over-provisioning, the dev who requested 16GB for a sidecar that has flatlined at 100MB for weeks.
You make a great point about the hostile framing of the word waste. I definitely don't want to encourage OOM risks. I'll update the readme to clarify that this delta represents potential capacity to investigate rather than guaranteed waste.
Appreciate the detailed breakdown on the safety buffer nuances.
As we are talking about micro services, K8s has two patterns that are useful.
A global namespace root with sub namespaces will just desired config and current config will the complexity hidden in the controller.
The second is closer to your issue above, but it is just dependency inversion, how the kubelet has zero info on how to launch a container or make a network or provision storage, but hands that off to CRI, CNI or CSI
Those are hard dependencies that can follow a simple wants/provides model, and depending on context often is simpler when failures happen and allows for replacement.
E.G you probably wouldn’t notice if crun or runc are being used, nor would you notice that it is often systemd that is actually launching the container.
But finding those separation of concerns can be challenging. And K8s only moved to that model after suffering from the pain of having them in tree.
I think a DAG is a better aspirational default though.
I am running a container on an old 7700k with a 1080ti that gives me vscode completions with rag with similar latency and enough accuracy to be useful for boilerplate etc…
That is something I would possibly pay for but as the failures on complex tasks are so expensive, this seems to be a major use case and will just be a commodity.
Creating the scaffolding for a jwt token or other similar tasks will be a race to the bottom IMHO although valuable and tractable.
IMHO they are going to have to find ways to build a mote, and what these tools are really bad at is the problem domains that make your code valuable.
Basically anything that can be vibe coded can be trivially duplicated and the big companies will just kill off the small guys who are required to pay the bills.
Something like surveillance capitalism will need to be found to generate revenue needed for the scale of Microsoft etc…
Given how every CPU vendor seems to push for some kind of NPU, local running models will probably be far more common in next 5 years. And convincing everyone to pay subscription for very minimal improvements in functionality gonna be hard.
Have you documented your VSCode setup somewhere? I've been looking to implement something like that. Does your setup provide next edit suggestions too?
I keep idly wondering what would be the market for a plug and play LLM runner. Some toaster sized box with the capability to run exclusively offline/local. Plug it into your network, give your primary machine the IP, and away you go.
Of course, the market segment who would be most interested, probably has the expertise and funds to setup something with better horsepower than could be offered in a one size fits all solution.
The problem is that math is not some universal language, it is a broad field with various sub domains with their own conventions, assumptions, and needs.
Polysemy vs Homonymy vs Context Dependency will always be a problem.
There are lots of areas to improve, but one of the reasons learning math is hard is that in the elementary forms we pretend that there is a singular ubiquitous language, only to change it later.
That is why books that try to be rigorous tend to dedicate so much room at the start to definitions.
Abstract algebra is what finally help it click for me, but it is rare for people to be exposed to it.
To add to the other responses, cloud-init is also not a secure delivery mechanism, you shouldn’t be using it to deliver sensitive information, but you could use it to kick off scripts that access more appropriate mechanisms, like in AWS where you can access a local URL to obtain instance credentials.
The tooling for that exists today in Linux, and it is fairly easy to use with podman etc.
K8s choices clouds that a little, but for vscode completions as an example, I have a pod, that systemd launches on request that starts it.
I have nginx receive the socket from systemd, and it communicates to llama.cpp through a socket on a shared volume. As nginx inherits the socket from systemd it does have internet access either.
If I need a new model I just download it to a shared volume.
Llama.cpp has now internet access at all, and is usable on an old 7700k + 1080ti.
People thinking that the k8s concept of a pod, with shared UTC, net, and IPC namespaces is all a pod can be confuses the issue.
The same unshare command that runc uses is very similar to how clone() drops the parent’s IPC etc…
I should probably spin up a blog on how to do this as I think it is the way forward even for long lived services.
The information is out there but scattered.
If it is something people would find useful please leave a comment.
Plan9 had this by default in 1995, no third party tools required. You launch a program, it gets its own namespace, by default it is a child namespace of whatever namespace launched the program.
I should not have to read anything to have this. Operating systems should provide it by default. That is my point. We have settled for shitty operating systems because it’s easier (at first glance) to add stuff on top than it is to have an OS provide these things. It turns out this isn’t easier, and we’re just piling shit on top of shit because it seems like the easiest path forward.
Look how many lines of code are in Plan9 then look at how many lines of code are in Docker or Kubernetes. It is probably easier to write operating systems with features you desire than it is to write an application-level operating system like Kubernetes which provide those features on top of the operating system. And that is likely due to application-scope operating systems like Kubernetes needing to comply with the existing reality of the operating system they are running on, while an actual operating system which runs on hardware gets to define the reality that it provides to applications which run atop it.
You seem to have a misunderstanding of what namespaces accomplished on plan9, or that it was extending Unix concepts and assembling them in another way.
As someone who actually ran plan9 over 30 years ago I ensure that if you go back and look at it, the namespaces were intended to abstract away the hardware limitations of the time, to build distributed execution contexts of a large assembly of limited resources.
And if you have an issue with Unix sockets you would have hated it as it didn’t even have stalls and everything was about files.
Today we have a different problem, where machines are so large that we have to abstract them into smaller chunks.
Plan9 was exactly the opposite, when your local system CPU is limited you would run the cpu command and use another host, and guess what, it handed your file descriptors to that other machine.
The goals of plan9 are dramatically different than isolation.
But the OSes you seem to hate so much implemented many of the plan9 ideas, like /proc, union file systems, message passing etc.
Also note I am not talking about k8s in the above, I am talking about containers and namespaces.
K8s is an orchestrater, the kernel functionality may be abstracted by it, but K8s is just a user of those plan9 inspired ideas.
Netns, pidns, etc… could be used directly, and you can call unshare(2)[0] directly, or use a cri like crun or use podman.
Heck you could call the ip() command and run your app in an isolated namespace with a single command if you wanted to.
Kubernetes is an operating system on top of an operating system. Its complexity is insane.
The base OS should be providing a lot/all of these features by default.
Plan9 is as you describe out of the box, but what I want is what plan9 might be if it were designed today and could be with a little work. Isolation would not be terribly difficult to add to it. The default namespace a process gets by default could limit it to its own configuration directory, its own data directory, and standard in and out by default. And imagine every instance of that application getting its own distinct copy of that namespace and none of them can talk to each other or scan any disk. They only do work sent to them via stdin, as dictated in the srv configuration for that software.
Everything doesn’t HAVE to be a file, but that is a very elegant abstraction when it all works.
> call the ip() command and run your app in an isolated namespace with a single command if you wanted to.
I should not have to opt in to that. Processes should be isolated by default. Their view of the computer should be heavily restricted; look at all these goofy NPM packages running malware, capturing credentials stored on disk. Why can an NPM package see any of that stuff by default? Why can it see anything else on disk at all? Why is everything wide fucking open all the time?
Because containers on Linux will never be able to provide this, they are fundamentally insecure from the kernel layer up, adding another OS stack on top (k8s) will never address the underlying mess that Linux containers are fundamentally.
The fact that tools like docker, podman and bubblewrap exist and work points out that the OS supports it, but using the OS APIs directly sucks. Otherwise the only "safe" implementations of such features would need a full software VM.
If using software securely was really a priority, everyone would be rustifing everything, and running everything in separated physical machines with restrictive AppArmor, SELinux, TOMOYO and Landlock profiles, with mTLS everywhere.
It turns out that in Security, "availability" is a very important requirement, and "can't run your insecure-by-design system" is a failing grade.
> The fact that tools like docker, podman and bubblewrap exist and work points out that the OS supports it
Only via virtualization in the case of MacOS. Somehow, even windows has native container support these days.
A much more secure system can be made I assure you. Availability is important, but an NPM package being able to scan every attached disk in its post-installation script and capture any clear text credentials it finds is crossing the line. This isn’t going to stop with NPM, either.
One can have availability and sensible isolation by default. Why we haven’t chosen to do this is beyond me. How many people need to get ransomwared because the OS lets some crappy piece of junk encrypt files it should not even be able to see without prompting the user?
This sounds very interesting to me. I'd read through that blog post, as I'm working on expanding my K8s skills - as you say knowledge is very scattered!
When Intuit is scraping every browser tab, there is no way to link a podcast campaign with engagement, so the way they were paid for driving traffic is lost.
Basically Honey copied the Ashley Madison model, unconstrained addition with a pay for delete. Ashley Madison had no email verification fyi, any bot or angry neighbor could sign you up for an account, then they wanted payment to delete.
Honey would extract any promo code they could find, then try to make you pay to remove it.
reply