It allows you to do stuff so much faster than having to type everything manually into the terminal. Also really enjoy the "Undo Last Commit" feature and how I can easily see all modified files at once and shuffle around stuff between the staging area.
This looks cherry-picked, for example Claude Opus had a higher score on SWE-Bench Verified so they conveniently left it out, also GDPval is literally a benchmark made by OpenAI
And who believes that the difference between 91.9% and 92.4% is significant in these benchmarks? Clearly these have margins of error that are swept under the rug.
but yeah, exercising can be an addiction? as sex. now doing these daily is fine. it turns into addiction when you can't stop or interfere negatively with your routine
With the speed of how pricing information propagates, this seems way too dependent on how the agent is built, what information it has access to, and the feedback loop between the LLM and actions it can carry out
> To save the situation, a consortium of US banks provided a $1.1 billion line of credit to the brothers which allowed them to pay Bache which, in turn, survived the ordeal.
It seems once you amass a certain amount of wealth, you just get automatically bailed out from your mistakes
Seems that in the end it didn't work out well for them:
> The Hunts lost over a billion dollars through this incident, but the family fortunes survived. They pledged most of their assets, including their stake in Placid Oil, as collateral for the rescue loan package they obtained. However, the value of their assets (mainly holdings in oil, sugar, and real estate) declined steadily during the 1980s, and their estimated net wealth declined from $5 billion in 1980 to less than $1 billion in 1988.
The article goes on to say that they had $5B net wealth around the time of the incident. It's not that unreasonable to get a loan of 20% of your wealth in a hurry, especially if said loan immediately benefits the lenders.
The American series (House and Elementary) has the advantage of more seasons and episodes, which I think is sometimes required to showcase the challenges of drug addiction and mental health.
The fact that these characters having to come back to face the same problem over and over again episode after episode is more true to the nature of the mental health problem itself.
BBC Sherlock has too little episodes to bring audience along a prolonged struggle with mental health.
It's definitely one of those pieces of media that is extremely frustrating because of how close it came to being an all-time great adaptation. Moffat needed somebody in the writers room with him to tell him 'no' sometimes.
It allows you to do stuff so much faster than having to type everything manually into the terminal. Also really enjoy the "Undo Last Commit" feature and how I can easily see all modified files at once and shuffle around stuff between the staging area.