

Sure just take the fun out of it why don’t you
Sure just take the fun out of it why don’t you
Good news, they find a treatment regimen that when applied to mice cause them to have a health span several times longer than the average health span of a mouse.
Bad news, the treatment regimen when applied to humans causes them to have a health span several times longer than the average health span of a mouse.
Yeah it’s essentially letting China win by default…
The issue here is that we’ve well gone into sharply exponential expenditure of resources for reduced gains and a lot of good theory predicting that the breakthroughs we have seen are about tapped out, and no good way to anticipate when a further breakthrough might happen, could be real soon or another few decades off.
I anticipate a pull back of resources invested and a settling for some middle ground where it is absolutely useful/good enough to have the current state of the art, mostly wrong but very quick when it’s right with relatively acceptable consequences for the mistakes. Perhaps society getting used to the sorts of things it will fail at and reducing how much time we try to make the LLMs play in that 70% wrong sort of use case.
I see LLMs as replacing first line support, maybe escalating to a human when actual stakes arise for a call (issuing warranty replacement, usage scenario that actually has serious consequences, customer demanding the human escalation after recognizing they are falling through the AI cracks without the AI figuring out to escalate). I expect to rarely ever see “stock photography” used again. I expect animation to employ AI at least for backgrounds like “generic forest that no one is going to actively look like, but it must be plausibly forest”. I expect it to augment software developers, but not able to enable a generic manager to code up whatever he might imagine. The commonality in all these is that they live in the mind numbing sorts of things current LLM can get right and/or a high tolerance for mistakes with ample opportunity for humans to intervene before the mistakes inflict much cost.
Well, here’s me pinning my hopes on your interpretation. A few more moderate leaders in the world would be a gigantic relief after so many years of how things have been going. I mostly grew up the last time the world swung a bit more moderate and would be ecstatic to feel that way again.
I’ve found that as an ambient code completion facility it’s… interesting, but I don’t know if it’s useful or not…
So on average, it’s totally wrong about 80% of the time, 19% of the time the first line or two is useful (either correct or close enough to fix), and 1% of the time it seems to actually fill in a substantial portion in a roughly acceptable way.
It’s exceedingly frustrating and annoying, but not sure I can call it a net loss in time.
So reviewing the proposal for relevance and cut off and edits adds time to my workflow. Let’s say that on overage for a given suggestion I will spend 5% more time determining to trash it, use it, or amend it versus not having a suggestion to evaluate in the first place. If the 20% useful time is 500% faster for those scenarios, then I come out ahead overall, though I’m annoyed 80% of the time. My guess as to whether the suggestion is even worth looking at improves, if I’m filling in a pretty boilerplate thing (e.g. taking some variables and starting to write out argument parsing), then it has a high chance of a substantial match. If I’m doing something even vaguely esoteric, I just ignore the suggestions popping up.
However, the 20% is a problem still since I’m maybe too lazy and complacent and spending the 100 milliseconds glancing at one word that looks right in review will sometimes fail me compared to spending 2-3 seconds having to type that same word out by hand.
That 20% success rate allowing for me to fix it up and dispose of most of it works for code completion, but prompt driven tasks seem to be so much worse for me that it is hard to imagine it to be better than the trouble it brings.
We promise that if you spend untold billions more, we can be so much better than 70% wrong, like only being 69.9% wrong.
Is it the case that the “they” that want a more moderate leader are consistent with the “they” that actually get to make the call?
I think the US has sunk it’s own ship without any particular effort by BRICS frankly.
They will require the requester to prove they control the standard http(s) ports, which isn’t possible with any nat.
It won’t work for such users, but also wouldn’t enable any sort of false claims over a shared IP.
If you can get their servers to connect to that IP under your control, you’ve earned it
Yeah but their violence in Ukraine dilutes NATO military attention, even if they aren’t that powerful a direct military ally.
I suspect the concern is that no one, including China itself knows how strong they would be in a military conflict, since they haven’t been in an at scale conflict in living memory, using economic power instead to great effect.
If they are really wanting to violently assert their view on Taiwan, they want global attention divided.
So your take is that because the US has misbehaved, then Russia should misbehave harder? Not that the nations should behave better in general…
If they marketed on the actual capability, customer executives won’t be as eager to open their wallet. Get them thinking they can reduce headcount and they’ll fall over themselves. You tell them your staff will remain about the same but some facets of their job will be easier, and they are less likely to recognize the value.
Also codecs… even with the right repositories enabled, you’ll tend to install a media application that manages to be utterly incapable of actually processing most media.
They’ve made strides on this front but it’s still messed up.
Also sometimes they are too aggressive on one front. Some of the applications you can install from their repository that have some python based features are broken because they can’t handle python 3.13. There’s some ability to install python 3.12 but without much beyond the core making it less useful.
The research I saw mentioning LLMs as being fairly good at chess had the caveat that they allowed up to 20 attempts to cover for it just making up invalid moves that merely sounded like legit moves.
I remember seeing that, and early on it seemed fairly reasonable then it started materializing pieces out of nowhere and convincing each other that they had already lost.
Because the business leaders are famously diligent about putting aside the marketing push and reading into the nuance of the research instead.
Yeah, they are frequently just parroting things like CVE notices as highlighted by a fairly stupid scanning tool.
The security ecosystem has been long diluted because no one wants to doubt a “security” person and be wrong, and over time that has made a pretty soft context for people to get credibility as a security person.