@jj4211

jj4211@lemmy.world · 14 hours ago

The type of problem in my experience is the biggest source of different results

Ask for something that is consistent with very well trodden territory, and it has a good shot. However if you go off the beaten path, and it really can’t credibly generate code, it generates anyway, making up function names, file paths, rest urls and attributes, and whatever else that would sound good and consistent with the prompt, but no connection to real stuff.

It’s usually not that that it does the wrong thing because it “misunderstood”, it is usually that it producea very appropriate looking code consistent with the request that does not have a link to reality, and there’s no recognition of when it invented non existent thing.

If it’s a fairly milquetoast web UI manipulating a SQL backend, it tends to chew through that more reasonably (though in various results that I’ve tried it screwed up a fundamental security principle, like once I saw it suggest a weird custom certificate validation and disable default validation while transmitting sensitive data before trying to meaningfully execute the custom valiidation.

jj4211@lemmy.world · 22 hours ago

I’ve been using Claude to mediocre results, so this time I used Gemini 3 because everyone in my company is screaming “this time it works, trust us bro”. Claude has not been working so great for me for my day job either.

jj4211@lemmy.world · 1 day ago

It’s certainly a use case that LLM has a decent shot at.

Of course, having said that I gave it a spin with Gemini 3 and it just hallucinated a bunch of crap that doesn’t exist instead of properly identifying capable libraries or frontending media tools…

But in principle and upon occasion it can take care of little convenience utilities/functions like that. I continue to have no idea though why some people seem to claim to be able to ‘vibe code’ up anything of significance, even as I thought I was giving it an easy hit it completely screwed it up…

jj4211@lemmy.world · 1 day ago

So if it can be vibe coded, it’s pretty much certainly already a “thing”, but with some awkwardness.

Maybe what you need is a combination of two utilities, maybe the interface is very awkward for your use case, maybe you have to make a tiny compromise because it doesn’t quite match.

Maybe you want a little utility to do stuff with media. Now you could navigate your way through ffmpeg and mkvextract, which together handles what you want, with some scripting to keep you from having to remember the specific way to do things in the myriad of stuff those utilities do. An LLM could probably knock that script out for you quickly without having to delve too deeply into the documentation for the projects.

jj4211@lemmy.world · 2 days ago

So, sure, in this group of only billionaires someone climbed onto the roof of a pickup truck to take a picture, that would be realistic? Trying to picture this scenario and it seems even more awkward than a drone shot.

And again, 7 of the most famous billionaires manage to casually run into each other in a parking lot with no one else around… Well except the guy climbing onto the roof of a pickup truck to take a picture…

There may or may not be artifacts, I’m not really interested enough to scrutinize, but I’m thinking a lack of artifacts won’t be a good gauge when someone can Photoshop out the rest of the way and/or models get better.

jj4211@lemmy.world · 2 days ago

Yeah, but mispredicting that would hurt. The market can stay irrational longer than I can stay solvent, as they say.

jj4211@lemmy.world · 2 days ago

Yeah, this one is going to hurt. I’m pretty sure my rather long career will be toast as my company and mostly my network of opportunities are all companies that are bought so hard into the AI hype that I don’t know that they will be able to survive that going away.

jj4211@lemmy.world · 2 days ago

Without the usual artifacts, there’s still plenty to go on.

The scene just doesn’t make sense. They are standing around in a circle parking garage next to a cyber truck. They wouldn’t meet up there if they were meeting. Further, think about the camera situation. Looks like someone on a ladder or a drone took a picture from above. That’s a weird amount of effort for a pretty stupid shot.

But yeah, if the person has a better sense of plausible scenes, then it’s much harder to know if it’s an AI image or not. On video we mostly have duration to go by, since LLM vid generation can’t hold it together too long (other weirdness happens too of course, but broadly speaking we just can’t trust pictures or very short videos).

jj4211@lemmy.world · 2 days ago

And someone would be taking a picture of them taking a selfie? They would be standing in a circle in a parking lot while someone gets above them to take a shot from above, either on a ladder or a drone? Forget the situation, the logistics of the pictures themselves just don’t make sense either.

jj4211@lemmy.world · 2 days ago

Exactly what comments are you saying they didn’t read?

jj4211@lemmy.world · 2 days ago

At work there’s a lot of rituals where processes demand that people write long internal documents that no one will read, but management will at least open it up, scroll and be happy to see such long documents with credible looking diagrams, but never read them, maybe looking at a sentence or two they don’t know, but nod sagely at.

LLM can generate such documents just fine.

Incidentally an email went out to salespeople. It told them they didn’t need to know how to code or even have technical skills, they code just use Gemini 3 to code up whatever a client wants and then sell it to them. I can’t imagine the mind that thinks that would be a viable business strategy, even if it worked that well.

jj4211@lemmy.world · 2 days ago

An LLM can generate code like an intern getting ahead of their skis. If you let it generate enough code, it will do some gnarly stuff.

Another facet is the nature of mistakes it makes. After years of reviewing human code, I have this tendency to take some things for granted, certain sorts of things a human would just obviously get right and I tend not to think about it. AI mistakes are frequently in areas my brain has learned to gloss over and take on faith that the developer probably didn’t screw that part up.

AI generally generates the same sorts of code that I hate to encounter when humans write, and debugging it is a slog. Lots of repeated code, not well factored. You would assume of the same exact thing is fine in many places, you’d have a common function with common behavior, but no, AI repeated itself and didn’t always get consistent behavior out of identical requirements.

His statement is perhaps an over simplification, but I get it. Fixing code like that is sometimes more trouble than just doing it yourself from the onset.

Now I can see the value in generating code in digestible pieces, discarding when the LLM gets oddly verbose for simple function, or when it gets it wrong, or if you can tell by looking you’d hate to debug that code. But the code generation can just be a huge mess and if you did a large project exclusively through prompting, I could see the end result being just a hopeless mess.v frankly surprised he could even declare an initial “success”, but it was probably “tutorial ware” which would be ripe fodder for the code generators.

jj4211@lemmy.world · 2 days ago

So I don’t get it, I have mine up with a domain without tsilscale… The clients are quite happy wherever. I don’t even see that much “crawling” traffic that goes to the domain, most just hit the server by ip and get a static 401 page that the “default” site is hard coded to give out.

jj4211@lemmy.world · 3 days ago

Hardware raid limits your flexibility, of any part fails, you probably have to closely match the part in replacement.

Performance wise, there’s not much to recommend them. Once upon a time the xor calculations weighed on CPU enough to matter. But cpus far outpaced storage throughput and now it’s a rounding error. They continued some performance edge by battery backed ram, but now you can have nvme as a cache. In random access, it can actually be a lability as it collapses all the drive command queues into one.

The biggest advantage is simplifying booting from such storage, but that can be handled in other ways that I wouldn’t care about that.

jj4211@lemmy.world · 3 days ago

While sas is faster, the difference is moot if you have even a modest nvme cache.

I don’t know if it’s especially that much more reliable, especially I would take new SATA over second hand sas any day.

The hardware raid means everything is locked together, you lose a controller, you have to find a compatible controller. Lose a disk, you have to match pretty closely the previous disk. JBOD would be my strong recommendation for home usage where you need the flexibility in event of failure.

jj4211@lemmy.world · 4 days ago

Not true, sometimes it’s DNS.

jj4211@lemmy.world · 4 days ago

I remember this sort of stuff a long time ago. There were wifi drivers that were either linux, but closed source, or horror of horrors having to resort to ndiswrapper…

Of course, the Ubuntu derivatives made this easy enough by just including it, but Fedora was much more purist about open source and so wouldn’t even tell you about rpm-fusion, let alone enable proprietary drivers for basic network access.

Now Fedora has edged a bit more practical and proactively let’s users know about how to add proprietary stuff and the wifi industry takes Linux seriously, if not for desktop use then for all the embedded use cases they would be left out of without good Linux support. Fedora is still a bit far on the ‘purist’ side still (try to play a lot of media using dnf provided software, it will tend to break), but not as hard as it used to be)

jj4211@lemmy.world · 4 days ago

the TLS-ALPN-01 challenge requires a https server that implements generating a self-signed certificate on demand in response to a specific request. So we have to shut down our usual traffic forwarder and let an ACME implementation control the port for a minute or so. It’s not a long downtime, but irritatingly awkward to do and can disrupt some traffic on our site that has clients from every timezone so there’s no universal ‘3 in the morning’ time, and even then our service is used as part of other clients ‘3 in the morning’ maintenance windows… Folks can generally take a blip in the provider but don’t like that we generate a blip in those logs if they connect at just the wrong minute in a month…

As to why not support going straight to 443, don’t know why not. I know they did TLS-ALPN-01 to keep it purely as TLS extensions to stay out of the URL space of services which had value to some that liked being able to fully handle it in TLS termination which frequently is nothing but a reverse proxy and so in principle has no business messing with payload like HTTP-01 requires. However for nginx at least this is awkward as nginx doesn’t support it.

jj4211@lemmy.world · 5 days ago

Hey, it’s just spitting hard facts like Musk has the “potential to drink piss better than any human in history,”

jj4211@lemmy.world · 5 days ago

Frankly, another choice virtually forced by the broader IT.

If the broader IT either provides or brokers a service, we are not allowed to independently spend money and must go through them.

Fine, they will broker commercial certificates, so just do that, right? Well, to renew a certificate, we have to open a ticket and attach our csr as well as a “business justification” and our dept incurs a hundred dollar internal charge for opening that ticket at all. Then they will let it sit for a day or two until one of their techs can get to it. Then we are likely to get feedback about something like their policy changing to forbid EC keys and we must do RSA instead, or vice versa because someone changed their mind. They may email an unexpected manager for confirmation in accordance to some new review process they implemented. Then, eventually, their tech manually renews it with a provider and attaches the certificate to the ticket.

It’s pretty much a loophole that we can use let’s encrypt because they don’t charge and technically the restrictions only come in when purchasing is involved. There was a security guy raising hell that some of our sites used that “insecure” let’s encrypt and demanding standards change to explicitly ban them, but the bearaucracy to do that was insurmountable so we continue.