AI bots (or clients claiming to be one) appear quite fast on new sites, at least that's what I saw recently in few places. They probably monitor Certificate Transparency logs - you won't hide by avoiding linking. Unless you are ok with staying in the shadow of naked http.
Okay, but then what? Host your sites on something other than 'www' or '*', exclude them from search engines, and never link to them? Then, the few people who do resolve these subdomains, you just gotta hope they don't do it using a DNS server owned by a company with an AI product (like Google, Microsoft, or Amazon)?
I really don't know how you're supposed to shield your content from AI without also shielding it from humanity.
The biggest problem I have seen with AI scrapping is that they blindly try every possible combination of URLs once they find your site and blast it 100 times per second for each page they can find.
They don’t respect robots.txt, they don’t care about your sitemap, they don’t bother caching, just mindlessly churning away effectively a DDOS.
Google at least played nice.
And so that is why things like anubis exist, why people flock to cloudflare and all the other tried and true methods to block bots.
I don't see how that is possible. The web site is a disconnected graph with a lot of components. If they get hold of a url, maybe that gets them to a few other pages, but not all of them. Most of the pages on my personal site are .txt files with no outbound links, for that matter. Nothing to navigate.
> However, they absolutely also lower the barrier to entry and dethrone “pure single tech” (ie backend only, frontend only, “I don’t know Kubernetes”, or other limited scope) software engineers who’ve previously benefited from super specialized knowledge guarding their place in the business.
This argument gets repeated frequently, but to me it seems to be missing final, actionable conclusion.
If one "doesn't know Kubernetes", what exactly are they supposed to do now, having LLM at hand, in a professional setting? They still "can't" asses the quality of the output, after all. They can't just ask the model, as they can't know if the answer is not misleading.
Assuming we are not expecting people to operate with implicit delegation of responsibility to the LLM (something that is ultimately not possible anyway - taking blame is a privilege human will keep for a foreseeable future), I guess the argument in the form as above collapses to "it's easier to learn new things now"?
But this does not eliminate (or reduce) a need for specialization of knowledge on the employee side, and there is only so much you can specialize in.
The bottleneck maybe shifted right somewhat (from time/effort of the learning stage to the cognition and the memory limits of an individual), but the output on the other side of the funnel (of learn->understand->operate->take-responsibility-for) didn't necessary widen that much, one could argue.
> If one "doesn't know Kubernetes", what exactly are they supposed to do now, having LLM at hand, in a professional setting? They still "can't" asses the quality of the output, after all. They can't just ask the model, as they can't know if the answer is not misleading.
This is the fundamental problem that all these cowboy devs do not even consider. They talk about churning out huge amounts of code as if it was an intrinsically good thing. Reminds me of those awful VB6 desktop apps people kept churning out. Vb6 sure made tons of people nx productive but it also led to loads of legacy systems that no one wanted to touch because they were built by people who didn't know what they were doing. LLMs-for-Code are another tool under the same category.
>They still "can't" asses the quality of the output, after all. They can't just ask the model, as they can't know if the answer is not misleading.
Wasn't this a problem before AI? If I took a book or online tutorial and followed it, could I be sure it was teaching me the right thing? I would need to make sure I understood it, that it made sense, that it worked when I changed things around, and would need to combine multiple sources. That still needs to be done. You can ask the model, and you'll have the judge the answer, same as if you asked another human. You have to make sure you are in a realm where you are learning, but aren't so far out that you can easily be misled. You do need to test out explanations and seek multiple sources, of which AI is only one.
An AI can hallucinate and just make things up, but the chance it different sessions with different AIs lead to the same hallucinations that consistently build upon each other is unlikely enough to not be worth worrying about.
I don’t think the conclusion is right. Your org might still require enough React knowledge to keep you gainfully employed as a pure React dev but if all you did was changing some forms, this is now something pretty much anyone can do. The value of good FE architecture increased if anything since you will be adding code quicker. Making sure the LLM doesn’t stupidly couple stuff together is quite important for long term success
If you don’t know k8s, or any tech really, you can RTFM, you can generate or apply some premade manifests, you can feed the errors into the LLM and ask about it, you can google the error message, you can do a lot of things. Often times, in the “real world” of software engineering, you learn by having zero idea of how to do something to start with and gradually come up with ideas from screwing around with a particular tool or prototyping a solution and seeing how well it works.
I agree that some of the above basically amounts to: it’s easier to learn new things. Which itself might sound ho-hum, but it really is a fundamental responsibility of software engineers to learn new things, understand new and complex problems, and learn how to do it correctly and repeatable. LLMs unquestionably help with this, even with their tendency to hallucinate: usually proof by contradiction (or the failure of an over-confident chaos machine) is even better than just having a thing that spits out perfect solutions without needing the operator to understand it.
However, I will say that there is a very large gulf between learning how to reason about complex systems or code and learning how to use the entropy machine to produce nominally acceptable work. Pure reliance and delegation of responsibility to the AI will torpedo a lot of projects that a good engineer could solve, and no amount of lines of code makes up for a poorly conceived product or a brittle implementation that the LLM later stumbles over. Good engineering principles are more important than ever, and the developer has to force the LLM to conform to those.
There are many things to question about agentic coding: whether it’s truly cost/effort effective, whether it saves time, whether it makes you worse at problem solving by handing you facile half-solutions that wither in the face of the chaos of the real world, etc. But they clearly aren’t a technology which “doesn’t do ANYTHING useful”, as some HN posters claim.
It really depends on whether coding agents is closer to "compiler" or not. Very few amongst us verify assembly code. If the program runs and does the thing, we just assume it did the right thing.
But that moves the burden of maintenance from the provider of the service to its users (and/or partially to intermediary in form of "skills registry" of sorts, which apparently is a thing now).
So maybe a hybrid approach would make more sense? Something like /.well-known/skills/README.md exposed and owned by the providers?
That is assuming that the whole idea of "skills" makes sense in practice.
Yeah that's true, skill distribution isn't a solved problem yet - MCPs have a URL, which is a great way of making them available for people to start using without extra steps.
> Anyone here use this testing in the wild? Where's it most useful? Do you have the issue I described? Is there an easy way to overcome it?
One example would be when you have a naive implementation of some algorithm and you want to introduce a second one, optimized but with much more complex implementation. Then this naive one will act as a model for comparisons.
Another case that comes to mind is when you have rather simple properties to test (like: does it finish without a crash, within a given time?, does not cross some boundaries on the output?), and want to easily run over a sensible set of varying inputs.
I do for some time now, on the scale of around 20 hosts in their cloud offering. No restarts or network outages. I do see "migrations" from time to time (vm migrating to a different hardware, I presume), but without impact on metrics.
You can decouple Postgres and surrounding userspace upgrade cycles from your host os, if this is something that you want. Or run multiple different PG versions (have independent upgrades schedule) without being tied to the host os specific mechanisms for that.