Hacker Newsnew | past | comments | ask | show | jobs | submit | l33tman's commentslogin

We had proper licenses for all PADS seats at my previous work, but all the users always installed the cracked versions because it was unusable with the dongle.

I started such an alternative project just before GPT-3 was released, it was really promising (lots of neuroscience inspired solutions, pretty different to Transformers) but I had to put it on hold because the investors I approached seemed like they would only invest in LLM-stuff. Now a few years later I'm trying to approach investors again, only to find now they want to invest in companies USING LLMs to create value and still don't seem interested in new foundational types of models... :/

I guess it makes sense, there is still tons of value to be created just by using the current LLMs for stuff, though maybe the low hanging fruits are already picked, who knows.

I heard John Carmack talk a lot about his alternative (also neuroscience-inspired) ideas and it sounded just like my project, the main difference being that he's able to self-fund :) I guess funding an "outsider" non-LLM AI project now requires finding someone like Carmack to get on board - I still don't think traditional investors are that disappointed yet that they want to risk money on other types of projects..


  > I guess funding an "outsider" non-LLM AI project now requires finding someone like Carmack to get on board
And I think this is a big problem. Especially since these investments tend to be a lot cheaper than the existing ones. Hell, there's stuff in my PhD I tabled and several models I made that I'm confident I could have doubled performance with less than a million dollars worth of compute. My methods could already compete while requiring less compute, so why not give them a chance to scale? I've seen this happen to hundreds of methods. If "scale is all you need" then shouldn't the belief that any of those methods would also scale?


Markets are very good and squeezing profit out of sometimes ludicrous ventures, not good at all for foundational research.

It's the problem of having organised our economic life in this way, or rather, exclusively this way.


  > exclusively this way.
I think an important part is to recognize that fundamental research is extremely foundational. We often don't recognize the impacts because by the time we see them they have passed through other layers. Maybe in the same way that we forget about the ground existing and being the biggest contributor to Usain Bolt's speed. Can't run if there's no ground.

But to see economic impact, I'll make the bet that a single mathematical work (technically two) had a greater economic impact than all technologies in the last 2 centuries. Calculus. I haven't run the calculations (seems like it'd be hard and I'd definitely need calculus to do them), but I'd be willing to bet that every year Calculus results in a greater economic impact than FAANG, MANGO, or whatever the hot term is these days, does.

It seems almost silly to say this and it is obviously so influential. But things like this fade away into the background the same way we almost never think about the ground beneath our feet.

I have to say this because we're living in a time where people are arguing we shouldn't build roads because cars are the things that get us places. But this is just framing, and poor framing at that. Frankly, part of it is that roads are typically built through public funds and cars through private. It's this way because the road is able to make much higher economic impact by being a public utility rather than a private one. Incidentally, this makes the argument to not build roads self-destructive. It's short sighted. Just like actual roads, research has to be continuously performed. The reality is more akin to those cartoon scenes where a character is laying down the railroad tracks just in front of the speeding train.[0] I guess if you're not Gromit placing down the tracks it is easy to assume they just exist.

But unlike actual roads, research is relatively cheap. Sure, maybe a million mathematicians don't produce anything economically viable for that year and maybe not 20, but one will produce something worth trillions. And they do this at a mathematician's salary! You can hire research mathematicians at least 2 to 1 for a junior SWE. 10 to 1 for a staff SWE. It's just crazy to me that we're arguing we don't have the money for these kinds of things. I mean just look at the impact of Tim Berners Lee and his team. That alone offsets all costs for the foreseeable future. Yet somehow his net worth is in the low millions? I think we really should question this notion that wealth is strongly correlated to impact.

[0] Why does a 10hr version of this exist... https://www.youtube.com/watch?v=fwJHNw9jU_U


The issue with this is that if someone hacks one of the hosts now they have access to the backups of all your other hosts. With borg at least and the standard setup, would be cool if I was wrong though


At least with restic that is not an issue. See my other comment here: https://news.ycombinator.com/item?id=44626515

Backups are append only and each host gets its own key, the keys can be individually revoked.

Edit: I have to correct myself. After further research, it seems that append-only != write-only. Thus you are correct in that a single host could possibly access/read data backed up by another host. I suppose it depends on use-case whether that is a problem.


It would be nice if one of the backup systems supported public key crypto for the bulk of the data, so that the keys used for recovering data would be different from the keys used for backing up. I know there is an open ticket for one of restic/borg, because I subscribed to it a few years ago and periodically get updates on it, but nobody has come up with a solution to it yet.


I can see why you would chime in to say that in your experience you don't get any value out of it, but to chime in to say that the millions of people who do are "inexperienced" is pretty offensive. In the hands of skilled developers these tools are a complete gamechanger.


> In the hands of skilled developers these tools are a complete gamechanger.

This is where both sides are basically just accusing the other of not getting it

The AI coders are saying "These tools are a gamechanger in the hands of skilled developers" implying if you aren't getting gamechanging results you aren't skilled

The non-AI coders are basically saying the same thing back to them. "You only think this is gamechanging because you aren't skilled enough to realize how bad they are"

Personally, I've tried to use LLMs for coding quite a bit and found them really lacking

If people are finding a lot of success with them, either I'm using them wrong and other people have figured out a better way, or their standards are way, way lower than mine, or maybe they wind up spending just as long fixing the broken code as it would take me to write it


Sorry for the late reply here, but let me develop a bit more what I meant. I have a long and vast programming experience, but am also an entrepreneur with many projects that I switch between. I could have focused full-time on, say, React Native, and then I could churn out my monkey code all day long. But I don't spend full-time on RN and then suddenly 12 months passed where I didn't write any at all so my specific knowledge of that domain is always a bit behind.

But something like o4-mini-high is a domain expert in all versions of React, Redux, RN etc. and knows every internal SDK change over the last 10 years (or goes out and reads the changelogs and code itself). Countless times I've had it port old code to new and it figures it out 100%. It formulates good modern canonical ways to solve stuff. It knows all the stupid tricks you have to do to get RN stuff run well on Android and iOS that I would never be able to keep in my head unless I work full-time on that. And it does the eye-watering boring styling code that nobody likes, you can even just upload a screenshot of another app or a sketch on paper and it will correctly output code for the style in a matter of seconds.

The end result is that I can, without investing a full-time of keeping myself current, do a professional RN dual Android/iOS app development cycle because I have the general skill to understand what to ask it and how to merge its output properly. This leaves me time to do other stuff and generally be more productive.

My guess is that many who gave up on the AI coding stuff tried the bad tools like the default chatgpt 4o-mini (or tried the tools available 2 years ago) and got a bad experience. There are light-years of differences between these and something like o4-mini-high.

TL;DR: use the correct model for the job, and it doesn't really need to be an argument - if it makes you more productive it's a good tool, if it doesn't, nobody is forcing you to use it. But I don't think you should imply that everybody who likes these tools are stupid.


Stop telling everyone the secret.


Are these “millions of people” in the room with us now?


Where are these millions, and where is their output? You're in an echo chamber mate, there aren't millions of people using AI to do significant amounts of work.

Indie hackers just did an article on 4 vibe coded Startups and they all seem like a joke.

And they could only find 4!

I didn't look at them all, but the flight sim is spectacularly bad, the revenue numbers obviously unsustainable and it looks like something moderately motivated school children might have made for a school project in a week.


It's sort of a slow-motion avalanche. For example the price of outsourcing app development is really coming down now because it's one of the areas where the AI coding tools really excel. It's a lot of boiler-plate style code, pretty canonical stuff, no rocket science and, to be frank, not really that much that has to be clever. You can give the tools a screenshot of an UI and it gladly outputs correct styling code for it in seconds. It supercharges already skilled developers. And if you needed 6 in-house app developers before, now you only need 2. It isn't an immediate effect, but it is an effect that is slowly going to run though the business.

The question is, are companies going to use fewer people to do the same, or the same amount of people and just create better products?

For prototyping new ideas this is also an invaluable turbocharge for startups who can't really afford to have hordes of developers trying out alternative solutions.


Why?

What does it add that you couldn't have written yourself faster if you're so skilled?


> if you're so skilled?

I think this is needlessly snarky and also presupposes something that wasn't said. No one said it can write something that the developer couldn't write (faster) themselves. Tab complete and refactoring tools in your IDE/editor don't do anything you can't write on your own but it's hard to argue that they don't increase productivity.

I have only used cline for about a week, but honestly I find it useful in a (imo badly organized) codebase at work as an auto-grepper. Just asking it "Where does the check for X take place" where there's tons of inheritance and auto-constructor magic in a codebase I rarely touch, it does a pretty good job of showing me the flow of logic.


Why would you do that, a bus driver costs $10/h and the bus costs several $100k even if it's NOT self driving. The cost of the driver must be miniscule in comparision, not to mention if the self-driving bus and associated insurances will cost many more factors of $100k...


> a bus driver costs $10/h

For an urban transport system bus driver (vs say someone driving a private minibus)? Not sure where that is, but it's closer to 1000eur/week here. 39 hour week, 4 weeks holidays, so >25eur/hour, plus benefits, employer tax, and so on and so forth.

I doubt self-driving city buses will be a thing anytime for a while, but it's not because the drivers are cheap, it's because the job's actually quite difficult.


It doesn't neccessarily invalidate your overall point, but a bus driver would cost significantly more than $10/h. In Germany, the hourly rate is closer to double that, plus additional costs the employer must pay, such as taxes and a contribution towards healthcare. Plus the overhead of recruitment, training, HR, uniforms, sickness cover, etc...


And when you've bought the bus, a self-driving bus can operate 24/7, but to get 24/7 coverage with drivers you need at least 3 drivers per bus, probably more like 4 when accounting for illness and time off. You can cut that a bit and cut operating hours without major effect on service, but it's still going to be more than 1 at least.


A bus driver costs $10/h (or whatever; an NYC bus driver averages $28/h) per hour of operation.

A transit bus lasts an average of 12 years, and even if you don't run it nights or weekends, that's still around 50,000 hours of driving.

That's $500k at $10/h, and if you run a bus 24/7 in NYC you could spend $3M over the course of the bus's life.


Per hour per hour? Ok, let's see...second-order integral (these always confuse me because rates are stepping up a dimension but derivatives are stepping down a dimension, but that's neither here nor there), assuming no constants, so they start at 0$/h and there's no sign up bonus...

Got it. After one day, they've made $3423.33 A week gets them over two million dollars, putting a solid, but not insurmountable, dent in the transportation budget. A month over five-hundred million, easily bankrupting most midsized towns. After a year they've made almost half the entirety of the US GDP.

Maybe I should become a bus driver.


The point of the Narrative Clip (https://getnarrative.com) was to let people both live in the moment AND capture it. It was launched at the height of the life-logger/wearable conjunction 10 years ago, but it's still an interesting idea and the thought of having a more thorough "log" of your life might get more and more relevant the closer we get to vivid AI-recreations and even brain augmentation.

Some customers also just had bad memory and loved sort of re-living their day every evening which made memories store more efficiently in the brain.

There are social "contracts" that the users need to consider when using stuff like this though, as you do take photos of those around you or who you interact with..


Ok, so I'm thinking here that.. hmm... maybe.. just maybe... there is something that, kind of, steers the rest of the thought process into a, you know.. more open process? What do you think? What do I think?

As opposed to the more literary authoritative prose from textbooks and papers where the model output from the get-go has to commit to a chain of thought. Some interesting relatively new results are that time spent on output tokens more or less linearly correspond to better inference quality so I guess this is a way to just achieve that.

The tokens are inserted artificially in some inference models, so when the model wants to end the sentence, you switch over the end token with "hmmmm" and it will happily now continue.


You're partially correct, but describing it in that way makes it sound like if you could "just look a little bit closer" the statistics would disappear, which doesn't happen. So it's more subtle than this. Fundamentally it's because QM doesn't use additive probabilities, but rather additive amplitudes which are complex numbers, and the probability is the square of the sum of these, so you can get interference between amplitudes. You can never get interference by adding probabilities.

In the dual slit experiment this is visible as you can't get the interference effects by summing the probabilities for "particle through slit 1" and "particle through slit 2" but rather you need to sum the amplitudes of the processes.

Working physicists (since 100 years) just do this, there is no practical need to interpret it further, but it would be cool if someone could figure out some prediction/experiment mismatch that does indeed require tweaking this!


No you're understanding correctly (I think), the behaviour of a single detected particle depends on all possible paths it could take to get to the detection.

This is fundamental to 100 years of quantum mechanics and underlies most of physics including all semiconductors, materials science, chemistry, lasers, etc. The double slit experiment is just a very good illustration of the principle boiled down to its essentials, which is why it's everywhere in pop-sci. It makes for more accessible story than describing how a hydrogen atom works.


For someone jumping back on the local LLM train after having been out for 2 years, what is the current best local web-server solution to host this for myself on a GPU (RTX3080) Linux server? Preferably with support for the multimodal image input and LaTeX rendering on the output..

I don't really care about insanely "full kitchen sink" things that feature 100 plugins to all existing cloud AI services etc. Just running the released models the way they are intended on a web server...



preemptively adding for us AMD users - it’s pretty seamless to get Ollama working with rocm, and if you have a card that’s a bit below the waterline (lowest supported is a 6800xt, i bought a 6750xt), you can use a community patch that will enable it for your card anyway:

https://github.com/likelovewant/ollama-for-amd/wiki#demo-rel...

I specifically recommend the method where you grab the patched rocblas.dll for your card model, and replace the one that Ollama is using, as someone who is technical but isn’t proficient with building from source (yet!)


What's the benefit of the container over installing as a tool with uv? It seems like extra work to get it up and running with a GPU, and if you're using a Mac, the container slows down your models.


For that GPU the best Gemma 3 model you'll be able to run (with GPU-only inference) is 4-bit quantized 12b parameter model: https://ollama.com/library/gemma3:12b

You could use CPU for some of the layers, and use the 4-bit 27b model, but inference would be much slower.


LM studio in API mode, then literally any frontend that talks openAI api.

Or, just use the LM studio front end, it's better than anything I've used for desktop use.

I get 35t/s gemma 15b Q8 - you'll need a smaller one, probably gemma 3 15b q4k_l. I have a 3090, that's why.


Librechat + ollama is the best I have tried. Fairly simple setup if you can grok yaml config.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: