On my MBP M2 Max 16" with 92GB of RAM, 2 prompts on llama3.3:70b or 1 prompt on deepseek-r1:70b take about 10% of battery. Really makes me nervous when I think about people (ab)using gpt-o1 or 4o for everything.
For a prompt to make the cut, I usually judge whether the task (or the documentation) is boring enough, time-consuming enough, yet precise enough for an llm to give me the solution. If something requires more than three-five prompts then I need to work on it myself and come back with smaller and more precise questions.
IMO, if the green energy industry wants to penetrate an important domain of modern life fast, the serving of open source models seems like a really low hanging fruit.
With 10% of battery I can code for 3 hours. If I compare the result of this activity with the 10% of battery wasted on a single question whose answer probably won't be completely correct, I consider the latter to be excessive, in relative terms, even if I take into account my time cost. Theoretically, if I would need to ask 100 questions to complete one task, I would need to recharge my computer 10 times. This is why I offload the burden of energy usage to a third party such as OpenAI.
BTW I don't drive if I don't have to either, which might make me more sensitive than you to this, comparatively low, energy usage diff.
funny, I actually think the energy use was high not low. Like energy to move a 1000 KG thing for 100 meters just to answer a simple question? Try doing that with your own muscles.
And that from the system that is using like 10% of the energy then OpenAI. So for OpenAI every useless request is like pushing 1000KG for 1KM.
Ok I have been waiting for some time to find the right context on hackernews to shamelessly mention a democratic system that I have been working on for the last year:
https://arxiv.org/pdf/2109.01436.pdf
IMO the problem with most {something} democracy systems is that they are voting paradigms that focus either only or mostly on how decisions are made but not how the options to decide upon are formed. This is true for the systems that you mentioned as well.
Moreover, the fact that decision-making and policy authoring are operations that take time leads to citizens' beliefs updating during the processes and after the fact, but with "old-school" type representative democracies there is no real-time mechanism to synchronise policy suggestions and voting outcomes with the new states of information accumulation by voters. "High-tech" democracies should take advantage of the technologies that enable fast and cheap communication but they should also have mechanisms that disincentivise the spam that inevitably follows cheap talk.
Regarding AI policy making, there are very interesting projects such as https://pol.is which address a lot of the problems that I mentioned above but, in my opinion, their output should be used as advise only. We have not yet exhausted human-centric democratic systems and neural networks are still privately-owned and privately-trained black boxes.
Been working with and around Polis software for a few years, and have had the chance to visit Taiwan to participate in vTaiwan for a few months. Definitely seconding it as massively interesting.
> they are voting paradigms that focus either only or mostly on how decisions are made but not how the options to decide upon are formed.
Just wanted to second your thought here, as we seem to have converged on this independently. I've been speaking similarly about the framing of Polis. Facilitative processes often require cycles of expansion of possibility and then collapse into singular options (whether the options to vote between, or the yay/nay decision). This expansion/collapse is often called the double diamond process.
For example, civil society discussion is often expansion, deliberative assembly in parliament is also ideally about expansion. Putting forward electorial candidates is expansion. Voting (electorial or parliamentary) is purely collapse. Who participates in either the expansion or collapse is a critical feature that differs in all these.
Different levels of expansion/collapse of possibility space are stitched together in our old process, often built on technology of their era, usually created to suit the era of postal mail and telegraphs (communicating electoral results) and horse-drawn carriage (attending parliamentary proceedings). In its prime, Roberts Rules of Order was a cutting-edge social technology that (coupled with other parliamentary democracy infra), allowed as many people as possible to have their views enshrined in governing law.
Most digital innovation focusses on modernizing the collapse process inherent in voting, but it is the less interesting part imho.
Polis is a tool that involves new technology that potentially unseats Roberts Rules of Order as the universal technology of deliberation (expansion). It's dimensional reduction allows for a lightweight process that accepts TONS of participation. It facilitates larger processes where people can participate more directly in issue-based expansion of the possibilities on the way to collapsing into a singular decision or compromise.
It is a deliberation-support tool (expansion), not a decision-making tool (collapse).
Hope this helps others trying to grapple with the significance of tools like Polis :)
Thank you very much for the nice commment! I like the double diamond process idea. Informative comments like this make me come back to hackernews.
I would say that the model that I have been working on is like a framework for both sides. From the few papers that exist about deliberative democracy, the idea is that with a well thought out and inclusive process of expansion on a particular subject, the collapse is essentially useless since its result will essentially confirm the outcome.
Polis seems to work quite well with the plurality of opinions but IMO, for the process to be inclusive and for malevolent participants to not hijack the conversation, there needs to exist a set of incentives and disincentives that will make citizens have an honest behavior towards the system.
Of course these types of conversations are what I live for so I sent you a message on Twitter :)
On the other hand you can test a neural network a million times, exhaustingly mapping its biases and failures. You can't do that with a human, you got to take a chance.
Yes you can do a million tests but the biases and failures of the neural net are chosen by human developers, a miniscule subset of citizens who themselves have some form of bias depending on their educational background, places where they grew up, family and friends etc.
While you can replace a human in a powerful position with another one, since we do not know how to surgically correct individual weights in order to remove a specific decisional bias from a neural net, we can only retrain it and hope for the best. Because we are humans ourselves, we can understand the incentives of other humans and create adequate mechanisms to correct for their selfishness and their bias but what would be the incentive of an all powerful neural net? Which loss is it minimizing? And how does it know the citizens' preferences in the present and in the future? If the citizens stop feeding it (accurate) information at one point in time, will it still be the benevolent dictator that it was supposed to be?
*Edit: Another counterargument that generally applies to neural nets making decisions for human activities is the argument of accountability. While you can put on trial a bad politician for their harmful-to-society decision making, who is to blame when the neural net will inevitably spew the wrong output on an issue that it has not encountered before and a policy decision will be made based on that? Will we put the developers on trial?
Apparently GPT-3 has learned a large number of personality types and their fine grained biases, well enough to simulate a poll with good accuracy. This means you can poll it any time to check how a population of choice would have reacted had something happened. The language model does not carry the just biases of its builders, it learns all biases equally, you just need to specify the desired bias (personality type) when you call it. The responsibility falls with the user, builders are not to blame this time unless they didn't cover every bias equally well.
Ok so GPT-3 can model biases well. This still doesn't solve problems such as the optimal aggregation of citizens' preferences, which is the actual optimization problem of policy making. Just to give an idea of how complex a field this is, there is a subdomain of economic theory and information theory called social choice https://en.wikipedia.org/wiki/Social_choice_theory that works with these issues and itself does not have many theories about policy formation, mostly choice between already formed policies.
If a neural net finds the way to write policies that are Nash equilibrium collective decisions in all possible democracy systems then we will be close to solving the problem.
> you can test a neural network a million time [...] You can't do that with a human, you got to take a chance.
Funny, as a biochemist, I always took it that minds had essentially been through unfathomable numbers and levels of iteration through cultural and biological evolution cycles, and that ANNs were the untrustworthy new kids on the block :)
From my recent experience with WebAssembly developing a cryptographic library for Nodejs and the browser [1], I have to say that once someone needs to use memory allocation, typed arrays from JS to WASM (I did not manage to make the opposite work) etc. it quickly becomes obvious that there is lack of documentation and build system fragmentation that only hurts community growth IMO. If I was less motivated to finish the undertaking, I would just give up and go with libsodium-wrappers or tweetnacljs.
I started with clang targeting wasm32-unknown-unknown-wasm as my build system but this just did not work with malloc/free, unless I was targeting WASI, but if I targeted WASI I would not be able to run the module in the browser except with a polyfill that was hard to set up with C/TS stack. I ended up with emscripten because it was importing the module with all the right helper functions but there I was getting memory errors on debug mode but not in production. I needed to pass the Uint8Arrays from JS to WASM in a very specific way (with HEAP8), otherwise the pointers were not working properly, but I was not able to find this in the documentation. I only found out from a stackoverflow comment somewhere after two weeks of brain melting (why would Uint8Array(memory.buffer, offset, len).byteOffset not work?).
After I compiled the project successfully and the JS was giving the correct results, I decided to compile with -s SINGLE_FILE command in order to make the package as portable as possible, but this increased the size significantly because it translates the bytes into base64 that are then converted into WASM module from JS. A package manager of a compiled language that outputs cross-env JS that solves these problems automagically would be, IMO again, a game changer for the ecosystem. I believe this is what AssemblyScript tries to achieve but I honestly could not make it work for my project after experimenting with it for one or two days.
I get that a lot of the problems come from the incompatibility of browser and Nodejs APIs and different agendas from the various stakeholders, but I would very much like to see these differences be reconciled so that we can have a good developer experience for cross-platform WASM modules, which will lead to more high-performance components for JS, which is a programming language that affects so many people.
Thank you so much for the feedback! This is my first project in C so it means the world to me. I try to follow all the best practices in the hope that others might want to contribute someday.
For a prompt to make the cut, I usually judge whether the task (or the documentation) is boring enough, time-consuming enough, yet precise enough for an llm to give me the solution. If something requires more than three-five prompts then I need to work on it myself and come back with smaller and more precise questions.
IMO, if the green energy industry wants to penetrate an important domain of modern life fast, the serving of open source models seems like a really low hanging fruit.