More

kuczmama · 2026-02-06T16:20:55 1770394855

This is a sad day. I used heroku for years (in the past).

A few alternatives to consider

- https://render.com/ - this is very close to heroku

- https://coolify.io/ - My personal favorite. It's slightly more involved, but you can run it on any hardware like hetzner and save a boatload.

swat535 · 2026-02-06T19:13:12 1770405192

How does render compare to fly.io? Does anyone has experience running production rails apps on these?

ojusave · 2026-02-06T20:34:22 1770410062

https://render.com/articles/render-vs-fly-io

mcintyre1994 · 2026-02-07T09:35:46 1770456946

I find it quite funny that they give themselves maximum marks for transparent pricing. If you go to their pricing page, everything is priced as “per user/month plus compute costs*”. Maybe it’s just because I’m on mobile and the page doesn’t seem to work super well, but reading if I have no idea what those compute costs are and therefore what the actual cost is.

tptacek · 2026-02-07T06:10:43 1770444643

Honestly, this is super fair. I went in expecting to hate-read it (not because I have any issues with Render, they're great, just these X vs. Y competitive posts). But yeah, I think this is a reasonable way to look at it.

swat535 · 2026-02-07T02:41:23 1770432083

Thank you! Can't believe I missed that..

joelanman · 2026-02-07T10:48:29 1770461309

https://railway.com is another good one

kuczmama · on Feb 16, 2025

South Bay Gen AI Meetup: https://www.meetup.com/gen-ai/

kuczmama · on Dec 27, 2024

You could get this plastic test: https://blueprint.bryanjohnson.com/products/microplastics-te.... I haven't tried it, but it's the only one I know of.

hombre_fatal · on Dec 28, 2024

I don’t know what that’s measuring but TFA is about plastic chemicals like BPA, not microplastics

kuczmama · on Dec 12, 2024

This is awesome! Thank you for building this.

kuczmama · on Nov 7, 2024

I have been using Google voice for 10+ years, and it's always been something that concerns me. I've been worried about Google either banning me or dropping the service. Does anyone know of a good alternative, I'm happy to pay for something but the main thing is it needs to be able to work from the browser like Google voice.

wiredfool · on Nov 7, 2024

I've been pondering something based on twilio or similar, but haven't found anything. Don't really need the web calling, but web sms would be useful at times.

jmp.chat doesn't really fill me with confidence, but it might be an option in an emergency.

xethos · on Nov 7, 2024

I've been using (and recommended up-thread) Voip.ms, whose services include web-based SMS as an option

miloignis · on Nov 7, 2024

I've been playing around with https://jmp.chat/ (SMS and Voice over XMPP) and might make the switch for my main number soon.

CharlesW · on Nov 7, 2024

One option is to buy an Ooma ($80) and port your number to it ($40). The my.ooma.com site isn't responsive, but otherwise works fine on mobile.

mdaniel · on Nov 7, 2024

I had never heard of them, but FWIW their "simplysafe competitor" says it comes with phone service, so that one-stop-shopping may interest this audience https://www.ooma.com/home-security/diy-security-system/#:~:t...

ed: however, unlike Google Voice it does not appear this service supports SMS/MMS so it's for sure not a one-to-one replacement :(

It _seems_ that Asterisk supports SMS, but from the wording I can't tell if they mean only in Europe or what https://docs.asterisk.org/Configuration/Applications/Short-M...

kuczmama · on Aug 7, 2024

Shameless plug: This is a podcast interview I did with Yi Ding of Llama index where we discussed some of the differences between LangChain and LlamaIndex (among other things). Give it a listen as a supplement to the article:

https://podcast.genaimeetup.com/e/typescript-ai-crafting-the...

kuczmama · on March 26, 2024

That is very cool. I'm curious if you can share some of the things people are using this synthetic DNA for?

koeng · on March 26, 2024

I can share a few things! (but definitely not all)

- One is finding different T7 RNA polymerases with unique properties by manipulating the backbone. They can be used for things like in-vitro RNA production for vaccines - Another is synthesizing a phage that has been sequenced for a specific organism, but that the samples are now lost of. So resynthesizing that genome from scratch - A different project (personal one) is building a DNA parts toolkit with standardized DNA parts so you can combine em together like legos. Pretty much nowhere but FreeGenes has open source genetic parts (I used to run that project), and I think open source genetic parts need to be in the world

kuczmama · on March 7, 2024

Somewhat related is "Oh My ZSH!" which is basically zsh on steroids, it's always one of the first things I install on a new computer. It gives things like new colors, themes, plugins, and more. Highly recommend you check it out.

https://ohmyz.sh/

zemnl · on March 8, 2024

I used OhMyZsh! for a long time and really liked it, but now I moved to Starship [1]. However I noticed I still missed some of the "steroids" that OhMyZsh brought so I keep it installed in my system and I only source the plugins I need in the .zshrc:

  source /usr/share/oh-my-zsh/lib/completion.zsh
  source /usr/share/oh-my-zsh/lib/directories.zsh
  source /usr/share/oh-my-zsh/lib/history.zsh
  source /usr/share/oh-my-zsh/lib/spectrum.zsh
  source /usr/share/oh-my-zsh/lib/key-bindings.zsh

[1]: https://starship.rs/

jerpint · on March 8, 2024

This + ohmytmux is my go to

kuczmama · on Jan 29, 2024

Realistically, what hardware would be required to run this? I assumed a RTX 3090 would be enough?

summarity · on Jan 29, 2024

A Studio Mac with an M1 Ultra (about 2800 USD used) is actually a really cost effective way to run in. Its total system power consumption is really low, even spitting out tokens at full tilt (<250W).

michaelt · on Jan 29, 2024

You can run a similarly sized model - Llama 2 70B - at the 'Q4_K_M' quantisation level, with 44 GB of memory [1]. So you can just about fit it on 2x RTX 3090 (which you can buy, used, for around $1100 each)

Of course, you can buy quite a lot of hosted model API access or cloud GPU time for that money.

[1] https://huggingface.co/TheBloke/Llama-2-70B-GGUF

kkzz99 · on Jan 29, 2024

RTX 3090 has 24GB of memory, a quantized llama70b takes around 60GB of memory. You can offload a few layers on the gpu, but most of them will run on the CPU with terrible speeds.

nullc · on Jan 29, 2024

You're not required to put the whole model in a single GPU.

You can buy a 24GB gpu for $150-ish (P40).

kuczmama · on Jan 29, 2024

Wow that's a really good idea. I could potentially buy 4 Nvidia P40's for the same price as a 3090 and run inference on pretty much any model I want.

frognumber · on Jan 30, 2024

For reference for readers.

SUPPORTED

=========

* Ada / Hopper / A4xxx (but not A4000)

* Ampere / A3xxx

* Turing / Quadro RTX / GTX 16xx / RTX 20XX / Volta / Tesla

EOL 2023/2024

=============

* Pascal / Quadro P / Geforce GTX 10XX / Tesla

Unsupported

===========

* Maxwell

* Kepler

* Fermi

* Tesla (yes, this one pops up over and over, chaotically)

* Curie

Older don't really do GPGPU much. The older cards are also quite slow relative to modern ones! A lot of the ancient workstation cards can run big models cheaply, but (1) with incredible software complexity (2) very slowly, even relative to modern CPUs.

Blender rendering very much isn't ML, but it is a nice, standardized benchmark:

https://opendata.blender.org/

As a point of reference: A P40 has a score of 774 for Blender rendering, and a 4090 has 11,321. There are CPUs ($$$) in the 2000 mark, so about dual P40. It's hard for me to justify a P40-style GPU over something like a 4060Ti 16GB (3800), an Arc a770 16GB (1900), or a 7600XT 16GB (1300). They cost more, but the speed difference is nontrivial, as is the compatibility difference and support life. A lot of work is going into making modern Intel / AMD GPUs supported, while ancient ones are being deprecated.

nullc · on Jan 30, 2024

P40 is essentially a faster 1080 with 24GB ram. For many tasks (including LLMs) it's easy to be memory bandwidth bottlenecked and if you are they are more evenly matched. (newer hardware has more bandwidth, sure but not in a cost proportional manner).

I find that my hosts using 9x P40 do inference on 70b models MUCH MUCH faster than a e.g. a dual 7763 and cost a lot less. ... and can also support 200B parameter models!

For the price of a single 4090, which doesn't have enough ram to run anything I'm interested in, I can have slower cards which have cumulatively 15 times the memory and cumulatively 3.5 times the memory bandwidth.

frognumber · on Jan 30, 2024

Interesting.

Technically, P40 is rated at an impressive 347.1GB/sec memory bandwidth, and 4060, at a slightly lower 272GB/sec. For bandwidth-limited workloads, the P40 still wins.

The 4090 is about 3-4x that, but as you point out, is not cost-competitive.

What do you use to fit 9x P40 cards in one machine, supply them with 2-3kW of power, and keep them cooled? Best I've found are older rackmount servers, and the ones I was looking at stoped short of that.

nullc · on Jan 30, 2024

I technically have 10 plus a 100GBE card in them, but due to nvidia chipset limitations using more than 9 in a single task is a pain. (Also, IIRC one of the slots is only 8x in my chassis)

Supermicro has made a couple 5U chassis that will take 10x double width cards and provide adequate power and cooling. SYS-521GE-TNRT is one such example (I'm not sure off the top of my head which mine are, they're not labeled on the chassis, but they may be that).

They pricey new but they show up on ebay for 1-2k. The last ones I bought I paid $1800, I think the earlier set I paid $1500/ea-- around the time that ethereum gpu mining ended ( I haven no clue why someone was using chassis like these for gpu mining, but I'm glad to have benefited! ).

eurekin · on Jan 29, 2024

Just make sure you're comfortable with manually compiling the bitsandbytes and generally combine a software stack of almost out of date libraries

nullc · on Jan 29, 2024

P40 still works with 12.2 at the moment. I used to use K80s (which I think I paid like $50 for!) which turned into a huge mess to deal with older libraries, especially since essentially all ML stuff is on a crazy upgrade cadence with everything constantly breaking even without having to deal with orphaned old software.

You can get gpu server chassis that have 10 pci-slots too! for around $2k on ebay. But note that there is a hardware limitation on the PCI-E cards such that each card can only directly communicate with 8 others at a time. Beware, they're LOUD even by the standards of sever hardware.

Oh also the nvidia tesla power connectors have cpu-connector like polarity instead of pci-e, so at least in my chassis I needed to adapt them.

Also keep in mind that if you aren't using a special gpu chassis, the tesla cards don't have fans, so you have to provide cooling.

kuczmama · on Jan 29, 2024

That's a good point. Are you referring to the out of date cuda libraries?

eurekin · on Jan 29, 2024

I don't remember exactly (either cuda directly or the cudnn version used by the flashattention)... Anyway, /r/localLlama has few instances of such builds. Might be really worthwhile looking that up before buying

trentnelson · on Jan 29, 2024

Can that be split across multiple GPUs? i.e. what if I have 4xV100-DGXS-32GBs?

kuczmama · on Jan 11, 2024

> The politicians against healthcare reform are well-known

Who are they specifically?

paxys · on Jan 11, 2024

The answer is the majority of the Republican party and their electorate, but just saying that will get you a barrage of downvotes on HN and other similar forums.

We get the system we deserve..

yunwal · on Jan 12, 2024

I generally vote Democrat, but there is no American political party putting serious support toward a sensible healthcare system.

wolverine876 · on Jan 12, 2024

It was top billing in Democrat policy for decades, from Clinton through Clinton, and Biden did some too. Top billing in Republican policy was to oppose it and undermine it - remember the endless, pointless, virtue-signaling votes during Obama's administration by Congressional Republicans to kill Obama's healthcare plan. Remember he got it through on the slimist margin, through procedural maneuver, with (no? one?) Republican votes.

Organize and get the ball rolling. What are the problems? How do we solve them? You're not a customer or a victim; you tell them what they should be doing. Don't worry, most politicians will do anything to get some votes!

yunwal · on Jan 12, 2024

> Remember the endless…

I remember how the ACA was essentially a giant gift to the insurance industry, and that despite being mandated in most states, only increased the insurance rate by a few percent. Oh, and the vast majority of the increase in insurance was health plans with a $10000+ deductible. I remember how medical debt bankruptcies continued to climb after it was passed.

> Don't worry, most politicians will do anything to get some votes!

Before they get votes, they have to win the money primary. Then we get to decide which technocrat or self-funded billionaire candidate sounds nicest to our ear.

It’s time to stop thinking you can vote your way out of this. Our leverage is our labor.

horns4lyfe · on Jan 12, 2024

Well we got Democrat-led reform and it’s arguably worse.

hasty_pudding · on Jan 11, 2024

We get to pick from the politicians that the DNC and the RNC allow us to pick from.

verisimi · on Jan 11, 2024

Just vote them out. The medical system is the only issue that matters. /s