mgens's comments

mgens · 2025-08-22T19:51:35 1755892295

A month ago I saw a Waymo turn left into a tiny alley in Palo Alto and continue at full 25mph speed, which was alarming. I guess the alley is marked as a regular road in the software? Highlights how even if it's safer than humans on average, they need to minimize these weird behaviors in order to get socially accepted and avoid $$$ liability when there is an accident.

vesrah · 2025-08-22T20:14:34 1755893674

New speedbumps were installed in a school zone near my housing complex recently, we're a heavy Waymo area and I watched one of them launch itself over one without slowing down.

potato3732842 · 2025-08-23T00:05:53 1755907553

They installed one of those near my friends house. There's a couple mechanic shops in the vicinity used it for diagnosis while driving exactly the posted speed limit. It lasted about a month until the people who complained it into existence complained it out of existence.

mgens · on March 25, 2025

Have been using them for non-interactive coding where latency is not an issue. Specifically, turning a set of many free-text requirements into SQL statements, so that later when an item's data is entered into the system, we can efficiently find which requirements it meets. The reasoning models' output quality is much better than the non-reasoning models like 3.5 Sonnet, it's not a subtle difference.

mgens · on Feb 11, 2025

Unfortunately quite common to see serious mathematical issues in the medical literature. I guess due to a combination of math being essential to interpreting medical data and trial results, but most practitioners not having much depth of math knowledge. Just this week I came across the quote "Frequentist 95% CI: we can be 95% confident that the true estimate would lie within the interval." This is an incorrect interpretation of confidence intervals, but the amusing part is that it is from a tutorial paper about them, so the authors should have known better. And cited by 327! https://pmc.ncbi.nlm.nih.gov/articles/PMC6630113/

mgens · on Jan 31, 2025

Right. For large-volume requests that use reasoning this will be quite useful. I have a task that requires the LLM to convert thousands of free-text statements into SQL select statements, and o3-mini-high is able to get many of the more complicated ones that GPT-4o and Sonnet 3.5 failed at. So I will be switching this task to either o3-mini or DeepSeek-R1.

mgens · on Dec 6, 2024

Yes! Reading some basic documentation on the language or framework, then starting to build in Cursor with AI suggestions works so well. The AI suggests using functions you didn't even know about yet, then you can go read documentation on them to flesh out your knowledge. Learned basic web dev with Django and Tailwind this way and it accelerated the process greatly. Related to the article, this relies on being curious and taking the time to learn any concepts the AI is using, since you can't trust it completely. But it's a wonderfully organic way to learn by doing.

dogleash · on Dec 6, 2024

Tell me you're a Cursor advertisement chatbot without telling me you're a Cursor advertisement chatbot.

mgens · on Dec 7, 2024

I'll have you know that I successfully clicked on several traffic lights and motorcycles this week to get into a web site. Didn't even need Cursor!