I'm a software engineer working with scientist-turned-programmers, and what I've...

LeonardoTolstoy · on Jan 6, 2024

> Lack of familiarity with common data structures and algorithms

This part I 100% agree with. I adapt a lot of scientific code as my day-to-day and most of the issues in them tend to be making things 100x slower than they need to be and then even implementing insane approximations to "fix" the speed issue instead of actually fixing it

>"Big ball of mud" design

Funny enough this was explicitly how my PI at my current job wants to implement software. In his opinion the biggest roadblock in scientific software is actually convincing scientists to use the software. And what scientists want is a big ball of mud which they can iterate on easily and basically requires no installation. In his opinion a giant Python file with a requirement.txt file and a Python version is all you need. I find the attitude interesting. For the record he is a software engineer turned scientist, not the other way around, but our mutual hatred for Conda makes me wonder if he is onto something ...

>I think the appearance of "I'm just getting shit done" is often a superficial one, because it doesn't factor in the real costs: other scientists and engineers can't use their solutions because they're not designed in a way that makes them work in any other setting than the narrow one they were solving for.

For the record my experience is the exact opposite. The crazy trash software probably written in Python that is produced by scientists are often the ones more easily iterated on and used by other scientists. The software scientists and researchers can't use are the over-engineered stuff written in a language they don't know (e.g. Scala or Rust) that requires them to install a hundred things before they are able to use it.

karmelapple · on Jan 6, 2024

> The mindset … might be fine in a research setting

A vast amount of software is written for research papers that would be useful to people other than the paper’s authors. A lot of software that is in common use by commercial teams started off in academia.

One of the major issues I see is the lack of maintenance of this software, especially given all the problems written in your post and the one above. If the software is a big ball of mud, good luck to anyone trying to come in and make a modification for their similar research paper, or commercial application.

I don’t know the answer to this, but I think additional funding to biology labs to have something like a software developer who is devoted to making sure their lab’s software follows reasonably close to software development best practices would be a great start. If it’s a full time position where they’d likely stick around for many years, some of the maintenance issues would resolve themselves, too. This software-minded person at a lab would still be there even after the biology researchers have moved on elsewhere, and this software developer could answer questions from other people interested about code written years ago.

gyrovagueGeist · on Jan 7, 2024

This is the goal of the RSE field, but it's often still quite rare :(

https://us-rse.org/

karmelapple · on Jan 12, 2024

That's fantastic, I haven't heard of this group before! I wish there was a lot more effort spent here.

This seems like a much better way to spend one's software development time and experience than, say, ad-tech... at least in my humble opinion :)