In Python len() on a bytes type gives you the number of bytes, and len() on a str type gives you the number of codepoints. I think that makes sense, as strings are only intended to deal with text, and you should never have to worry about byte indexing at all.
As someone who has done both, I'd say that argument is wrong. It is much more convenient to index by code point. Indexing by bytes is almost always what you don't want to do, and leads to a lot of errors.
In many cases it's not very useful, but there are clearly cases where it is, e.g. if you want to normalize text, compose/change emojis, stuff like that.
A codepoint is the "smallest useful addressable unit" when dealing with Unicode text, so it makes sense that's the default.
It's also comparatively expensive to address grapheme clusters.
> In many cases it's not very useful, but there are clearly cases where it is, e.g. if you want to normalize text, compose/change emojis, stuff like that.
I can see that iterating through by codepoint could be useful for some of those cases, but I still can't see why you'd ever want to index by codepoint?
For the same reason you want to index anything: to slice, remove, etc. stuff. e.g. to replace a skin tone in an emoji: "str[i] = 0x1f3ff", or to insert one: "str = str[:i] + 0x1f3ff + str[i:]".
But that's a pointlessly inefficient way to do it - surely what you want there is to iterate and transform rather than scan through and then slice? (And don't you need to group by extended grapheme cluster rather than codepoint anyway for that to make sense?)