Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

That came up a few days ago, but in the context of Julia. I wrote a long response at the time [0]. I find it interesting that that's his title ("Why numbering should start at zero") but it really only makes sense as an argument (that starting at 0 is best) if you accept that this is the proper notation for ranges:

  a <= x < b
If you accept that, then the rest of his argument follows. But if you use:

  a <= x <= b
Then 1-based (where a = 1) indexing actually makes a lot of sense as well, because the other reasons for selecting his preferred range notation fall away. 1-based is only "weird" if you use his range notation because then you describe ranges as:

  1 <=  x < N+1
Where N is the length of the range. But, if you use the second form:

  1 <= x <= N
It reads fairly well.

[0] https://news.ycombinator.com/item?id=25723676



Djikstra's argument is, pound for pound, one of the densest things I have ever read.

If you spend some more time with it, and you really should, you will discover that most of it is in fact arguing that what he calls convention a) is the best choice.

So what you're saying here is circular; of course if you accept his argument, then you accept his conclusion, which is the title of the paper!

You really should accept his argument though. It's quite persuasive. You're really just saying that 1-based indexing is okay if you use convention c), as indeed, Lua does.

But you're not even touching on why he says not to. If you do, I'm confident you'll conclude, as the field in general has, that a) is the more powerful and general choice.

But I would like to do more than just gesture at Djikstra's magnificent argument and say "reread this!". The One True Wiki has a useful discussion of the subject: in particular, I find the observation that 0-based indexing unifies counting and measurement to be persuasive, and it's found nowhere in the paper in question.

https://wiki.c2.com/?ZeroAndOneBasedIndexes


I have read this EWD several times now, I think he makes a solid case for:

  a <= x < b
over the other three forms in the general case. But he fails to make an argument for the special case of 0-based indexing vs 1-based.

When restricted to just the discussion of 1-based and 0-based:

1. The experience report doesn't seem to apply as it's about the general notation advantage, not the specific cases of 0- or 1-based indexes. We would need additional experience reports about those (which I suspect would tend to favor 0-based).

2. We don't need to worry about calculating the range size, because it's obvious for both cases. Yes, (a) is better if you're using 0-based indexes and it's awkward if you're using 1-based. But (c) is better if you're using 1-based and awkward if you're using 0-based. (a) leaves the range size of a 0-based index in the range description, and (c) leaves the range size of a 1-based index in the range description. The argument becomes a wash, neither is obviously better than the other on this basis.

3. The argument that (a) is better because you can see if two ranges is a persuasive one. But how do you have two adjacent ranges when they both have the same starting position? It's irrelevant to the case for either 0-based or 1-based arrays.

His final argument is that:

  1 <= x < n+1
is more awkward than

  0 <= x < n
But he doesn't even present the alternative:

  1 <= x <= n
is less awkward than:

  0 <= x <= n-1
Again, a non-argument because it assumes the outcome he wants, that 0-based is fundamentally better.

Now, all that said 0-based has other advantages. But Dijkstra fails to address those other advantages. Ultimately, though, as I said in the other discussion, your language shouldn't restrict your range description options to either.

> You really should accept his argument though. It's quite persuasive.

I don't think anyone should accept the argument because it is not persuasive. There are much better arguments for 0-based indexing than this EWD. We should scrap flawed arguments when we have much better ones available. The best part of the EWD is the description of which range notation is "better", but not the part about 0-based indexing.


2. How is (c) better for calculating range size if you're using 1-based?

3. "But how do you have two adjacent ranges when they both have the same starting position?" What do you mean?

Consider the length n prefix of a sequence. With 1-indexing, for n = 0, we have sequence[1:0], which means

a. The end position is before the start position (?!).

b. We have to use 0 anyway.


In (c) with 1-based ranges the size of the range is present in the range. There is no calculating because the value is there. 1 <= x <= size

One argument for the notation (not 0 vs 1 based ranges) is that notation (a) offers easier detection of adjacent ranges. This is irrelevant to the 0 and 1 based range debate.


A range doesn’t have to start at 1.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: