The doc for sort.Slice
states:
// The less function must satisfy the same requirements as
// the Interface type's Less method.
This is referring to the less function interface here:
// Less must describe a transitive ordering:
// - if both Less(i, j) and Less(j, k) are true, then Less(i, k) must be true as well.
// - if both Less(i, j) and Less(j, k) are false, then Less(i, k) must be false as well.
This describes only transitive ordering. In contrast, the underlying algorithm, pdqsort, requires strict weak ordering.
This is an important distinction. If a comparator is, for example:
- reflexive
- transitive
Then it will be conforming to the documented requirements yet will still cause crashes under nebulous implementation defined circumstances. e.g. the number of elements to sort is greater than 33, and elements happen to be in a specific order.
To fix:
Change the documented requirement of sort.Slice
to what is stated in sort.SortFunc
.
// SortFunc sorts the slice x in ascending order as determined by the cmp
// function. This sort is not guaranteed to be stable.
// cmp(a, b) should return a negative number when a < b, a positive number when
// a > b and zero when a == b or a and b are incomparable in the sense of
// a strict weak ordering.
//
// SortFunc requires that cmp is a strict weak ordering.
// See https://en.wikipedia.org/wiki/Weak_ordering#Strict_weak_orderings.
// The function should return 0 for incomparable items.
Comment From: gabyhelp
Related Issues
Related Code Changes
(Emoji vote if this was helpful or unhelpful; more detailed feedback welcome in this discussion.)
Comment From: gopherbot
Change https://go.dev/cl/666355 mentions this issue: sort/slice: fix doc for sort.Slice
Comment From: randall77
I'm not certain that changing the spec is the way forward here. I'd rather change the implementation so that it conforms to the spec as written.
What is the case that causes a problem here? Is it when less(x,x)
returns true? Can we detect that somehow in pdqsort and handle it, or maybe even bail out to a slower sort?
Comment From: jagprog5
@randall77 think of this as a clarification and not a change to the doc. the doc is written poorly. practically speaking, everyone assumed strict weak ordering. but it was written in a way that, I'd argue, gave weaker requirements than is necessary.
I've changed the title
Comment From: randall77
Should we just require a strict weak ordering everywhere then?
(I'm not sure why in the CL sort.Slice
and sort.SliceStable
have explicitly different requirements.)
Comment From: jagprog5
Should we just require a strict weak ordering everywhere then?
Yes, I've changed it. we already require a strict weak ordering everywhere, this just clarifies that.
As for the difference in sort vs stable, I've dropped that. There is a different in the requirements only from the implementation, but not the doc. I'd prefer it to be part of the doc / spec as well, but that would require more work in justifying. And, I see that C++ stdlib doesn't provide a public distinction either, so it's probably best to just stay consistent anyway.
Comment From: randall77
Do you have an example of a sort that actually goes wrong with a non-strict-weak-ordering comparison? Particularly, an ordering which is transitive and asymmetric but is (sometimes at least) reflexive.
Comment From: jagprog5
Sure, here's an example which satisfies the letter but not the spirit of the current spec:
fn less(a, b): // pseudocode
true // intended to trivially breaks irreflexive property (CRASH)
To clarify why this works, let's look at the current spec:
// Less reports whether the element with index i
// must sort before the element with index j.
This says nothing. We are defining what less means! So it's circular.
// If both Less(i, j) and Less(j, i) are false,
// then the elements at index i and j are considered equal.
This part doesn't matter, since like above, we are defining what "equal" means through the comparator so this part also says nothing. But anyways: requirement is never used since Less(i, j) and Less(j, i)
will never be false, both are stated literally as true.
// Sort may place equal elements in any order in the final result,
// while Stable preserves the original input order of equal elements.
Not applicable.
// Less must describe a transitive ordering:
// - if both Less(i, j) and Less(j, k) are true, then Less(i, k) must be true as well.
This will always be satisfied since Less(i, k)
is stated literally to be true regardless of preconditions.
// - if both Less(i, j) and Less(j, k) are false, then Less(i, k) must be false as well.
Will never occur since Less(i, j) and Less(j, k)
are both stated literally to be true.
Comment From: randall77
// Less reports whether the element with index i // must sort before the element with index j. This says nothing. We are defining what less means! So it's circular.
I don't agree. This says a lot. To me it means that Less(i, j) implies that s[i] must appear before s[j] in the output. It is not possible to satisfy that if i!=j and both Less(i, j) and Less(j, i) are true.
And this presumably also implies that we can't have Less(i, i) be true, because then s[i] must appear before itself. Depending on whether you agree that "before" here means "strictly before", which I think is clear but maybe there's a bit more wiggle room than the previous paragraph.
Comment From: jagprog5
That you for your patience @randall77
I don't agree. This says a lot.
Ok fair enough! So, if this statement:
// Less reports whether the element with index i
// must sort before the element with index j.
Describes everything that is needed. From that alone, we can derive all three strict weak ordering properties.
But then why is the transitive ordering line stated below that? It's confusing as it's like an elaboration on the specifics but only stating one.
There is a standard way of stating these requirements and only this one part of the stdlib isn't using it. If the documentation for Less is defining a strict weak ordering, then it should say so using those words for clarity! I've got someone at my work who defined a complex comparator but didn't know these properties, and improving the doc is helpful for others.
Comment From: randall77
Describes everything that is needed. From that alone, we can derive all three strict weak ordering properties.
I don't think you can derive transitivity from that.
Consider the case where only Less(1,2) and Less(2,3) return true. It is easy to satisfy "Less(i,j) => s[i] appears before s[j]", but that isn't a transitive relation. Particularly, Less(1,3) is false. In other words, !Less(1,3) does not imply that s[3] appears after s[1].
Comment From: randall77
I agree that we should probably make all the docs consistent, if in fact all our implementations require the same thing.
Comment From: jagprog5
@prattmic please consider removing "NeedsDecision"
Comment From: jagprog5
@prattmic please consider using a different tag. The work has been done - I've already open a patch request and PR.