Proposal Details

I propose the slices package recognize a new build tag like slicesoptsize (or just gooptsize if for all of std) that changes to slower implementations that result in smaller binaries, avoiding generics until the handful of generics-makes-big-binaries issues are addressed.

Basically, the build tag would be a declaration from the user akin to C compiler -Os saying that the user wants to optimize for size, not speed.

Background

While trying to optimize our binary for small environments (tiny Linux IoT devices), I noticed (using shotizam) that the slices package was using a surprising amount space in our final linux binaries.

It turns to to be because of generics and instantiating optimized sort routines for all shapes.

The generic slices.Sort, SortFunc, etc are now popular in the ecosystem (and thus in our dependencies) and while they can sort millions of items ~15% more efficiently, they can also make our binaries 632 KB larger.

As an experiment for benchmarking, in https://github.com/tailscale/go/pull/136 (CLA-signed, safe for Go contributors to look at + take), I just replaced the generic code with calls to the old slice sort code using reflect.Swapper:


 // Sort sorts a slice of any ordered type in ascending order.
 // When sorting floating-point numbers, NaNs are ordered before other values.
 func Sort[S ~[]E, E cmp.Ordered](x S) {
-       n := len(x)
-       pdqsortOrdered(x, 0, n, bits.Len(uint(n)))
+       sortSlice(x, func(i, j int) bool {
+               return x[i] < x[j]
+       })
 }

 // SortFunc sorts the slice x in ascending order as determined by the cmp
@@ -28,14 +30,36 @@ func Sort[S ~[]E, E cmp.Ordered](x S) {
 // See https://en.wikipedia.org/wiki/Weak_ordering#Strict_weak_orderings.
 // The function should return 0 for incomparable items.
 func SortFunc[S ~[]E, E any](x S, cmp func(a, b E) int) {
-       n := len(x)
-       pdqsortCmpFunc(x, 0, n, bits.Len(uint(n)), cmp)
+       sortSlice(x, func(i, j int) bool {
+               return cmp(x[i], x[j]) < 0
+       })
 }

... etc.

It made a huge difference in binary size! See tailscale/go#136 for numbers.

% ~/go/bin/shotizam --sqlite before
sqlite> select sum(size) from bin where func like 'sort%' or func like 'slices%';
485649

After:

% ~/go/bin/shotizam --sqlite after
^[[ASQLite version 3.43.2 2023-10-10 13:08:14
Enter ".help" for usage hints.
sqlite> select sum(size) from bin where func like 'sort%' or func like 'slices%';
51370

The code that goes away is all the stuff like:

sqlite> .mode column
sqlite> .width 50
sqlite> select func, size from bin where func like 'sort%' or func like 'slices%'  order by 2 desc limit 20;
Func                                                Size
--------------------------------------------------  ----
slices.partitionCmpFunc[go.shape.struct { AllocByt  3680
es int64; FreeBytes int64; AllocObjects int64; Fre
eObjects int64; Stack []uintptr }]

slices.partitionCmpFunc[go.shape.struct { Name str  3680
ing; Href string; Desc string; Count int }]

slices.partitionCmpFunc[go.shape.7259c45d4bf1dcc9d  3648
80d21eb6239a2e2ab64cc46f70de9a02885c0f89f58e623]

slices.partitionCmpFunc[go.shape.struct { encoding  3552
/json.name string; encoding/json.nameBytes []uint8
; encoding/json.nameNonEsc string; encoding/json.n
ameEscHTML string; encoding/json.tag bool; encodin
g/json.index []int; encoding/json.typ reflect.Type
; encoding/json.omitEmpty bool; encoding/json.omit
Zero bool; encoding/json.isZero func(reflect.Value
) bool; encoding/json.quoted bool; encoding/json.e
ncoder encoding/json.encoderFunc }]

slices.partitionCmpFunc[go.shape.struct { github.c  3520
om/gaissmai/bart.n *github.com/gaissmai/bart.node[
go.shape.bool]; github.com/gaissmai/bart.is4 bool;
 github.com/gaissmai/bart.path github.com/gaissmai
/bart.stridePath; github.com/gaissmai/bart.depth i
nt; github.com/gaissmai/bart.idx uint; github.com/
gaissmai/bart.cidr net/netip.Prefix; github.com/ga
issmai/bart.val go.shape.bool }]

slices.partitionCmpFunc[go.shape.struct { github.c  3520
om/gaissmai/bart.n *github.com/gaissmai/bart.node[
go.shape.struct {}]; github.com/gaissmai/bart.is4
bool; github.com/gaissmai/bart.path github.com/gai
ssmai/bart.stridePath; github.com/gaissmai/bart.de
pth int; github.com/gaissmai/bart.idx uint; github
.com/gaissmai/bart.cidr net/netip.Prefix; github.c
om/gaissmai/bart.val go.shape.struct {} }]

slices.partitionCmpFunc[go.shape.struct { github.c  3520
om/gaissmai/bart.n *github.com/gaissmai/bart.node[
go.shape.*uint8]; github.com/gaissmai/bart.is4 boo
l; github.com/gaissmai/bart.path github.com/gaissm
ai/bart.stridePath; github.com/gaissmai/bart.depth
 int; github.com/gaissmai/bart.idx uint; github.co
m/gaissmai/bart.cidr net/netip.Prefix; github.com/
gaissmai/bart.val go.shape.*uint8 }]

slices.partitionCmpFunc[go.shape.struct { github.c  3104
om/go-json-experiment/json/jsontext.name []uint8;
github.com/go-json-experiment/json/jsontext.buffer
 []uint8 }]

slices.partitionCmpFunc[go.shape.struct { URL *net  3104
/url.URL; StatusCode int; ExpectedContent string;
SupportsTailscaleChallenge bool; Provider tailscal
e.com/net/captivedetection.EndpointProvider }]