Golang proposal: database/sql: add configuration option to ignore null values in Scan

like https://github.com/golang/go/issues/28414, but for all null values

copied from https://github.com/golang/go/pull/57066:

Currently, when scanning a NULL database value into a go type that is not nil-able, this package throws an error like "sql: Scan error on column index 0, name "foo": converting NULL to int is unsupported". This makes logical sense; nil and 0 are different values, and only the latter can be assigned to an int. The Null* types (NullString, NullInt, etc) account for this difference, giving programmers the ability to safely bridge the gap between nullable SQL values and primitive go types.

I propose we give programmers the option to change that behavior and instead ignore NULL values. There are a few reasons I think this option makes sense:

Alignment with programmer intent

In go, if I want a struct field with a string field, I would write type foo struct{ bar string } In mysql, if I want a table with a string column, I would write CREATE TABLE foo (bar varchar(255));

But this produces incompatible types -- in mysql, any value can be NULL unless you explicitly say otherwise. If we wanted to enforce that our foo.bar column could never be NULL, we would have to write CREATE TABLE foo (bar varchar(255) NOT NULL); This nuance is seldom clear to programmers, and is a lesson that generally only gets learned the hard way. For instance, if you google "go sql null", you will get back numerous pages of blog posts, stack overflow posts, etc -- whether or not it is "right", the undeniable fact is that this behavior subverts programmer expectations on a regular basis.

Furthermore, when a programmer writes some go code to fetch sql data and work with it in a go program, choosing to type their variable as a non-nilable type is a signal that they do not need to write special handling for the null case. It isn't up to us to determine whether or not they should; by writing their code this way, the programmer has indicated that they have no use for this distinction. When we force them to care (at runtime, often long after the code has been written, when we unexpectedly encounter a NULL value in the database), this produces broken software that isn't always straightforward to triage/remediate.

Fundamentally, the responsibility of this package is to bridge the gap between sql and go. Accounting for the disparate treatment of NULL in these two languages is currently the responsibility of the programmer, but by reconciling this difference in the translation layer, we can prevent bugs and make it easier to write great go code.

Consistency with json unmarshaling behaviors

As shown in this playground, unmarshaling json with NULL values into non-nilable types already works according to the programmer expectations described above -- we leave these values untouched. Whether or not the json unmarshaler is "right" is up for debate (see https://github.com/golang/go/issues/33835), but the fact of the matter is that this will likely have to remain the default behavior forever. The fact that these two standard libraries have opposite opinions is unfortunate, and can probably never be changed. However, we compound this issue by giving programmers no mechanism to make them behave the same way. So, every go programmer has to be aware of this distinction and correctly design their types for the environment(s) that they expect to instantiate them from. This is clunky at best, and in practice it creates vast swaths of bug habitat.

It would be great to introduce an analogous configuration option in the standard json unmarshaler (or perhaps even reuse the same configuration flag), but that is out of scope for this issue.

For the above reasons, it's important to give programmers the ability to opt out of this behavior. Using Null* types or pointers to primitives remain viable for situations where we need to distinguish between NULL and other values, but in situations where the programmer does not care about this distinction, we should allow them this freedom.

I'm not totally sure of the most standard way to specify this bit of configuration; I think it would likely suffice to be a global setting (not per connection, transaction, statement, etc)

Comment From: rittneje

Why is this necessary over using COALESCE/IFNULL in the SQL query itself?

unmarshaling json with NULL values into non-nilable types already works according to the programmer expectations described above

The JSON unmarshaler does not turn null into the zero value. Rather, it treats this the same as the field being absent. https://go.dev/play/p/iTpNi9LewMR

choosing to type their variable as a non-nilable type is a signal that they do not care about the distinction between NULL and zero

It could also mean that they have an application bug. You could make the same argument about Go turning nil pointer dereferences into the corresponding zero value.

Comment From: MisterSquishy

Why is this necessary over using COALESCE/IFNULL in the SQL query itself?

This is preferable as it doesn't require the programmer to know/remember whether or not a given SQL column is nullable. It's not necessary though, you're right that writing all your queries this way would avoid the problem too (as would remembering to include NOT NULL in the definitions of every column you intend to someday scan into a non-nilable type)

The JSON unmarshaler does not turn null into the zero value. Rather, it treats this the same as the field being absent

This is a great point, thanks! I was mistaken about how that worked. The sql package should be able to behave like the JSON unmarshaler, so I think we should instead give the option to ignore null values (and not coalesce them to zero). I'll update the proposal to be less prescriptive.

It could also mean that they have an application bug. You could make the same argument about Go turning nil pointer dereferences into the corresponding zero value.

Agreed that there are scenarios where programmers would not want this behavior (as described in https://github.com/golang/go/issues/33835). We should allow programmers to make this decision for themselves.

Comment From: hungtcs

This problem has bothered me for days. I finally realized that the data cannot be scanned directly to the destination, my current workaround is to use a data wrapper.

type DatabaseValueScanner struct {
    Value            any
    DatabaseTypeName string
}

func (scanner *DatabaseValueScanner) Scan(src interface{}) error {
    switch value := src.(type) {
    case int8:
        scanner.Value = value
    // ... 
        // ...
    case []uint8:
        scanner.Value = fmt.Sprintf("%s", value)
    case time.Time:
        scanner.Value = value
    case nil:
        scanner.Value = nil
    default:
        scanner.Value = value
    }
    return nil
}

Here is a common use case

pointers := make([]any, rowsCount)
for i := range pointers {
    pointers[i] = &DatabaseValueScanner{
        Name: columnTypes[i].Name(),
        Type: columnTypes[i].DatabaseTypeName(),
    }
}

for rows.Next() {
    if err := rows.Scan(pointers...); err != nil {
        return err
    }

    var values []any
    for _, valueScanner := range pointers {
        values = append(values, valueScanner.(*DatabaseValueScanner).Value)
    }
    dataset.Rows = append(dataset.Rows, values)
}

Comment From: arvenil

IMHO this should be aligned with json.Unmarshal. I continuously see issues with scanning sql results into structs but thankfully it's not the case for json unmarshaling.

Whenever I'm faced with the issue the solution ~100% of time is to use COALESCE/CONCAT, and I'm doing it since beginning of go. It's a huge waste of time. What is worse inexperienced developers often resort to using sql.Null leading to bloated code, leaking sql.Null to handlers and often introducing additional bugs if they mishandle Null.Valid flag.

I don't find a need to keep current scan error - if a null is valid value one can just switch to sql.Null type.