Proposal Details

It's not uncommon to see code like this:

type Bread struct {
    Weight      int      `json:"weight" yaml:"weight" toml:"weight" xml:"weight"`
    Slices      int      `json:"slices" yaml:"slices" toml:"slices" xml:"slices"`
    WholeGrain  bool     `json:"whole_grain" yaml:"whole_grain" toml:"whole_grain" xml:"whole_grain"`
    SourDough   bool     `json:"sour_dough" yaml:"sour_dough" toml:"sour_dough" xml:"sour_dough"`
    Price       int      `json:"price" yaml:"price" toml:"price" xml:"price"`
    Ingredients []string `json:"ingredients,omitempty" yaml:"ingredients,omitempty" toml:"ingredients,omitempty" db:"ingredients,omitempty"`
}

I you want the struct to (un)marshal with more than one encoder, then you typically need to add a struct tag for every encoder. Two, three, or more types of struct tags are fairly common, and they often all have the same value.

Most encoders don't support changing the struct tag they look at (and probably shouldn't, either).

All of this is noisy, annoying to maintain, and hard to read.

I propose adding a new generic name alias, which can be re-used by different encoders. this would be similar to the encoding.Text{Unmarshaler,Marshaler}. With that, the above would be:

type Bread struct {
    Weight      int      `text:"weight"`
    Slices      int      `text:"slices"`
    WholeGrain  bool     `text:"whole_grain"`
    SourDough   bool     `text:"sour_dough"`
    Price       int      `text:"price"`
    Ingredients []string `text:"ingredients,omitempty"`
}

Maybe text isn't the best name; but it does fit nicely with encoding.TextMarshaler / encoding.TextUnmarshaler. Alternatively it could be name, alias, or maybe something else.

The logic would be:

name, ok := lookupTag(field, "json") // Or yaml, xml, etc.
if !ok {
    name, ok = lookupTag(field, "text")
}
if !ok  {
    name = field.Name()
}

That is:

  1. Prefer marshaller-specific tag if it exists.
  2. Fall back to the generic text if it doesn't.
  3. And the field name if that also doesn't exist.

This allows setting reasonable defaults that should be fine for the common use case, and deviating from that if need be.

The only allowed option is omitempty (although that should be a vet check). Values with - will be skipped as it works currently.


Concretely, the change required would be to implement this in the encoding/json, encoding/xml, and database/sql packages, and maybe document it somewhere. The wider ecosystem of parsers (YAML, TOML, various "fast json" packages, ec.) can then adopt this.

This change should be fairly low-complexity throughout, and simplify the life of everyone wanting to deal with more than one encoding format.

Comment From: gabyhelp

Related Issues and Documentation

(Emoji vote if this was helpful or unhelpful; more detailed feedback welcome in this discussion.)

Comment From: ianlancetaylor

CC @dsnet @mvdan

Comment From: seankhliao

Let's fold into #60791, seems to be the same, including the only option being omitempty