Proposal Details

It seems the json encoder and decoder has a significant overhead when escaping strings. I've attached a bunch of benchmarks to the end of this report. In short, I have a large string (hex in this case) and I would like to insert it into a json field. My benchmarks just json encode the single hex value.

BenchmarkMarshalString-12                    162       7383418 ns/op
BenchmarkMarshalRawJSON-12                    42      28148749 ns/op
BenchmarkMarshalTexter-12                    153       7785682 ns/op
BenchmarkMarshalJsoner-12                     40      28960272 ns/op
BenchmarkMarshalCopyString-12               4141        263625 ns/op

I would expect the performance to be near the speed of copying the data. However, Go seems to do a lot of extra processing. This report is kind of questioning various parts of that:

  • I can imagine Go wanting to double check the content of a string, but in that case, it would be nice to have a means to tell the json encoder/decoder that I know the content is valid, just parse it without wasting a ton of time.
  • I expected the RawMessage to actually not do all kinds of pre-post processing, but alas, Go elegantly ignores that it's "raw", and still does everything.
  • Annoyingly enough, for types that have MarshalJson implemented, it seems the escaping runs 3 (!!!) times. I haven;t found the 3rd one, but I think two of them are https://github.com/golang/go/blob/master/src/encoding/json/encode.go#L587 and the line right after, where both lines do an appendString call, which internally does the escape checks (yeah, the noescape flag only disables HTML escape checking, not ascii escape checking).

I'm not even entirely sure what's the solution to the various issues.

  • I'd expect to be able to use the json package without escaping.
  • I'd expect RawMessage to not be post processed
  • I'd expect the escaping code to be fast, and not take more time than encoding all the fields
  • I'd expect encoding to run once, not 3 times
func BenchmarkMarshalString(b *testing.B) {
    src := bytes.Repeat([]byte{'0'}, 4194304)
    str := hex.EncodeToString(src)

    b.ResetTimer()
    for i := 0; i < b.N; i++ {
        json.Marshal(str)
    }
}

func BenchmarkMarshalRawJSON(b *testing.B) {
    src := bytes.Repeat([]byte{'0'}, 4194304)
    msg := json.RawMessage(`"` + hex.EncodeToString(src) + `"`)

    b.ResetTimer()
    for i := 0; i < b.N; i++ {
        json.Marshal(msg)
    }
}

func BenchmarkMarshalTexter(b *testing.B) {
    src := bytes.Repeat([]byte{'0'}, 4194304)
    txt := &Texter{str: hex.EncodeToString(src)}

    b.ResetTimer()
    for i := 0; i < b.N; i++ {
        json.Marshal(txt)
    }
}

func BenchmarkMarshalJsoner(b *testing.B) {
    src := bytes.Repeat([]byte{'0'}, 4194304)
    jsn := &Jsoner{str: hex.EncodeToString(src)}

    b.ResetTimer()
    for i := 0; i < b.N; i++ {
        json.Marshal(jsn)
    }
}

func BenchmarkMarshalCopyString(b *testing.B) {
    src := bytes.Repeat([]byte{'0'}, 4194304)
    str := hex.EncodeToString(src)

    b.ResetTimer()
    for i := 0; i < b.N; i++ {
        buf := make([]byte, len(str)+2)
        buf[0] = '"'
        copy(buf[1:], str)
        buf[len(buf)-1] = '"'
    }
}

type Texter struct {
    str string
}

func (t Texter) MarshalText() ([]byte, error) {
    return []byte(t.str), nil
}

type Jsoner struct {
    str string
}

func (j Jsoner) MarshalJSON() ([]byte, error) {
    return []byte(`"` + j.str + `"`), nil
}

Comment From: gabyhelp

Similar Issues

(Emoji vote if this was helpful or unhelpful; more detailed feedback welcome in this discussion.)

Comment From: ianlancetaylor

Let's focus further encoding/json optimization discussions on encoding/json/v2. #63397

Comment From: karalabe

Ah, works for me!