What version of Go are you using (go version
)?
$ go version 1.17.1
Does this issue reproduce with the latest release?
Yes
What operating system and processor architecture are you using (go env
)?
go env
Output
$ go env ➜ ~ go env GO111MODULE="" GOARCH="amd64" GOBIN="" GOCACHE="/home/chrisg/.cache/go-build" GOENV="/home/chrisg/.config/go/env" GOEXE="" GOEXPERIMENT="" GOFLAGS="" GOHOSTARCH="amd64" GOHOSTOS="linux" GOINSECURE="" GOMODCACHE="/home/chrisg/pkg/mod" GONOPROXY="" GONOSUMDB="" GOOS="linux" GOPATH="/home/chrisg" GOPRIVATE="" GOPROXY="https://proxy.golang.org,direct" GOROOT="/home/chrisg/sdk/go1.17.1" GOSUMDB="sum.golang.org" GOTMPDIR="" GOTOOLDIR="/home/chrisg/sdk/go1.17.1/pkg/tool/linux_amd64" GOVCS="" GOVERSION="go1.17.1" GCCGO="gccgo" AR="ar" CC="gcc" CXX="g++" CGO_ENABLED="1" GOMOD="/dev/null" CGO_CFLAGS="-g -O2" CGO_CPPFLAGS="" CGO_CXXFLAGS="-g -O2" CGO_FFLAGS="-g -O2" CGO_LDFLAGS="-g -O2" PKG_CONFIG="pkg-config" GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build237025076=/tmp/go-build -gno-record-gcc-switches"
What did you do?
Parsed a url with a non url-encoded semicolon in a value of a query string parameter https://go.dev/play/p/n68WBqiRmkt
u, err := url.Parse("http://foo.com/?bar=;&baz=foobar&abc&xyz=&ikj=n;m")
if err != nil {
panic(err)
}
fmt.Printf("%+v\n", u)
fmt.Printf("%+v\n", u.Query())
fmt.Printf("%+v\n", u.RawQuery)
q, err := url.ParseQuery(u.RawQuery)
if err != nil {
panic(err)
}
fmt.Printf("%+v\n", q)
What did you expect to see?
The value parsed literally, and successfully - e.g.,
map[abc:[] baz:[foobar] xyz:[] bar:[;] ikj=n;m]
What did you see instead?
http://foo.com/?bar=;&baz=foobar&abc&xyz=&ikj=n;m
map[abc:[] baz:[foobar] xyz:[]]
bar=;&baz=foobar&abc&xyz=&ikj=n;m
panic: invalid semicolon separator in query
goroutine 1 [running]:
main.main()
/tmp/sandbox883685007/prog.go:22 +0x1eb
Program exited.
There have been a number of issues related to the new semicolon handling (some of which raised by myself): - https://github.com/golang/go/issues/49683 (closed as a duplicate of https://github.com/golang/go/issues/47425 -- I don't believe it is) - https://github.com/golang/go/issues/49399 - https://github.com/golang/go/issues/47425
The original issue, and proposal to add AllowQuerySemicolons
:
- https://github.com/golang/go/issues/25192
- https://github.com/golang/go/issues/45973
It's not clear why the current behavior has been chosen. The oringal issue only highlighted semicolons as an issue as a seperator. The current implementation ignores them entirely -- which is incredibly frustrating.
The fact of the matter is that urls containing raw semicolons do exist and those parameter values have meaning.
I'd really appreciate if we could have a dialog about potential options. I'm willing to invest time to do any implementation, but it's not clear to me what the powers that be will find acceptable.
As it stands, the issues introduced by the url parsing change in 1.17 has prevented upgrades at my org. I'd really like to help find a safe & viable way forward
Comment From: seankhliao
cc @FiloSottile
Comment From: chrisguiney
Adding from the rfc: https://datatracker.ietf.org/doc/html/rfc3986#section-2.2
If a reserved character is found in a URI component and
no delimiting role is known for that character, then it must be
interpreted as representing the data octet corresponding to that
character's encoding in US-ASCII.
where
```reserved = gen-delims / sub-delims gen-delims = ":" / "/" / "?" / "#" / "[" / "]" / "@" sub-delims = "!" / "$" / "&" / "'" / "(" / ")" / "*" / "+" / "," / ";" / "="
note `;` in `sub-delims`
Given that `;` as a delimiting role has been rejected, it reads that the only correct thing to do is parse `;` as though it were any other non-reserved character
**Comment From: ianlancetaylor**
No version of Go ever produced what you expect to see:
map[abc:[] bar:[;] baz:[foobar] ikj=n;m xyz:[]]
Go 1.16 and earlier produces this:
map[abc:[] bar:[] baz:[foobar] ikj:[n] m:[] xyz:[]]
The change in semicolon behavior was chosen to prevent mishandling of invalid queries while diverging as little as possible from earlier Go behavior. See https://github.com/golang/go/issues/25192#issuecomment-824216147.
**Comment From: chrisguiney**
@ianlancetaylor you are correct, but under no circumstance would I as a user expect the current behavior. That is, it was wrong, and it's still wrong....but worse, because now data is gone, and my logs are flooded.
**Comment From: chrisguiney**
@ianlancetaylor to your point though, I did mis-attribute the current situation with #25192 -- they're not strictly related. Please forgive the mistake, I'm a bit frustrated at the whole thing, and perhaps wrote up the issue in a bit of haste (and with doubt that anything would come of it)
They are somewhat tangentially related in that
1. when merely upgrading without any more thought, my logs get flooded, because I deal with _many_ urls with semicolons in query string parameter value. I don't immediately notice any behavior change though, because in my case, I'm forwarding using `url.RawQuery` to apache , which doesn't treat them as separators, but does parse them as described in the RFC quote above.
2. when adding `http.AllowQuerySemicolons`, all `;` get replaced with `&` on the `url.RawQuery` -- leaving the keys, but removing the values. Log messages go away, which is nice, but without the value being sent upstream, it's not viable for me.
when facing the two issues above, and seeing that go now either errors out (if using `url.ParseQuery`) or silently ignores (if using `(*URL).Query()`), it was clear to me that the current behavior was definitely not correct...but instead of being able to blissfully ignore it as I had in past releases, I have to do _something_ about it.
**Comment From: chrisguiney**
In an effort to stay engaged and productive on this - I have a couple questions:
- Is there any path forward for raw semicolons in the url? That is, is the go team even amenable to continuing the conversation, or is this considered a solved problem/settled issue? It seems like the current behavior was a deliberate choice. Is there room to revisit that?
- Is this better formulated as a proposal? From my point of view, it's certainly a bug, but I can see an argument against that.
While informative, @ianlancetaylor 's response didn't really give any direction on the issue
**Comment From: ianlancetaylor**
I'm not closely involved with this issue. That said, Go pays a lot of attention to backward compatibility. If we completely change the handling of semicolons in the net/url and net/http packages, then it seems to me that we risk silently changing the behavior of existing Go programs. It seems to me that that might open up security holes, which would be bad.
Can you address your specific issue by taking the URL string and converting `;` to `%3B` before passing it to `url.ParseQuery`? I'm not trying to claim that that is the ideal solution but perhaps it would help you move forward.
Whether that helps or not, please don't expect any quick resolutions here. It's hard to find paths that work for as many people as possible.
**Comment From: FiloSottile**
The current behavior was the least-bad option in a minefield, and it's motivated by trying to minimize the situations in which parser mis-alignment between a proxy and an application leads to security vulnerabilities.
As mentioned in https://github.com/golang/go/issues/49399 we'll remove the log line now that it has done its job, which was alerting applications of potential breakage. That should solve the issue for anyone who's not parsing the query but just forwarding RawQuery.
Anyone who wants the pre-Go 1.17 behavior can opt-in with AllowQuerySemicolon.
The only thing that's not addressed is applications that wish to parse *unencoded* semicolons as non-separators. Are there a lot of clients that send unescaped semicolons expecting them to NOT be handled as separators?
If yes, maybe we could add a EscapeQuerySemicolon Handler?
func EscapeQuerySemicolons(h Handler) Handler { return HandlerFunc(func(w ResponseWriter, r Request) { if strings.Contains(r.URL.RawQuery, ";") { r2 := new(Request) r2 = r r2.URL = new(url.URL) r2.URL = *r.URL r2.URL.RawQuery = strings.ReplaceAll(r.URL.RawQuery, ";", "%3B") h.ServeHTTP(w, r2) } else { h.ServeHTTP(w, r) } }) }
**Comment From: chrisguiney**
> The only thing that's not addressed is applications that wish to parse unencoded semicolons as non-separators. Are there a lot of clients that send unescaped semicolons expecting them to NOT be handled as separators?
> If yes, maybe we could add a EscapeQuerySemicolon Handler?
@FiloSottile I work with a lot of really wacky urls. I'm certain that there are *some* clients that will send unescaped semicolons expecting them not to be separators.
I invite you to engage with my proposal for adding a query parsing interface (https://github.com/golang/go/issues/56300). URL parsing has been shown to be very sensitive to change. It's also something that keeps cropping up in different areas, such as `httputil.ReverseProxy`, and `http.Client`.
In the `http.Client` case, there's currently no way to change how it behaves on parsing a `Location` header prior to following a redirect. (I just discovered this the other day, and haven't had a chance to note it on the proposal yet)
The challenges have lead me to believe the best way forward is to provide interfaces, and provide a safe and sane default implementation. Let users that have particular needs supply their own, so they can get consistent behavior across the various http libraries without worrying about the next go release subtly changing behavior.
**Comment From: vmallet**
> The only thing that's not addressed is applications that wish to parse _unencoded_ semicolons as non-separators. Are there a lot of clients that send unescaped semicolons expecting them to NOT be handled as separators?
@FiloSottile just as an example of something probably pretty common:
Share directions from Google Maps on iOS, get a short URL:
https://maps.app.goo.gl/Pu8hswB8Tu9an9Gb8?g_st=ic
Follow the redirect to get the full URL, here's what's returned by Google:
https://maps.google.com/?geocode=Fd8FPgId1pq0-ClVVVVVjHePgDE_7c0aV1zypA%3D%3D;FXshOgIdFYO7-CmbxbP6w8uPgDE-fvZtP0T6vA%3D%3D&daddr=sjc&saddr=sfo&dirflg=d&ftid=0x808f778c55555555:0xa4f25c571acded3f;0x808fcbc3fab3c59b:0xbcfa443f6df67e3e&lucs=,47071704&g_ep=CAISBjYuODQuNBgAIJScASoJLDQ3MDcxNzA0QgJVUw%3D%3D&g_st=ic
(Found my way here after upgrading a service running some ancient Go to 1.21).
**Comment From: frankli0324**
just want to drop another example of google service using semicolons in url query

at *the* google website

at google docs
and guess what, with Google Chrome
irony apart, I also want to mention that `http.Request.ParseForm` also uses `url.ParseQuery` for parsing "application/x-www-form-urlencoded" request bodies. Request bodies are often serialized in a different logic with the url query.
> that might open up security holes
I don't remember any other url implementations, especially those handling form bodies, not taking ';' as a legal non-reserved character. do you mean some applications assumes that `Go` rejects semicolon when parsing queries, and relies on that behavior? otherwise I don't see any other "back-compatibility" issue. also this is *very* wierd and surprising behavior to me so I personally don't believe anyone is *relying* on this behavior.
On security side, I not only disagree that this strengthens security, but maybe even introducing some of the security risks. consider this scenario where a golang service acts as a WAF in front of a service, parses the request and looks for malicious parameters. an attacker may incapacitate the WAF by simply including a semicolon at any part of the query.
(Found my way here after realizing Go is rejecting around 0.0003% of requests on a *payment* service taking 7k qps).
**Comment From: MUCZ**
Just encountered the same issue with using semicolons in URL query strings using `go1.21`. In my use case, semicolons are a common part of query parameter values, and I believe this is true for many other applications as well.
The `EscapeQuerySemicolons` proposed above works but it's not yet part of the standard library, and I have concerns about its efficiency.
**Comment From: jub0bs**
I think it's high time to revisit this issue.
@FiloSottile
> The only thing that's not addressed is applications that wish to parse unencoded semicolons as non-separators. Are there a lot of clients that send unescaped semicolons expecting them to NOT be handled as separators?
As others have noted, the current behaviour creates interoperability problems, since it's not compliant with the URL Living Standard. The latter indeed [makes no special case for semicolons in the query string](https://url.spec.whatwg.org/#urlencoded-parsing). For instance, all modern browsers behave as follows:
```js
const params = new URLSearchParams('bar=;&baz=foobar&abc&xyz=&ikj=n;m');
params.get('bar'); // ';'
params.get('baz'); // 'foobar'
params.get('abc'); // ''
params.get('xyz'); // ''
params.get('ikj'); // 'n;m'
Because net/url
has been choking on semicolons in the query string for years (since Go 1.17), I expect very few clients of Go servers to still rely on semicolons as query-param separators. Couldn't this "cooldown period" serve as a scaffold for tolerating the presence of semicolons in the query string again, albeit not as query-param separators?
I cannot think of opportunities for Web-cache poisoning that this would create... If anything, aligning with the URL Living Standard can only reduce the potential for parser differentials.
Comment From: jub0bs
Funnily enough, gorilla/mux still relies on the old behaviour of splitting on both ;
and &
, thereby creating a parser differential with net/url: https://github.com/gorilla/mux/issues/781