I just ran our full suite of a few thousand unit tests with GOEXPERIMENT=jsonv2 and they all passed. (well, one test failed because an error message was changed, but that's on us)
I'm especially a fan of breaking out the syntactic part into into its own jsontext package. It makes a ton of sense, and I could see us implementing a couple parsers on top of that to get better performance where it really matters.
I wish they would take this chance to ditch omitempty in favor of just the newly-added omitzero (which is customizable with IsZero()), to which we'll be switching all our code over Real Soon Now. The two tags are so similar that it takes effort to decide between them.
Those numbers look similar to goccy. I used to use it in the past, even Kubernetes uses it as direct dependency, but the amount of issues have been stockpiling for quite some time so I no longer trust it.
So it seems both are operating at the edge of Go's capabilities.
Personally, I think JSON should be in Go's core and highly optimised simd c code and not in the Go's std library as standard Go code. As JSON is such an important part of the web nowadays, it deserves to be treated with more care.
IIRC, sonic does JIT, has inline assembly (github says 41%), and it's huge. There's no way you can audit it. If you don't need to squeeze every cpu cycle out of your json parser (and most of us don't; go wouldn't be the first choice for such performance anyway), I'd stick with a simpler implementation.
And Sonic with its "cutting edge" optimization is still slower than std Json on arm64 with basic use cases. It shows that JIT, simd, low level code comes at cost of maintenance for all platform.
first of all, that doesn't exercise JSON v2 at all, afaict
second of all, sonic apparently uses unsafe to (unsafe-ly) cast byte slices to strings, which of course is gonna be faster than doing things correctly, but is also of course incomparable to doing things correctly
like almost all benchmark data posted to hn -- unsound, ignore
I will say this and I feel it's true. Dealing with JSON in Go is a pain. You should be able to write json and not have to care about the marshalling and the unmarshalling. It's the way that serde rust behaves and more or less every other language I've had to deal with and it makes managing this behavior when there's multiple writers complicated.
Are you referring to the json macro that allows variable interpolation? Doing that will void type safety. Might be useful in dynamic languages like Python but I wouldn’t want to trade type safety for some syntactic sugar in Go
A bad design doesn't invalidate the sentence you have quoted.
Over time, it became evident that the JSON package didn't meet the needs of its users, and the package has evolved as a result. The size of the evolution doesn't matter.
Well mostly I have seem people learn shortcomings of software by using or creating it and come up with new version when possible. In your case it seems v1 are perfect each time.
Please do run this on your own workloads! It's fairly easy to set up and run. I tried it a few weeks ago against a large test suite and saw huge perf benefits, but also found a memory allocation regression. In order for this v2 to be a polished release in 1.26, it needs a bit more testing.
I had a back and forth with someone who really didn't want to change that behavior and their reasoning was that since you can create and provide an empty map or slice.. having the marshaler do that for you, and then also needing a way to disable that behavior, was unnecessary complexity.
Well those are different things, aren't they? Empty slice/map is different from nil. So it makes a lot of sense that nil = null and []string{} = [], and you have an option to use both. That being said, it starts to make less sense if you work with go where the API mostly treats it as equivalent (append, len, []). So that would be my guess how it ended up the way it did.
Also, now that nil map is an empty object, shouldn't that extend to every nil struct that doesn't have a custom marshaller? It would be an object if it wasn't nil after all...
Could somebody give a high level overview of this for me, as not a godev? It looks like Go JSON lib has support to encode native go structures in JSON, which is cool, but maybe it was bad, which is not as cool. Do I have that right?
Go already has a JSON parser and serializer. It kind of resembles the JS api where you push some objects into JSON.stringify and it serializes them. Or you push some string and get an object (or string etc) from JSON.parse.
The types themselves have a way to customize their own JSON conversion code. You could have a struct serialize itself to a string, an array, do weird gymnastics, whatever. The JSON module calls these custom implementations when available.
The current way of doing it is shit though. If you want to customize serialization, you need to return a json string basically. Then the serializer has to check if you actually managed to return something sane. You also have no idea if there were some JSON options. Maybe there is an indentation setting or whatever. No, you return a byte array.
Deserialization is also shit because a) again, no options. b) the parser has to send you a byte array to parse. Hey, I have this JSON string, parse it. If that JSON string is 100MB long, too bad, it has to be read completely and allocated again for you to work on because you can only accept a byte array to parse.
New API fixes these. They provide a Decoder or Encoder to you. These carry any options from top. And they also can stream data. So you can serialize your 10GB array value by value while the underlying writer writes it into disk for example. Instead of allocating all on memory first, as the older API forces you to.
There are other improvements too but the post mainly focuses on these so thats what I got from it (I havent tried the new api btw, this is all from the post so maybe I’m wrong on some points)
The largest problem were around behavior around nil in golang and what to convert into json and vice versa.
* The v2 will now throw an error for invalid characters outside of ut8 (before silently accepted it) which meant one had to preprocess or process again the json before sending it off to the server
* the golang nil will be converted to json empty array or map (for each type). previously it was converted to json null.
* json field names will be converted to golang names with case sensitivity. before it was case-insentitive and would be lowercased. this kinda caused lots of problems if the field collided. (say there's bankName and bankname in json)
* omitempty was problematic as it was used for say golang amount: nil would mean omit the field in json as {} instead of { amount: null}. however it also meant that the golang amount: 0 would also be omitted as { amount: 0 } which surprising. the new omitempty will only do so for nil and empty arrays/hashmaps but no longer for 0 or false. there's a new omitzero tag for that.
Nah, the existing implementation is pretty decent, actually, but doesn’t address every use case and has some flaws that are hard or impossible to fix. But for lots of use cases it works great.
Now here’s a new implantation that addresses some of the architectural problems that made the old library structurally problematic for some use cases (streaming large JSON docs being the main one).
I'm coming in a little hot and contrarian. I've been working with the Go JSON library for well over a decade at this point, since before Go 1.0, and I think v1 is basically fine.
I have two complaints. Its decoder is a little slow, PHP's decoder blows it out of the water. I also wish there was an easy "catch all" map you could add to a struct for items you didn't define but were passed.
None of the other things it "solves" have ever been a problem for me - and the "solution" here is a drastically more complicated API.
I frankly feel like doing a v2 is silly. Most of the things people want could be resolved with struct tags varying the behavior of the existing system while maintaining backwards compatibility.
My thoughts are basically as follows
The struct/slice merge issue? I don't think you should be decoding into a dirty struct or slice to begin with. Just declare it unsupported, undefined behavior and move on.
Durations as strings? Why? That's just gross.
Case sensitivity by default? Meh. Just add a case sensitivity struct tag. Easy to fix in v1
Partial decoding? This seems so niche it should just be a third party libraries job.
Basically everything could've been done in a backwards compatible way. I feel like Rob Pike would not be a fan of this at all, and it feels very un-Go.
It is good to see some partial solutions to this issue. It plagues most languages and introduces a nice little ambiguity that is just trouble waiting to happen.
Ironically, JavaScript with its hilarious `null` and `undefined` does not have this problem.
Most JSON parsers and emitters in most languages should use a special value for "JSON null".
Null and undefined are fine imho with a sort of empty/missing semantics (especially since you mostly just care to == them) I have bigger issues to how similar yet different it is to have an undefined key and a not-defined key, I would almost prefer if
obj['key']=undefined
was the same as
delete obj['key']
This V2 is still pushing forward the retarded behavior from v1 when it comes to handling nil for maps, slices and pointers. I am so sick and tired of this crap. I had to fork the v1 to make it behave properly and they still manage to fuck up completely new version just as well(by pushing omitempty and ignoring omitnil behavior as a standalone case) which means I will be stuck with the snale-pace slow v1 for ever.
"In v1, a nil Go slice or Go map is marshaled as a JSON null. In contrast, v2 marshals a nil Go slice or Go map as an empty JSON array or JSON object, respectively. The jsonv2.FormatNilSliceAsNull and jsonv2.FormatNilMapAsNull options control this behavior difference. To explicitly specify a Go struct field to use a particular representation for nil, either the `format:emitempty` or `format:emitnull` field option can be specified. Field-specified options take precedence over caller-specified options."
Both v1 packages continue work; both are maintained. They get security updates, and were both improved by implementing them on top of v2 to the extent possible without breaking their respective APIs.
More importantly: the Go authors remain responsible for both the v1 and v2 packages.
What most people want to avoid with a "batteries included standard library" (and few additional dependencies) is the debacle we had just today with NPM.
Well maintained packages, from a handful of reputable sources, with predictable release schedules, a responsive security team and well specified security process.
You can't get that with 100s of independently developed dependencies.
It's not like this is new. Look at Java and .NET collection APIs, for example - both languages have the OG 1.0 versions, and then the more modern ones. In a similar vein, .NET has four different ways to deal with XML, of which three (XmlDocument, XPathDocument, and XDocument) are basically redundant representations of XML trees, each one doing things differently based on lessons learned.
I'm not sure how this is a problem, and I'm very sure that even in the presence of this "problem" it is far better for a language to have a batteries-included stdlib than to not
I still don't get how a common thing like JSON is not solved in go.
How convoluted it is to just get a payload from an api call compared to all languages is baffling
Love seeing meaningful stdlib improvements.
I just ran our full suite of a few thousand unit tests with GOEXPERIMENT=jsonv2 and they all passed. (well, one test failed because an error message was changed, but that's on us)
I'm especially a fan of breaking out the syntactic part into into its own jsontext package. It makes a ton of sense, and I could see us implementing a couple parsers on top of that to get better performance where it really matters.
I wish they would take this chance to ditch omitempty in favor of just the newly-added omitzero (which is customizable with IsZero()), to which we'll be switching all our code over Real Soon Now. The two tags are so similar that it takes effort to decide between them.
Benchmark Analysis: Sonic vs Standard JSON vs JSON v2 in Go
https://github.com/centralci/go-benchmarks/tree/b647c45272c7...
Those numbers look similar to goccy. I used to use it in the past, even Kubernetes uses it as direct dependency, but the amount of issues have been stockpiling for quite some time so I no longer trust it.
So it seems both are operating at the edge of Go's capabilities.
Personally, I think JSON should be in Go's core and highly optimised simd c code and not in the Go's std library as standard Go code. As JSON is such an important part of the web nowadays, it deserves to be treated with more care.
IIRC, sonic does JIT, has inline assembly (github says 41%), and it's huge. There's no way you can audit it. If you don't need to squeeze every cpu cycle out of your json parser (and most of us don't; go wouldn't be the first choice for such performance anyway), I'd stick with a simpler implementation.
And Sonic with its "cutting edge" optimization is still slower than std Json on arm64 with basic use cases. It shows that JIT, simd, low level code comes at cost of maintenance for all platform.
https://github.com/bytedance/sonic/issues/785
first of all, that doesn't exercise JSON v2 at all, afaict
second of all, sonic apparently uses unsafe to (unsafe-ly) cast byte slices to strings, which of course is gonna be faster than doing things correctly, but is also of course incomparable to doing things correctly
like almost all benchmark data posted to hn -- unsound, ignore
I will say this and I feel it's true. Dealing with JSON in Go is a pain. You should be able to write json and not have to care about the marshalling and the unmarshalling. It's the way that serde rust behaves and more or less every other language I've had to deal with and it makes managing this behavior when there's multiple writers complicated.
Are you referring to the json macro that allows variable interpolation? Doing that will void type safety. Might be useful in dynamic languages like Python but I wouldn’t want to trade type safety for some syntactic sugar in Go
> serde rust
That does look a lot cleaner. I was just grumbling about this in golang yesterday (yaml, not json, but effectively the same problem).
>Over time, packages evolve with the needs of their users, and encoding/json is no exception
No, it's an exception. It was badly designed from the start - it's not just that people's json needs (which hardly changed) outgrew it.
A bad design doesn't invalidate the sentence you have quoted.
Over time, it became evident that the JSON package didn't meet the needs of its users, and the package has evolved as a result. The size of the evolution doesn't matter.
Well mostly I have seem people learn shortcomings of software by using or creating it and come up with new version when possible. In your case it seems v1 are perfect each time.
Please do run this on your own workloads! It's fairly easy to set up and run. I tried it a few weeks ago against a large test suite and saw huge perf benefits, but also found a memory allocation regression. In order for this v2 to be a polished release in 1.26, it needs a bit more testing.
> Since encoding/json marshals a nil slice or map as a JSON null
How did that make it into the v1 design?
I had a back and forth with someone who really didn't want to change that behavior and their reasoning was that since you can create and provide an empty map or slice.. having the marshaler do that for you, and then also needing a way to disable that behavior, was unnecessary complexity.
how is a nil map not null? It certainly isn’t a zero-valued map, that would be {}.
Well those are different things, aren't they? Empty slice/map is different from nil. So it makes a lot of sense that nil = null and []string{} = [], and you have an option to use both. That being said, it starts to make less sense if you work with go where the API mostly treats it as equivalent (append, len, []). So that would be my guess how it ended up the way it did.
Also, now that nil map is an empty object, shouldn't that extend to every nil struct that doesn't have a custom marshaller? It would be an object if it wasn't nil after all...
Why shouldn't it be? The nil is null and empty array is an empty array, they are completely different objects.
Could somebody give a high level overview of this for me, as not a godev? It looks like Go JSON lib has support to encode native go structures in JSON, which is cool, but maybe it was bad, which is not as cool. Do I have that right?
Go already has a JSON parser and serializer. It kind of resembles the JS api where you push some objects into JSON.stringify and it serializes them. Or you push some string and get an object (or string etc) from JSON.parse.
The types themselves have a way to customize their own JSON conversion code. You could have a struct serialize itself to a string, an array, do weird gymnastics, whatever. The JSON module calls these custom implementations when available.
The current way of doing it is shit though. If you want to customize serialization, you need to return a json string basically. Then the serializer has to check if you actually managed to return something sane. You also have no idea if there were some JSON options. Maybe there is an indentation setting or whatever. No, you return a byte array.
Deserialization is also shit because a) again, no options. b) the parser has to send you a byte array to parse. Hey, I have this JSON string, parse it. If that JSON string is 100MB long, too bad, it has to be read completely and allocated again for you to work on because you can only accept a byte array to parse.
New API fixes these. They provide a Decoder or Encoder to you. These carry any options from top. And they also can stream data. So you can serialize your 10GB array value by value while the underlying writer writes it into disk for example. Instead of allocating all on memory first, as the older API forces you to.
There are other improvements too but the post mainly focuses on these so thats what I got from it (I havent tried the new api btw, this is all from the post so maybe I’m wrong on some points)
The main issues are under the Behavior differences https://go.dev/blog/jsonv2-exp#behavior-differences
The largest problem were around behavior around nil in golang and what to convert into json and vice versa.
* The v2 will now throw an error for invalid characters outside of ut8 (before silently accepted it) which meant one had to preprocess or process again the json before sending it off to the server * the golang nil will be converted to json empty array or map (for each type). previously it was converted to json null. * json field names will be converted to golang names with case sensitivity. before it was case-insentitive and would be lowercased. this kinda caused lots of problems if the field collided. (say there's bankName and bankname in json) * omitempty was problematic as it was used for say golang amount: nil would mean omit the field in json as {} instead of { amount: null}. however it also meant that the golang amount: 0 would also be omitted as { amount: 0 } which surprising. the new omitempty will only do so for nil and empty arrays/hashmaps but no longer for 0 or false. there's a new omitzero tag for that.
Nah, the existing implementation is pretty decent, actually, but doesn’t address every use case and has some flaws that are hard or impossible to fix. But for lots of use cases it works great.
Now here’s a new implantation that addresses some of the architectural problems that made the old library structurally problematic for some use cases (streaming large JSON docs being the main one).
If/once this goes through, I wonder what the adoption is going to be like now that all LLMs still only have the v1 api in their corpus.
Hopefully people will remember documentation exists once errors start popping up and refer to it.
I'm coming in a little hot and contrarian. I've been working with the Go JSON library for well over a decade at this point, since before Go 1.0, and I think v1 is basically fine.
I have two complaints. Its decoder is a little slow, PHP's decoder blows it out of the water. I also wish there was an easy "catch all" map you could add to a struct for items you didn't define but were passed.
None of the other things it "solves" have ever been a problem for me - and the "solution" here is a drastically more complicated API.
I frankly feel like doing a v2 is silly. Most of the things people want could be resolved with struct tags varying the behavior of the existing system while maintaining backwards compatibility.
My thoughts are basically as follows
The struct/slice merge issue? I don't think you should be decoding into a dirty struct or slice to begin with. Just declare it unsupported, undefined behavior and move on.
Durations as strings? Why? That's just gross.
Case sensitivity by default? Meh. Just add a case sensitivity struct tag. Easy to fix in v1
Partial decoding? This seems so niche it should just be a third party libraries job.
Basically everything could've been done in a backwards compatible way. I feel like Rob Pike would not be a fan of this at all, and it feels very un-Go.
It goes against Go's whole worse is better angle.
null != nil !!!
It is good to see some partial solutions to this issue. It plagues most languages and introduces a nice little ambiguity that is just trouble waiting to happen.
Ironically, JavaScript with its hilarious `null` and `undefined` does not have this problem.
Most JSON parsers and emitters in most languages should use a special value for "JSON null".
Fixed in 1976 by ML, followed up by Eiffel in 2005, but unfortunately yet to be made common.
Null and undefined are fine imho with a sort of empty/missing semantics (especially since you mostly just care to == them) I have bigger issues to how similar yet different it is to have an undefined key and a not-defined key, I would almost prefer if
This V2 is still pushing forward the retarded behavior from v1 when it comes to handling nil for maps, slices and pointers. I am so sick and tired of this crap. I had to fork the v1 to make it behave properly and they still manage to fuck up completely new version just as well(by pushing omitempty and ignoring omitnil behavior as a standalone case) which means I will be stuck with the snale-pace slow v1 for ever.
Are you sure about that? Unless I'm misunderstanding they did fix this: https://pkg.go.dev/encoding/json
"In v1, a nil Go slice or Go map is marshaled as a JSON null. In contrast, v2 marshals a nil Go slice or Go map as an empty JSON array or JSON object, respectively. The jsonv2.FormatNilSliceAsNull and jsonv2.FormatNilMapAsNull options control this behavior difference. To explicitly specify a Go struct field to use a particular representation for nil, either the `format:emitempty` or `format:emitnull` field option can be specified. Field-specified options take precedence over caller-specified options."
What is your preferred behavior for a nil map/slice? Feels weird that it doesn't map to null.
Isn't that what the new 'omitzero' option is for?
https://pkg.go.dev/github.com/go-json-experiment/json#exampl...
This is the second time a v2 is released to a package in the Go's standard library. Other ecosystems are not free of this problem.
And then people complain that Rust doesn't have a batteries-included stdlib. It is done to avoid cases like this.
That has its own downsides, though.
Both v1 packages continue work; both are maintained. They get security updates, and were both improved by implementing them on top of v2 to the extent possible without breaking their respective APIs.
More importantly: the Go authors remain responsible for both the v1 and v2 packages.
What most people want to avoid with a "batteries included standard library" (and few additional dependencies) is the debacle we had just today with NPM.
Well maintained packages, from a handful of reputable sources, with predictable release schedules, a responsive security team and well specified security process.
You can't get that with 100s of independently developed dependencies.
Wow, two whole times in 19 years? That sounds terrible.
Yes, we should definitely go with the Rust approach instead.
Anyway, I'd better get back to figuring out which crate am I meant to be using...
I’d rather have 2 jsons in the stdlib after 15 years than 0 jsons in the stdlib
It's not like this is new. Look at Java and .NET collection APIs, for example - both languages have the OG 1.0 versions, and then the more modern ones. In a similar vein, .NET has four different ways to deal with XML, of which three (XmlDocument, XPathDocument, and XDocument) are basically redundant representations of XML trees, each one doing things differently based on lessons learned.
It's still better than the mess that is Node.js.
Two v2s in 15 years seems pretty good given the breadth of the stdlib.
I'm not sure how this is a problem, and I'm very sure that even in the presence of this "problem" it is far better for a language to have a batteries-included stdlib than to not
I still don't get how a common thing like JSON is not solved in go. How convoluted it is to just get a payload from an api call compared to all languages is baffling
> I still don't get how a common thing like JSON is not solved in go.
Given that it is not even yet solved in its namesake language, Javascript, that's not saying much.
You should read that, it's still relevant: https://seriot.ch/projects/parsing_json.html