Is there any well-known method for DRYing JSON

Is there any well-known method for DRYing JSON

Consider this JSON response:
Name: ‘Saeed’,
Age: 31
}, {
Name: ‘Maysam’,
Age: 32
}, {
Name: ‘Mehdi’,
Age: 27

This works fine for small amount of data, but when you want to serve larger amounts of data (say many thousand records for example), it seems logical to prevent those repetitions of property names in the response JSON somehow.
I Googled the concept (DRYing JSON) and to my surprise, I didn’t find any relevant result. One way of course is to compress JSON using a simple home-made algorithm and decompress it on the client-side before consuming it:
[[‘Name’, ‘Age’],
[‘Saeed’, 31],
[‘Maysam’, 32],
[‘Mehdi’, 27]]

However, a best practice would be better than each developer trying to reinvent the wheel. Have you guys seen a well-known widely-accepted solution for this?


Solution 1:

One solution is known as hpack algorithm

Solution 2:

First off, JSON is not meant to be the most compact way of representing data. It’s meant to be parseable directly into a javascript data structure designed for immediate consumption without further parsing. If you want to optimize for size, then you probably don’t want self describing JSON and you need to allow your code to make a bunch of assumptions about how to handle the data and put it to use and do some manual parsing on the receiving end. It’s those assumptions and extra coding work that can save you space.

If the property names and format of the server response are already known to the code, you could just return the data as an array of alternating values:

['Saeed', 31, 'Maysam', 32, 'Mehdi', 27]

or if it’s safe to assume that names don’t include commas, you could even just return a comma delimited string that you could split into it’s pieces and stick into your own data structures:

"Saeed, 31, Maysam, 32, Mehdi, 27"

or if you still want it to be valid JSON, you can put that string in an array like this which is only slightly better than my first version where the items themselves are array elements:

["Saeed, 31, Maysam, 32, Mehdi, 27"]

These assumptions and compactness put more of the responsibility for parsing the data on your own javascript, but it is that removal of the self describing nature of the full JSON you started with that leads to its more compact nature.

Solution 3:

You might be able to use a CSV format instead of JSON, as you would only specify the property names once. However, this would require a rigid structure like in your example.

JSON isn’t really the kind of thing that lends itself to DRY, since it’s already quite well-packaged considering what you can do with it. Personally, I’ve used bare arrays for JSON data that gets stored in a file for later use, but for simple AJAX requests I just leave it as it is.

DRY usually refers to what you write yourself, so if your object is being generated dynamically you shouldn’t worry about it anyway.

Solution 4:

Use gzip-compression which is usually readily built into most web servers & clients?

It will still take some (extra) time & memory to generate & parse the JSON at each end, but it will not take that much time to send over the network, and will take minimal implementation effort on your behalf.

Might be worth a shot even if you pre-compress your source-data somehow.

Solution 5:

It’s actually not a problem for JSON that you’ve often got massive string or “property” duplication (nor is it for XML).

This is exactly what the duplicate string elimination component of the DEFLATE-algorithm addresses (used by GZip).

While most browser clients can accept GZip-compressed responses, traffic back to the server won’t be.

Does that warrant using “JSON compression” (i.e. hpack or some other scheme)?

  1. It’s unlikely to be much faster than implementing GZip-compression in Javascript (which is not impossible; on a reasonably fast machine you can compress 100 KB in 250 ms).

  2. It’s pretty difficult to safely process untrusted JSON input. You need to use stream-based parsing and decide on a maximum complexity threshold, or else your server might be in for a surprise. See for instance Armin Ronacher’s Start Writing More Classes:

    If your neat little web server is getting 10000 requests a second through gevent but is using json.loads then I can probably make it crawl to a halt by sending it 16MB of well crafted and nested JSON that hog away all your CPU.