JSON vs. XML for configuration files?

One of the topics we’re discussing in Thunderbird dev land involves how to distribute configuration files for Thunderbird, so that if you’re part of a group (users of large ISPs, enterprise users, gmail users, whatever), that the “right” default configuration can be picked up as automagically as possible.

There’s lots behind that effort, but there’s one point of debate which I thought I’d throw out here explicitly to get input from people outside the Thunderbird community. Thunderbird can easily support either an XML dialect, or JSON files. There’s a feeling that the formats are isomorphic, and that figuring out which syntax ends up being a tooling issue, in particular for unknown third parties who might want to also implement parsing of these files. Some people believe that XML, being more mature, is more likely to be parseable by various software packages “out there”. Others believe that JSON is a better fit, because it’s lighter weight, more mashable, etc.

Are there good best practices that people have developed as to when to use which markup language?

Comments here, please. Keep it civil and constructive, please!

26 thoughts on “JSON vs. XML for configuration files?”

  1. I’ve usually used JSON when the config file is more ad-hoc; a collection of essentially arbitrary key/value pairs, for example. For me, the overhead of XML is only worth dealing with if you’ve got a more heavy-weight, structured set of data. Then, the ability to validate the XML config is super-handy. In those cases I’ll write a schema so that configs can be validated; the lack of schema support for JSON means it’s not feasible there.

  2. I personally prefer XML.
    On the negative side, it is very verbose compared to JSON, and also more time-consuming to parse.
    On the positive side, it is a more robust and extensible language, and it has a little more support through applications and programming libraries.
    Another important point favoring XML is that it was designed and has always been used as a data representation format. JSON is based on Javascript, which was initially meant for web scripting. The point I’m making here is that JSON introduces security concerns to those who choose to use it. Third parties (or maybe even Thunderbird) could handle these files incorrectly, allowing code execution to be possible if malicious files find their way to a user’s profile. It’s a stretch, but it’s something that may bite you in the future, so why risk it?

  3. If there are compelling use cases where the data is consumed in a browser, then JSON is probably preferable (since, XML parsing is still consumed with cross-browser quirks like entity support, etc).

    XML is more easily extended, since alternate namespaces, etc. give a lot of opportunities to add in extensions within elements.

    Those are the two issues that have most often moved us in one direction or another.

  4. XML. Validating your data is your friend. There are some efforts to define schema for JSON but why bother waiting when XML has it and has great support in EVERY language. Plus, JavaScript has E4X and so the base implementation language has fantastic XML support. It’s a no brainer.

  5. If CPU, storage, and bandwidth are infinite then XML wins out due to an already existing mechanism for defining structure. As soon as any of those three becomes an issue, it’s time to determine if it’s worth creating a mechanism for defining JSON structure.

  6. Saying that because JavaScript has E4X Javascript has “fantastic language support” for XML is wrong. I have nothing against the idea, and I think eventually it’ll happen, but E4X is the wrong way to do it. For a time I fixed E4X bugs in SpiderMonkey, Mozilla’s JS engine, and half the pain involved in doing so was discovering and documenting specification bugs. E4X is not the right way to do XML in JS.

    Back on topic, Thunderbird’s not going to run its files against a schema on every startup, so the ability to create schemas (which then have to be updated as the format changes, don’t forget!) for your XML is pointless. Stick with JSON, a slightly smaller format, and run with it. But really, it doesn’t matter which you use as long as you don’t spend much time thinking about the choice.

  7. XML for the win!

    Not being able to validate is a Bad Thing ™. And XML’s validation stuff is as robust as you can want. Especially when you’re dealing with a standard that can have arbitrary extensions.

    This is the Internet after all – you can’t get much more arbitrary than that 😉

  8. It’s pretty hard to make a config file readable for people who aren’t developing the product when it’s in JSON.

    If the config is never updated by hand for any reason then it’s fine, then you’re just persisting the config data. But if people are expected to read it and edit you’re not gonna make it readable in JSON.

    Not to mention I don’t know of a JSON serializer that does “pretty printing” like most XML libraries do. Having the config be one giant JSON object on a single line would be a real pain to work with by hand.

  9. Why not something more rigid like SQLite + a lightweight XUL editor? It’s more structured than XML, is portable/compact, and has great access speeds.

    Otherwise, XML is more well known (and is nominally more accessible) from most languages and environments. Web devs know JSON, but everyone else knows a bit of XML in one way or another.

  10. I go for XML too. A lot of companies hasn’t even heard about JSON. Almost everybody knows XML and generating a XML file should be no problem for both most developers and most systems.

    I also think it’s very important that such features like multiple identities, signatures etc are supported in these config files.

  11. If we want to make Thunderbird a robust product, then validation of the config file is a must. Hence XML.

    JSON may be lighter, but Thunderbird will anyway have an XML parser within…

  12. I think JSON is much more readable due to less verbosity. It also makes values typed, which can make it easier to understand and makes it stricter in some ways than XML.

    That said, XML would be nice if you expect the format to have many overlapping extensions (namespaces) and I think you can’t go wrong with either, but I definitely think JSON is the cleaner choice.

  13. JSON is cool when you have JavaScript… But if the goal is to open up the config files to 3rd-parties, then parsing a trivial XML file and handling the config data inside does not require a full-blown browser-like environment.
    It’s furthermore easier, IMHO, to make typos in JSON than in a well-designed trivial XML file.
    While we’re here, I’d like to remain the readers that the original format of preference files of netscape/mozilla is also simple, fine for 3rd-parties, fine for JS:

    user-pref(“bla.foo.bar”, true);

    Isn’t it ?

  14. Don’t have an opinion between XML and JSON.

    However, I’m more interested in support for ACAP:

    http://en.wikipedia.org/wiki/Application_Configuration_Access_Protocol

    I used to use the Mulberry email client. This was unpleasant to use in many ways, but its strongest point was that any computer (at our organisation, my home, a colleague’s home, etc) could be used to log in to my email account with just my username and password and every single setting/config was instantly there, to the point that the window would resize to how I normally use it.

    Settings stored remotely are a good thing :)

    Chris Keene

  15. I’ll prefer XML.
    I don’t see any problems with cpu releated issues, most configs will be few kbytes so it’s not big deal even on old computers.
    Next thing we should talk about is HOW you distribute these files, especially in wide site like enterprise. Options are DNS or DHCP with links to files on main server.

  16. The two are isomorphic. So the decision criteria is “what are enterprises/ISPs going to say when we tell them this is the format we’ve chosen?”. And it does seem to me that XML will mostly get “ah, yes, I know about that”, and JSON will get a “huh?” (except perhaps from their web dev team, which is unlikely to be the same people as their desktop software deployment team).

  17. Speaking from the pov of security: the goals are to have all your own code, and to be simple. Insecurity lurks in complexity, and in other people’s code. Not because people are evil but because their objectives differ from yours in ways that you can’t control.

    Generally, then, you should use your own format. Because that ensures that it is your own code doing the reading. If you feel a little fast and fancy free, pick something that is inside the Mozilla family as at least you can have a conversation …

    To be more precise, I think this approach rules out XML, if one were considering security as the only goal. XML is far too complex, it drags in all sorts of unknown stuff which the average developer cannot control, and you are highly dependent on the other people’s code sets.

    (I’m not familiar with JSON, but a little googling and I found a page that describes it … in a page. From a strictly security pov, that makes it win hands down. I already understand what it can do, so I can secure it.)

  18. I am only a user
    I had to use XML file with an upgrade of FileZilla sftware to recover my ftp server connection info
    It was perfect: copy-paste and bingo it works
    If JSON is more complexe than copy and paste, keep XML
    Copy and paste in Windows Explorer is already complexe for an average Windows user
    Moreover, I can open XML in my NotePad to monitor data in it. I liked it

  19. I am just a user as well as looking into making my own XulRunner App, but I love both FF and TB.

    I can’t really see any huge advantages to using JSON. From what I understand you are talking about a way of importing settings into the program on a one time use basis, not how to permanent storage basis. If that is true then I would take security and validation concerns to be a priority over performance and portability.

    Using XML you can cook up a TB config schema to set everything from general server settings to optional or even required add-ons and their settings. I also agree with the sentiment that while JSON may be easier to write, more technical employees are more familiar with XML, especially app development/distribution and server admin types who seem to be the ones who would be writing these config files. An additional benefit to using XML would be the possibility of corps/ISPs writing their own XUL setup dialogs to further brand TB.

    After reading Chris Keene’s comment and following the given link I am intrigued with ACAP and believe that it should certainly be an option if possible, but I wouldn’t use it as the sole option as there doesn’t seem to be a great deal of implementation of this protocol out there yet.

  20. Sorry, you got me thinking about this and I came up with some other ideas of what the XML config could potentially be used to setup.

    The files could include a XUL overlay to include custom help menu items in the form of either a simple browser link, or a complex add-on documentation system.

    In TB 1 and 2 the account node (the root node for each account with the account’s name as it’s label) was always quite boring in my opinion. You could make this an overlay so that it could be customized with helpful tools like; a XUL progress bar to visualize the account’s quota status; an interface to create a help desk ticket or check antivirus quarantined email (either local or provider’s AV); an organization’s custom IM platform like Sametime, etc. all of which could once again be setup via a XML settings import file.

    The file could further be processed during custom distro installs by including it in a directory like chrome/branding/autoconf/autoconf.xml or maybe default/autoconf/autoconf.xml, or imported after TB has already been installed for new/additional accounts.

    Doing any of these things is quite feasible with XML (and perhaps extra linked in files) but very difficult to do with just JSON.

  21. Both can be used to the same end but XML is much better at deeply structured, document type data. For example, you would most likely rather deal with tax return documents or books in XML than you would in JSON. JSON shines the most when dealing wit key-value pairs. My general rule of thumb is: if its document-centric data, use XML, if its key-value pairs, consider JSON. Comment 21 by “rnd” is interesting too, however. XML is extensible in the sense that you can include other XML languages in your own host language, which allows for some pretty interesting possibilities. Not only can you store data values, you can store things like mathematical expressions (MathML), SVG graphics, XUL overlays, etc. If there is even a remote chance that this sort of thing might be useful in some way, choose XML and leave the door open. If its simply strict Key-Value pairs and always will be, JSON has some technical advantages.

  22. I don’t think you can put comments in a JSON file, can you? That makes showing example config files a pain.

    Is the trailing comma after ‘2’ valid JSON in this example:
    {“a”: 1,
    “b”: 2,
    }
    ? That generates a JS-strict warning from Mozilla’s JS engine. If that is not valid (at least I know it isn’t allowed by the JSON parser that Django uses — guessing that is simplejson), then that kind of thing is a minor pain for maintaining JSON files via any hand editing.

  23. XML means validation, but it also means better error messages when you leave a tag unclosed. JSON of course won’t know whether a closing } is at a particular depth whereas a closing </mail> (providing there’s enough variation in the XML) will almost certainly be tracable to a particular node.

    I don’t know whether any parts of your thunderbird config are similar to existing XML standards (Eg. vCard-XML http://www.xmpp.org/extensions/xep-0054.html ) but that’s something to consider — building upon standards.

    I’d recommend XML or SQLite, probably not JSON.

    ps. Any chance of a comment preview? Thanks.

  24. Schemas are definitely a pro for XML, and not just for validation either. If you’re using a decent XML editor, having a schema will make it give you auto-completion of names and tell you what you can put where, which is very helpful if you don’t have to edit the file often.

    With the actual situation you’re discussing this for (large enterprise-y groups), there is also the fact that there are lots of tools for XML which will help users, such as being able to easily merge their data with a template, XSLT and the like.

  25. I would be interested to hear what people day about this these days. Our organization doesn’t use much JSON, but I am trying to convince them to use JSON for configuration of a JavaScript browser softphone, using http://json-schema.org/ for validation.

Leave a Reply

Your email address will not be published. Required fields are marked *