This issue is alive and well. Chronological updates are at the bottom of the text.
JSON parsing? Well, that surely must be easy, is it not? Native JSON object is here, why should anyone now worry about Cross-browser JSON parsing, into the ES5 objects? After all, JSON has very simple syntactical rules.
Yes, that line of thinking might be a sensible approach. But how is the issue of using “nonstandard” JSON strings solved in your “cross-browser javascript”?
Ok, let us dive deep, without hesitation.
IE8: |
JSON.parse("{ 'a':1 }") |
Syntax Error |
CHROME |
JSON.parse("{ 'a':1 }") |
OK |
FF |
JSON.parse("{ 'a':1 }") |
Syntax Error |
OPERA 10.10 |
JSON.parse("{ 'a':1 }") |
Undefined variable JSON |
SAFARI 4.0.4 |
JSON.parse("{ 'a':1 }") |
Syntax Error |
See the problem? Your ES5 code receives JSON strings that are out of your control. And standard JSON string has to be embedded in single quotes and property names have to be in double quotes. Like so:
1 2 3 4 5 6 |
// example of a legal JSON string '{ "name" : "json", "id": 1, "whoami" : function () { return "guess who?" } }' // |
The problem is that not every browser follows this simple rule. While CHROME 4.x
, which is/was a browser with a (very) a significant number of users, does not. That is perhaps not such a big issue. The big issue is that today on the WWW there is a large number of legacy systems connected to it, all passing JSON “almost-standard” (aka “illegal”) strings around.
I suppose, now you might reply: “…OK, why support nonstandard usage … ?”. And then someone else might reply: “but CHROME does”? Ad infinitum … Instead of endless debate, I suggest a slight detour. Here is the trick: JSON.parse() does not have to be used to “parse” JSON strings. The good old Function() approach, “just works” :
1 2 3 4 5 6 |
// var data = "{ 'a':1 }" ; // or "{ a : 1 }" // or the proper syntax '{ "a" : 1 }' data = (new Function("return " + data))(); // |
The above “trick” works in each browser, regardless of the fact that string passed in, sometimes is not a proper JSON syntax. This works everywhere, including IE8, FF, and SAFARI where JSON.parse("{'a':1}")
dutifully throws an exception. The above trick also works in browsers that have no JSON as a native object at all. End of detour.
In case you feel safe, I am sorry to tell you, this worketh today in my CHROME 93.0.X
.
Good in case you want to support it. I don’t. Here is why.
By now perhaps you might want to adopt a more sensible approach? No compromise approach where your library will NOT allow for non-standard JSON strings, anymore!
Maybe I did not say it clearly enough, so I will do it now: I agree 100% with a no-compromise approach to “almost-standard” (aka illegal) JSON strings. It is only that in reality, there are well-known and (commercial) paid for RESTfull services, which return this wrong kind of JSON.
Especially this kind: ” { look-ma-no-quotes : 1 }” is, it seems, in widespread use. No quotes whatsoever around names.
What is also relevant in this context, is that there are other JSON issues, especially security ones. Issues much larger than the travails of any JavaScript library out there. And they need to be ultimately re-solved by W3C, IEEE, http://www.soa-standards.org/
, etc …, not jQuery, Dojo
or any other “ninja” team. In my opinion, the best one can do, for her library, is to document these issues. And that will immediately show if the browser of choice is capable of NOT legal JSON parsing.
What is especially worrying is that new kinds of “AJAX” (not AJAX) platforms are starting to appear. Non-dom and non ES5 code, which “just” uses the idea of REST (JSON + HTTP). Like Node.js or Ruby apps or Python Tornado… all happily working without dom, JavaScript, or browsers. And which talk to some external and proprietary server-side quaky “REST”, (not REST).
This issue is old news with XML, but XML text is not a source of any programming language. JSON is much more dangerous since it actually is source code, not a document markup language like XML is.
As a very good example. In CHROME 4.x
window.JSON.parse
will happily parse "{ 'a' : 1 }"
which is not standard. It will even parse "{ a : 1}"
, and yes it will also parse this :
1 2 3 4 5 |
// JSON.parse( '{ "document.writeln('!')" : 1 }' ); // ok in CHROME // |
In your organization, You might want to allow for this or not. It is up to you. I would not. Actually the simplest “way out” is to check in your JavaScript library which browser it is currently in. In the case of omnipresent jQuery, I would add new jQuery.support
member :
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
// // NOTE: bellow is also false // if there is no native JSON object // jQuery.support.nonstandard_json_string = function () { try { JSON.parse("{ a : 1 }"); return true ; } catch(x) { return false; } }(); //------------------------------- |
As far as I know “only” in CHROME, the above yields true (Tested up to CHROME 4.0.302.3):
1 2 3 4 |
// // In CHROME 4.X jQuery.support.nonstandard_json_string === true ; // |
Having this in place, one can go ahead and implement her JSON parsing logic. Here is my attempt :
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 |
/* (c) 2013 - 2021 by dbj at dbj dot org, https://dbj.org/license_dbj/ */ // cross browser safe JSON parsing // illegal JSON stops here ... // (function (window, undefined) { // taken from http://json.org/json2.js var ok_json = function ( data ) { return /^[],:{}s]*$/.test( data.replace( /\(?:["\/bfnrt]|u[0-9a-fA-F]{4})/g, "@") .replace( /"[^"\nr]*"|true|false|null|-?d+(?:.d*)?(?:[eE][+-]?d+)?/g, "]") .replace(/(?:^|:|,)(?:s*[)+/g, "") ) ; } // true if native JSON exists and supports non-standard JSON var ok_wrong_json = function () { try { JSON.parse("{ a : 1 }"); return true ; } catch(x) { return false ; } }(); window.json_parse = ( window.JSON && ("function" === typeof window.JSON.parse) ) ? ( ok_wrong_json ) ? function json_parse ( data ) { // Case 1 : native JSON is here but supports illegal strings if ( ! ok_json( data ) ) throw new Error(0xFFFF,"Bad JSON string.") ; return window.JSON.parse( data ) ; } : // else function json_parse ( data ) { // Case 2: native JSON is here , // and does not support illegal strings // this will throw on illegal strings return window.JSON.parse( data ) ; } : // else function json_parse ( data ) { // Case 3: there is no native JSON present if ( ! ok_json( data ) ) throw new Error(0xFFFF,"Bad JSON string.") ; return (new Function("return " + data))(); } ; })(window) ; /* */ |
Above is definitely a “slow and safe” approach. Perhaps this is good enough, to represent a safe cross-browser JSON parsing mechanism?
Update: 2010 Feb 01
I made a little “fuss” on the V8 (Chrome JavaScript engine) “Issues” forum, and today it appears to be fixed: http://code.google.com/p/v8/issues/detail?id=372 . So in “V8”, JSON.parse() , now works as it should. Therefore some near future CHROME updates will not fall into “case 1” anymore, in the code above.
Update: 2013 Jan 29
Of course, 3 years after the situation is much better in every major browser. But. These days we have a multitude of mobile device browsers that have to be tested. Not an easy task but there are teams doing it. This is nothing less than a good thing because this kind of code wakes me up in the early morning in a cold sweat:
1 2 3 4 5 6 7 8 9 10 |
/* */ var sneaky = JSON.parse( '{ "document.writeln(x)" : "WTF!?" }' ) ; for ( j in sneaky ) ( new Function ("x",j)) (sneaky [j]) ; /* */ |
The whole Internet might actually not be as safe as we think it is, yes?
Update: 2021 Sep
I was assured javascript “ES5” testing suite is alive and well; alas somewhere in the clouds. I assume, brave researchers might find it by visiting ECMA-262.
–DBJ
4 thoughts on “JSON : Naughty parsing is still allowed”
There are many different ways for JSON to be invalid, how do you propose testing every single one of them to see which browsers “correctly” parse all possible variations of invalid JSON?
JSON validity checking, in this solution, is left to
JSON.parse()
function, orok_json()
. Latter implementation is borrowed from json2.js. made by one Mr D. Crockford.If it would not be for the (current) CHROME, the solution would be (much) simpler. That is: Just use JSON , if available, otherwise fall-back onto the Function() “trick”, with this
ok_json()
.Perhaps ultimately it is not possible to check the 100% validity of the JSON string. Especially if it is a very large one. And also: what is the upper limit of a JSON string size ? There are, more than few unresolved issuess here. A job for W3C, I guess. Perhaps JSON community could/should borrow from XML parsing and security concepts.
Update 2010 Jan 25
I see now what exactly you mean ! You mean: how do we know that
ok_wrong_json()
, is rigorous enough.To which I think the valid answer is : To have 100% rigorous testing , in there, one would have to apply dozens of tests. (see : http://es5conform.codeplex.com/ ) Instead my logic is: if the basic test is not passed, then all the other tests are irrelevant. If JSON.parse(“{ a : 1}”) , works nothing else matters. Because this means that this JSON implementation is seriously flawed.
Thanks for your comment : DBJ
http://code.google.com/p/v8/issues/detail?id=573#c11