Line Delimited JSON
- This article was considered for deletion at Wikipedia on November 6 2015. This is a backup of Wikipedia:Line_Delimited_JSON. All of its AfDs can be found at Wikipedia:Special:PrefixIndex/Wikipedia:Articles_for_deletion/Line_Delimited_JSON, the first at Wikipedia:Wikipedia:Articles_for_deletion/Line_Delimited_JSON.
- Wikipedia editors had multiple issues with this page:
- The topic of this article may not meet Wikipedia's general notability guideline. But, that doesn't mean someone has to… establish notability by citing reliable secondary sources that are independent of the topic and provide significant coverage of it beyond its mere trivial mention. (October 2013)
Template:NPOV language Line Delimited JSON is a standard for delimiting JSON in stream protocols (such as TCP).
Contents
Introduction
This is a minimal specification for sending and receiving JSON over a stream protocol, such as TCP.
The Line Delimited JSON framing is so simple that no specification had previously been written for this ‘obvious’ way to do it.
Example output
(With \r\n
line separators)
<source lang="javascript"> {"some": "thing"} {"foo": 17, "bar": false, "quux": true} {"may": {"include": "nested", "objects": ["and", "arrays"]}} </source>
Motivation
There is currently no standard for transporting JSON within a stream protocol (primarily plain TCP), apart from WebSockets, which is unnecessarily complex for non-browser applications.
An important use case is processing a large number of JSON objects where the receiver of the data should not have to receive every single byte before it can begin decoding it. The processing time and memory usage of a JSON parser trying to parse a multi-gigabyte (or larger) string is often prohibitive. Thus, a "properly" encoded JSON list of millions of lines is not a practical way to pass and parse data.[1]
There were numerous possibilities for JSON framing, including counted strings and ASCII control characters or non-ASCII characters as delimiters (DLE STX
, ETX
or WebSocket's 0xFF
s).
Scope
The primary use case for LDJSON is an unending stream of JSON objects, delivered at variable times, over TCP, where each object needs to be processed as it arrives. e.g. a stream of stock quotes or chat messages.
Philosophy / requirements
The specification must be:
- trivial to implement in multiple popular programming languages
- flexible enough to handle arbitrary whitespace (pretty-printed JSON)
- not contain non-printable characters
- netcat/telnet friendly
Functional specification
Software that supports Line Delimited JSON
PostgreSQL
As of version 9.2 PostgreSQL has a function called row_to_json
.[2] In addition PostgreSQL supports JSON as a field type, so this may output nested components in much the same way as MongoDB and other NoSQL databases.
<source lang="json">
[email protected]:~$ echo 'SELECT row_to_json(article) FROM article;' | sudo -u postgres psql—tuples-only {"article_id":1,"article_name":"ding","article_desc":"bellsound","date_added":null} {"article_id":2,"article_name":"dong","article_desc":"bellcountersound","date_added":null} [email protected]:~$
</source>
Apache
Apache logs can be formatted as JSON lines by setting the LogFormat
variable. For example, here is how to write logs for consumption by Logstash and Kibana: "Getting Apache to output JSON (for logstash 1.2.x)". http://untergeek.com/2013/09/11/getting-apache-to-output-json-for-logstash-1-2-x/.
NGINX
NGIИX logs can likewise be formatted as JSON lines by setting the log_format
variable, such as in this example: "Logging to Logstash JSON Format in Nginx". https://blog.pkhamre.com/logging-to-logstash-json-format-in-nginx/.
jline
An example [1] of command-line tools for manipulating JSON lines in much the same way that grep, sort and other Unix tools manipulate CSV.
jq
sed for JSON, implemented in C and compiled to a standalone binary. [2]
pigshell
This is a shell-in-a-browser that has pipelines made up from objects [3].
Sending
Each JSON object must be written to the stream followed by the carriage return and newline characters 0x0D0A
. The JSON objects may contain newlines, carriage returns and any other permitted whitespace. See http://www.json.org/ for the full specification.
All serialized data must use the UTF-8 encoding.
Receiving
The receiver should handle pretty-printed (multi-line) JSON.
The receiver must accept all common line endings: ‘0x0A’ (Unix), ‘0x0D’ (Mac), ‘0x0D0A’ (Windows).
Trivial implementation
A simple implementation is to accumulate received lines. Every time a line ending is encountered, an attempt must be made to parse the accumulated lines into a JSON object.
If the parsing of the accumulated lines is successful, the accumulated lines must be discarded and the parsed object given to the application code.
If the amount of unparsed, accumulated characters exceeds 16 MiB the receiver may close the stream. Resource constrained devices may close the stream at a lower threshold, though they must accept at least 1 KiB.
Implementations
- ldjson-stream parser and serializer (Node.js, BSD License)
- LDJSONStream simple and secure, no dependencies (Node.js, ISC License)
MIME type and file extensions
When using HTTP/email the MIME type for Line Delimited JSON should be application/x-ldjson.
When saved in a file, the file extension should be .ldjson or .ldj
Many parsers handle Line Delimited JSON,[3] and standard content-type for "streaming JSON" suggests application/json; boundary=NL for the MIME type
See also
- Server-sent events (EventSoure)
Notes and references
- ↑ "JSON.parse() on a large array of objects is using way more memory than it should". http://stackoverflow.com/questions/30564728/json-parse-on-a-large-array-of-objects-is-using-way-more-memory-than-it-should. Retrieved 31 May 2015.
- ↑ "row_to_json". http://www.postgresql.org/docs/9.2/static/functions-json.html. Retrieved 6 October 2014.
- ↑ trephine.org. "Newline Delimited JSON". trephine.org. http://trephine.org/t/index.php?title=Newline_delimited_JSON. Retrieved 2 July 2013.
- Ryan, Film Grain. "How We Built Filmgrain, Part 2 of 2". filmgrainapp.com. http://blog.filmgrainapp.com/2013/07/02/how-we-built-filmgrain-part-2-of-2/. Retrieved 4 July 2013.