Working with JSON Lines in Python: My Real-World Experience Using jsonline
As a backend developer, I’ve dealt with JSON files that were far too large to load into memory.
Log pipelines, analytics exports, ML datasets — they all looked fine until a single
json.load() call brought my service to a halt. That’s when I first encountered
JSON Lines and eventually the jsonline Python library.
In this article, I’ll share what JSON Lines actually are, how I used the
jsonline library in production, the mistakes I made along the way,
and how I debugged them. This is written from hands-on experience, not theory.
What Is JSON Lines (JSONL)?
JSON Lines is a format where each line in a file is a standalone JSON object. Instead of one massive JSON array, data is written incrementally, one record per line.
{"id":1,"event":"login","user":"alex"}
{"id":2,"event":"purchase","amount":49.99}
{"id":3,"event":"logout","user":"alex"}
This format is common in:
- Server logs
- Streaming pipelines
- Data science workflows
- Large exports from APIs
The key benefit is that you can process files line-by-line without loading everything into memory.
Why I Chose the jsonline Python Library
I discovered jsonline while searching for a way to explore multi-gigabyte JSONL files without rewriting my entire pipeline. The library provides list-like access to JSON Lines files while reading only the required parts from disk.
What stood out to me:
- No full file loading
- Indexed access to JSON objects
- Append-friendly workflow
- Simple mental model
Installing jsonline
pip install jsonline
Reading JSON Lines with jsonline
This is the first working example I used while inspecting production logs:
from jsonline import JsonLines
data = JsonLines("events.jsonl")
first = data[0]
count = len(data)
The surprising part for me was how fast this felt compared to streaming manually. Accessing individual records felt almost instant, even with very large files.
Appending Data Safely
Appending new records without rewriting the file was another reason I kept using jsonline. Here’s a real snippet from a batch job I maintained:
from jsonline import JsonLines
data = JsonLines("events.jsonl")
data.append({"id":4,"event":"timeout"})
Common Errors I Faced (And How I Fixed Them)
1. Invalid JSON on a Single Line
One malformed line caused unexpected crashes. JSON Lines requires each line to be valid JSON. I fixed this by validating incoming data before writing.
import json
json.loads(line)
2. Trailing Commas
Some upstream services produced JSON-like output that wasn’t valid JSON. Trailing commas were the most common issue.
Before writing the file, I ran the payload through JSON Formatter Pro to normalize and validate it.
3. Encoding Errors
UTF-8 encoding issues surfaced when logs came from different systems. Opening files explicitly with encoding resolved this.
open("events.jsonl", encoding="utf-8")
How I Debug JSON Lines Faster
When files became unreadable, I stopped debugging manually and started using browser tools. My workflow now looks like this:
- Extract a problematic line
- Validate it using JSON Formatter Pro
- Fix structural issues
- Append back safely
This alone saved hours of guesswork.
JSON Lines Examples in Other Languages
JavaScript (Node.js)
const fs = require("fs")
const lines = fs.readFileSync("events.jsonl","utf8").trim().split("\n")
const objects = lines.map(JSON.parse)
Go
file, _ := os.Open("events.jsonl")
scanner := bufio.NewScanner(file)
for scanner.Scan() {
var obj map[string]interface{}
json.Unmarshal(scanner.Bytes(), &obj)
}
Java
BufferedReader reader = new BufferedReader(new FileReader("events.jsonl"));
String line;
while ((line = reader.readLine()) != null) {
JSONObject obj = new JSONObject(line);
}
PHP
$lines = file("events.jsonl", FILE_IGNORE_NEW_LINES);
foreach ($lines as $line) {
$obj = json_decode($line, true);
}
When jsonline Is the Right Choice
- Large datasets
- Append-only workflows
- Log processing
- Streaming exports
For small files, standard JSON parsing is fine. For everything else, JSON Lines and jsonline scale far better.
Alternative Tools and Libraries
Depending on your stack, these tools solve similar problems:
- ujson (Python)
- orjson (Python)
- jq (CLI)
- Pandas read_json with lines=True
Final Thoughts
JSON Lines solved real problems for me that traditional JSON couldn’t. The jsonline library made it practical without introducing complexity or memory pressure.
If you’re debugging malformed JSON or converting data before writing JSONL files, I strongly recommend validating your payloads using JSON Formatter Pro before processing them further.
For related tasks, you may also find these tools useful:
This workflow has been battle-tested in production environments, and it’s one I still rely on today.