Dealing with JSON Files


It is very likely that sooner or later you will have to parse through a JSON file as opposed to a well organised CSV file. This post gather a couple of python code that can help you get started making sense of JSON format. Let’s quickly have a look.

Let’s import some packages

# Import
import json
from import json_normalize

We create a dataset to play with

# Let's manually create some JSON string
json_string = """
    "person": {
        "name": "John",
        "age": 31,
        "city": "San Francisco",
        "relatives": [
                "name": "Jane",
                "age": 34,
                "city": "Los Angeles"
data = json.loads(json_string)
{'person': {'age': 31,
  'city': 'San Francisco',
  'name': 'John',
  'relatives': [{'age': 34, 'city': 'Los Angeles', 'name': 'Jane'}]}}

Let’s parse the data!

We can easily transform a dataset in a JSON format into a more readible format - like a DataFrame - using the json_normalize function. See below:

data_parsed = json_normalize(data)
person.age person.relatives
0 31 San Francisco John [{'name': 'Jane', 'age': 34, 'city': 'Los Ange...
age city name
0 34 Los Angeles Jane

Loading a JSON file into a jupyter notebook, normalizing the data and saving it into a DataFrame can be done as follows:

filepath = "pathtojonfile/file.json"
dataJSON = json.load(open(filepath))
dataDF = json_normalize(dataJSON)

Et voila!

Share this post: