JSON has become a popular way of storing and exchanging data from the web. Whether you are scraping data through an API or you need to store the data in some database, you will most likely be dealing with data in JSON format. And that’s why in this tutorial, we will discuss how to work with JSON files using Python.Â
Although JSON was coined from the JavaScript programming language, need not worry. Every discussion here will utilize Python. Python provides an easy way of working with JSON files. By the end of this tutorial, you’d know how to work with JSON data in Python.
Specifically, here’s what you’d learn in this tutorial.
- What is JSON
- The Structure and Syntax of JSON
- The JSON Library
- Converting a Python Object to a JSON String
- Prettifying a JSON Data
- Sorting the Keys in a JSON String
- Saving the JSON string as a JSON file
- Converting JSON to a Python Object
- Decoding a JSON file to a Python Object.
- Working with JSON from APIs
Let’s dive in.
What is JSON
JSON stands for JavaSCript Object Notation, which of course was inspired by a part of JavaScript, that deals with object literal syntax. According to their official website, JSON is completely independent of any programming language. However, it uses the convention employed in the C family programming language, making it easy for C, C++, Python, C#Java programmers to easily understand. Let’s understand the structure of JSON.
The Structure and Syntax of JSON
As mentioned, JSON is curled from JavaScript in the beginning so it holds its syntax for object notation.
- Data is stored in name-value pairs
- Objects are encapsulated in curly braces
- Arrays are held in square brackets
- And Data is separated by commas.
An example of a JSON file is shown below:
{ ‘Key_1’: ‘value_1’:, ‘Key_2’: ;value_2’, }
The datatypes permissible in JSON are:
- A number
- A string
- A JSON object
- A boolean
- An array
- Null
It is worthy of note that when moving from a file format to JSON, JSON automatically converts data types that are not listed above, to one of the above-listed data types. This is called encoding and decoding. Going forward, we will see how encoding and decoding are done between JSON and Python.
The JSON Library
Python comes with an inbuilt library for working with JSON files, the JSON library. You can access this library with a simple import statement.
import json
Using this library, here are the 4 most common methods to use when working with JSON files.
- dumps() method: This is used for converting a Python object into a JSON string (encoding)
- dump() method: This is used for converting a Python object into a JSON file (encoding)
- loads() method: This is used for converting a JSON string to a Python object (decoding)
- load() method This is used for converting a JSON file to a Python object (decoding)
Let’s take each of these processes one after the other.
Converting a Python Object to a JSON String
As earlier mentioned, JSON automatically changes data types during the encoding process. When encoding (converting Python to JSON), here’s how the data types are converted.
Python | JSON |
list | Array |
Dict | Object |
Float | Real number |
unicode | String |
None | null |
True | True |
False | False |
Int, long | int |
Now, let’s go ahead and convert a Python object to a JSON string. Recall we use the dumps() method.
#import necessary libraries import json #define a python object as a dictionary python_object = { Â Â Â Â "id": "0001", Â Â Â Â "type": None, Â Â Â Â "name": "Cake", Â Â Â Â "image": Â Â Â Â Â Â Â Â { Â Â Â Â Â Â Â Â Â Â Â Â "url": "images/0001.jpg", Â Â Â Â Â Â Â Â Â Â Â Â "width": 200, Â Â Â Â Â Â Â Â Â Â Â Â "height": 200 Â Â Â Â Â Â Â Â }, Â Â Â Â "thumbnail": Â Â Â Â Â Â Â Â { Â Â Â Â Â Â Â Â Â Â Â Â "url": "images/thumbnails/0001.jpg", Â Â Â Â Â Â Â Â Â Â Â Â "width": 32.3, Â Â Â Â Â Â Â Â Â Â Â Â "height": 32.5 Â Â Â Â Â Â Â Â } } #encode string json_string = json.dumps(python_string) #print json string print(json_string)
Output:
{"id": "0001", "type": null, "name": "Cake", "image": {"url": "images/0001.jpg", "width": 200, "height": 200}, "thumbnail": {"url": "images/thumbnails/0001.jpg", "width": 32.3, "height": 32.5}}Â
You’d notice some of the data type changes. For instance, the None data type is now changed to null.
Prettifying a JSON Data
To beautify our JSON string a bit, we can add some indentation to the JSON string. This is done by defining the indent argument. Let’s set the indent to 2 and observe the result.
#encode string json_string = json.dumps(python_object, indent=3)
Output:
{    "id": "0001",    "type": null,    "name": "Cake",    "image": {       "url": "images/0001.jpg",       "width": 200,       "height": 200    },    "thumbnail": {       "url": "images/thumbnails/0001.jpg",              "width": 32.3,       "height": 32.5    } }
Now, you can read and understand the file better. You’d observe image and thumbnail objects are another object with keys and values
Sorting the Keys in a JSON String
If you wish, you can sort the keys in the JSON String using the sort_keys argument. When you set it to True, the keys are sorted in alphabetical order. Let’s try it out.
#encode string with sorted keys json_string = json.dumps(python_object, indent=3, sort_keys=True)
Output:
{    "id": "0001",    "image": {       "height": 200,       "url": "images/0001.jpg",       "width": 200    },    "name": "Cake",    "thumbnail": {       "height": 32.5,       "url": "images/thumbnails/0001.jpg",              "width": 32.3    },    "type": null }
You’d see that the keys are now sorted in alphabetical order: id, image, name, thumbnail, and type.
Saving the JSON string as a JSON file
You can decide to save the JSON string as a file. This time, you use the dump() method rather than the dumps() method.
Let’s see an example.
#import necessary libraries import json #define a python object as a dictionary python_object = { Â Â Â Â "id": "0001", Â Â Â Â "type": None, Â Â Â Â "name": "Cake", Â Â Â Â "image": Â Â Â Â Â Â Â Â { Â Â Â Â Â Â Â Â Â Â Â Â "url": "images/0001.jpg", Â Â Â Â Â Â Â Â Â Â Â Â "width": 200, Â Â Â Â Â Â Â Â Â Â Â Â "height": 200 Â Â Â Â Â Â Â Â }, Â Â Â Â "thumbnail": Â Â Â Â Â Â Â Â { Â Â Â Â Â Â Â Â Â Â Â Â "url": "images/thumbnails/0001.jpg", Â Â Â Â Â Â Â Â Â Â Â Â "width": 32.3, Â Â Â Â Â Â Â Â Â Â Â Â "height": 32.5 Â Â Â Â Â Â Â Â } } #encode string with sorted keys with open('new_jsonfile.json', 'w') as file: Â Â Â Â json.dump(python_object, file, indent=3, sort_keys=True) Â
Observe that the dump method takes the argument of the python object to encode and the file to be dumped into.
Output:
A new JSON file has been created.
Next, let’s see how to decode a JSON file.
Converting JSON to a Python Object
Changing a JSON to a Python object is called decoding. In this section, we will discuss how to decode a JSON string or file. Let’s begin with a JSON string. To decode a JSON string, you use the loads() method.
Let’s see an example.
#import necessary libraries import json #define a json string as a dictionary json_string = ''' {    "id": "0001",    "type": null,    "name": "Cake",    "image": {       "url": "images/0001.jpg",       "width": 200,       "height": 200    },    "thumbnail": {       "url": "images/thumbnails/0001.jpg",              "width": 32.3,       "height": 32.5    } }  ''' #decode string with sorted keys python_object = json.loads(json_string) #print the result print(python_object)
Output:
{'id': '0001', 'type': None, 'name': 'Cake', 'image': {'url': 'images/0001.jpg', 'width': 200, 'height': 200}, 'thumbnail': {'url': 'images/thumbnails/0001.jpg', 'width': 32.3, 'height': 32.5}}Â
Notice that the null datatype has been converted to a None data type. We can return the values of the Python object as a dictionary. Say we wish to check the values of the thumbnail key, we can write
#print the values of the thumbnail key print(python_object['thumbnail'])
Output:
{'url': 'images/thumbnails/0001.jpg', 'width': 32.3, 'height': 32.5}
You can as well parse a JSON file and convert it to a Python object.
Decoding a JSON file to a Python Object.
To decode a JSON file to a python object, you use the load() method rather than the loads() method.
#import necessary libraries import json #read json file with open('new_jsonfile.json') as file: Â Â Â Â new_python_object = json.load(file) Â #print the result print(new_python_object)
Output:
{'id': '0001', 'image': {'height': 200, 'url': 'images/0001.jpg', 'width': 200}, 'name': 'Cake', 'thumbnail': {'height': 32.5, 'url': 'images/thumbnails/0001.jpg', 'width': 32.3}, 'type': None}Â
As seen, the JSON file we created the last time is decoded and printed as a python dictionary
Working with JSON from APIs
When scraping data from the web using an API, most APIs return a JSON file. This is one common application of working with JSON files. Let’s say we wish to scrap the exchange information for different currencies, we can get the data from GDAX API. See the example below.
#import the necessary libraries import json from urllib.request import urlopen  #open the JSON from the API with urlopen('https://api.gdax.com/products/') as response:     texts = response.read()  #convert json to python objects data = json.loads(texts)  #convert python object to a prettified json print(json.dumps(data, indent=3))
Output:
 {       "id": "LOOM-USDC",       "base_currency": "LOOM",       "quote_currency": "USDC",       "base_min_size": "1.00000000",       "base_max_size": "2500000.00000000",              "quote_increment": "0.00000100",       "base_increment": "1.00000000",       "display_name": "LOOM/USDC",       "min_market_funds": "0.1",       "max_market_funds": "100000",       "margin_enabled": false,       "post_only": false,       "limit_only": true,       "cancel_only": false,       "trading_disabled": false,       "status": "online",       "status_message": ""    },    {       "id": "DAI-USDC",       "base_currency": "DAI",       "quote_currency": "USDC",       "base_min_size": "1.00000000",       "base_max_size": "100000.00000000",       "quote_increment": "0.00000100",       "base_increment": "0.00001000",       "display_name": "DAI/USDC",       "min_market_funds": "5",       "max_market_funds": "100000",       "margin_enabled": false,       "post_only": false,       "limit_only": true,       "cancel_only": false,       "trading_disabled": false,       "status": "online",       "status_message": ""    },    {       "id": "XTZ-USD",       "base_currency": "XTZ",       "quote_currency": "USD",       "base_min_size": "1.00000000",       "base_max_size": "100000.00000000",       "quote_increment": "0.00010000",       "base_increment": "0.01000000",       "display_name": "XTZ/USD",       "min_market_funds": "10",       "max_market_funds": "100000",       "margin_enabled": false,       "post_only": false,       "limit_only": false,       "cancel_only": false,       "trading_disabled": false,       "status": "online",       "status_message": ""    },    {       "id": "EOS-BTC",       "base_currency": "EOS",       "quote_currency": "BTC",       "base_min_size": "0.10000000",       "base_max_size": "50000.00000000",       "quote_increment": "0.00000100",       "base_increment": "0.10000000",       "display_name": "EOS/BTC",       "min_market_funds": "0.001",       "max_market_funds": "30",       "margin_enabled": false,       "post_only": false,       "limit_only": false,       "cancel_only": false,       "trading_disabled": false,       "status": "online",       "status_message": ""    }, }
There was more to this output by the way. If say, we want to print all the currency exchanges in this data, we can write
#print the list of exchanges available for item in data: Â Â Â Â print(item['id'], end=', ') print()Â Â Â Â print(f'There are {len(data)} currency exchanges in this data')
Output:
ETC-GBP, FIL-USD, BNT-GBP, ETH-EUR, NMR-BTC, XRP-USD, UNI-BTC, FIL-EUR, BNT-BTC, AAVE-GBP, BNT-EUR, BTC-GBP, BCH-BTC, LRC-USD, XRP-EUR, EOS-USD, ALGO-GBP, EOS-EUR, MKR-USD, UNI-USD, UMA-EUR, COMP-BTC, BNT-USD, ETH-USDC, NU-GBP, NU-EUR, LTC-BTC, REP-USD, BAND-BTC, EOS-BTC, KNC-USD, LTC-USD, DASH-USD, NU-USD, WBTC-USD, BTC-USDC, ZEC-USD, XLM-USD, XTZ-USD, FIL-BTC, BAL-USD, ETH-BTC, SNX-EUR, SNX-GBP, YFI-BTC, DASH-BTC, DAI-USD, GRT-BTC, UMA-GBP, CVC-USDC, XTZ-BTC, REN-BTC, ZEC-USDC, GNT-USDC, CGLD-EUR, LTC-EUR, FIL-GBP, CGLD-USD, ALGO-BTC, XTZ-EUR, GRT-USD, MANA-USDC, SNX-BTC, ATOM-USD, BAL-BTC, KNC-BTC, CGLD-BTC, NMR-USD, BAND-EUR, ALGO-EUR, OMG-EUR, XLM-BTC, ETC-USD, YFI-USD, BAT-USDC, OMG-BTC, LINK-EUR, NU-BTC, LINK-ETH, OMG-USD, ETH-GBP, DNT-USDC, ZRX-EUR, REP-BTC, CGLD-GBP, AAVE-EUR, ETC-EUR, REN-USD, UMA-USD, LRC-BTC, XRP-GBP, XTZ-GBP, ETC-BTC, ATOM-BTC, NMR-GBP, LOOM-USDC, ZEC-BTC, BCH-GBP, GRT-GBP, COMP-USD, ETH-USD, BCH-USD, ETH-DAI, BCH-EUR, AAVE-USD, UMA-BTC, OXT-USD, BAND-GBP, BAT-ETH, XRP-BTC, XLM-EUR, MKR-BTC, WBTC-BTC, ALGO-USD, SNX-USD, DAI-USDC, OMG-GBP, LINK-BTC, LINK-USD, BTC-EUR, ZRX-BTC, AAVE-BTC, LTC-GBP, GRT-EUR, BAND-USD, NMR-EUR, ZRX-USD, BTC-USD, LINK-GBP, There are 129 currency exchanges in this data
If say we wish to check the currency exchange with the highest market fund, we can run
#create a list to store max market fund max_market_list = [] for item in data: Â Â Â Â max_market_list.append((item['id'], item['max_market_funds'])) #print the max market fund currency and it's value print(max(max_market_list))
Output:
('ZRX-USD', '100000')
As seen, the ZRX cryptocurrency to the USD has the highest market fund.
So basically, this is how you can play around with data in JSON format from APIs.
To conclude,
We have discussed how to encode and decode JSON strings and files in Python. You have also learned that it is good practice to prettify your JSON files using the indent argument. Finally, you have seen how to work with JSON files from APIs.
If you’ve got any questions, feel free to leave them in the comment section and I’d do my best to answer them.
One Response