Streaming JSONL data#

In this example, we demonstrate how to stream JSONL data from a URL and parse it into a dataframe and visualise the data. We will use the XSGD-USDC pair on Uniswap V3 token as an example.

JSON lines#

JSON Lines is a convenient format for storing structured data that may be processed one record at a time. This is preferred method for getting large amounts of OHLCV data due to the ease of incremental processing and lower memory footprint

JSON Lines is like JSON, but each new line (record) is a new JSON object. The basic structure of JSON Lines is very simple:

{"name": "John", "age": 30, "city": "New York"}
{"name": "Jane", "age": 25, "city": "Chicago"}
{"name": "Jim", "age": 35, "city": "Los Angeles"}

In the above example, each line is a valid JSON object, and each line can be parsed independently of the other lines. This property makes JSON Lines ideal for data storage where new records are appended over time, such as log files. It’s also good for data processing tasks that can be carried out in a line-by-line manner, such as Unix ‘grep’, ‘sed’, or ‘awk’.

In comparison, a regular JSON data would look like:

[
  {"name": "John", "age": 30, "city": "New York"},
  {"name": "Jane", "age": 25, "city": "Chicago"},
  {"name": "Jim", "age": 35, "city": "Los Angeles"}
]

In the above, you can see that the entire string has to be parsed as a whole to get the data, unlike JSON Lines where each line is independently a JSON object.

Data Streaming#

[1]:

import requests
import pandas as pd
import json

pair_url = "https://tradingstrategy.ai/api/candles-jsonl?pair_ids=2699634&time_bucket=1d"

def get_candles(url: str) -> pd.DataFrame:
    x = requests.get(url)

    data = x.text.split('\n')  # split by newline character

    json_objects = []
    for line in data:
        try:
            json_objects.append(json.loads(line))
        except:
            pass


    candles = pd.DataFrame(json_objects)
    candles.rename(columns = {'ts':'date','o':'open', 'h':'high','l':'low','c':'close','v':'volume'}, inplace = True)
    candles['timestamp'] = pd.to_datetime(candles['date'])
    candles = candles.set_index('date')
    return candles

pair_data = get_candles(pair_url)

display(pair_data.head())

	open	high	low	close	volume	xr	b	s	tc	bv	sv	p	sb	eb	timestamp
date
1623196800	0.751273	0.761408	0.751273	0.753983	46404.942246	1.0	None	None	None	None	None	2699634	12599483	12602480	1970-01-01 00:00:01.623196800
1623283200	0.753983	0.754360	0.749023	0.754360	27170.926637	1.0	None	None	None	None	None	2699634	12607138	12608081	1970-01-01 00:00:01.623283200
1623369600	0.754511	0.754511	0.754511	0.754511	556.739452	1.0	None	None	None	None	None	2699634	12611519	12611519	1970-01-01 00:00:01.623369600
1623456000	0.749398	0.753380	0.733605	0.753380	126186.194150	1.0	None	None	None	None	None	2699634	12618259	12619487	1970-01-01 00:00:01.623456000
1623542400	0.744022	0.751574	0.744022	0.751574	44663.119939	1.0	None	None	None	None	None	2699634	12628856	12628860	1970-01-01 00:00:01.623542400

Data Visualisation#

[2]:

from tradingstrategy.charting.candle_chart import visualise_ohlcv

def get_figure(candles: pd.DataFrame, chart_name: str):
    return visualise_ohlcv(
        candles,
        height=600,
        theme="plotly_white",
        chart_name=chart_name,
        y_axis_name="Price",
        volume_axis_name="volume",
    )

fig = get_figure(pair_data, "XSGD/USDC")

fig.show()