Streaming JSONL data#
In this example, we demonstrate how to stream JSONL data from a URL and parse it into a dataframe and visualise the data. We will use the XSGD-USDC pair on Uniswap V3 token as an example.
JSON lines#
JSON Lines is a convenient format for storing structured data that may be processed one record at a time. This is preferred method for getting large amounts of OHLCV data due to the ease of incremental processing and lower memory footprint
JSON Lines is like JSON, but each new line (record) is a new JSON object. The basic structure of JSON Lines is very simple:
{"name": "John", "age": 30, "city": "New York"}
{"name": "Jane", "age": 25, "city": "Chicago"}
{"name": "Jim", "age": 35, "city": "Los Angeles"}
In the above example, each line is a valid JSON object, and each line can be parsed independently of the other lines. This property makes JSON Lines ideal for data storage where new records are appended over time, such as log files. It’s also good for data processing tasks that can be carried out in a line-by-line manner, such as Unix ‘grep’, ‘sed’, or ‘awk’.
In comparison, a regular JSON data would look like:
[
{"name": "John", "age": 30, "city": "New York"},
{"name": "Jane", "age": 25, "city": "Chicago"},
{"name": "Jim", "age": 35, "city": "Los Angeles"}
]
In the above, you can see that the entire string has to be parsed as a whole to get the data, unlike JSON Lines where each line is independently a JSON object.
Data Streaming#
[1]:
import requests
import pandas as pd
import json
pair_url = "https://tradingstrategy.ai/api/candles-jsonl?pair_ids=2699634&time_bucket=1d"
def get_candles(url: str) -> pd.DataFrame:
x = requests.get(url)
data = x.text.split('\n') # split by newline character
json_objects = []
for line in data:
try:
json_objects.append(json.loads(line))
except:
pass
candles = pd.DataFrame(json_objects)
candles.rename(columns = {'ts':'date','o':'open', 'h':'high','l':'low','c':'close','v':'volume'}, inplace = True)
candles['timestamp'] = pd.to_datetime(candles['date'])
candles = candles.set_index('date')
return candles
pair_data = get_candles(pair_url)
display(pair_data.head())
open | high | low | close | volume | xr | b | s | tc | bv | sv | p | sb | eb | timestamp | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
date | |||||||||||||||
1623196800 | 0.751273 | 0.761408 | 0.751273 | 0.753983 | 46404.942246 | 1.0 | None | None | None | None | None | 2699634 | 12599483 | 12602480 | 1970-01-01 00:00:01.623196800 |
1623283200 | 0.753983 | 0.754360 | 0.749023 | 0.754360 | 27170.926637 | 1.0 | None | None | None | None | None | 2699634 | 12607138 | 12608081 | 1970-01-01 00:00:01.623283200 |
1623369600 | 0.754511 | 0.754511 | 0.754511 | 0.754511 | 556.739452 | 1.0 | None | None | None | None | None | 2699634 | 12611519 | 12611519 | 1970-01-01 00:00:01.623369600 |
1623456000 | 0.749398 | 0.753380 | 0.733605 | 0.753380 | 126186.194150 | 1.0 | None | None | None | None | None | 2699634 | 12618259 | 12619487 | 1970-01-01 00:00:01.623456000 |
1623542400 | 0.744022 | 0.751574 | 0.744022 | 0.751574 | 44663.119939 | 1.0 | None | None | None | None | None | 2699634 | 12628856 | 12628860 | 1970-01-01 00:00:01.623542400 |
Data Visualisation#
[2]:
from tradingstrategy.charting.candle_chart import visualise_ohlcv
def get_figure(candles: pd.DataFrame, chart_name: str):
return visualise_ohlcv(
candles,
height=600,
theme="plotly_white",
chart_name=chart_name,
y_axis_name="Price",
volume_axis_name="volume",
)
fig = get_figure(pair_data, "XSGD/USDC")
fig.show()