Data Collection

To backtest the FX strategy, we will collect 1-hour bars from Interactive Brokers for EUR.USD.

EUR.USD is included in QuantRocket's list of free sample data. No QuantRocket subscription is required to collect EUR.USD data but an IBKR account is required. (IBKR provides FX data for free to account holders.)

Collect EUR.USD listing

The first step is to collect the securities master record for EUR.USD from Interactive Brokers.

To collect data, first start IB Gateway:

In [1]:
from quantrocket.ibg import start_gateways
start_gateways(wait=True)
Out[1]:
{'ibg1': {'status': 'running'}}

Then collect the securities master record:

In [2]:
from quantrocket.master import collect_ibkr_listings
collect_ibkr_listings(exchanges="IDEALPRO", symbols="EUR.USD")
Out[2]:
{'status': 'the IBKR listing details will be collected asynchronously'}

Monitor flightlog for completion:

quantrocket.master: INFO Collecting IDEALPRO listings from IBKR website (EUR.USD only)
quantrocket.master: INFO Requesting details for 1 IDEALPRO listings found on IBKR website
quantrocket.master: INFO Saved 1 IDEALPRO listings to securities master database

Lookup EUR.USD

We need to look up the Sid (security ID) for EUR.USD so we can collect historical data for it. We use the command line interface because less typing is required.

Prefixing a line with ! allows running terminal commands from inside a notebook.

In [3]:
!quantrocket master get --symbols 'EUR.USD' --json | json2yaml
---
  - 
    Sid: "FXEURUSD"
    Symbol: "EUR.USD"
    Exchange: "IDEALPRO"
    Country: null
    Currency: "USD"
    SecType: "CASH"
    Etf: 0
    Timezone: "America/New_York"
    Name: "European Monetary Union Euro"
    PriceMagnifier: 1
    Multiplier: 1
    Delisted: null
    DateDelisted: null
    LastTradeDate: null
    RolloverDate: null

Collect historical data

Next, we create a database for collecting 1-hour EUR.USD bars. Since FX is an OTC market, historical data does not represent actual trades but rather the bid-ask midpoint; we specify MIDPOINT as the bar type to be explicit, but this would be implied for CASH securities even if we omitted the bar_type parameter. We use EUR.USD's nickname, Fiber, to name the database.

The shard parameter is required for intraday databases and determines how to split up large databases into smaller pieces for better performance. Since our universe contains only one pair, we can turn this feature off. See the usage guide to learn more.

In [4]:
from quantrocket.history import create_ibkr_db
create_ibkr_db("fiber-1h", sids="FXEURUSD", bar_size="1 hour", bar_type="MIDPOINT", shard="off")
Out[4]:
{'status': 'successfully created quantrocket.v2.history.fiber-1h.sqlite'}

Then collect the data:

In [5]:
from quantrocket.history import collect_history
collect_history("fiber-1h")
Out[5]:
{'status': 'the historical data will be collected asynchronously'}

Monitor flightlog for completion:

quantrocket.history: INFO [fiber-1h] Collecting history from IBKR for 1 securities in fiber-1h
quantrocket.history: INFO [fiber-1h] Saved 85778 total records for 1 total securities to quantrocket.v2.history.fiber-1h.sqlite

For more detailed feedback during the data collection, check the detailed logs (quantrocket flightlog stream -d), which reveals data being collected one month at a time:

quantrocket_history_1|Issuing to ibg1 historical data request for 1 M of 1 hour MIDPOINT for EUR.USD CASH (sid FXEURUSD) ending 20130208 23:00:00 GMT
quantrocket_history_1|Issuing to ibg1 historical data request for 1 M of 1 hour MIDPOINT for EUR.USD CASH (sid FXEURUSD) ending 20130308 23:00:00 GMT
...

Next Up

Part 2: Time-of-Day Research