F# – A trading strategy backtester #1

In this post series I will be presenting a F# based trading backtester. Prerequisites to understand this material are a basic understanding of financial markets trading principles and an intermediate knowledge of F#.

This first post will show a way to get market data from the internet (using Alpha Vantage). Successive posts will show charting by XPlot and a callback based trading strategy tester facilitated by Deedle.

Background

Trading is “the action or activity of buying and selling goods and services”. In finance, though, that “goods and services” assumes an extended meaning: there are not real goods or services, but paperback or even immaterial (as electronic records) assets. What do all these material and immaterial assets have that make them trad-able? A value/price. And while goods and services are concrete things traded for necessity or desire, what could be the necessity or desire to possess a piece of paper or a bunch of bits in an electronic storage device? Well, as the value/price of that asset is not fixed, but changes in time, if we buy it now and sell at a later time when the price is greater, we realize a profit. This, in simple words, is speculation. Of course, not all financial trading is speculative, but most is.

However, prices do not always grow, they can also descend. In this case we are not making a profit, but a loss.

It is actually possible to make a profit when asset values get lower (short trading), but I don’t want to make it too complicated here.

So, all we need is to know when to buy and when to sell in order to achieve profits and avoid losses. In other words we need a strategy, a profitable strategy.

Prices can be influenced by a plethora of causes. And while in the past the markets where pretty limited and it was possible to develop a kind of “gut feel” on what a price would do in the future, globalization has made this really difficult, because the number of factors and correlations to account for is unsustainable for a human brain. That’s where computers help. Humans tells computers what data to consider, how to consider them, and computers give back the longed signal: BUY, SELL, WAIT.

Even when the amount of data is not unsustainable, it takes a lot of effort to analyse the data in the correct way and physically operate, especially if we want to differentiate our portfolio and increase operation frequency. And again, computers to the rescue.

And finally, as past behavior is the best predictor of future behavior, whatever strategy we can imagine, how much more would we trust it if we can test it with the past history? And here is where computers are massively useful. With them, we can test and tune thousands of strategies over decades of historical data in an infinitesimal fraction of the time humans would take through paper trading.

There are literally countless developments in financial computer programming. Python and C++ are among the most used programming languages. The former is a dynamic duck-typing language that makes it easy to quickly write and refactor operational code, the latter is the must go choice when speed is a key factor, as it produces highly optimized and unmanaged compiled code, so it’s the machine-nearest high-level object-oriented language.

In these posts, though, I will use F#, because I think it offers the best of both worlds. I believe F# is the Pareto ruler in this matter and it allows to both create quickly and execute fast, enough to streamline the process of “test in Python then translate to C++ for execution” in 80% of the cases. Very High Frequency Trading, where decision making algorithms need to execute in milliseconds will still require C++ (or C or assembly), but I doubt they are more than the 20% of all financial software.

Getting market data from internet

Getting historical market data from the internet is quite easy nowadays. For long time both yahoo and google have offered free data APIs. Lately though, these mainstream companies have someway retracted from this service.

At the time of writing, a new free service has been introduced, though, Alpha Vantage. They provide free historical feed for many markets and pre-calculated technical indicators, going back 20 years.

So, what we are going to do is write a simple API composed by two functions

getStock symbol timeframe
getIndi symbol timeframe indicator

to download market data and transform them into our domain type values.

Alpha Vantage data is provided in JSON or CSV format, which both have Type Providers available for F#. However looking at the shape of the data, there seems not to be a well-defined static schema which would make fully advantageous the use of Type Providers in this case:

{
    "Meta Data": {
        "1. Information": "Daily Prices (open, high, low, close) and Volumes",
        "2. Symbol": "MSFT",
        "3. Last Refreshed": "2018-08-17",
        "4. Output Size": "Compact",
        "5. Time Zone": "US/Eastern"
    },
    "Time Series (Daily)": {
        "2018-08-17": {
            "1. open": "107.3600",
            "2. high": "107.9000",
            "3. low": "106.6900",
            "4. close": "107.5800",
            "5. volume": "18050570"
        },
        "2018-08-16": {
            "1. open": "108.3000",
            "2. high": "108.8600",
            "3. low": "107.3000",
            "4. close": "107.6400",
            "5. volume": "21384289"
        },...

The series of market data is not a collection, but a multiple-field object that doesn’t fit well with a Type Provider.

Fortunately the FSharp.Data framework offers a JSON parser as well (which the JSON Type Provider is built upon BTW). Central to this parser is the type JsonValue, which contains most of the plumbing we need for our purpose.

  • JsonValue.AsyncLoad takes an web request URL and asynchronously downloads and parses the requested data
  • JsonValue.Record is the Discriminated Union case that encapsulates a JSON object and exposes an array of 2-tuples (one for each of the record’s fields), with the first element being the field name and the second being the field value, the latter recursively enclosed in another JsonValue instance

The above JSON sample, once parsed, would become:

JsonValue.Record [|"Meta Data", V1; "Time Series (Daily)", V2|]

where V1 would be a JsonValue instance as in:

JsonValue.Record [|"1. Information", V3; "2. Symbol", V4; ...|]

V3 and V4, in turn, would be (again) JsonValue instances embedding “Daily Prices (open, high, low, close) and Volumes” and “MSFT”, … stands for the remaining fields.

Analogously, V2 would be :

JsonValue.Record [|"2018-08-16", M1; "2018-08-15", M2; ...|]

where Mx:

JsonValue.Record [|"1. open", O; "2. high", H; ...|]

O, H and so on would (finally) contain the actual price value as JsonValue instances.

We are not interested in all data here. The domain is defined as:

type PriceBarPart = | Open | High | Low | Close
type OHLCV = { O:float; H:float; L:float; C:float; V:int }
type InstrumentMarketDataElement = DateTime * OHLCV
type InstrumentMarketData = InstrumentMarketDataElement seq
type IndicatorMarketDataElement = DateTime * float
type IndicatorMarketData = IndicatorMarketDataElement seq
type IndicatorDefinitions =
    | SMA of period:int * price:PriceBarPart
..
with
    member this.Id =
        match this with
        | SMA (period, price) -> sprintf "SMA%d%A" period price

type Instrument = {
    Symbol : string
    MarketData : InstrumentMarketData
}

type Indicator = {
    Instrument : string
    Definition : IndicatorDefinitions
    MarketData : IndicatorMarketData
}
with
    member this.Id = sprintf "%s(%s)" this.Definition.Id this.Instrument

type MarketData =
    | Instrument of Instrument
    | Indicator of Indicator

InstrumentMarketDataElement is the record we want to fill for instruments, IndicatorMarketDataElement for indicators.

For those not familiar with trading market data, a full explanation of market feed is out of scope for this post, but I give some basic info in the following block.

Usually market makers and brokerage firms maintain historical data for security prices. Financial markets work continuously, and prices change many times, even multiple times per second. Commonly prices change events are named ticks. The full history at tick granularity is of course a huge amount of data, and usually only some particular values are used to take trading decisions. It is common to associate to a market data stream the concept of timeframe, which is intended as a time span (e.g. 1 hour or 1 day). One of the most used timeframes is the daily timeframe. Given a timeframe, time is divided in periods and for every period 5 numbers are given (Open, High, Low, Close prices plus traded Volume during that period). So, for example, the daily market history for the instrument ACME (e.g. MSFT) will consist of a dataset similar to

Time Open High Low Close Volume
1-1-2000 3.22 3.56 2.87 2.98 3342
2-1-2000 2.99 3.11 2.65 2.68 2665

Indicators, instead, are not primary market data (prices), but single per-period values calculated from the price information using specific algorithms. The most common indicators are Moving Averages, which exist in several flavors (e.g. Simple, Exponential,…). For example, a “20-periods ACME Simple Moving Average on Close” would have, at time t, a value calculated as average of the Close prices of the ACME instrument taken at times t, t-1, t-2… t-19. The dataset would look like

Time SMA20Close
1-1-2000 2.44
2-1-2000 2.57

Putting this together, we first write a function that takes the full downloaded JSON (as JsonValue) and extracts the “Last Refreshed” field from the “Meta Data” record and the complete “Time Series”

let extract jv =
    match jv with
    | JsonValue.Record [|_metaTitle, JsonValue.Record [|_; _; _, serverTime; _; _|]; _seriesTitle, series|]
    | JsonValue.Record [|_metaTitle, JsonValue.Record [|_; _; _, serverTime; _; _; _|]; _seriesTitle, series|]
    | JsonValue.Record [|_metaTitle, JsonValue.Record [|_; _; _, serverTime; _; _; _; _|]; _seriesTitle, series|]
    | JsonValue.Record [|_metaTitle, JsonValue.Record [|_; _; _; _, serverTime; _|]; _seriesTitle, series|] when fst (DateTime.TryParse(serverTime.AsString())) ->
        serverTime.AsDateTime(),
        match series with
        | JsonValue.Record quotes -> quotes
        | _ -> failwith " Unrecognized JSON format"
    | _ -> failwith " Unrecognized JSON format"

Pattern matching makes it very short and self explanatory. The multiple entries in the match expression are due to the fact that the “Meta Data” JSON object has slightly different formats depending on the different calls offered by Alpha Vantage. The number of fields is different and also the position of the server time field changes.

This function returns a 2-tuple with the server time and a JsonValue containing the market series as record (called quotes in the function). This series, however, can be an instrument or an indicator, which have different shape, so we use two different functions to extract the values from them

let json2Instrument (dt, jvq) : InstrumentMarketDataElement =
    match jvq with
    | JsonValue.Record [| _,sO; _,sH; _,sL; _,sC; _, sV|] -> (DateTime.Parse(dt), {O = sO.AsFloat(); H = sH.AsFloat(); L = sL.AsFloat(); C = sC.AsFloat(); V = sV.AsInteger()})
    | _ -> failwith " Unrecognized JSON format"

let json2Indicator (dt, jvq) : IndicatorMarketDataElement =
    match jvq with
    | JsonValue.Record [|_,sV|] -> (DateTime.Parse(dt), sV.AsFloat())
    | _ -> failwith " Unrecognized JSON format"

The main function using them is

let exec parser call = async {
    let url = buildUrl call
    let! jv = JsonValue.AsyncLoad(url)
    let dt, series = extract jv
    return series |> Seq.map parser
}

after downloading the data, exec extracts the data and transforms them into domain entities. To do its job, exec needs to know two things: the parser to be used with the series (json2Instrument or json2Indicator), and the request (call) to forward the Alpha Vantage to retrieve the JSON. We have then all the componets to implement our API

let getStockAsync symbol timeframe = async {
    let! marketData =
        match timeframe with
        | M1 | M5 | M15 | M30 | M60 -> AVF_TIME_SERIES_INTRADAY (symbol, timeframe)
        | _ -> AVF_TIME_SERIES_DAILY symbol
        |> exec json2Instrument

    return {Symbol = symbol; MarketData = marketData} |> Instrument
}

let getStock symbol timeframe = getStockAsync symbol timeframe |> Async.RunSynchronously

let getIndiAsync symbol timeframe indicator = async {
    let! marketData =
        match indicator with
        | SMA (period, price) -> AVF_SMA (symbol, timeframe, period, price)
        |> exec json2Indicator

    return { Instrument = symbol; Definition = indicator; MarketData = marketData } |> Indicator
}

let getIndi symbol timeframe indicator = getIndiAsync symbol timeframe indicator |> Async.RunSynchronously

Further definitions and the buildUrl function can be seen in the source code in github.

With this API, getting market data is as easy as

let msft = getStock "MSFT" DAY
let geti = getIndi "MSFT" DAY
let sma10 = SMA (10, Close) |> geti
let sma20 = SMA (20, Close) |> geti
let sma50 = SMA (50, Close) |> geti

Getting 20 years of daily history for MSFT (Microsoft) takes a few seconds, as this F# interactive screenshot shows

F#Inter_QuanTif1

Advertisements

One thought on “F# – A trading strategy backtester #1

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s