Automated trading and investing using RStudio on AWS

Investing rameritrade etrader Packages

Set up a trading server in minutes. R offers multiple packages that connect to trading APIs. Trading scripts can be deployed and fully automated, for 12 months free and $3 a month thereafter.

Exploring Finance https://exploringfinance.github.io/
11-14-2020

3 Key Takeaways

  1. AWS allows RStudio setup in minutes at little to no cost
  2. Choose the trading API that best aligns with your needs
  3. Once the AWS free tier expires, reduce costs using Lambda, GitHub, and a custom server

Setting up a server on AWS

Amazon Web Services (AWS) offers robust step-by-step documentation that makes it easy for anyone to leverage the services available. I will link to what is available to avoid duplicating documentation.

To start using AWS, you first need to create an account. There is a vast array of free services that offer everything needed to get started with RStudio for at least one year. After creating an account, the first thing to do is launch an instance.

Under the EC2 (Elastic Compute) panel, select ‘Launch Instance’. In the search bar, enter ‘Rstudio’ and select ‘Community AMIs’. An AMI is a pre-configured server image that allows for quick launch of EC2s. I personally prefer Ubuntu as the easiest OS to work with. In the picture below, you can see the option I prefer that comes with R 4.0 and RStudio 1.3.

Ubuntu 18, with R version 4

Figure 1: Ubuntu 18, with R version 4

After clicking select, ensure to select ‘t2.micro’ to enable the free tier. This will provide 12 months of free computing without having to stop and start the instance to control cost. Click ‘Review and Launch’. In this next section, you will need to modify security groups.

In security groups, you can set up a new security group and whitelist only certain IP addresses. This will provide an extra layer of protection beyond passwords. When you create a security group, you will need at least two rules for the server to work. Select ‘ssh’ and ‘Custom TCP Rule’. Port 22 will auto populate for SSH and then enter port 80 for Custom TCP to access RStudio. Select ‘My IP’ to auto populate your current IP address. You can also select ‘Anywhere’ to access the server from any IP address if security is not a big concern.

Note: If you selct ‘My IP’ you will only be able to access this server from your current IP Address. If your IP Address changes regularly, this should be considered when setting up your security group.

Setting up Security Groups

Figure 2: Setting up Security Groups

The other option to change is storage. Unfortunately this AMI requires 30GB, which is probably far more than needed for a basic RStudio setup. The good news is that the 30GBs is free for a year. Below I will discuss tactics to reduce ongoing costs after 12 free months.

After clicking ‘Launch’, you will need to create a key pair. This will not be necessary to use RStudio, but you still need to create the key pair and download it. This allows you to ssh to the server if needed. RStudio offers a terminal window, so again, this will most likely not be needed. After downloading, click ‘Launch Instance’. Congratulations! You now have a personal AWS server. Time to set it up for trading.

Accessing RStudio

Once the instance has launched, you can click into the instance details by clicking the ‘Instance Id’. This will display the ‘Public IPv4 DNS’ which is the address of your server. Because you have port 80 open in the security groups, you can copy this directly into the browser. Based on the picture below, I am pasting ‘http://ec2-52-91-85-61.compute-1.amazonaws.com/’ directly into my browser window.

Instance Details

Figure 3: Instance Details

This will bring up an RStudio Server login screen. The username is ‘rstudio’ and the password is the full Instance ID, in this case ‘i-003518d6097341ed8’. Once logged in, I would recommend changing your password following the instructions on the Welcome.R script. I also prefer a dark background instead of a white one, so under ‘Tools > Global Options > Appearance’ I select a new background.

Selecting a Trade API

There are currently multiple R packages that provide simplified access to different APIs for trading. Obviously, if you already have an account with one of the brokerage firms below, I would start there. However, if you are deciding who to go with, below is a brief description of each. All 4 firms offer free trading so that is not listed as a ‘pro’. Interactive Brokers along with the R Package are not listed because you need the IB Trader Workstation open and running to use the API, which eliminates the ability to automate the trading on a remote server.

  1. RobinHood was a pioneer in offering free trading, but once the other major firms cut commissions to zero, the biggest benefit was lost. The RobinHood R package provides easy access to the API. Customer base: 13 million client accounts with estimated $20 billion in AUM.
    • Pros: Very easy to use mobile app. API login is simple, only requiring username and password. Free Stock on sign-up. Fractional share trading. Cryptocurrencies available.
    • Cons: RobinHood is not a full service broker. You can only have one account and IRAs are not offered. No Mutual Funds. Account Transfers are not accepted so the account must be funded with cash.
  2. TD Ameritrade has recently been acquired by Charles Schwab but you can still set up accounts and trade. rameritrade provides access to the API, but the initial setup takes a few steps as detailed in a previous article. Customer base: 11 million client accounts with over $1 trillion in AUM.
    • Pros: Full service brokerage firm, Oauth 2.0 allows for tokens to maintain indefinite access after initial authentication is completed.
    • Cons: The initial authentication is a bit complicated. Mutual Funds are not available through the API. The future status of the API is unknown due to the Schwab acquisition.
  3. Alpaca is a fintech firm that is built for the trading API. They offer tons of technical integration options and algo trading. Full disclose, I have never used AlpacaforR, but the documentation seems very straight forward. Customer base: unknown.
    • Pros: Alpaca is built around the API rather than having an API built on top of a brokerage platform. Truly designed as a technical trading platform for algo traders and quant funds.
    • Cons: Not a full service brokerage firm, so no IRAs. No user friendly interface exists because of the focus on the API.
  4. ETrade is another of the major discount brokerage firms. Although it was recently acquired by Morgan Stanley, multiple sources confirm the company will stay mostly intact, including the API. etrader provides access to the API through R. Customer base: 5.2 million client accounts with $346 billion in AUM.
    • Pros: Full service brokerage firm. Allows Mutual Fund Trading. Might be offering fractional share trading in the near future (unconfirmed).
    • Cons: The authentication process has multiple steps and is more complicated to fully automate compared to the others. The R Package does not include Futures or MMF (though ETRADE does offer these).

I will be using the rameritrade package in the example below. As the author of the package, I am a bit biased. While I think ETRADE offers a better option over TD Ameritrade, the setup for automation is more complicated with etrader as shown in this article.

Creating a trading script

This example will assume you have successfully logged in and obtained a Refresh Token as detailed in Trade on TD Ameritrade with R. Below will be a simple script that logs into TD Ameritrade, checks the account balance, and then purchases one share of SCHB. Please note, I am not recommending SCHB, I am using it in this example because it is an inexpensive way to buy the entire stock market.

# Load rameritrade library
library(rameritrade)

# Pull in keys and Refresh Token from saved location
tdKeys <- readRDS('/home/rstudio/Trading/tdkeys.rds')
ref_tok <- readRDS('/home/rstudio/Trading/TDRefTok.rds')

# Generate an Access Token
acc_tok <- rameritrade::td_auth_accessToken(tdKeys$consumerKey,ref_tok)

# Confirm Market Hours and full trading day
TDMrktHours = rameritrade::td_marketHours()
MrktOpen = TDMrktHours$equity$EQ$isOpen
MrktEnd = TDMrktHours$equity$EQ$sessionHours$regularMarket[[1]]$end
if (is.null(MrktEnd)) {
  MrktClose=FALSE
} else {
    # Confirm Market is open until 4PM NYC
    MrktClose = hour(with_tz(as_datetime(MrktEnd), tz = "America/New_York")) == 16
    }
if (is.null(MrktOpen)) { MrktOpen = FALSE } 
if (MrktOpen & MrktClose) { TradingDay = TRUE } else { TradingDay = FALSE }


# Get Position and Quote Date
AllPositionsDF <- rameritrade::td_accountData()$balances
SCHBQuote <- rameritrade::td_priceQuote('SCHB','list')$SCHB$askPrice
# Depending on the account type, the cash balance may be in other fields. Confirm this argument.
CashBal <- AllPositionsDF$cashAvailableForTrading[1]

# If a trading day and the current cash balance is greater than the quote, execute trade
if (TradingDay & (CashBal > SCHBQuote)) {
  
  # Submit market buy order
  SCHBBuyOrder <- rameritrade::td_placeOrder(accountNumber=tdKeys$account1,
                                            instruction='Buy',
                                            quantity=1,
                                            ticker='SCHB') 
  Sys.sleep(15) # allow trade to settle
  
  # Get and save order details
  SCHBBuyResults <- rameritrade::td_orderDetail(SCHBBuyOrder$orderId,SCHBBuyOrder$accountNumber)
  saveRDS(SCHBBuyResults,paste0('/home/rstudio/Orders/SCHBBuy_',format(Sys.time(),'%Y%m%d_%H%M%S'),'.rds'))
  
}

I will save this file to ‘TradeExample.R’ within the home folder. This script is a very simple order entry and flow to demonstrate making a purchase of one share. When dollar cost averaging into a position, it is good to buy small amounts frequently. If you want to get fancier, you can use the gmailr package to send an email confirmation of the transaction.

Automating the transaction with cron

For anyone not familiar with Linux, cron jobs may be foreign. Cron instructs the server to execute a script at a specified time. For this example, we will invest $150-$200 a week, so this needs to be run twice a week at the current price. To spread out the trades, let’s run the script on Monday and Thursday at 12 noon Eastern time so that it executes during trading hours. To do this, go to the terminal window and enter ‘crontab -e’. This will bring you into the cron menu. Press the letter ‘i’ to insert new lines. Cron uses 5 numeric entries that will indicate exact times down to the minute, followed by the script to execute as shown below. I entered 00 12 * * 1,4 for 12 noon Monday and Thursday followed by a call to R and the trading script. When finished, press ESC and then type ‘:wq’ and press Enter to save.

Cron Entry

Figure 4: Cron Entry

The next important step will be to change the timezone of your server to Eastern. All AWS servers start in the UTC timezone so you must use sudo privileges to change the timezone to New York. Run the command ‘sudo timedatectl set-timezone America/New_York’ to make the change. This is shown below. You may need to enter the password, which would be the Instance ID if it has not been changed.

Changing Time Zone

Figure 5: Changing Time Zone

Congratulations, you have now created an automated trading app! Your app will buy one share of SCHB every Monday and Thursday that the market is open until all available funds in the TD Ameritrade account have been used.

Controlling costs after the free tier expires

If using the AWS free tier, the remainder of this article is less important. The object of this section is to explain how to start and stop your server only during market hours to avoid paying for unnecessary computing costs. I also address shrinking the storage size from 30GBs. Finally I show how you can backup your work using GitHub instead of server backups.

Lambda functions

AWS Lambda is a really cool product that allows you to execute code without a server. You can set up a cron schedule and link it to a Lambda function to execute code on a specific schedule. The possibilities are endless! Unfortunately, Lambda does not natively support R or much of the code above could simply be placed into a Lambda function and a server would not be required. For Python users, this is very possible since Lambda does support Python. Note: you can use R within Lambda but getting packages installed is way trickier.

One use case for Lambda is stopping and starting your EC2 instance. AWS has detailed instructions available. Once the free tier expires after 12 months, the t2.micro server will cost about $8.50-$9 a month to be running 24/7, combined with the 30GBs of storage cost at $3 a month, the free service jumps to $12 a month. While certainly not breaking the bank, this cost can be cut by at least 75% with simple modifications. Plus, if you are using docker/selenium with etrader or running more complex algorithms the t2.micro server will not be enough horsepower, which can quickly drive costs up. Lambda offers an ongoing free tier up to a certain monthly limit which would not be reached in this example.

Starting and stopping the EC2 server will require using a Lambda function with CloudWatch to schedule the function. The Lambda instructions are very straight forward, but things get more complicated due to timezones. Everything on AWS uses UTC and does not account for daylight savings time, so we need to convert this to Eastern time so that the server starts at 9am and stops at 4pm. After following the AWS instructons and arriving at the Function Code, the Python code below can be used to account for timezones.

Don’t forget to Deploy and test your code


import boto3
import datetime
import dateutil.tz

# Copy over the instance ID from the EC2 dashboard
instances = ['i-003518d6097341ed8']

# Ensure the region matches the Availability Zone
region = 'us-west-1'

# use AWS package to manage EC2
ec2 = boto3.client('ec2', region_name=region)

# Define a lambda handler the checks for the current time
# Start the instance if 9am and stop if 4pm
def lambda_handler(event, context):
    eastern = dateutil.tz.gettz('US/Eastern')
    ET = datetime.datetime.now(tz=eastern)
    
    if ET.hour == 9:
        ec2.start_instances(InstanceIds=instances)
    if ET.hour == 16:
        ec2.stop_instances(InstanceIds=instances)
    

Once the Lambda function is set up, you need to link it to a CloudWatch Rule. AWS has instructions on scheduling an event. The CloudWatch set up is very straight forward. Use a cron expression and link it to the Lambda function. The cron expression I use is ’05 13,14,20,21 ? * MON-FRI *’. This will trigger the Lambda function at UTC times 13, 14, 20, and 21 Monday through Friday. The Lambda function will convert the time to Eastern and determine if it’s 9am/4pm EST or EDT depending on the time of year.

That’s it! Your server will now only be up and running during market hours. If using the sample script above that runs at noon, you can narrow the window further to only be up for 1 hour a day rather than 7. When working with narrow windows, it might be best to split up your Lambda start/stop functions so that the timezones don’t trigger events incorrectly.

Configuring your own server

The AWS RStudio AMI used in this example requires a 30GB EBS Volume. There is most likely a lot of extra software like Latex installed that are not needed for a simple trading server. For the more ambitious you can launch a blank Ubuntu server at 10-12GBs and then install R and Rstudio. If you plan to use etrader you will also need to install Docker and download a Selenium Chrome docker image. Even with all that installed, the total used hard drive space should not exceed 8GB. Perhaps I am splitting hairs by trying to save $2 a month, but the compounding on the $25 saved a year should be considered! Plus, I like starting with a blank server so I know everything that has been installed. If you run out of space you can always increase the volume size, but you cannot decrease it.

Backup your work with GitHub

AWS provides backup capabilities and the option to create your own AMIs to launch instances. While it’s always a good idea to backup your work, the actual source code is a few mbs where backing up a server will be equal to the size of the server (several GBs). Considering how easily we set up the server above, is it really worth it to backup several GBs of data that can be set up again in a few minutes? Instead, RStudio has Git integration built directly into the UI. You can set up Git and link it to GitHub using these instruction. If all your code is backed up on Git, which is free, then backing up your server seems unnecessary, but each case is unique.

Note: Even if your GitHub repos are private, I would NOT store keys, passwords, or other credentials on Git. This information is too sensitive and should be stored locally in a secure location.

Wrapping up

In this article, we covered:

  1. Setting up an RStudio AWS server
  2. Writing a sample trading script using rameritrade
  3. Automating the script using cron
  4. Managing the server with Lambda, reducing storage cost, and backing up the source code

While it’s always fun to trade the market, I think this more importantly allows for a great way to dollar cost average into the stock market using ETFs. In another article, I will discuss the ultimate dollar cost averaging strategy using etrader. While the final impact of this may be unnoticeable over a 30 year investment horizon, there is also something that feels good about investing every day no matter what the market is doing. If the market is down, you are buying more. If the market is up, you bought yesterday and already have gains!

Note: The AWS instance used in this example has been terminated

Disclosure: The content herein is my own opinion and should not be considered financial
advice or recommendations. I am not receiving compensation for any materials produced. 
I have no business relationship with any companies mentioned.