Today I’ve attended a meetup where I’ve heard two fascinating talks.
PyData Vancouver Meetup:
Algorithmic Trading, Machine Learning - Cool things with PyData
Location: Mobify
Algorithmic Trading with Quantopian
- Speaker: François Lucas, founder of the SPY Surfer
- tl;dr: Making data-driven investing decisions
His slides can be found here.
From what I understood, the first speaker uses data mining to obtain Yahoo S&P500 financial data to make monthly investment decisions based on the following algorithm:
Algo:
- At the End of the Month:
- If price > 10-month SMA, go long
- Else Bonds
- Do nothing in between
- “SMA” stands for
Separately managed account e.g. S&P 500simple moving average - “10-month SMA” refers to the average running price over the past 10 month
- “long” refers to buying of a e.g. stock with the expectation it will rise in vlaue
- “Else Bonds” means buy bonds if the price is below the 10 month SMA average
He compares his algorithm to Warren Buffett’s exchange-traded fund (ETF) and shows that on a logged scale (which accounts for value differences due to compounding) that the two trading strategies are comparable.
Learning From Implicit Data
- Speaker: Ben Frederickson, Vancouver Data Products team lead at Flipboard
- tl;dr: Finding similar music using Matrix Factorization
Ben’s talk is essentially described by the following posts:
- #1. Distance Metrics for Fun and Profit
- #2. Faster Implicit Matrix Factorization
- #3. Finding Similar Music with Matrix Factorization
From what I understood, he describes the use of a BM25 distance metric (in comparison with cosine similarity and TFIDF) in post #1. He goes into matrix factorization (dividing a big matrix into smaller, condensed, and presumably generalized matrices) in post #2. He finds “The Arcade Fire” is a highly ranked similar hit to “Arcade Fire”, which is amazing because users often mention one without the other. Finally he touched upon a faster implementation that what was demoed in post #3.
At the end of the talk, Ben points to the following resources:
- https://github.com/benfred/implicit (location of code & presentation)
- http://www.benfrederickson.com/blog/