Change is one of the few constants in the world of quantitative investing. Successful quant strategies and models must continually evolve as market efficiency improves and some market anomalies are arbitraged away. Recent years have witnessed the significant growth in data volume (big data), proliferation of unconventional data types (alternative data), and application of innovative techniques to process and distill this new data such as machine learning and natural language processing. The increasing availability of much cheaper and faster computational power via cloudâbased resources is accelerating these changes. And yet despite all the advances, the ground rules of quantitative investing remain the same.
Contrary to popular opinion, the explosion of data has made the role of human judgment more important than ever. Most âbig dataâ sources are largely unstructured, meaning that it takes much more effort and attention to format and âcleanâ the data for analysis (by removing sources of bias and outliers). New data sources and techniques also create additional data mining pitfalls. Advanced computational techniques such as machine learning can uncover patterns in historical data, particularly nonâlinear effects and complex interaction effects. But they are less effective at discovering the rationale for those patterns, often identifying spurious and counterâintuitive relationships. For a market anomaly to be exploited consistently in the future, there must be a dependable reason why it exists. Reliable market anomalies usually have a riskâbased, behavioralâbased, or structuralâbased explanation. And in order to ensure the robustness of factors, human judgment should govern the signal construction process. Though their track records are short, many of the dedicated artificial intelligence (AI) funds that lack this human insight have underperformed in the last few years.1
That said, there are areas of investment research that may favor these techniques, especially when considering highâvolume data sources where the signalâtoânoise ratio is particularly low (such as webâscraping and textual analysis). At Causeway, we believe we have the tools, data, and expertise to harness the potential alpha of these sources. We are continuously analyzing new data and new techniques for use across our strategies. Over the last several years, we have augmented the Causeway quantitative team with talented colleagues trained in these techniques. And our fundamental colleagues provide us a unique advantage in helping validate the effectiveness of new data sources. Nevertheless, wary of the pitfalls, we maintain high standards for adding new signals to our quantitative stock selection models and for utilizing alternative data to inform fundamental investment decisions. Clients should expect to see steady and incremental enhancements to our investment processes rather than any abrupt shifts. In the discussion that follows, we highlight four recent research projects that have demonstrated promise, two of which concern big data (and the techniques to process them) and two of which deal with alternative data used for ânowcasting.â
- Predict Election Results using Twitter
One area of our research into big data sources focuses on analyzing sentiment from social media data, specifically Twitter. It goes without saying that there is quite a lot of ânoiseâ in tweets, but when aggregated they may also represent a timely âpollâ of popular opinion. Elections, referendums, and other cases where popular opinion directly determines an outcome are therefore promising applications of Twitter analysis. Geopolitical risk is usually difficult to navigate in situations when electoral results can have significant sway over a countryâs public markets. With traditional thirdâparty polls, pollsters attempt to mitigate a myriad of potential biases before collecting opinions, but they frequently predict incorrectly (e.g., Brexit, 2016 US election). And by the time pollsters have collected a statistically significant number of opinions, their polling results may be stale. With Twitter feed analysis, we can collect tweets in realâtime and then seek to weed out any potential biases after collecting the data.
Indiaâs 2019 parliamentary election was Causewayâs first foray into predicting election results using Twitter data. At yearâend 2019, India represented the fourth largest market in the MSCI Emerging Markets Index (EM Index). The Lok Sabha (lower house of parliament) has 543 elected members, so 272 seats are required for a majority. Since Prime Minister Narendra Modiâs election in May 2014, his Bharatiya Janata Party (BJP) had held a slim majority in parliament. The broader BJPâled coalition, the National Democratic Alliance (NDA), had used this majority to enact a variety of marketâfriendly legislation focused on infrastructure spending, tax reform, and rural reflation. The market rewarded these efforts: The Indian equity market outperformed the EM Index by over 27% from May 2014 to December 2018.
Continue reading here.