The answer to this question may seem obvious if you read the breathless proclamations of AI luminaries, but good quantitative investors should be hype-immune. We want to carefully compare the ChatGPT’s unsatisfactory responses to a couple of the prompts outlined in our book (which are mostly generated in early 2024) to their responses now (April 2025). In addition, if ChatGPT’s response is still not satisfactory, we want to compare the responses of several popular LLMs, some of which weren’t even available in early 2024 or weren’t launched in production mode yet. Of course, to test all the LLMs (e.g. ChatGPT, Copilot, DeepSeek, Gemini, Grok, Perplexity, …) is very time consuming, so we also use Chatbot Arena to randomly pick 2 for each prompt (see https://www.bloomberg.com/news/articles/2025-02-18/before-deepseek-blew-up-one-website-announced-its-arrival).
Let’s try the first and very simple task: plotting annualized returns of an ETF. Shockingly, ChatGPT failed at this task previously, as you could read from our book. It committed the egregious error of annualizing returns by multiplying by sqrt(252).
User Prompt
Write a Python program to download the adjusted closing prices of the ETF BILL, and plot its annualized daily returns chart. We only need YYYY as x-axis label.
Unfortunately, many LLMs (such as ChatGPT, Perplexity, gemma-3-4b-it by Google, command-a-03-2025 by Cohere) misinterpreted the question and produced some complicated calculations for the rolling 252-day compound returns. The only chatbot that correctly and simply multiplied the daily returns by 252 and displayed that is Microsoft’s Copilot. Bravo Mr. Nadella!
Now let’s proceed to another task that ChatGPT failed miserably before: summarizing a trading strategy based on Simon and Campasano (2014).
User Prompt
Pretend you are a quantitative finance expert. Please summarize the trading strategies described in this paper. [Author Note: We attach the pdf.]
ChatGPT Response
Absolutely—here’s a summary of the trading strategies described in the paper "The VIX Futures Basis: Evidence and Trading Strategies" by David P. Simon and Jim Campasano:
🧠 Core Insight:
The VIX futures basis—the difference between VIX futures prices and the VIX spot index—does not predict changes in the VIX spot index, but does predict changes in VIX futures prices. This creates trading opportunities due to the convergence of futures prices to spot at expiry.
📊 Main Strategy:
Exploit the "Roll" in the VIX Futures Curve
Short VIX futures when the curve is in contango (futures > spot) beyond a threshold.
Buy VIX futures when the curve is in backwardation (futures < spot) beyond a threshold.
These trades take advantage of the fact that VIX futures converge toward the spot VIX price as they approach settlement.
🔧 Implementation Details:
Daily Roll Signal: The basis is divided by the number of business days to settlement.
Enter short if daily roll > $100
Enter long if daily roll < –$100
Exit criteria:
Daily roll falls below $50 (short) or rises above –$50 (long), or
Maximum holding period of 9 business days
⚖️ Hedging Equity Risk:
Because VIX futures prices move inversely to equity markets, trades are hedged using mini-S&P 500 futures.
Short VIX = Short S&P hedge
Long VIX = Long S&P hedge
Hedge ratios are computed using regressions of VIX futures price changes on S&P futures returns, updated daily out-of-sample.
💵 Profitability Summary (2007–2011):
Strategy Mean P&L Sortino Ratio Notes Short VIX (hedged) $792 1.26 62 trades, 2:1 win/loss Long VIX (hedged) $1,018 1.03 40 trades Unhedged versions Slightly higher P&L, but significantly higher downside risk
Transaction costs included full bid-ask spread + brokerage fees.
Strategies showed robust performance even out-of-sample and across sub-periods (pre/post-2009).
🧾 Key Takeaway:
The profitability is mainly driven by the roll yield (the premium or discount in futures relative to spot), and much of the risk can be hedged using equity index futures, enabling statistically significant and economically meaningful returns.
Let me know if you’d like this turned into a code simulation, visual walkthrough, or executive-style slide!
This is a much better response than the one recorded in our book. We truly can’t ask for a better answer. Let’s take ChatGPT up on its offer and turn it into code. The results can found here: https://chatgpt.com/canvas/shared/680a8810a5808191aaa35e4b31d0a813. Looks great, doesn’t it? What we don’t know, however, is whether ChatGPT has used our previous prompt as part of its training data for this version, so this isn’t a completely fair benchmark.
TL;DR
Yes, the answers got much better over the last year. But still not all LLMs give equally satisfactory answers - sometimes one to try quite a few to find the suitable one. We suggest you try out other prompts in our book and see if the answers improved!