Info - PortfolioPrediction

1. Core Concepts in Portfolio Optimization

Portfolio optimization is a mathematical approach to constructing an investment portfolio by selecting the proportion of various assets. The primary goal is to achieve the best possible balance between maximizing expected returns and minimizing investment risk, often based on historical data.

Key Metrics:

Expected Return (Mean): This represents the average historical return an investment portfolio is anticipated to generate over a given period. It's a measure of the potential profit from an investment.
Risk (Volatility/Variance): This quantifies the degree of variation of a trading price series over time, reflecting the uncertainty or magnitude of changes in an investment's value. A higher variance typically indicates higher risk.
Efficient Frontier: This is a fundamental concept in Modern Portfolio Theory (MPT). It is a curve that represents the set of "optimal" portfolios that offer the highest possible expected return for a given level of risk, or the lowest possible risk for a given expected return. Portfolios below the frontier are sub-optimal, and those above are unattainable.

2. Return Estimation Methods

These methods estimate the expected future returns for assets.

Name	Formula	Variable Meaning	Significance
Mean Historical Return	$$ \bar{R} = \frac{1}{N} \sum_{i=1}^{N} R_i $$	$ R_i $: return in period i $ N $: total periods	Simple average of past returns; assumes future returns mirror past performance.
Exponentially Weighted Mean (EWM)	$$ \bar{R}_{EWM} = \sum_{i=1}^{N} w_i R_i, \quad w_i = (1-\lambda)\lambda^{N-i} $$	$ \lambda $: decay factor (0<λ<1) $ R_i $: historical returns	Gives higher weight to recent data, adapting to current market trends.
CAPM Expected Return	$$ E[R_i] = R_f + \beta_i (E[R_m] - R_f) $$	$ R_f $: risk-free rate $ \beta_i $: asset’s market sensitivity $ E[R_m] $: expected market return	Theoretical model explicitly linking systematic risk and expected return.

3. Covariance Estimation Methods

Covariance indicates how two assets move relative to each other, impacting diversification and overall portfolio risk.

Name	Formula	Variable Meaning	Significance
Sample Covariance	$$ \text{Cov}(X,Y) = \frac{1}{N-1} \sum_{i=1}^{N} (X_i - \bar{X})(Y_i - \bar{Y}) $$	$ X_i, Y_i $: asset returns $ \bar{X}, \bar{Y} $: mean returns	Direct estimate of co-movement from historical data; noisy for small datasets.
Exponentially Weighted Covariance	$$ \text{Cov}_{EWM} = \sum_{i} w_i (X_i - \bar{X})(Y_i - \bar{Y}) $$	$ w_i $: exponentially decaying weights $ \bar{X}, \bar{Y} $: weighted means	Adapts to changing market conditions by emphasizing recent data.
Semicovariance	$$ \text{SemiCov} = \frac{1}{N} \sum_{i=1}^{N} \min(0, R_i - \bar{R})^2 $$	$ R_i $: individual returns $ \bar{R} $: mean return	Focuses on downside movements—useful for investors sensitive to losses.

4. Risk Metrics

Risk metrics assess the volatility, tail-risk, and drawdown potential of a portfolio.

Name	Formula	Variable Meaning	Significance
Variance	$$ \sigma^2 = \frac{1}{N-1} \sum_{i=1}^{N} (R_i - \bar{R})^2 $$	$ R_i $: returns, $ \bar{R} $: mean return	Classical measure of total risk (volatility).
Value at Risk (VaR)	$$ \text{VaR}_\alpha = -F_R^{-1}(\alpha) $$	$ F_R^{-1} $: inverse CDF of returns, $ \alpha $: confidence level	Maximum expected loss over a time period at confidence level α.
Conditional VaR (CVaR)	$$ \text{CVaR}_\alpha = E[R \mid R \leq \text{VaR}_\alpha] $$	$ R $: returns, $ \alpha $: confidence level	Expected loss beyond VaR; better captures extreme downside risk.
Sharpe Ratio	$$ S = \frac{E[R_p - R_f]}{\sigma_p} $$	$ R_p $: portfolio return, $ R_f $: risk-free rate, $ \sigma_p $: portfolio volatility	Measures risk-adjusted performance; higher is better.
Max Drawdown	$$ \text{MDD} = \max_{t}( \frac{P_{peak} - P_t}{P_{peak}} ) $$	$ P_t $: portfolio value at time t, $ P_{peak} $: max historical value	Shows largest drop from a peak to a trough—key for downside assessment.

5. Optimization Goals

Our application supports two main approaches to portfolio optimization:

Mean-Variance Optimization:
This traditional approach, using /mean_variance endpoint, allows you to input your desired portfolio (stocks and weights). The system then calculates various metrics for your portfolio (Variance, VaR, CVaR, Sharpe Ratio, Max Drawdown). It also finds optimized portfolios based on two criteria:
- Return-Optimized for User's Risk: For the same risk level as your current portfolio (e.g., your portfolio's variance), it finds a new portfolio that achieves the maximum possible return.
- Risk-Optimized for User's Return: For the same expected return as your current portfolio, it finds a new portfolio that minimizes a specific risk metric (e.g., minimizes variance, VaR, CVaR, etc.).
It generates efficient frontiers (plots showing the trade-off between risk and return) for Variance, VaR, CVaR, Sharpe Ratio, and Max Drawdown.
Risk Optimization (Maximize Return for Target Risk):
Accessed via the /risk_opt endpoint, this feature allows you to specify a target level for a chosen risk metric (e.g., "I want a maximum VaR of 0.02"). The system then optimizes a portfolio to achieve the highest possible return while staying within your specified risk tolerance.

6. Backtesting and Time Series Analysis

To evaluate the robustness of optimization strategies, our platform incorporates backtesting:

Train/Test Split: The available historical data is split into "train" and "test" periods (e.g., 36 months for training, 3 months for testing). The optimization is performed exclusively on the "train" data to derive optimal weights.
Out-of-Sample Evaluation: The optimized weights are then applied to the "test" (out-of-sample) data to simulate how the portfolio would have performed in a period not seen during optimization. This provides a more realistic assessment of the strategy's effectiveness.
Backtrader Integration: We utilize the Backtrader library to run these simulations. This generates cumulative return plots for various portfolio types (user, variance-optimized, VaR-optimized, etc.) over different test horizons (e.g., 1-month, 2-month, 3-month backtests).

7. Backend Technologies

The application is built using a combination of Python libraries:

Flask: A lightweight web framework used to handle web requests, routes, and render HTML templates.
PyPortfolioOpt (`pypfopt`): A library for portfolio optimization, providing efficient implementations of Mean-Variance, CVaR, and other optimization techniques, along with various return and risk model estimators.
Backtrader: A Python framework for backtesting trading strategies, allowing realistic simulations of portfolio performance over historical data.
Plotly: An interactive graphing library used to generate the efficient frontier plots and backtesting results, allowing users to hover over points for detailed information.
Pandas & NumPy: Fundamental libraries for data manipulation and numerical operations, used for handling financial time series data and calculations.

8. Data Source

The historical stock price data used for analysis is sourced from Yahoo Finance historical price data. Each file typically contains daily Open, High, Low, Close, and Volume data for a specific stock. The application automatically reads these files to provide available stock options and compute historical metrics. For the CAPM estimate for mean return, we use NSE index as the market portfolio.

9. Additional References

Capiński, M. J., & Kopp, E. (2014). Portfolio theory and risk management.
In Cambridge University Press eBooks. https://doi.org/10.1017/cbo9781139017398

Name	Formula	Variable Meaning	Significance
Mean Historical Return	$$ \bar{R} = \frac{1}{N} \sum_{i=1}^{N} R_i $$	\( R_i \): return in period i \( N \): total periods	Simple average of past returns; assumes future returns mirror past performance.
Exponentially Weighted Mean (EWM)	$$ \bar{R}_{EWM} = \sum_{i=1}^{N} w_i R_i, \quad w_i = (1-\lambda)\lambda^{N-i} $$	\( \lambda \): decay factor (0<λ<1) \( R_i \): historical returns	Gives higher weight to recent data, adapting to current market trends.
CAPM Expected Return	$$ E[R_i] = R_f + \beta_i (E[R_m] - R_f) $$	\( R_f \): risk-free rate \( \beta_i \): asset’s market sensitivity \( E[R_m] \): expected market return	Theoretical model explicitly linking systematic risk and expected return.

Name	Formula	Variable Meaning	Significance
Sample Covariance	$$ \text{Cov}(X,Y) = \frac{1}{N-1} \sum_{i=1}^{N} (X_i - \bar{X})(Y_i - \bar{Y}) $$	\( X_i, Y_i \): asset returns \( \bar{X}, \bar{Y} \): mean returns	Direct estimate of co-movement from historical data; noisy for small datasets.
Exponentially Weighted Covariance	$$ \text{Cov}_{EWM} = \sum_{i} w_i (X_i - \bar{X})(Y_i - \bar{Y}) $$	\( w_i \): exponentially decaying weights \( \bar{X}, \bar{Y} \): weighted means	Adapts to changing market conditions by emphasizing recent data.
Semicovariance	$$ \text{SemiCov} = \frac{1}{N} \sum_{i=1}^{N} \min(0, R_i - \bar{R})^2 $$	\( R_i \): individual returns \( \bar{R} \): mean return	Focuses on downside movements—useful for investors sensitive to losses.

Name	Formula	Variable Meaning	Significance
Variance	$$ \sigma^2 = \frac{1}{N-1} \sum_{i=1}^{N} (R_i - \bar{R})^2 $$	\( R_i \): returns, \( \bar{R} \): mean return	Classical measure of total risk (volatility).
Value at Risk (VaR)	$$ \text{VaR}_\alpha = -F_R^{-1}(\alpha) $$	\( F_R^{-1} \): inverse CDF of returns, \( \alpha \): confidence level	Maximum expected loss over a time period at confidence level α.
Conditional VaR (CVaR)	$$ \text{CVaR}_\alpha = E[R \mid R \leq \text{VaR}_\alpha] $$	\( R \): returns, \( \alpha \): confidence level	Expected loss beyond VaR; better captures extreme downside risk.
Sharpe Ratio	$$ S = \frac{E[R_p - R_f]}{\sigma_p} $$	\( R_p \): portfolio return, \( R_f \): risk-free rate, \( \sigma_p \): portfolio volatility	Measures risk-adjusted performance; higher is better.
Max Drawdown	$$ \text{MDD} = \max_{t}( \frac{P_{peak} - P_t}{P_{peak}} ) $$	\( P_t \): portfolio value at time t, \( P_{peak} \): max historical value	Shows largest drop from a peak to a trough—key for downside assessment.

Portfolio Optimization and Analysis Information