
Python For Quant Interviews: Probability Distributions
Whether you’re tackling prop trading roles, quantitative research, or data science interviews, understanding and implementing key probability distributions is a must. In this comprehensive guide, we’ll break down core distributions—Bernoulli, Binomial, Poisson, Exponential, Uniform, and Gaussian—with intuitive explanations, common quant interview questions, and practical Python examples using scipy.stats.
Python For Quant Interviews: Probability Distributions
Why Probability Distributions Matter in Quant Interviews
Probability distributions form the foundation of quantitative modeling. Interviewers expect you to not only understand the math, but also to translate intuition into code. Financial data is inherently stochastic, and modeling risk, returns, and market behavior requires a deep grasp of these concepts. Showing you can implement and interpret distributions in Python is often a deciding factor in landing a quant role.
Table of Contents
- Bernoulli Distribution
- Binomial Distribution
- Poisson Distribution
- Exponential Distribution
- Uniform Distribution
- Gaussian (Normal) Distribution
- Summary Table
- Conclusion
Bernoulli Distribution
Intuition
The Bernoulli distribution models a single trial with two possible outcomes: success (1) or failure (0), with probability \( p \) of success.
- Think of flipping a biased coin.
- The simplest discrete distribution, foundation for many others.
Probability Mass Function (PMF):
\[ P(X = x) = p^x (1-p)^{1-x}, \quad x \in \{0, 1\} \]
| Parameter | Description |
|---|---|
| p | Probability of success (0 ≤ p ≤ 1) |
Quant Interview Questions
- What is the expectation and variance of a Bernoulli random variable?
- How can you simulate a single coin toss in Python?
- How does the Bernoulli relate to the Binomial distribution?
Python Example with scipy.stats
from scipy.stats import bernoulli
import numpy as np
# Parameters
p = 0.7
# PMF for 0 and 1
print("P(X=0):", bernoulli.pmf(0, p))
print("P(X=1):", bernoulli.pmf(1, p))
# Random sample of 10 Bernoulli trials
samples = bernoulli.rvs(p, size=10)
print("Samples:", samples)
# Expectation and Variance
mean, var = bernoulli.stats(p)
print("Mean:", mean, "Variance:", var)
Binomial Distribution
Intuition
The Binomial distribution counts the number of successes in \( n \) independent Bernoulli trials, each with probability \( p \) of success.
- e.g., Number of heads in 10 coin tosses.
- Models discrete events over a fixed number of trials.
Probability Mass Function (PMF):
\[ P(X = k) = {n \choose k} p^k (1-p)^{n-k} \]
| Parameter | Description |
|---|---|
| n | Number of trials (integer) |
| p | Probability of success per trial (0 ≤ p ≤ 1) |
Quant Interview Questions
- How do you compute the probability of getting at least 3 heads in 5 coin tosses?
- What are the mean and variance of the Binomial distribution?
- How does the Binomial relate to the Poisson distribution?
Python Example with scipy.stats
from scipy.stats import binom
# Parameters
n = 10 # number of trials
p = 0.5 # probability of success
# Probability of exactly 3 successes
print("P(X=3):", binom.pmf(3, n, p))
# Probability of at least 3 successes
prob_at_least_3 = 1 - binom.cdf(2, n, p)
print("P(X>=3):", prob_at_least_3)
# Random sample
samples = binom.rvs(n, p, size=10)
print("Samples:", samples)
# Mean and Variance
mean, var = binom.stats(n, p)
print("Mean:", mean, "Variance:", var)
Poisson Distribution
Intuition
The Poisson distribution models the number of events occurring in a fixed interval of time or space, given a constant average rate \( \lambda \) and independence between events.
- e.g., Number of trades per second on an exchange.
- Approximates Binomial when n is large, p is small, and np = λ.
Probability Mass Function (PMF):
\[ P(X = k) = \frac{\lambda^k e^{-\lambda}}{k!} \]
| Parameter | Description |
|---|---|
| \(\lambda\) | Average rate of occurrence (λ > 0) |
Quant Interview Questions
- What is the probability of 2 trades in a second if the average rate is 3 trades/sec?
- How does the Poisson distribution arise as a limit of the Binomial?
- What are the properties of the Poisson process?
Python Example with scipy.stats
from scipy.stats import poisson
# Parameters
lam = 3 # average rate
# Probability of exactly 2 events
print("P(X=2):", poisson.pmf(2, lam))
# Probability of at most 2 events
print("P(X<=2):", poisson.cdf(2, lam))
# Random sample
samples = poisson.rvs(lam, size=10)
print("Samples:", samples)
# Mean and Variance
mean, var = poisson.stats(lam)
print("Mean:", mean, "Variance:", var)
Exponential Distribution
Intuition
The Exponential distribution models the waiting time between independent Poisson events. It is continuous and memoryless.
- e.g., Time until next trade arrives.
- Key property: Memorylessness: \( P(T > s + t | T > s) = P(T > t) \)
Probability Density Function (PDF):
\[ f(x; \lambda) = \lambda e^{-\lambda x}, \quad x \geq 0 \]
| Parameter | Description |
|---|---|
| \(\lambda\) | Rate parameter (\(\lambda > 0\)) |
Quant Interview Questions
- Explain the memoryless property. Why is it important?
- How is the exponential related to the Poisson process?
- How would you simulate waiting times between trades?
Python Example with scipy.stats
from scipy.stats import expon
# Parameters
lam = 2 # rate parameter
# PDF at x = 1
print("f(x=1):", expon.pdf(1, scale=1/lam))
# Probability X > 2
print("P(X>2):", expon.sf(2, scale=1/lam))
# Random sample
samples = expon.rvs(scale=1/lam, size=10)
print("Samples:", samples)
# Mean and Variance
mean, var = expon.stats(scale=1/lam)
print("Mean:", mean, "Variance:", var)
Uniform Distribution
Intuition
The Uniform distribution models equal probability for all outcomes in a given interval.
- e.g., Random number between 0 and 1.
- Continuous (or discrete) with constant density.
Probability Density Function (PDF):
\[ f(x; a, b) = \frac{1}{b-a}, \quad a \leq x \leq b \]
| Parameter | Description |
|---|---|
| a | Lower bound |
| b | Upper bound (b > a) |
Quant Interview Questions
- How do you generate a random float between 5 and 10?
- What are the mean and variance of a uniform distribution?
- Why is the uniform distribution used in simulations?
Python Example with scipy.stats
from scipy.stats import uniform
# Parameters
a = 5
b = 10
# PDF at x = 7
print("f(x=7):", uniform.pdf(7, loc=a, scale=b-a))
# Probability X < 8
print("P(X<8):", uniform.cdf(8, loc=a, scale=b-a))
# Random sample
samples = uniform.rvs(loc=a, scale=b-a, size=10)
print("Samples:", samples)
# Mean and Variance
mean, var = uniform.stats(loc=a, scale=b-a)
print("Mean:", mean, "Variance:", var)
Gaussian (Normal) Distribution
Intuition
The Gaussian (Normal) distribution is the cornerstone of probability and statistics, modeling continuous data with a bell-shaped curve. Many natural and financial phenomena approximate normality, especially due to the Central Limit Theorem.
- e.g., Daily returns of a stock (approximate), measurement errors.
- Defined by mean (\( \mu \)) and standard deviation (\( \sigma \)).
Probability Density Function (PDF):
\[ f(x; \mu, \sigma^2) = \frac{1}{\sqrt{2\pi \sigma^2}} \exp\left( -\frac{(x-\mu)^2}{2\sigma^2} \right) \]
| Parameter | Description |
|---|---|
| \(\mu\) | Mean |
| \(\sigma^2\) | Variance (\(\sigma\) is std. dev.) |
Quant Interview Questions
- What is the 95% confidence interval for a standard normal variable?
- How do you generate standard normal samples in Python?
- State and explain the Central Limit Theorem.
- How can you compute the probability that a normal variable falls between two values?
Python Example with scipy.stats
from scipy.stats import norm
# Parameters
mu = 0
sigma = 1
# PDF at x = 0
print("f(x=0):", norm.pdf(0, mu, sigma))
# Probability X < 1.96 (standard normal, 97.5th percentile)
print("P(X<1.96):", norm.cdf(1.96, mu, sigma))
# Probability between -1 and 1
p = norm.cdf(1, mu, sigma) - norm.cdf(-1, mu, sigma)
print("P(-1 < X < 1):", p)
# Random sample
samples = norm.rvs(mu, sigma, size=10)
print("Samples:", samples)
# Mean and Variance
mean, var = norm.stats(mu, sigma)
print("Mean:", mean, "Variance:", var)
Summary Table: Distributions Comparison
| Distribution | Type | Parameters | Support | Mean | Variance | Python (scipy.stats) |
|---|---|---|---|---|---|---|
| Bernoulli | Discrete | p | 0, 1 | p | p(1 - p) | bernoulli |
| Binomial | Discrete | n, p | 0, 1, ..., n | np | np(1 - p) | binom |
| Poisson | Discrete | λ | 0, 1, 2, ... | λ | λ | poisson |
| Exponential | Continuous | λ | x ≥ 0 | 1/λ | 1/λ2 | expon |
| Uniform | Continuous | a, b | a ≤ x ≤ b | (a + b)/2 | (b - a)2/12 | uniform |
| Gaussian (Normal) | Continuous | μ, σ | −∞ < x < ∞ | μ | σ2 | norm |
Distribution Visualizations in Python
Visualizing probability distributions is highly recommended for developing intuition and for explaining concepts in interviews. Here’s how you can plot the PMFs or PDFs of the major distributions discussed above:
import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import bernoulli, binom, poisson, expon, uniform, norm
fig, axs = plt.subplots(2, 3, figsize=(18, 10))
# Bernoulli
p = 0.7
x = [0, 1]
axs[0, 0].bar(x, bernoulli.pmf(x, p))
axs[0, 0].set_title('Bernoulli PMF (p=0.7)')
axs[0, 0].set_xticks([0, 1])
# Binomial
n, p = 10, 0.5
x = np.arange(0, n+1)
axs[0, 1].bar(x, binom.pmf(x, n, p))
axs[0, 1].set_title('Binomial PMF (n=10, p=0.5)')
# Poisson
lam = 3
x = np.arange(0, 10)
axs[0, 2].bar(x, poisson.pmf(x, lam))
axs[0, 2].set_title('Poisson PMF (λ=3)')
# Exponential
lam = 2
x = np.linspace(0, 3, 100)
axs[1, 0].plot(x, expon.pdf(x, scale=1/lam))
axs[1, 0].set_title('Exponential PDF (λ=2)')
# Uniform
a, b = 5, 10
x = np.linspace(a-1, b+1, 100)
axs[1, 1].plot(x, uniform.pdf(x, loc=a, scale=b-a))
axs[1, 1].set_title('Uniform PDF (a=5, b=10)')
# Gaussian
mu, sigma = 0, 1
x = np.linspace(-4, 4, 100)
axs[1, 2].plot(x, norm.pdf(x, mu, sigma))
axs[1, 2].set_title('Gaussian PDF (μ=0, σ=1)')
for ax in axs.flat:
ax.grid(True)
plt.tight_layout()
plt.show()
Python Interview Tips for Probability Distributions
In quant interviews, you’ll often be asked to:
- Write code to simulate or analyze distributions.
- Compute probabilities, quantiles, or expectations using Python.
- Visualize distributions or draw conclusions from sample data.
- Discuss the intuition and applications of each distribution in finance.
Best Practices:
- Always specify the
random_stateparameter for reproducibility inrvs(). - Use
pmffor discrete andpdffor continuous distributions. - Understand
cdf(cumulative distribution function) andppf(percent point function, i.e., inverse cdf). - Know how to vectorize computations for efficiency (avoid Python loops on large data).
- Be ready to explain why a particular distribution is appropriate for a given problem.
Common Quant Interview Problems Involving Distributions
- Bernoulli/Binomial: “If the probability of a stock closing up is 0.55, what’s the probability it closes up exactly 6 times in 10 days?”
- Poisson: “If trades arrive at an average rate of 5 per minute, what’s the probability of observing at least 7 trades in a minute?”
- Exponential: “What is the probability that you wait more than 3 seconds for the next trade if λ = 0.5 per second?”
- Uniform: “How do you simulate a random execution price between $10 and $15?”
- Gaussian: “Assuming daily returns are normal with μ=0, σ=1%, what is the probability of a loss greater than 2%?”
Practice these types of problems, and make sure you can translate them into efficient Python code during your quant interview.
Applications of Probability Distributions in Quantitative Finance
Understanding the real-world uses of each distribution will impress interviewers:
- Bernoulli/Binomial: Model binary outcomes (e.g., up/down days, option exercises).
- Poisson/Exponential: Model event arrivals (trade times, order arrivals).
- Uniform: Simulate random samples, initialization in Monte Carlo methods.
- Gaussian: Model returns, risk, portfolio theory, Black-Scholes option pricing.
Demonstrate not only coding but also how each distribution fits into market modeling, derivative pricing, signal processing, or risk assessment.
Distribution Coding Challenges: Practice for Quant Interviews
Try solving these hands-on Python tasks similar to those in real interviews:
- Simulate a sequence of 100 Bernoulli trials with p=0.3 and plot the cumulative sum.
from scipy.stats import bernoulli import numpy as np import matplotlib.pyplot as plt p = 0.3 trials = bernoulli.rvs(p, size=100, random_state=42) cumsum = np.cumsum(trials) plt.plot(cumsum) plt.title('Cumulative Sum of Bernoulli Trials (p=0.3)') plt.xlabel('Trial') plt.ylabel('Cumulative Successes') plt.show() - Estimate the probability of getting more than 8 heads in 12 tosses of a fair coin, using both the binomial formula and simulation.
from scipy.stats import binom # Analytical probability n, p = 12, 0.5 prob_analytical = 1 - binom.cdf(8, n, p) print("P(X>8) analytic:", prob_analytical) # Simulation simulations = 100000 samples = binom.rvs(n, p, size=simulations, random_state=42) prob_simulated = np.mean(samples > 8) print("P(X>8) simulated:", prob_simulated) - Generate random inter-arrival times for trades (λ=4 per minute), and plot the histogram and exponential PDF.
from scipy.stats import expon import numpy as np import matplotlib.pyplot as plt lam = 4 n_samples = 1000 samples = expon.rvs(scale=1/lam, size=n_samples, random_state=42) plt.hist(samples, bins=30, density=True, alpha=0.6, label='Histogram') x = np.linspace(0, 2, 100) plt.plot(x, expon.pdf(x, scale=1/lam), 'r-', label='Exponential PDF') plt.title('Inter-arrival Times (λ=4)') plt.xlabel('Time') plt.ylabel('Density') plt.legend() plt.show() - Given stock returns are normal (μ=0.1%, σ=2%), what’s the probability of a daily return below -3%?
from scipy.stats import norm mu, sigma = 0.001, 0.02 prob = norm.cdf(-0.03, mu, sigma) print("Probability of return < -3%:", prob)
FAQs on Probability Distributions in Python Quant Interviews
- Should I use
numpy.randomorscipy.stats?
Answer: Usescipy.statsfor statistical computations (pmf, pdf, cdf, etc.) andnumpy.randomfor fast sampling in simulations. In interviews,scipy.statsis generally preferred for clarity and statistical functions. - How do I check if data is normally distributed?
Answer: Use visualizations (histogram, QQ plot) and statistical tests likescipy.stats.shapiroorscipy.stats.normaltest. - What’s the fastest way to compute cumulative probabilities?
Answer: Use thecdfmethod inscipy.statsdistributions. - Can I fit a distribution to data in
scipy.stats?
Answer: Yes, use thefitmethod for continuous distributions (e.g.,norm.fit(data)).
Conclusion: Master Probability Distributions for Quant Success
A solid command of probability distributions—and their implementation in Python—will set you apart in quant interviews. Always pair mathematical intuition with code proficiency. Practice interpreting questions in terms of distributions, coding up solutions using scipy.stats, and explaining your reasoning both mathematically and practically.
Keep this guide as a reference as you prepare for your next quant interview. With strong fundamentals and Python skills, you’ll be ready to tackle the toughest probability questions on the spot!
Good luck on your quant interview journey!
Further Reading & References
- Scipy.stats Documentation
- Introduction to Probability
- Heard on the Street: Quant Interview Prep
- QuantStart Interview Questions
Related Articles
- Why Downside Correlation Rises in Market Crashes and Its Impact on Volatility
- Goldman Sachs Quant Interview: Probability and Statistics Questions
- Quant Analyst Interview Questions at Millennium
- JP Morgan Quant Interview Questions: What to Expect and How to Prepare
- Poisson Process Interview Question with Solution for JPMorgan
