blog-cover-image

Python For Quant Interviews: Probability Distributions

Whether you’re tackling prop trading roles, quantitative research, or data science interviews, understanding and implementing key probability distributions is a must. In this comprehensive guide, we’ll break down core distributions—Bernoulli, Binomial, Poisson, Exponential, Uniform, and Gaussian—with intuitive explanations, common quant interview questions, and practical Python examples using scipy.stats.

Python For Quant Interviews: Probability Distributions

Why Probability Distributions Matter in Quant Interviews

Probability distributions form the foundation of quantitative modeling. Interviewers expect you to not only understand the math, but also to translate intuition into code. Financial data is inherently stochastic, and modeling risk, returns, and market behavior requires a deep grasp of these concepts. Showing you can implement and interpret distributions in Python is often a deciding factor in landing a quant role.

Bernoulli Distribution
Binomial Distribution
Poisson Distribution
Exponential Distribution
Uniform Distribution
Gaussian (Normal) Distribution
Summary Table
Conclusion

Bernoulli Distribution

Intuition

The Bernoulli distribution models a single trial with two possible outcomes: success (1) or failure (0), with probability $ p $ of success.

Think of flipping a biased coin.
The simplest discrete distribution, foundation for many others.

Probability Mass Function (PMF):

\[ P(X = x) = p^x (1-p)^{1-x}, \quad x \in \{0, 1\} \]

Parameter	Description
p	Probability of success (0 ≤ p ≤ 1)

Quant Interview Questions

What is the expectation and variance of a Bernoulli random variable?
How can you simulate a single coin toss in Python?
How does the Bernoulli relate to the Binomial distribution?

Python Example with `scipy.stats`


from scipy.stats import bernoulli
import numpy as np

# Parameters
p = 0.7

# PMF for 0 and 1
print("P(X=0):", bernoulli.pmf(0, p))
print("P(X=1):", bernoulli.pmf(1, p))

# Random sample of 10 Bernoulli trials
samples = bernoulli.rvs(p, size=10)
print("Samples:", samples)

# Expectation and Variance
mean, var = bernoulli.stats(p)
print("Mean:", mean, "Variance:", var)

Binomial Distribution

Intuition

The Binomial distribution counts the number of successes in $ n $ independent Bernoulli trials, each with probability $ p $ of success.

e.g., Number of heads in 10 coin tosses.
Models discrete events over a fixed number of trials.

Probability Mass Function (PMF):

\[ P(X = k) = {n \choose k} p^k (1-p)^{n-k} \]

Parameter	Description
n	Number of trials (integer)
p	Probability of success per trial (0 ≤ p ≤ 1)

Quant Interview Questions

How do you compute the probability of getting at least 3 heads in 5 coin tosses?
What are the mean and variance of the Binomial distribution?
How does the Binomial relate to the Poisson distribution?

Python Example with `scipy.stats`


from scipy.stats import binom

# Parameters
n = 10  # number of trials
p = 0.5 # probability of success

# Probability of exactly 3 successes
print("P(X=3):", binom.pmf(3, n, p))

# Probability of at least 3 successes
prob_at_least_3 = 1 - binom.cdf(2, n, p)
print("P(X>=3):", prob_at_least_3)

# Random sample
samples = binom.rvs(n, p, size=10)
print("Samples:", samples)

# Mean and Variance
mean, var = binom.stats(n, p)
print("Mean:", mean, "Variance:", var)

Poisson Distribution

Intuition

The Poisson distribution models the number of events occurring in a fixed interval of time or space, given a constant average rate $ \lambda $ and independence between events.

e.g., Number of trades per second on an exchange.
Approximates Binomial when n is large, p is small, and np = λ.

Probability Mass Function (PMF):

\[ P(X = k) = \frac{\lambda^k e^{-\lambda}}{k!} \]

Parameter	Description
$\lambda$	Average rate of occurrence (λ > 0)

Quant Interview Questions

What is the probability of 2 trades in a second if the average rate is 3 trades/sec?
How does the Poisson distribution arise as a limit of the Binomial?
What are the properties of the Poisson process?

Python Example with `scipy.stats`


from scipy.stats import poisson

# Parameters
lam = 3 # average rate

# Probability of exactly 2 events
print("P(X=2):", poisson.pmf(2, lam))

# Probability of at most 2 events
print("P(X<=2):", poisson.cdf(2, lam))

# Random sample
samples = poisson.rvs(lam, size=10)
print("Samples:", samples)

# Mean and Variance
mean, var = poisson.stats(lam)
print("Mean:", mean, "Variance:", var)

Exponential Distribution

Intuition

The Exponential distribution models the waiting time between independent Poisson events. It is continuous and memoryless.

e.g., Time until next trade arrives.
Key property: Memorylessness: $ P(T > s + t | T > s) = P(T > t) $

Probability Density Function (PDF):

\[ f(x; \lambda) = \lambda e^{-\lambda x}, \quad x \geq 0 \]

Parameter	Description
$\lambda$	Rate parameter ($\lambda > 0$)

Quant Interview Questions

Explain the memoryless property. Why is it important?
How is the exponential related to the Poisson process?
How would you simulate waiting times between trades?

Python Example with `scipy.stats`


from scipy.stats import expon

# Parameters
lam = 2 # rate parameter

# PDF at x = 1
print("f(x=1):", expon.pdf(1, scale=1/lam))

# Probability X > 2
print("P(X>2):", expon.sf(2, scale=1/lam))

# Random sample
samples = expon.rvs(scale=1/lam, size=10)
print("Samples:", samples)

# Mean and Variance
mean, var = expon.stats(scale=1/lam)
print("Mean:", mean, "Variance:", var)

Uniform Distribution

Intuition

The Uniform distribution models equal probability for all outcomes in a given interval.

e.g., Random number between 0 and 1.
Continuous (or discrete) with constant density.

Probability Density Function (PDF):

\[ f(x; a, b) = \frac{1}{b-a}, \quad a \leq x \leq b \]

Parameter	Description
a	Lower bound
b	Upper bound (b > a)

Quant Interview Questions

How do you generate a random float between 5 and 10?
What are the mean and variance of a uniform distribution?
Why is the uniform distribution used in simulations?

Python Example with `scipy.stats`


from scipy.stats import uniform

# Parameters
a = 5
b = 10

# PDF at x = 7
print("f(x=7):", uniform.pdf(7, loc=a, scale=b-a))

# Probability X < 8
print("P(X<8):", uniform.cdf(8, loc=a, scale=b-a))

# Random sample
samples = uniform.rvs(loc=a, scale=b-a, size=10)
print("Samples:", samples)

# Mean and Variance
mean, var = uniform.stats(loc=a, scale=b-a)
print("Mean:", mean, "Variance:", var)

Gaussian (Normal) Distribution

Intuition

The Gaussian (Normal) distribution is the cornerstone of probability and statistics, modeling continuous data with a bell-shaped curve. Many natural and financial phenomena approximate normality, especially due to the Central Limit Theorem.

e.g., Daily returns of a stock (approximate), measurement errors.
Defined by mean ($ \mu $) and standard deviation ($ \sigma $).

Probability Density Function (PDF):

\[ f(x; \mu, \sigma^2) = \frac{1}{\sqrt{2\pi \sigma^2}} \exp\left( -\frac{(x-\mu)^2}{2\sigma^2} \right) \]

Parameter	Description
$\mu$	Mean
$\sigma^2$	Variance ($\sigma$ is std. dev.)

Quant Interview Questions

What is the 95% confidence interval for a standard normal variable?
How do you generate standard normal samples in Python?
State and explain the Central Limit Theorem.
How can you compute the probability that a normal variable falls between two values?

Python Example with `scipy.stats`


from scipy.stats import norm

# Parameters
mu = 0
sigma = 1

# PDF at x = 0
print("f(x=0):", norm.pdf(0, mu, sigma))

# Probability X < 1.96 (standard normal, 97.5th percentile)
print("P(X<1.96):", norm.cdf(1.96, mu, sigma))

# Probability between -1 and 1
p = norm.cdf(1, mu, sigma) - norm.cdf(-1, mu, sigma)
print("P(-1 < X < 1):", p)

# Random sample
samples = norm.rvs(mu, sigma, size=10)
print("Samples:", samples)

# Mean and Variance
mean, var = norm.stats(mu, sigma)
print("Mean:", mean, "Variance:", var)

Summary Table: Distributions Comparison

Distribution	Type	Parameters	Support	Mean	Variance	Python (`scipy.stats`)
Bernoulli	Discrete	p	0, 1	p	p(1 - p)	bernoulli
Binomial	Discrete	n, p	0, 1, ..., n	np	np(1 - p)	binom
Poisson	Discrete	λ	0, 1, 2, ...	λ	λ	poisson
Exponential	Continuous	λ	x ≥ 0	1/λ	1/λ²	expon
Uniform	Continuous	a, b	a ≤ x ≤ b	(a + b)/2	(b - a)²/12	uniform
Gaussian (Normal)	Continuous	μ, σ	−∞ < x < ∞	μ	σ²	norm

Distribution Visualizations in Python

Visualizing probability distributions is highly recommended for developing intuition and for explaining concepts in interviews. Here’s how you can plot the PMFs or PDFs of the major distributions discussed above:


import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import bernoulli, binom, poisson, expon, uniform, norm

fig, axs = plt.subplots(2, 3, figsize=(18, 10))

# Bernoulli
p = 0.7
x = [0, 1]
axs[0, 0].bar(x, bernoulli.pmf(x, p))
axs[0, 0].set_title('Bernoulli PMF (p=0.7)')
axs[0, 0].set_xticks([0, 1])

# Binomial
n, p = 10, 0.5
x = np.arange(0, n+1)
axs[0, 1].bar(x, binom.pmf(x, n, p))
axs[0, 1].set_title('Binomial PMF (n=10, p=0.5)')

# Poisson
lam = 3
x = np.arange(0, 10)
axs[0, 2].bar(x, poisson.pmf(x, lam))
axs[0, 2].set_title('Poisson PMF (λ=3)')

# Exponential
lam = 2
x = np.linspace(0, 3, 100)
axs[1, 0].plot(x, expon.pdf(x, scale=1/lam))
axs[1, 0].set_title('Exponential PDF (λ=2)')

# Uniform
a, b = 5, 10
x = np.linspace(a-1, b+1, 100)
axs[1, 1].plot(x, uniform.pdf(x, loc=a, scale=b-a))
axs[1, 1].set_title('Uniform PDF (a=5, b=10)')

# Gaussian
mu, sigma = 0, 1
x = np.linspace(-4, 4, 100)
axs[1, 2].plot(x, norm.pdf(x, mu, sigma))
axs[1, 2].set_title('Gaussian PDF (μ=0, σ=1)')

for ax in axs.flat:
    ax.grid(True)
plt.tight_layout()
plt.show()

Python Interview Tips for Probability Distributions

In quant interviews, you’ll often be asked to:

Write code to simulate or analyze distributions.
Compute probabilities, quantiles, or expectations using Python.
Visualize distributions or draw conclusions from sample data.
Discuss the intuition and applications of each distribution in finance.

Best Practices:

Always specify the random_state parameter for reproducibility in rvs().
Use pmf for discrete and pdf for continuous distributions.
Understand cdf (cumulative distribution function) and ppf (percent point function, i.e., inverse cdf).
Know how to vectorize computations for efficiency (avoid Python loops on large data).
Be ready to explain why a particular distribution is appropriate for a given problem.

Common Quant Interview Problems Involving Distributions

Bernoulli/Binomial: “If the probability of a stock closing up is 0.55, what’s the probability it closes up exactly 6 times in 10 days?”
Poisson: “If trades arrive at an average rate of 5 per minute, what’s the probability of observing at least 7 trades in a minute?”
Exponential: “What is the probability that you wait more than 3 seconds for the next trade if λ = 0.5 per second?”
Uniform: “How do you simulate a random execution price between $10 and $15?”
Gaussian: “Assuming daily returns are normal with μ=0, σ=1%, what is the probability of a loss greater than 2%?”

Practice these types of problems, and make sure you can translate them into efficient Python code during your quant interview.

Applications of Probability Distributions in Quantitative Finance

Understanding the real-world uses of each distribution will impress interviewers:

Bernoulli/Binomial: Model binary outcomes (e.g., up/down days, option exercises).
Poisson/Exponential: Model event arrivals (trade times, order arrivals).
Uniform: Simulate random samples, initialization in Monte Carlo methods.
Gaussian: Model returns, risk, portfolio theory, Black-Scholes option pricing.

Demonstrate not only coding but also how each distribution fits into market modeling, derivative pricing, signal processing, or risk assessment.

Distribution Coding Challenges: Practice for Quant Interviews

Try solving these hands-on Python tasks similar to those in real interviews:

Simulate a sequence of 100 Bernoulli trials with p=0.3 and plot the cumulative sum.


from scipy.stats import bernoulli
import numpy as np
import matplotlib.pyplot as plt

p = 0.3
trials = bernoulli.rvs(p, size=100, random_state=42)
cumsum = np.cumsum(trials)
plt.plot(cumsum)
plt.title('Cumulative Sum of Bernoulli Trials (p=0.3)')
plt.xlabel('Trial')
plt.ylabel('Cumulative Successes')
plt.show()

Estimate the probability of getting more than 8 heads in 12 tosses of a fair coin, using both the binomial formula and simulation.


from scipy.stats import binom

# Analytical probability
n, p = 12, 0.5
prob_analytical = 1 - binom.cdf(8, n, p)
print("P(X>8) analytic:", prob_analytical)

# Simulation
simulations = 100000
samples = binom.rvs(n, p, size=simulations, random_state=42)
prob_simulated = np.mean(samples > 8)
print("P(X>8) simulated:", prob_simulated)

Generate random inter-arrival times for trades (λ=4 per minute), and plot the histogram and exponential PDF.


from scipy.stats import expon
import numpy as np
import matplotlib.pyplot as plt

lam = 4
n_samples = 1000
samples = expon.rvs(scale=1/lam, size=n_samples, random_state=42)

plt.hist(samples, bins=30, density=True, alpha=0.6, label='Histogram')
x = np.linspace(0, 2, 100)
plt.plot(x, expon.pdf(x, scale=1/lam), 'r-', label='Exponential PDF')
plt.title('Inter-arrival Times (λ=4)')
plt.xlabel('Time')
plt.ylabel('Density')
plt.legend()
plt.show()

Given stock returns are normal (μ=0.1%, σ=2%), what’s the probability of a daily return below -3%?


from scipy.stats import norm

mu, sigma = 0.001, 0.02
prob = norm.cdf(-0.03, mu, sigma)
print("Probability of return < -3%:", prob)

FAQs on Probability Distributions in Python Quant Interviews

Should I use numpy.random or scipy.stats?
Answer: Use scipy.stats for statistical computations (pmf, pdf, cdf, etc.) and numpy.random for fast sampling in simulations. In interviews, scipy.stats is generally preferred for clarity and statistical functions.
How do I check if data is normally distributed?
Answer: Use visualizations (histogram, QQ plot) and statistical tests like scipy.stats.shapiro or scipy.stats.normaltest.
What’s the fastest way to compute cumulative probabilities?
Answer: Use the cdf method in scipy.stats distributions.
Can I fit a distribution to data in scipy.stats?
Answer: Yes, use the fit method for continuous distributions (e.g., norm.fit(data)).

Conclusion: Master Probability Distributions for Quant Success

A solid command of probability distributions—and their implementation in Python—will set you apart in quant interviews. Always pair mathematical intuition with code proficiency. Practice interpreting questions in terms of distributions, coding up solutions using scipy.stats, and explaining your reasoning both mathematically and practically.

Keep this guide as a reference as you prepare for your next quant interview. With strong fundamentals and Python skills, you’ll be ready to tackle the toughest probability questions on the spot!

Good luck on your quant interview journey!