blog-cover-image

Python For Quant Interviews: Probability Distributions

Whether you’re tackling prop trading roles, quantitative research, or data science interviews, understanding and implementing key probability distributions is a must. In this comprehensive guide, we’ll break down core distributions—Bernoulli, Binomial, Poisson, Exponential, Uniform, and Gaussian—with intuitive explanations, common quant interview questions, and practical Python examples using scipy.stats.

Python For Quant Interviews: Probability Distributions


Why Probability Distributions Matter in Quant Interviews

Probability distributions form the foundation of quantitative modeling. Interviewers expect you to not only understand the math, but also to translate intuition into code. Financial data is inherently stochastic, and modeling risk, returns, and market behavior requires a deep grasp of these concepts. Showing you can implement and interpret distributions in Python is often a deciding factor in landing a quant role.

Table of Contents


Bernoulli Distribution

Intuition

The Bernoulli distribution models a single trial with two possible outcomes: success (1) or failure (0), with probability \( p \) of success.

  • Think of flipping a biased coin.
  • The simplest discrete distribution, foundation for many others.

Probability Mass Function (PMF):

\[ P(X = x) = p^x (1-p)^{1-x}, \quad x \in \{0, 1\} \]

Parameter Description
p Probability of success (0 ≤ p ≤ 1)

Quant Interview Questions

  • What is the expectation and variance of a Bernoulli random variable?
  • How can you simulate a single coin toss in Python?
  • How does the Bernoulli relate to the Binomial distribution?

Python Example with scipy.stats


from scipy.stats import bernoulli
import numpy as np

# Parameters
p = 0.7

# PMF for 0 and 1
print("P(X=0):", bernoulli.pmf(0, p))
print("P(X=1):", bernoulli.pmf(1, p))

# Random sample of 10 Bernoulli trials
samples = bernoulli.rvs(p, size=10)
print("Samples:", samples)

# Expectation and Variance
mean, var = bernoulli.stats(p)
print("Mean:", mean, "Variance:", var)

Binomial Distribution

Intuition

The Binomial distribution counts the number of successes in \( n \) independent Bernoulli trials, each with probability \( p \) of success.

  • e.g., Number of heads in 10 coin tosses.
  • Models discrete events over a fixed number of trials.

Probability Mass Function (PMF):

\[ P(X = k) = {n \choose k} p^k (1-p)^{n-k} \]

Parameter Description
n Number of trials (integer)
p Probability of success per trial (0 ≤ p ≤ 1)

Quant Interview Questions

  • How do you compute the probability of getting at least 3 heads in 5 coin tosses?
  • What are the mean and variance of the Binomial distribution?
  • How does the Binomial relate to the Poisson distribution?

Python Example with scipy.stats


from scipy.stats import binom

# Parameters
n = 10  # number of trials
p = 0.5 # probability of success

# Probability of exactly 3 successes
print("P(X=3):", binom.pmf(3, n, p))

# Probability of at least 3 successes
prob_at_least_3 = 1 - binom.cdf(2, n, p)
print("P(X>=3):", prob_at_least_3)

# Random sample
samples = binom.rvs(n, p, size=10)
print("Samples:", samples)

# Mean and Variance
mean, var = binom.stats(n, p)
print("Mean:", mean, "Variance:", var)

Poisson Distribution

Intuition

The Poisson distribution models the number of events occurring in a fixed interval of time or space, given a constant average rate \( \lambda \) and independence between events.

  • e.g., Number of trades per second on an exchange.
  • Approximates Binomial when n is large, p is small, and np = λ.

Probability Mass Function (PMF):

\[ P(X = k) = \frac{\lambda^k e^{-\lambda}}{k!} \]

Parameter Description
\(\lambda\) Average rate of occurrence (λ > 0)

Quant Interview Questions

  • What is the probability of 2 trades in a second if the average rate is 3 trades/sec?
  • How does the Poisson distribution arise as a limit of the Binomial?
  • What are the properties of the Poisson process?

Python Example with scipy.stats


from scipy.stats import poisson

# Parameters
lam = 3 # average rate

# Probability of exactly 2 events
print("P(X=2):", poisson.pmf(2, lam))

# Probability of at most 2 events
print("P(X<=2):", poisson.cdf(2, lam))

# Random sample
samples = poisson.rvs(lam, size=10)
print("Samples:", samples)

# Mean and Variance
mean, var = poisson.stats(lam)
print("Mean:", mean, "Variance:", var)

Exponential Distribution

Intuition

The Exponential distribution models the waiting time between independent Poisson events. It is continuous and memoryless.

  • e.g., Time until next trade arrives.
  • Key property: Memorylessness: \( P(T > s + t | T > s) = P(T > t) \)

Probability Density Function (PDF):

\[ f(x; \lambda) = \lambda e^{-\lambda x}, \quad x \geq 0 \]

Parameter Description
\(\lambda\) Rate parameter (\(\lambda > 0\))

Quant Interview Questions

  • Explain the memoryless property. Why is it important?
  • How is the exponential related to the Poisson process?
  • How would you simulate waiting times between trades?

Python Example with scipy.stats


from scipy.stats import expon

# Parameters
lam = 2 # rate parameter

# PDF at x = 1
print("f(x=1):", expon.pdf(1, scale=1/lam))

# Probability X > 2
print("P(X>2):", expon.sf(2, scale=1/lam))

# Random sample
samples = expon.rvs(scale=1/lam, size=10)
print("Samples:", samples)

# Mean and Variance
mean, var = expon.stats(scale=1/lam)
print("Mean:", mean, "Variance:", var)

Uniform Distribution

Intuition

The Uniform distribution models equal probability for all outcomes in a given interval.

  • e.g., Random number between 0 and 1.
  • Continuous (or discrete) with constant density.

Probability Density Function (PDF):

\[ f(x; a, b) = \frac{1}{b-a}, \quad a \leq x \leq b \]

Parameter Description
a Lower bound
b Upper bound (b > a)

Quant Interview Questions

  • How do you generate a random float between 5 and 10?
  • What are the mean and variance of a uniform distribution?
  • Why is the uniform distribution used in simulations?

Python Example with scipy.stats


from scipy.stats import uniform

# Parameters
a = 5
b = 10

# PDF at x = 7
print("f(x=7):", uniform.pdf(7, loc=a, scale=b-a))

# Probability X < 8
print("P(X<8):", uniform.cdf(8, loc=a, scale=b-a))

# Random sample
samples = uniform.rvs(loc=a, scale=b-a, size=10)
print("Samples:", samples)

# Mean and Variance
mean, var = uniform.stats(loc=a, scale=b-a)
print("Mean:", mean, "Variance:", var)

Gaussian (Normal) Distribution

Intuition

The Gaussian (Normal) distribution is the cornerstone of probability and statistics, modeling continuous data with a bell-shaped curve. Many natural and financial phenomena approximate normality, especially due to the Central Limit Theorem.

  • e.g., Daily returns of a stock (approximate), measurement errors.
  • Defined by mean (\( \mu \)) and standard deviation (\( \sigma \)).

Probability Density Function (PDF):

\[ f(x; \mu, \sigma^2) = \frac{1}{\sqrt{2\pi \sigma^2}} \exp\left( -\frac{(x-\mu)^2}{2\sigma^2} \right) \]

Parameter Description
\(\mu\) Mean
\(\sigma^2\) Variance (\(\sigma\) is std. dev.)

Quant Interview Questions

  • What is the 95% confidence interval for a standard normal variable?
  • How do you generate standard normal samples in Python?
  • State and explain the Central Limit Theorem.
  • How can you compute the probability that a normal variable falls between two values?

Python Example with scipy.stats


from scipy.stats import norm

# Parameters
mu = 0
sigma = 1

# PDF at x = 0
print("f(x=0):", norm.pdf(0, mu, sigma))

# Probability X < 1.96 (standard normal, 97.5th percentile)
print("P(X<1.96):", norm.cdf(1.96, mu, sigma))

# Probability between -1 and 1
p = norm.cdf(1, mu, sigma) - norm.cdf(-1, mu, sigma)
print("P(-1 < X < 1):", p)

# Random sample
samples = norm.rvs(mu, sigma, size=10)
print("Samples:", samples)

# Mean and Variance
mean, var = norm.stats(mu, sigma)
print("Mean:", mean, "Variance:", var)

Summary Table: Distributions Comparison

Distribution Type Parameters Support Mean Variance Python (scipy.stats)
Bernoulli Discrete p 0, 1 p p(1 - p) bernoulli
Binomial Discrete n, p 0, 1, ..., n np np(1 - p) binom
Poisson Discrete λ 0, 1, 2, ... λ λ poisson
Exponential Continuous λ x ≥ 0 1/λ 1/λ2 expon
Uniform Continuous a, b a ≤ x ≤ b (a + b)/2 (b - a)2/12 uniform
Gaussian (Normal) Continuous μ, σ −∞ < x < ∞ μ σ2 norm

Distribution Visualizations in Python

Visualizing probability distributions is highly recommended for developing intuition and for explaining concepts in interviews. Here’s how you can plot the PMFs or PDFs of the major distributions discussed above:


import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import bernoulli, binom, poisson, expon, uniform, norm

fig, axs = plt.subplots(2, 3, figsize=(18, 10))

# Bernoulli
p = 0.7
x = [0, 1]
axs[0, 0].bar(x, bernoulli.pmf(x, p))
axs[0, 0].set_title('Bernoulli PMF (p=0.7)')
axs[0, 0].set_xticks([0, 1])

# Binomial
n, p = 10, 0.5
x = np.arange(0, n+1)
axs[0, 1].bar(x, binom.pmf(x, n, p))
axs[0, 1].set_title('Binomial PMF (n=10, p=0.5)')

# Poisson
lam = 3
x = np.arange(0, 10)
axs[0, 2].bar(x, poisson.pmf(x, lam))
axs[0, 2].set_title('Poisson PMF (λ=3)')

# Exponential
lam = 2
x = np.linspace(0, 3, 100)
axs[1, 0].plot(x, expon.pdf(x, scale=1/lam))
axs[1, 0].set_title('Exponential PDF (λ=2)')

# Uniform
a, b = 5, 10
x = np.linspace(a-1, b+1, 100)
axs[1, 1].plot(x, uniform.pdf(x, loc=a, scale=b-a))
axs[1, 1].set_title('Uniform PDF (a=5, b=10)')

# Gaussian
mu, sigma = 0, 1
x = np.linspace(-4, 4, 100)
axs[1, 2].plot(x, norm.pdf(x, mu, sigma))
axs[1, 2].set_title('Gaussian PDF (μ=0, σ=1)')

for ax in axs.flat:
    ax.grid(True)
plt.tight_layout()
plt.show()

Python Interview Tips for Probability Distributions

In quant interviews, you’ll often be asked to:

  • Write code to simulate or analyze distributions.
  • Compute probabilities, quantiles, or expectations using Python.
  • Visualize distributions or draw conclusions from sample data.
  • Discuss the intuition and applications of each distribution in finance.

Best Practices:

  • Always specify the random_state parameter for reproducibility in rvs().
  • Use pmf for discrete and pdf for continuous distributions.
  • Understand cdf (cumulative distribution function) and ppf (percent point function, i.e., inverse cdf).
  • Know how to vectorize computations for efficiency (avoid Python loops on large data).
  • Be ready to explain why a particular distribution is appropriate for a given problem.

Common Quant Interview Problems Involving Distributions

  • Bernoulli/Binomial: “If the probability of a stock closing up is 0.55, what’s the probability it closes up exactly 6 times in 10 days?”
  • Poisson: “If trades arrive at an average rate of 5 per minute, what’s the probability of observing at least 7 trades in a minute?”
  • Exponential: “What is the probability that you wait more than 3 seconds for the next trade if λ = 0.5 per second?”
  • Uniform: “How do you simulate a random execution price between $10 and $15?”
  • Gaussian: “Assuming daily returns are normal with μ=0, σ=1%, what is the probability of a loss greater than 2%?”

Practice these types of problems, and make sure you can translate them into efficient Python code during your quant interview.


Applications of Probability Distributions in Quantitative Finance

Understanding the real-world uses of each distribution will impress interviewers:

  • Bernoulli/Binomial: Model binary outcomes (e.g., up/down days, option exercises).
  • Poisson/Exponential: Model event arrivals (trade times, order arrivals).
  • Uniform: Simulate random samples, initialization in Monte Carlo methods.
  • Gaussian: Model returns, risk, portfolio theory, Black-Scholes option pricing.

Demonstrate not only coding but also how each distribution fits into market modeling, derivative pricing, signal processing, or risk assessment.


Distribution Coding Challenges: Practice for Quant Interviews

Try solving these hands-on Python tasks similar to those in real interviews:

  1. Simulate a sequence of 100 Bernoulli trials with p=0.3 and plot the cumulative sum.
    
    from scipy.stats import bernoulli
    import numpy as np
    import matplotlib.pyplot as plt
    
    p = 0.3
    trials = bernoulli.rvs(p, size=100, random_state=42)
    cumsum = np.cumsum(trials)
    plt.plot(cumsum)
    plt.title('Cumulative Sum of Bernoulli Trials (p=0.3)')
    plt.xlabel('Trial')
    plt.ylabel('Cumulative Successes')
    plt.show()
    
  2. Estimate the probability of getting more than 8 heads in 12 tosses of a fair coin, using both the binomial formula and simulation.
    
    from scipy.stats import binom
    
    # Analytical probability
    n, p = 12, 0.5
    prob_analytical = 1 - binom.cdf(8, n, p)
    print("P(X>8) analytic:", prob_analytical)
    
    # Simulation
    simulations = 100000
    samples = binom.rvs(n, p, size=simulations, random_state=42)
    prob_simulated = np.mean(samples > 8)
    print("P(X>8) simulated:", prob_simulated)
    
  3. Generate random inter-arrival times for trades (λ=4 per minute), and plot the histogram and exponential PDF.
    
    from scipy.stats import expon
    import numpy as np
    import matplotlib.pyplot as plt
    
    lam = 4
    n_samples = 1000
    samples = expon.rvs(scale=1/lam, size=n_samples, random_state=42)
    
    plt.hist(samples, bins=30, density=True, alpha=0.6, label='Histogram')
    x = np.linspace(0, 2, 100)
    plt.plot(x, expon.pdf(x, scale=1/lam), 'r-', label='Exponential PDF')
    plt.title('Inter-arrival Times (λ=4)')
    plt.xlabel('Time')
    plt.ylabel('Density')
    plt.legend()
    plt.show()
    
  4. Given stock returns are normal (μ=0.1%, σ=2%), what’s the probability of a daily return below -3%?
    
    from scipy.stats import norm
    
    mu, sigma = 0.001, 0.02
    prob = norm.cdf(-0.03, mu, sigma)
    print("Probability of return < -3%:", prob)
    

FAQs on Probability Distributions in Python Quant Interviews

  • Should I use numpy.random or scipy.stats?
    Answer: Use scipy.stats for statistical computations (pmf, pdf, cdf, etc.) and numpy.random for fast sampling in simulations. In interviews, scipy.stats is generally preferred for clarity and statistical functions.
  • How do I check if data is normally distributed?
    Answer: Use visualizations (histogram, QQ plot) and statistical tests like scipy.stats.shapiro or scipy.stats.normaltest.
  • What’s the fastest way to compute cumulative probabilities?
    Answer: Use the cdf method in scipy.stats distributions.
  • Can I fit a distribution to data in scipy.stats?
    Answer: Yes, use the fit method for continuous distributions (e.g., norm.fit(data)).

Conclusion: Master Probability Distributions for Quant Success

A solid command of probability distributions—and their implementation in Python—will set you apart in quant interviews. Always pair mathematical intuition with code proficiency. Practice interpreting questions in terms of distributions, coding up solutions using scipy.stats, and explaining your reasoning both mathematically and practically.

Keep this guide as a reference as you prepare for your next quant interview. With strong fundamentals and Python skills, you’ll be ready to tackle the toughest probability questions on the spot!

Good luck on your quant interview journey!


Further Reading & References

Related Articles