Python Practice Problems For Data Interview
Problem Statement: Write a Python function that computes the dot product of a matrix and a vector. The function should return a list representing the resulting vector.
import numpy as np # Import the NumPy library for numerical operations
def matrix_vector_dot(matrix, vector):
"""
Computes the dot product of a matrix and a vector.
Parameters:
matrix (list of lists): The input matrix.
vector (list): The input vector.
Returns:
list: The resulting vector after the dot product.
"""
if len(matrix[0]) != len(vector):
return -1
else:
return np.dot(matrix, vector).tolist() # Compute the dot product and convert the result to a list
Problem Statement: Write a Python function that multiplies a matrix by a scalar and returns the result.
import numpy as np # Import the NumPy library for numerical operations
def scalar_multiply(matrix, scalar):
"""
Multiplies each element of the matrix by a scalar.
Parameters:
matrix (list of lists): The input matrix.
scalar (float): The scalar value to multiply with.
Returns:
list of lists: The resulting matrix after scalar multiplication.
"""
return (np.array(matrix) * scalar).tolist() # Convert matrix to NumPy array, multiply by scalar, and convert back to list
Problem Statement: Write a Python function that uses the Jacobi method to solve a system of linear equations given by Ax = b. The function should iterate n times.
import numpy as np # Import the NumPy library for numerical operations
def jacobi_method(A, b, n):
"""
Solves the system of linear equations Ax = b using the Jacobi method.
Parameters:
A (list of lists): Coefficient matrix.
b (list): Constant terms vector.
n (int): Number of iterations.
Returns:
list: Approximate solution vector after n iterations.
"""
x = np.zeros_like(b) # Initialize the solution vector with zeros
D = np.diag(A) # Extract the diagonal elements of A
R = A - np.diagflat(D) # Compute the remainder matrix (A without the diagonal)
for _ in range(n):
x = (b - np.dot(R, x)) / D # Update the solution vector
return x.tolist() # Convert the result to a list and return
Problem Statement: Write a Python function that performs linear regression using gradient descent. The function should take NumPy arrays X (features with a column of ones for the intercept) and y (target values), a learning rate, and the number of iterations.
import numpy as np # Import the NumPy library for numerical operations
def linear_regression(X, y, learning_rate, iterations):
"""
Performs linear regression using gradient descent.
Parameters:
X (numpy.ndarray): Feature matrix with a column of ones for the intercept.
y (numpy.ndarray): Target values vector.
learning_rate (float): Learning rate for gradient descent.
iterations (int): Number of iterations.
Returns:
numpy.ndarray: Coefficients vector after training.
"""
m, n = X.shape # Get the number of samples (m) and features (n)
theta = np.zeros(n) # Initialize the coefficients vector with zeros
for _ in range(iterations):
gradient = np.dot(X.T, np.dot(X, theta) - y) / m # Compute the gradient
theta -= learning_rate * gradient # Update the coefficients
return theta # Return the trained coefficients
Problem Statement: Write a Python function that implements the k-Means clustering algorithm. This function should take specific inputs and produce a list of cluster assignments.
import numpy as np # Import the NumPy library for numerical operations
def k_means_clustering(data, k, iterations):
"""
Performs k-Means clustering on the given data.
Parameters:
data (numpy.ndarray): The input data points.
k (int): Number of clusters.
iterations (int): Number of iterations.
Returns:
list: Cluster assignments for each data point.
"""
centroids = data[np.random.choice(len(data), k, replace=False)] # Initialize centroids randomly
for _ in range(iterations):
distances = np.linalg.norm(data[:, np.newaxis] - centroids, axis=2) # Compute distances to centroids
labels = np.argmin(distances, axis=1) # Assign clusters based on closest centroid
for i in range(k):
points = data[labels == i] # Get all points assigned to cluster i
if points.size:
centroids[i] = points.mean(axis=0) # Update centroid to mean of assigned points
return labels.tolist() # Convert the result to a list and return