blog-cover-image

Top NatWest Group Data Scientist Interview Questions and Answers

In this comprehensive guide, we solve and explain common NatWest Group Data Scientist interview questions, breaking down each concept in detail. Whether you are brushing up on graph theory, clustering differences, probability puzzles, or linear algebra transformations, this resource will help you master the material and ace your interview.

NatWest Group Data Scientist Interview Questions


How many edges does a fully connected (or complete) graph have?

A fully connected graph, also known as a complete graph, is a simple undirected graph in which every pair of distinct vertices is connected by a unique edge.

Suppose there are \( n \) vertices in the graph. For each vertex, it can be connected to \( n - 1 \) other vertices. However, since the graph is undirected, each edge connects two vertices, and we must avoid double-counting.

Formula Derivation

The total number of possible edges:

  • Each of the \( n \) vertices connects to \( n-1 \) others
  • This counts each edge twice (once from each end)

So, total edges is:

\[ \frac{n(n-1)}{2} \]

Example Calculation

For a complete graph with 5 vertices:

\[ \frac{5 \times (5-1)}{2} = \frac{5 \times 4}{2} = 10 \]

So, there are 10 edges in a fully connected graph with 5 vertices.


Multiple Choice Questions

1. What is the fundamental difference between k-means clustering and KNN?

Let's clarify the key difference between these two popular algorithms:

  • K-means Clustering is an unsupervised learning algorithm. It is used to group unlabeled data into clusters based on similarity, by minimizing intra-cluster variance.
  • KNN (K-Nearest Neighbors) is a supervised learning algorithm. It classifies new data points based on the majority label among its k-nearest neighbors in the training set.

Fundamental difference: K-means is used for unsupervised clustering, while KNN is used for supervised classification or regression.


2. Which command would be used to ensure any local tracked changes are also shared amongst remote repositories?

This question refers to Git version control commands.

  • git commit saves changes to the local repository.
  • git push uploads local commits to the remote repository.
  • git pull fetches and merges changes from the remote repository into the local repository.

Correct Answer: To share local tracked changes with a remote repository, use:

git push

3. Given 2 fair 6-sided dice and a fair coin, what is the probability of rolling a combined score of 7 with the dice and having the coin show heads after tossing?

Let's break this down step by step:

Step 1: Probability of rolling a combined score of 7 with two dice

There are \( 6 \) faces on each die, so there are \( 6 \times 6 = 36 \) possible outcomes.

To get a sum of 7, the possible pairs are:

  • (1,6)
  • (2,5)
  • (3,4)
  • (4,3)
  • (5,2)
  • (6,1)

So, there are 6 favorable outcomes.

\[ P(\text{sum} = 7) = \frac{6}{36} = \frac{1}{6} \]

Step 2: Probability of tossing a coin and getting heads

Since the coin is fair:

\[ P(\text{heads}) = \frac{1}{2} \]

Step 3: Combined Probability

Assuming independence (rolling dice does not affect the coin):

\[ P(\text{sum} = 7 \ \text{and} \ \text{heads}) = P(\text{sum} = 7) \times P(\text{heads}) = \frac{1}{6} \times \frac{1}{2} = \frac{1}{12} \]


4. Given data matrix X of shape n x d and transformation matrix sigma of shape d x k, which matrix multiplications will produce a transformed data matrix Xdash of shape n x k?

  • X: shape \( n \times d \)
  • \(\Sigma\): shape \( d \times k \)
  • Desired shape for \( X' \): \( n \times k \)

In matrix multiplication, the number of columns in the first matrix must match the number of rows in the second. So:

\[ (n \times d) \times (d \times k) = (n \times k) \]

Correct multiplication: \( X' = X \cdot \Sigma \)


5. The sigmoid function is defined. What does its value tend to as its input x moves towards negative infinity?

The sigmoid function is commonly defined as:

\[ \sigma(x) = \frac{1}{1 + e^{-x}} \]

As \( x \to -\infty \), \( e^{-x} \to \infty \), so:

\[ \sigma(x) \to \frac{1}{1 + \infty} = 0 \]

Conclusion: The value of the sigmoid function tends to 0 as \( x \) moves towards negative infinity.


6. You are playing a card game with three cards. Both sides of one card are black, both sides of another are white, and the remaining card has one black side and one white side. You pick a card at random and see it has a black side face-up. What is the probability that the other side of the card is white?

This is a classic conditional probability puzzle.

Step 1: List all card faces

  • BB (both sides black)
  • WW (both sides white)
  • BW (one black, one white)

Each card has 2 sides, so in total there are 6 card faces:

  • BB: black, black
  • WW: white, white
  • BW: black, white

All faces: B, B, W, W, B, W (but the BW card can be B or W, not both at once).

Step 2: List all possible black faces

  • BB card: both sides are black – 2 faces
  • BW card: 1 black face
  • WW card: 0 black faces

So, there are 3 black faces in total:

  • BB (face 1)
  • BB (face 2)
  • BW (black face)

Step 3: Calculate conditional probability

Given that you see a black face, what is the probability that the other side is white?

  • If you picked BB (either face), the other side is black
  • If you picked BW (black face), the other side is white

So, out of the 3 possible black faces, only 1 (the black face of BW) has a white reverse.

\[ P(\text{other side is white} | \text{black face up}) = \frac{1}{3} \]


 

Summary Table

Question Answer Explanation
How many edges in a complete graph? \(\frac{n(n-1)}{2}\) Each pair of vertices connected by a unique edge, avoiding double-counting.
Difference between k-means and KNN? Unsupervised vs. supervised K-means clusters unlabeled data; KNN classifies based on known labels.
Git command to share changes? git push Uploads local changes to remote repository.
Probability: sum 7 (dice) & heads (coin)? \(\frac{1}{12}\) Independent probabilities: \(\frac{1}{6} \times \frac{1}{2}\).
Matrix multiplication for transforming data? \( X' = X \cdot \Sigma \) Matrix shapes allow multiplication to produce \( n \times k \) result.
Sigmoid as \( x \to -\infty \)? 0 \(\sigma(x) = \frac{1}{1 + e^{-x}}\) tends to 0 as \( x \to -\infty \).
Card game: other side is white? \(\frac{1}{3}\) Only 1 out of 3 black faces has a white reverse.

Conclusion

Mastering these NatWest Group Data Scientist interview questions will give you a strong foundation for your interview. Focus on understanding core principles, practicing mathematical reasoning, and applying concepts to practical scenarios. Whether you are solving for the number of edges in a graph, distinguishing between machine learning algorithms, working with matrices, or handling probability puzzles, thorough preparation is the key to success.

Best of luck in your NatWest Group Data Scientist interview!

Related Articles