Goldman Sachs visits our campus every year for both campus placements and internships. Of late, their process has become pretty stable, they generally conduct a three part test online, followed by interviews on day 1 (December 1). I applied via the campus placement cell, was shortlisted for interview, and eventually was given an offer (which I accepted). This is an overview of both the test as well as the interview process I went through.

**Online test :**

Held on Hackerrank, you are allowed to switch between sections. Each section has a separate timer, which will stop when you switch. One strategy could be to sacrifice some section (ML, since it was shit) and use that time for the other questions. Two sections can be done fully if you have some luck and you manage your time properly. Also note, GS will not use the entire test result to shortlist. Different teams will look at different things, so if you do really well in one section or two sections, you have a good shot at interview.

### CS :

60 minutes, 5 MCQs, 2 coding questions. Each MCQ was +10/-3, Coding questions were +20, +30 with partial marking (although it wasn’t specified how much, and Hackerrank does show how many test cases passed)

(30) points

You are given a list of n tourist bookings (start date, duration), and the total number of tourists that can simultaneously be in the country. When processing bookings, you have to check if the current number of tourists existing are more than number allowed, and if so, deny the booking. O(n^2) was obviously giving TLE, it is possible to do it in O(nlogn)

(20) points

You are given a number (in the form of a string) and an integer k. You have to output the maximum palindromic number that can be formed from the given number and by using at most k changes to the number (replacing digits is the only allowed operations). Output -1 if it is not possible to get a palindrome.

MCQs :

- You are given a BST (filled with some 6-7 values), and the question was that how many input orders could be given such that you end up with the same BST. (As in, you are given a stream of input and you make a BST out of them normally without balancing etc, how many streams will give same BST)
- We wish to solve Sudoku filling using graph coloring, by putting edge between nodes that can’t have the same number. What is the minimum number of nodes in this graph? Ans: 810
- Given a number (n bit long), what is the complexity of finding if the number is a power of 2? Ans –
**O(n)** - What is the extra space complexity for maintaining next min element in a stack .Ans O(1)
- What is a possible output for n |= n>>1, n|= n>>2, n|= n>>4, … n|= n>>16, cout<<(n>>1) Ans: 127

**ML :**

#### 10 questions, all MCQs, each +10/-3. Duration was 30 minutes

- X1 ~ N(0,1), X2 ~ N(2, 4), what is KL (X1||X2)
- Let w be an unbiased estimator for theta parameter in Unif(0,theta). Given n samples, what is an unbiased estimator for variance of w
- X ~ N(0,1). Let p denote cdf of x. Define y is random variable, y = s(log(p) – log(1-p)). What is cdf of y? Ans: 1/(1+e^(-y/s))
- Given n samples of a vector (X,Y), what is an unbiased estimator for cov(X,Y). The options were in terms of H, defined as the covariance of the sample.
- 3 people Alice, Bob and Charlie. Alice can shoot with probability 0.2, Bob with 0.5 and Charlie with 1. What is the probability of Bob surviving if they all were shooting in cyclic order. Ans: 13/30
- What kind of normalisation(mean, min-max) is applied before cosine similarity of word vectors. Ans – nothing, as it would lead to information loss (tentative answer)
- In time series, which method is used for testing? Window method, Shuffling method or k-fold cross validation. Ans – should be window I think, because everything else will destroy sequential nature of dat
- Another question on cosine similarity. Matrix of MxN. Whether the similarity would lie between [0,1] or [0,1) based on whether the rank of matrix was N and M>N. –
- Given that there are two coins of bias p and q, you define “event” as choosing a random coin among the two, and then tossing them thrice. Given outcomes as {HTT} and {TTH}, do expectation maximization once to find values of p, q. Start with p = 0.4, q = 0.8

**Quant : **

10 questions. 9 MCQs and one numerical type, all were +10, the MCQs had -3, numerical had no negatives.

- Straws weigh a random amount in unif(0,1). A camel can take a total weight of 1 before its back breaks. What is the expected weight of the last straw that breaks the camel’s back. Ans : 2-2/e = 0.64
- What is the expected number of straws that can be placed before the camel’s back is broken Ans : e
- Geometry question, obtuse triangle ABC was given (B being the obtuse angle). D was midpoint of BC. Angle ADB = 45, Angle ACB = 30. Find tan B Ans: -2-sqrt(3)
- Matrix was given, entries in first row were cos 1, cos 2, cos 3, … and so on, for n^2 entries. What is the limit as n tends to infinity of the determinant. Ans: 0 (https://math.stackexchange.com/questions/1003453/a-limit-determinant-question)
- M, N are drawn from unif(1,100) integers. What is the probability that 7^m + 7^n is divisible by 5. Ans : 0.25
- What is the probability that the first toss was heads given that r heads were observed in n tosses of a fair coin Ans : its r/n
- A and B play a game with each other. P(A wins) = 2/3. The loser of each round gives the winner 1$. What is the expected number of rounds they will play if A starts with 1$ and B starts with 2$. Ans : 15/7
- Another determinant simplification problem, you had to do basic R1 -> R1 – R2 type operations and extract common elements.
- Number of minimum length of set such that there exists a subset that has sum divisible by 11. Ans : 11 https://math.stackexchange.com/questions/1939620/prove-that-there-is-at-least-one-subset-of-11-numbers-whose-sum-is-divisible-by
- x^2 + 2bx +c = 0 – what is the probability that this has real roots, given that b, c are drawn uniformly randomly from [-1, 1]. (Real distribution) Ans – 2/3

Shortlists are based both on your resume as well as test scores. Depending on how well you did in each section, different teams will shortlist you for interviews. For me, I had attempted 5 out of 5 coding MCQs correctly (And gotten one wrong), and 9 out of 10 quant questions(all 9 correct). I got a few tests cases right on the first coding question, that may have given me some more marks partially. ML section was totally garbage. This was enough to get me a shortlist in both the teams that look for coding skills (GS Technology) and teams that look for quant skills.

**Interviews : **

As mentioned, depending on how many teams look at you, you may have multiple shortlists. I had 2 coding interviews (by the Technology team) and 4 quant interviews. I believe of each kind, the first is a screening test, and the second is a selection test. My interview order was Qp/Q1/Q2/Tp/T1/Q3, where Q/T indicate Tech/Quant, and p indicates prelims round whereas the number indicates a particular team specific round. At each interview, try to catch the name of the divisions they work at, and try to ask them about the work there. If your interviews are going moderately well, you may be given a shot at multiple teams, and it is a good way to figure out what team interests you the most, and try and make that clear during the interviews.

### Technology team :

Interview one :

- Tell me about yourself
- Given a linked list, how will you find a loop in it (I told him I had seen the problem before, so I just gave a brief outline of the solution and we moved on to the next one)
- Given an array of integers, find the minimum missing natural number. Had to eventually give a O(n) time, O(1) space solution, starting from naive O(nlogn) solution

The rest of the interview was a brief overview of the team, what they did. I asked some questions about the kind of scale they faced. The interviewer talked about a project he was working on, which was ensuring synchronisation on a global scale in milliseconds. This seemed pretty neat to me, but I had explained already that I had no systems exposure, and I would prefer a quant team over a tech team.

Interview two :

- Given a matrix, give an algorithm to rotate the matrix in place. (Responded with doing vertical + horizontal mirroring)
- Given a set of strings, find the longest common prefix of all of them. (Responded with sorting all the strings lexicographically + comparing the first and last string. Another approach I mentioned was to take any two strings randomly, check their LCP (let this be A), and continue checking with each remaining string, setting A = LCP(A, string_i))

In both interviews, the interviewer will talk about their teams. This is a good opportunity to ask specific questions. As I didn’t have any background in databases or networks, I couldn’t really appreciate the scale or the impact their team was having, so I just asked generic questions about work etc.

### Quant teams :

Interview one :

Basic probability puzzles. I got them to skip all the puzzles I had come across, so it wasn’t really a big deal.

- Explain what PCA is. Give an example, will it always report a linear combination?
- Josephus problem (I didn’t know of this, question was asked for 100, I misheard and did it for n and then made some small errors when calculating for 100)
- Given an array of random numbers, what is the expected number of local minima (I initially didn’t check the end points, and said n/4, then corrected it immediately)
- Given a polynomial x^2 + ax + b, what is the probability that it has real roots (a, b are drawn from unif[0,1]) (I mentioned that it was already asked in the test, so they asked me to solve it and show what I had done)
- Given that a stick is broken into 3 parts, what is the expected length of the maximum part? (This was straightfoward. Find the cdf of X = max(X1, X2, X3) and integrate)
- Given that a stick is broken into 3 parts, what is the expected length of the minimum part? (This I got stuck on, especially when asked to generalise to n breaks) Later on asked to comment on order, show how you could solve this using different methods

Interview two :

Another round of probability puzzles. Again, mostly easy stuff. I don’t remember any specific puzzle from this, because this happened in between other interviews and I was pretty much on auto-pilot and have no clue what happened. Just make sure you approach the questions from the basics, and talk to the interviewer all the time.

- Chameleon puzzle : http://www.cut-the-knot.org/blue/Chameleons.shtml (I noted down the differences between the colors at the start and after each exchange, and said something. With some hints, I was able to reduce it to the divisibility by 3 form, and they were happy with it)
- Given two coin toss sequences : HTH, HTT, which will take lesser tosses to observe upon expectation? (I initially had no clue what to do with this, but after some talking, figured out that the first would restart upon failure at the 3rd toss, whereas the second would could resume from the second toss upon failure in the third. Hence the second would take lesser number of tosses on expectation)
- What would you say is your strength, and give an example of that.
- What is one example of what your weaknesses are, and how did that affect you?
- Would you say you’re an individual contributor more or a team player (and mentioned there was no right answer, like what!)

Interview three :

This was more of a discussions session. I was asked one puzzle, and later on discussions about the work at the team of the interviewer.

- Given that a robot starts on the real line at some integer X, and moves with some integer speed Y, and you are allowed to query any integral point on the real line every second. How will you find where the robot is, provably in finite steps? I first asked if I could start with X = 0. In this case, it is easy to figure out that you have to query i^2 at every second i, and you can always find the robot. This is essentially just checking all integers in sequence. Now the extension to X, Y is immediate, you just need to check tuples of (X, Y) and this essentially reduces to checking all rational numbers in sequence (or all points in integer positive grid). Again, since both sequences are countable, you can find the robot in finite amount of time.
- Given a bunch of yield values, and fertiliser used values, try to model how the yield would look given an amount of fertiliser. This took some prodding and talking out loud, eventually settled for modelling yield and fertiliser as jointly Gaussian random variables(Y, F), with unknown co-variance. He then asked how I would estimate the profit(measured by P*Y – C*F) made, over this distribution (straightforward integration). He then asked me how to find expectation of the profit, given that the profit is greater than some known quantity (integrate with proper limits). We then had some discussion about the integration in this case (he had misread my handwriting and notation for something else, and thought I was doing something wrong). He then asked how I would compute the expectation given samples from the distributions (compute the quantity and average it). Again some discussion followed on how to do this, why this would be accurate, etc.
- He then asked me how I could generate these two jointly distributed Gaussian random variables, when given access to only a generation function that generates from Unif[0,1]. Here, I asked to break it into smaller steps. The first step I explained was generating two correlated random variables from two uncorrelated Gaussian random variables (Do y = a + p*b, where a and b are standard Gaussians and p is your required covariance). I was used to speaking about covariance, so made the mistake of not normalizing this properly. He pointed this error out, and I quickly corrected for correlation and not covariance. He then asked how I would generate this Gaussian from the uniform, to which I replied that one could use the inverse-cdf method. Some more questions followed, and it seemed like he was suitably happy with my performance.

He then explained the work of his team, and what sort of models they used. This was a very interesting conversation for me, I did not realise the scale of operations of this particular team, and I managed to ask a few questions about it because it was genuinely intriguing.

Interview four :

This was a puzzle round, coupled with some generic questions about what my weaknesses and strengths were, along with discussion on the teams work. Some questions that I recall :

- Let us take 100. You break it into two parts, say 50 and 50. Let A1 = 50*50. Now break each 50 into two more parts, say 25, 25 and 25, 25. A2 = 25*25, A3 = 25*25. You basically partition the number and construct a term by multiplying each term of the partition. This way you can build a binary tree. Now add up all the terms you have thus defined (all the products). Prove / disprove that the sum is constant irrespective of the partitioning you choose at every node. (I was initially stumped by this, so I tried using n = 4, n = 5 instead of 100. I then claimed it would work irrespective of the value, so tried doing it for general N and failed miserably. Then he asked how I would go about proving such a thing, I said it looked like induction. So now you can basically induct on each child partition of the binary tree formed. I first said that the sum should be n*(n+1)/2, then using induction was able to prove it, and thus find the value)
- Find the k smallest elements in an array (Started with naive O(nlogn) solution, then said it could be improved to O(nlogk). He asked if any better was possible, I said no because that might imply sorting could be faster than O(nlogn). Not sure if that was right, but he seemed happy)
- Asked if I was familiar with ML and statistics (big mistake, ML != statistical inference). Asked why we compute standard deviation by dividing with n-1 instead of n. (Didn’t know the answer to it, so made up something about how the mean itself is not true mean, so we should probably not estimate it exactly. Needed a bunch of prompts for this one, finally the answer turned out to be along these lines)
- (Toughest question I had in the day) Given three uniformly randomly chosen points in the interior of a circle, what is the probability that the triangle formed contains the center of the circle. (I first doodled for a bit drawing a line between two random points and figuring out where the third could lie to be relevant. This led me to join each point to the center and extend each of these lines to the circle, so now you would get a sector of angle theta, where theta is the angle between those two points. I said that this was the relevant region, and we’d need to integrate it. They asked how many parameters I needed to integrate, I said radius of both points, and relative angle between them. The then asked if radius mattered, turns out it didn’t really. So I said the answer would just be the average value of theta, then I made a mistake in saying that the theta would range from 0 – 2pi, corrected to 0 – pi. This came out to be 1/4, and the interviewers were happy with my solution and approach, having avoided integration completely)

In all the quant interviews (barring the third), I was also asked a lot of other puzzles. Most of these were things I had seen earlier (tiling with squares with shaded triangles, breaking a stick and forming triangle, ants colliding on a triangle etc). For all of these, I asked them to change to a different question. I initially thought this would reflect badly on me as a candidate, but I’m not too sure now. I think it is good if you tell them beforehand that you have seen a puzzle. In the worst case, they will ask you to show them the solution, in the best case you will get a different puzzle to solve. Always try and clarify all aspects of the question with your interviewer, and never hesitate from thinking out loud or giving partial solutions. They are interested more in how you are able to get from a simple naive inefficient solution to an optimised one.