Cloudera Inc. Interview Experience | Software Engineer (Internship + Full Time/On Campus)

About the company

Cloudera Inc. is a US based Big Data solutions company (founded in 2008) which provides enterprise software that runs over the open source Hadoop-Apache framework, and streamlines the information retrieval and data mining process with integrated ML and AI frameworks. It is the leading provider of enterprise Hadoop-Apache framework based distributions and recently merged with Hortonworks, which was its biggest competitor before the merger.

Work location: Bangalore (India)
Interview location: Manipal Institute of Technology, Manipal (India)

Summary:

There were 5 rounds in total, the first being a computer based online test followed by 4 one to one technical, managerial and HR interviews for the shortlisted students. I cleared all rounds and finally accepted the offer, and here is my interview experience for the same. Loads of luck to the reader for their upcoming placements, as luck does go on to play a small but vital role in the placement interviews.

Round 1: Online Test

Unlike a lot of other companies, the online test for Cloudera Inc. did not feature even a single multiple choice question, but only five coding questions divided into two sections. Three of those questions were compulsory and belonged to section 1, while there was a choice of selecting any one question out of the remaining two which were classified as advanced in the section 2. The test was hosted on Hackerrank for a duration of 120 minutes. There was a CGPA and branch criteria for eligibility to appear for the test, and was only for 7th semester students.

Section 1: (All mandatory)

The first question was related to data structures and algorithms. You have been given an array of strings sorted in time of arrival, where every string is a request for a username. Your job is to give out unique usernames for every request, by processing the requests in the given order. If the requested username has been already assigned before, you have to give out another username by concatenating the requested username’s base string with the minimum possible positive integer that will generate a unique username. If the requested username has not been assigned before, you may assign it as it is. Solution here: Program for assigning usernames using Trie

Example:
input: {a1, a, a, a1, b, b}
output: {a1, a, a2, a11, b, b1}

I implemented it using the Trie data structure, but there is an easier implementation using STL maps, which saves you from writing a ton of code.

The second question was related to DBMS and SQL. Given the schema of a simple relational database, you had to query some tuples based on the mentioned condition. The question was pretty easy, the only catch being to add the ORDER BY command. There was 1 visible test case that was working without adding ORDER BY, while there were several background test cases which required the perfect query. I would highly recommend the GeeksForGeeks MySQL thread for preparation and revision of SQL related problems.

The third question was related to string manipulation and file handling. Given a text file containing HTTP requests as formatted strings (1 request per line) on a Linux system, find the number of requests that send/receive more than a given number of bytes and the total number of bytes exchanged. Write both the results in an output file on separate lines. Although I used C to solve the problem, I would highly recommend python if it is allowed, which it was.

Section 2: (Select any one)

The fourth question was a problem from the game theory paradigm, the famous Game of Nim problem. Solution here: Combinatorial Game Theory | Set 2 (Game of Nim)

The fifth question was related to data structures and algorithms. There is a 2D grid of iron bars in a jail separated by a unit distance each, with M horizontal and N vertical rods mounted into the walls, ceiling and floor that surrounds them. The 1st and last iron bars in both dimensions are at unit distance from the walls, ceiling and floor.
A grid with N = 8 vertical bars and M = 4 horizontal bars. The asterisks (*) represent the walls, the commas (, ) represent the ceiling and the backticks/backquotes (`) represent the floor.

,,,,,,,,,,,,,,,,,,,
*_|_|_|_|_|_|_|_|_*
*_|_|_|_|_|_|_|_|_*
*_|_|_|_|_|_|_|_|_*
*_|_|_|_|_|_|_|_|_*
* | | | | | | | | *
```````````````````

Every thing is separated by unit distances and the square cells are of unit area. You are given such a grid with some of the iron bars missing in both dimensions. Find the maximum area of a rectangle with no iron bar in it. For example if all iron bars are removed, the maximum area would be (M + 1) * (N + 1). The input contains two arrays with the indices of the bars that remain in the grid. The idea is to sort the arrays in O(n*log(n)) time and find the maximum difference between consecutive elements of both the arrays in O(n) time. Then multiply the two differences to get the maximum area.

My overall experience of the online round is positive. It was a pretty easy test, only meant to test the coding ability and not skills, of the students and to filter out the get-to-do candidates. It checked on all areas required to work in a Hadoop cluster: basic programming, database knowledge and file handling on a very basic level. I cleared the round and made it to the interviews, with 34 others, an 11.5% shortlist percentage.

Round 2: Technical Round

The first interview round was taken by a single guy from their engineering department and lasted for about 45 minutes. It started with two puzzles followed by some data structures and networking questions.

Puzzle 1: Puzzle 12 | (Maximize probability of White Ball)

Puzzle 2: Puzzle 4 | (Pay an employee using a 7 units gold rod?)

What happens when you type in a URL on your browser? It was a very open ended question and I answered it around the TCP/IP model, DNS servers and routing. Other possible answers could revolve around data link layer protocols or on the client server architecture. One of the possible answers here: what happens when you type a URL?
There was a long discussion that happened around this question as one would expect, where he subtly checked all my Computer Networks knowledge. In the end he wasn’t entirely convinced with my answer, although he admitted that there is no right answer.

The fourth question was a fairly simple one. Given a binary tree, determine if it is a binary search tree. I gave the simple space optimized In-order Traversal solution which marked the end of round 2. Solution here: A program to check if a binary tree is BST or not

14 out of 35 people cleared this round.

Round 3: Technical Round

The second interview started only about five minutes after the first interview, and it also was taken by a single guy from their engineering department and lasted for about 45 minutes. This interview was all about databases in the second half, while the first half was a CV based discussion where I was asked a great deal about my projects, and specially about my open source contributions as one would expect.

The second half of the interview was structured with daisy chained DBMS questions, where the answer for one gave rise to the questions for next.

The first question was to represent inheritance in a relational database. I gave the infamous Living Beings -> Animal, Plants -> Mammals, Amphibians -> Dogs, Humans example, and wrote down the schema for the same. Here is some reference: Enhanced ER Model

The second question was related to database security and ACID properties. He also took the opportunity and asked me to write a C++/JAVA class to model security and atomicity in a bank database from which he tested my OOP skills as well. This question stretched into a long discussion as he could find loopholes in my solutions as I iteratively removed them, until the point where it was fairly correct.

The third question was to optimize storage and searching in a database, where I straight up mentioned B+ Trees which was a big mistake. He asked me to write psuedo code for deletion from B+ Trees, which I could not write. But instead of wasting time on trying, I drew an example B+ Tree and showed the process of deletion of a node in the diagram instead, which convinced him. Here is some reference: Introduction of B+ Tree
He then asked me how distributed clusters help in reducing this problem even further.

There were a few other deviations from the mainline DBMS from where he asked questions throughout the interview, but nothing was unanswerable or difficult and the interviewer was very helpful and cool. The most important takeaway from this round was to be able to justify your CV very thoroughly.

A few more questions on my projects and some interview feedback marked the end of round 3.

10 out of 14 people cleared this round.

Round 4: Techno-Managerial Round

The third interview was taken by a single guy, a project manager from their management department and lasted for about 35 minutes. Despite of being the shortest interview, it seemed to pack in the most amount of content, and for fair reasons. This interview had CV based questions about my open source projects, core OS questions, System Design and Managerial questions.

For the first 5 minutes, he talked to me about my previous interview experiences, my general college experience and my future plans and dreams. Pretty trivial stuff, much expected.

Then he started asking OS questions. He asked me about thrashing and paging by asking example questions. He then asked me to explain terms like race condition, mutual exclusion and deadlock, and also asked me to write code for Peterson’s Solution to the critical section problem.

He asked me to write code for the Readers Writers problem, and suggest design changes to prevent starvation of writers when there is an infinite stream of readers.

He asked me some basic questions on job scheduling while further going through my CV, and finally asked me a few but detailed questions on Real Time Operating Systems since that was on my CV. We then talked a little bit about some informal stuff and one of my previous internships, which marked the end of round 4.

7 out of 10 people cleared this round.

Round 5: HR Round

The last round was taken by a single guy, who was from their HR department and it lasted for a short 15 minutes. He also asked me about my projects but this time I knew I had to provide answers in “the HR’s way” of things.

He asked me about my future plans, regarding higher studies and/or corporate roles and cleared some logistical doubts out of the way. He took some verbal confirmations from me regarding job location, job description and offer acceptance.

That was pretty much it, after which we patiently waited for the results, that were declared in the same hour.

3 out of 7 people cleared this round and were offered the job (Internship + Full Time). All three of us accepted it.

Conclusion:

The entire process was very neat, complete and sufficient in terms of time, content, difficulty and results. Unlike some students, I didn’t have to wait between my interviews and the process was very streamlined for me. I am looking forward to starting work from Jan 2020.
A few pieces of advice for the reader:

  • Needless to say, treat GFG as your Bible. Practice competitive coding on Interview Bit, Hackerrank and Code Chef.
  • Do not panic at any point in the process. A lot depends on how smartly and fluently you can communicate. Remember, there’s no point in being talented if you cannot tell that to your interviewer.
  • Answer something, if not everything. Don’t try to either jump to the answer or just stay where you are. Instead, try to crawl your way through the solution slowly and steadily.
  • Some HR questions are staple questions with staple answers. Do not answer these emotionally, by which I mean there is no point in being “bold” and “frank” when all you have to do is say a simple yes or no.
  • Google the answers to those questions which you missed, in the time you get between interviews. If you get the chance to, ask for a brief interview feedback and try to close any open question in the interview room itself, by asking for directional hints (not solutions), which shows your interviewer that you are eager to solve a problem.
  • Dress properly and follow the code of conduct strictly. They are watching you!
  • Do not lie on your CV. Seriously, don’t. Ever.

Wish you all the luck in life!

Write your Interview Experience or mail it to contribute@geeksforgeeks.org



My Personal Notes arrow_drop_up

Here to code my way in

If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to contribute@geeksforgeeks.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.

Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.