# Cloudera Inc. Interview Experience | Software Engineer (Internship + Full Time/On Campus)

```Example:
input: {a1, a, a, a1, b, b}
output: {a1, a, a2, a11, b, b1}```
I implemented it using the Trie data structure, but there is an easier implementation using STL maps, which saves you from writing a ton of code. The second question was related to DBMS and SQL. Given the schema of a simple relational database, you had to query some tuples based on the mentioned condition. The question was pretty easy, the only catch being to add the `ORDER BY` command. There was 1 visible test case that was working without adding `ORDER BY`, while there were several background test cases which required the perfect query. I would highly recommend the GeeksForGeeks MySQL thread for preparation and revision of SQL related problems. The third question was related to string manipulation and file handling. Given a text file containing HTTP requests as formatted strings (1 request per line) on a Linux system, find the number of requests that send/receive more than a given number of bytes and the total number of bytes exchanged. Write both the results in an output file on separate lines. Although I used C to solve the problem, I would highly recommend python if it is allowed, which it was. Section 2: (Select any one) The fourth question was a problem from the game theory paradigm, the famous Game of Nim problem. Solution here: Combinatorial Game Theory | Set 2 (Game of Nim) The fifth question was related to data structures and algorithms. There is a 2D grid of iron bars in a jail separated by a unit distance each, with M horizontal and N vertical rods mounted into the walls, ceiling and floor that surrounds them. The 1st and last iron bars in both dimensions are at unit distance from the walls, ceiling and floor. A grid with `N = 8` vertical bars and `M = 4` horizontal bars. The asterisks (*) represent the walls, the commas (, ) represent the ceiling and the backticks/backquotes (`) represent the floor.
```,,,,,,,,,,,,,,,,,,,
*_|_|_|_|_|_|_|_|_*
*_|_|_|_|_|_|_|_|_*
*_|_|_|_|_|_|_|_|_*
*_|_|_|_|_|_|_|_|_*
* | | | | | | | | *
``````````````````````
Every thing is separated by unit distances and the square cells are of unit area. You are given such a grid with some of the iron bars missing in both dimensions. Find the maximum area of a rectangle with no iron bar in it. For example if all iron bars are removed, the maximum area would be `(M + 1) * (N + 1)`. The input contains two arrays with the indices of the bars that remain in the grid. The idea is to sort the arrays in `O(n*log(n))` time and find the maximum difference between consecutive elements of both the arrays in `O(n)` time. Then multiply the two differences to get the maximum area. My overall experience of the online round is positive. It was a pretty easy test, only meant to test the coding ability and not skills, of the students and to filter out the get-to-do candidates. It checked on all areas required to work in a Hadoop cluster: basic programming, database knowledge and file handling on a very basic level. I cleared the round and made it to the interviews, with 34 others, an 11.5% shortlist percentage. Round 2: Technical Round The first interview round was taken by a single guy from their engineering department and lasted for about 45 minutes. It started with two puzzles followed by some data structures and networking questions. Puzzle 1: Puzzle 12 | (Maximize probability of White Ball) Puzzle 2: Puzzle 4 | (Pay an employee using a 7 units gold rod?) What happens when you type in a URL on your browser? It was a very open ended question and I answered it around the TCP/IP model, DNS servers and routing. Other possible answers could revolve around data link layer protocols or on the client server architecture. One of the possible answers here: what happens when you type a URL? There was a long discussion that happened around this question as one would expect, where he subtly checked all my Computer Networks knowledge. In the end he wasn’t entirely convinced with my answer, although he admitted that there is no right answer. The fourth question was a fairly simple one. Given a binary tree, determine if it is a binary search tree. I gave the simple space optimized In-order Traversal solution which marked the end of round 2. Solution here: A program to check if a binary tree is BST or not 14 out of 35 people cleared this round. Round 3: Technical Round The second interview started only about five minutes after the first interview, and it also was taken by a single guy from their engineering department and lasted for about 45 minutes. This interview was all about databases in the second half, while the first half was a CV based discussion where I was asked a great deal about my projects, and specially about my open source contributions as one would expect. The second half of the interview was structured with daisy chained DBMS questions, where the answer for one gave rise to the questions for next. The first question was to represent inheritance in a relational database. I gave the infamous `Living Beings -> Animal, Plants -> Mammals, Amphibians -> Dogs, Humans` example, and wrote down the schema for the same. Here is some reference: Enhanced ER Model The second question was related to database security and ACID properties. He also took the opportunity and asked me to write a C++/JAVA class to model security and atomicity in a bank database from which he tested my OOP skills as well. This question stretched into a long discussion as he could find loopholes in my solutions as I iteratively removed them, until the point where it was fairly correct. The third question was to optimize storage and searching in a database, where I straight up mentioned B+ Trees which was a big mistake. He asked me to write pseudo code for deletion from B+ Trees, which I could not write. But instead of wasting time on trying, I drew an example B+ Tree and showed the process of deletion of a node in the diagram instead, which convinced him. Here is some reference: Introduction of B+ Tree He then asked me how distributed clusters help in reducing this problem even further. There were a few other deviations from the mainline DBMS from where he asked questions throughout the interview, but nothing was unanswerable or difficult and the interviewer was very helpful and cool. The most important takeaway from this round was to be able to justify your CV very thoroughly. A few more questions on my projects and some interview feedback marked the end of round 3. 10 out of 14 people cleared this round. Round 4: Techno-Managerial Round The third interview was taken by a single guy, a project manager from their management department and lasted for about 35 minutes. Despite of being the shortest interview, it seemed to pack in the most amount of content, and for fair reasons. This interview had CV based questions about my open source projects, core OS questions, System Design and Managerial questions. For the first 5 minutes, he talked to me about my previous interview experiences, my general college experience and my future plans and dreams. Pretty trivial stuff, much expected. Then he started asking OS questions. He asked me about thrashing and paging by asking example questions. He then asked me to explain terms like race condition, mutual exclusion and deadlock, and also asked me to write code for Peterson’s Solution to the critical section problem. He asked me to write code for the Readers Writers problem, and suggest design changes to prevent starvation of writers when there is an infinite stream of readers. He asked me some basic questions on job scheduling while further going through my CV, and finally asked me a few but detailed questions on Real Time Operating Systems since that was on my CV. We then talked a little bit about some informal stuff and one of my previous internships, which marked the end of round 4. 7 out of 10 people cleared this round. Round 5: HR Round The last round was taken by a single guy, who was from their HR department and it lasted for a short 15 minutes. He also asked me about my projects but this time I knew I had to provide answers in “the HR’s way” of things. He asked me about my future plans, regarding higher studies and/or corporate roles and cleared some logistical doubts out of the way. He took some verbal confirmations from me regarding job location, job description and offer acceptance. That was pretty much it, after which we patiently waited for the results, that were declared in the same hour. 3 out of 7 people cleared this round and were offered the job (Internship + Full Time). All three of us accepted it. Conclusion: The entire process was very neat, complete and sufficient in terms of time, content, difficulty and results. Unlike some students, I didn’t have to wait between my interviews and the process was very streamlined for me. I am looking forward to starting work from Jan 2020. A few pieces of advice for the reader:
• Needless to say, treat GFG as your Bible. Practice competitive coding on Interview Bit, Hackerrank and Code Chef.
• Do not panic at any point in the process. A lot depends on how smartly and fluently you can communicate. Remember, there’s no point in being talented if you cannot tell that to your interviewer.