Robert Bosch Data Scientist interview 2019

Telephonic :-
Q.1. What is Decision Tree ? How to split ? How does decision tree work ?
Q.2. What does each node contain in a Decision Tree ?
Q.3. What is Entropy and Genie Index and how does it help ?
Q.4. What is Random Forest ? What is Random in Random Forest ? How to calculate OOB Error ?
Q.5. How does random forest work ?
Q.6. Explain the entire process from the point you get the data till you reach the final stage of prediction.
Q.7. How does knn work ? Which distance algorithm to use in knn when data is categorical ?
Q.8. You have 10 documents. Each topic has been tagged with a topic. Once a new document comes, how to tag it to one of those topics ?
Primary focus : Candidate should be good in coding and he should also have sound knowledge on ML algorithms.

Face to Face :-
Coding round in R
1. Create a data frame of this form
Date Value
01/01/2019 12:00 xx
. .
. .
. .
01/31/2019 11:59 .

Value can be randomly generated

2. Transpose the data frame into this form
Date Hour1 Hour2 Hour3 . . . Value
01/01/2019 12:00 13:00 14:00 . . . xx
02/01/2019 12:00 13:00 14:00 . . . xx
. . . . . .
. . . . . .
. . . . . .
31/01/2019 12:00 13:00 14:00 . . . xx

Technical Interview
Q.1. If I want to find a relationship between Price and Sales should I use regression or correlation ?
Answer : Simple linear regression can be used to understand the relationship between
the dependent variable (Sales) and independent variable (Price).
Assumption = No other parameters are present.
Correlation coefficient or Standardized covariance (-1 < r < 1) will tell us :
1. If there is positive or negative correlation.
2. It gives strength and relationship between 2 variables.

Q.2. If I have multiple features in my dataset, how do I know which ones to include for my model building ?
Answer. Check coefficient of determination i.e. R squared. It is the percentage of variation in the y variable that is explainable by x variable.
If r squared is 0 that means you can't predict y from x.
If r squared is 1 that means you can predict y from x without any errors.
I had answered dimensionality reduction technique like Principal Component Analysis.

Q.3. Questions on SSE, RMSE, MAPE.

Q.4. More questions on end to end process of data analysis.
Q.5. I was asked few problems on practical scenarios :
a) If I want to improve the traffic conditions what are the data I would ask for.
b) Which algorithm to use when kind of questions.

Write your Interview Experience or mail it to

My Personal Notes arrow_drop_up

If you like GeeksforGeeks and would like to contribute, you can also write an article using or mail your article to See your article appearing on the GeeksforGeeks main page and help other Geeks.

Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.