# Random Forest Regression in Python

A Random Forest is an ensemble technique capable of performing both regression and classification tasks with the use of multiple decision trees and a technique called Bootstrap Aggregation, commonly known as bagging. The basic idea behind this is to combine multiple decision trees in determining the final output rather than relying on individual decision trees.
Approach :

• Pick at random K data points from the training set.
• Build the decision tree associated with those K data points.
• Choose the number Ntree of trees you want to build and repeat step 1 & 2.
• For a new data point, make each one of your Ntree trees predict the value of Y for the data point, and assign the new data point the average across all of the predicted Y values.

Below is the step by step Python implementation.
Step 1 : Import the required libraries.

 `# Importing the libraries ` `import` `numpy as np ` `import` `matplotlib.pyplot as plt ` `import` `pandas as pd `

Step 2 : Import and print the dataset

 `data ``=` `pd.read_csv(``'Salaries.csv'``) ` `print``(data) ` Step 3 : Select all rows and column 1 from dataset to x and all rows and column 2 as y

 `x = data.iloc[:, 1:2].values  ` `print(x) ` `y = data.iloc[:, 2].values   `  Step 4 : Fit Random forest regressor to the dataset

 `# Fitting Random Forest Regression to the dataset ` `# import the regressor ` `from` `sklearn.ensemble ``import` `RandomForestRegressor ` ` `  ` ``# create regressor object ` `regressor ``=` `RandomForestRegressor(n_estimators ``=` `100``, random_state ``=` `0``) ` ` `  `# fit the regressor with x and y data ` `regressor.fit(x, y)   ` Step 5 : Predicting a new result

 `y_pred ``=` `regressor.predict(``6.5``)  ``# test the output by changing values `

Step 6 : Visualising the result

 `# Visualising the Random Forest Regression results ` ` `  `# arange for creating a range of values ` `# from min value of x to max  ` `# value of x with a difference of 0.01  ` `# between two consecutive values ` `X_grid ``=` `np.arange(``min``(x), ``max``(x), ``0.01``)  ` ` `  `# reshape for reshaping the data into a len(X_grid)*1 array,  ` `# i.e. to make a column out of the X_grid value                   ` `X_grid ``=` `X_grid.reshape((``len``(X_grid), ``1``)) ` ` `  `# Scatter plot for original data ` `plt.scatter(x, y, color ``=` `'blue'``)   ` ` `  `# plot predicted data ` `plt.plot(X_grid, regressor.predict(X_grid),  ` `         ``color ``=` `'green'``)  ` `plt.title(``'Random Forest Regression'``) ` `plt.xlabel(``'Position level'``) ` `plt.ylabel(``'Salary'``) ` `plt.show()` My Personal Notes arrow_drop_up Check out this Author's contributed articles.

If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to contribute@geeksforgeeks.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.

Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.

Article Tags :
Practice Tags :

Be the First to upvote.

Please write to us at contribute@geeksforgeeks.org to report any issue with the above content.