Machine Learning in C++

Most of us have C++ as our First Language but when it comes to something like Data Analysis and Machine Learning, Python becomes our go-to Language because of its simplicity and plenty of libraries of pre-written Modules.
But can C++ be used for Machine Learning too? and If yes, then how?

Pre-requisites:

  1. C++ Boost Library:- It is a powerful C++ library used for various purposes like big Maths Operations, etc.
    You can refer here for installation of this Library
  2. ML pack C++ Library:- This is a small and Scalable C++ Machine Learning Library.
    You can refer here for the installation of this Library.
    Note: set USE_OPENMP=OFF when installing mlpack, don’t sweat, given link has guide on how to do that
  3. Sample CSV Data File:- As MLpack library does not have any inbuilt Sample Dataset so we have to use our own Sample Dataset.

Our Model

The Code we are writing takes a simple dataset of vectors and finds the nearest neighbour for each data point.

The Training Part has been highlighted

Input : Our Input is a file named data.csv containing a dataset of vectors
The File Contains the Following Data:
3, 3, 3, 3, 0
3, 4, 4, 3, 0
3, 4, 4, 3, 0
3, 3, 4, 3, 0
3, 6, 4, 3, 0
2, 4, 4, 3, 0
2, 4, 4, 1, 0
3, 3, 3, 2, 0
3, 4, 4, 2, 0
3, 4, 4, 2, 0
3, 3, 4, 2, 0
3, 6, 4, 2, 0
2, 4, 4, 2, 0

Code:

filter_none

edit
close

play_arrow

link
brightness_4
code

#include <mlpack/core.hpp>
#include <mlpack/methods/neighbor_search/neighbor_search.hpp>
  
using namespace std;
using namespace mlpack;
// NeighborSearch and NearestNeighborSort
using namespace mlpack::neighbor;
// ManhattanDistance
using namespace mlpack::metric;
  
void mlModel()
{
    // Armadillo is a C++ linear algebra library; 
    // mlpack uses its matrix data type.
    arma::mat data;
  
    /*
    data::Load is used to import data to the mlpack, 
    It takes 3 parameters,
        1. Filename = Name of the File to be used
        2. Matrix = Matrix to hold the Data in the File
        3. fatal = true if you want it to throw an exception
         if there is an issue
    */
    data::Load("data.csv", data, true);
  
    /*
    Create a NeighborSearch model. The parameters of the 
    model are specified with templates:
        1. Sorting method: "NearestNeighborSort" - This 
        class sorts by increasing distance.
        2. Distance metric: "ManhattanDistance" - The 
        L1 distance, the sum of absolute distances.
        3. Pass the reference dataset (the vectors to 
        be searched through) to the constructor.
     */
    NeighborSearch<NearestNeighborSort, ManhattanDistance> nn(data);
    // in the above line we trained our model or 
    // fitted the data to the model
    // now we will predict
  
    arma::Mat<size_t> neighbors; // Matrices to hold
    arma::mat distances; // the results
  
    /*
    Find the nearest neighbors. Arguments are:-
        1. k = 1, Specify the number of neighbors to find
        2. Matrices to hold the result, in this case, 
        neighbors and distances
    */
    nn.Search(1, neighbors, distances);
    // in the above line we find the nearest neighbor
  
    // Print out each neighbor and its distance.
    for (size_t i = 0; i < neighbors.n_elem; ++i)
    {
        std::cout << "Nearest neighbor of point " << i << " is point "
                  << neighbors[i] << " and the distance is " 
                  << distances[i] << ".\n";
    }
}
  
int main()
{
    mlModel();
    return 0;
}

chevron_right


Run the above code in Terminal/CMD using

g++ knn_example.cpp -o knn_example -std=c++11 -larmadillo -lmlpack -lboost_serialization

followed by

./knn_example

Output:
Nearest neighbor of point 0 is point 7 and the distance is 1.
Nearest neighbor of point 1 is point 2 and the distance is 0.
Nearest neighbor of point 2 is point 1 and the distance is 0.
Nearest neighbor of point 3 is point 10 and the distance is 1.
Nearest neighbor of point 4 is point 11 and the distance is 1.
Nearest neighbor of point 5 is point 12 and the distance is 1.
Nearest neighbor of point 6 is point 12 and the distance is 1.
Nearest neighbor of point 7 is point 10 and the distance is 1.
Nearest neighbor of point 8 is point 9 and the distance is 0.
Nearest neighbor of point 9 is point 8 and the distance is 0.
Nearest neighbor of point 10 is point 9 and the distance is 1.
Nearest neighbor of point 11 is point 4 and the distance is 1.
Nearest neighbor of point 12 is point 9 and the distance is 1.


My Personal Notes arrow_drop_up

Check out this Author's contributed articles.

If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to contribute@geeksforgeeks.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.

Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.




Article Tags :
Practice Tags :


2


Please write to us at contribute@geeksforgeeks.org to report any issue with the above content.