Career Opportunities in Big Data

This blog represents a high level view of career opportunities that are existing in the Big Data Domain and basic skill requirements.Some of the designations and responsibilities are mentioned here.

Role – Data Scientist

  • The big data scientist needs to be familiarized with some of languages among Python, R, Java, Ruby, Clojure, Matlab, Pig or SQL.
  • They need to have an understanding of Hadoop, Hive and/or MapReduce.
  • In addition need to be familiar with disciplines such as:
    • Natural Language Processing: the interactions between computers and humans;
    • Machine learning: using computers to improve as well as develop algorithms;
    • Conceptual modeling: to be able to share and articulate modelling;
    • Statistical analysis: to understand and work around possible limitations in models;
    • Predictive modelling: most of the big data problems are towards being able to predict future outcomes

Role – Big Data Engineer / BigData Developer / BigData Architect



  • Step by step approach for a software Engineer who is expert in Java / C / C++ => HADOOP (APIs, MR Coding, Ecosystem & Admin ) => HIVE/PIG/IMPALA/ML => OOZIE Plus Monitoring.
  • Architect, Design & Develop BigData based software from scratch / Upgrade / Maintain.
  • Step by step approach for a software Engineer who is expert in ORACLE / PL/SQL/ MS SQL / TERRADATA / DATA WAREHOUSING => HADOOP (APIs, MR Coding, Ecosystem & Admin ) => HIVE/PIG/IMPALA/ML => OOZIE Plus Monitoring tools.
  • Architect, Design & Develop BigData based data ware house

Role – Big Data DBA

  • Design and Development of Data modelling.
  • Hadoop ecosystem installation and configuration
  • DR / Cluster to Clysters – Database backup and recovery.
  • Database connectivity and security.
  • Performance monitoring and tuning ; Configuration based
  • Disk space management.
  • Software patches and upgrades for Unix as well as Hadoop

Role – Big Data Admin/Hadoop Administrator

    • Good Linux and shell Scripting background
    • Good knowledge of Hadoop Ecosystem and technologies.
    • Understanding of Hadoop design principals and factors that affect distributed system performance, including hardware and network considerations.
    • Experience in providing Infrastructure Recommendations, Capacity Planning and develop utilities to monitor cluster better
    • Experience around managing large clusters with huge volumes of data
    • Experience with cluster maintenance tasks such as creation and removal of nodes, cluster monitoring and troubleshooting. Manage and review Hadoop log files?
    • Experience installing and implementing security for Hadoop clusters.


Role : BigData – Hadoop operations / Production Support / Operations

      • Good Linux and shell Scripting background
      • Good knowledge of Hadoop Ecosystem and technologies.
      • Cluster maintenance
      • Job Management / Job failures / Investigation / Restart
      • Autosys / Oozie integrationData analysis – Data recovery
      • Cluster to Cluster data movement
      • Escalations
      • Operations management.

This article is contributed by Sujay Chungath;A Java/J2EE/Big Data Architect and Founder at Netscientium.com , a knowledge sharing platform built by accomplished Java/J2EE architects and management experts providing high tech IT trainings mainly in Big Data (Hadoop, Spark, Scala, Storm), Big Data workshops, Angular JS, Java Script, iOS Swift in class room and online models in India and USA. Netscientium is part of Netscitus Corporation.

If you also wish to showcase your blog here, please see GBlog for guest blog writing on GeeksforGeeks.



My Personal Notes arrow_drop_up


Article Tags :

1


Please write to us at contribute@geeksforgeeks.org to report any issue with the above content.