Apache HadoopΒΆ

  • First would be setting up the ssh, which you can refer here

  • Having the right version of Java

sudo apt-get install openjdk-8-jdk
java -version # 1.8.x.yyy
  • Download the hadoop 3.1.2 version

  • Extract to say `/opt/binaries/hadoop/

    tar xzf hadoop-3.1.2.tar.gz
    mv hadoop-3.1.2.tar.gz hadoop

References

  • https://data-flair.training/blogs/installation-of-hadoop-3-on-ubuntu/

  • https://towardsdatascience.com/a-gentle-introduction-to-apache-arrow-with-apache-spark-and-pandas-bb19ffe0ddae