WARNING: Running spark-class from user-defined location. WARNING: User-defined SPARK_HOME (/usr/lib/spark) overrides detected (/usr/lib/spark/). Setup Downloads]$Īssuming it does, you can now launch the shell Downloads]$ /u01/bdd/v1.2.0/BDD-1.2.0.31.813/bdd-shell/bdd-shell.sh SPARK_EXTRA_CLASSPATH=/usr/lib/oozie/oozie-sharelib-yarn/lib/spark/spark-avro_2.10-1.1.0-cdh5.7.0.jar In the same configuration file, add/amend:
#Install jupyter notebook on vm install
SPARK_EXECUTOR_PYTHON=/u01/anaconda2/bin/pythonĪmend the path if you didn't install Anaconda into /u01 Now edit the BDD Shell configuration file ( /u01/bdd/v1.2.0/BDD-1.2.0.31.813/bdd-shell/nf) in your favourite text editor to add/amend the following lines: Do you wish the installer to prepend the Anaconda2 install location I opted not to (which is the default) since Python is used elsewhere on the system and by prepending it it'll take priority and possibly break things. > /u01/anaconda2Īfter a few minutes of installation, you'll be prompted to whether you want to prepend Anaconda's location to the PATH environment variable. bash is part of the command to enter)Īccept the licence when prompted, and then select a install location - I used /u01/anaconda2 where the rest of the BigDataLite installs are Anaconda2 will now be installed into this location: The first step is to download Anaconda, which is a distribution of Python that also includes " over 100 of the most popular Python, R and Scala packages for data science" as well as Jupyter notebook, which we'll see in a moment. Login to BigDataLite 4.5 (oracle/welcome1) and open a Terminal window. You can find the BDD Shell installation document here.
#Install jupyter notebook on vm how to
In this article we'll see how to configure BDD Shell on Big Data Lite 4.5 (along with Jupyter Notebooks), and in a subsequent post dive into how to actually use them. Version 4.5 was released recently, which included BDD 1.2. The Big Data Lite virtual machine is produced by Oracle for demo and development purposes, and hosts all the components that you'd find on the Big Data Appliance, all configured and integrated for use.
With the ability to push back to Hive and thus BDD data modified in this environment, this is important functionality that will make BDD even more useful for navigating and exploring big data. This exposes the datasets and BDD functionality in a Python and PySpark environment, opening up huge possibilities for advanced data science work on BDD datasets, particularly when used in conjunction with Jupyter Notebooks. New in Big Data Discovery 1.2 is the addition of BDD Shell, an integration point with Python.