The sandbox is really nice to work with;
With that said, a few tidbits that helped me that i want to share:
- There is a shell access from Ambari, the UI, but sometimes you want to access via ssh;
Dont do this:
$ ssh root@127.0.0.1:2222
ssh: Could not resolve hostname 127.0.0.1:2222: nodename nor servname provided, or not known
Do that instead:
urbanlegends-2:~$ ssh -p 2222 root@127.0.0.1
Password should be 'hadoop'.
- If you want to use Hive, and you are installing HDP from scratch, surprise, you cannot use Beeswax (as the time of this writing, Oct, 2013), it is not integrated yet ..
So you will need to install Beeswax separately from Ambari.
Documentation is not complete, and you will need to download (via yum install beeswax).
- adding a jar for a Serde;
Even though you add the jar in the Hue UI File Browser, the jar location may not be picked up properly when using Hive at the command line. And Hue hides the actual path from you ..
Workaround: run your select statement from Beeswax. adding the jar resource in Beeswax. It will then tell you where the jar was added in the log.
I.e. : Added resource: /tmp/hue_3792@sandbox_201310151419_resources/hive-contrib-0.11.0.2.0.5.0-67.jar
- installation of Hue:
Documentation:
1. After creation of Hue user
(
3. Create a Hue
user and either deploy Hue in that user's home directory or under the /usr/share directory.
) documentation omits to say that you need to actually
download and install hue..
i.e. this step, mentioned in HDP 1.3 , was forgotten in HDP
2.0:
2. After running the daemon,
via /usr/lib/hue/build/env/bin/supervisor
The IP address needs to remain 0.0.0.0 and the port needs to
be a free port (check via netstat). Then the daemon should say something like:
“
and you can check the UI on the browser. “Desktop” refers to
the Hue server (generally the same management node as Ambari).
A few notes:
- Installing g++ : you actually need to install gcc-c++.
i.e. yum install gcc-c++ .
- You can install multiple yum packages at once (in fact,
all of the ones listed in the HDP doc) but putting their name all on the same
yum install line.
But actually
Hue Integration: as of HDP 2.0, Ambari and Hue are not integrated
together. Therefore their users need to be duplicated in each system. You can
integrate Hue and Ambari with LDAP(Active directory) , if that is done
enterprise users who have access to have
sso in ambari and hue.
linux boxes will be
able to have sso in ambari and hue.
Hue Security: You
need to ensure all users created in Hue have access to create Hive jobs. If
not, It could be because you do not have /user/<username> directories in
HDFS. You have to create user in hdfs before you can use hue , as you need
.staging directory for executing map reduce jobs.
Beeswax settings:
If there is a specific serde jar which you have to use every time and by all
user , you can put same in /usr/lib/hive/lib and restart hue. It will include
the directory in class path while starting beeswax. Check beeswax_server.out
for more details.
Nice post! For information, the Hue website is gethue.com. You can see the latest updates, docs and help there.
ReplyDeleteThanks Romain.
ReplyDelete