140 likes | 252 Vues
This guide provides essential information for new users of Apache Bigtop, focusing on cluster administration and setup in cloud environments. Key steps include registration for the Bigtop mailing lists, installation of Bigtop on AWS for cloud computing tasks, and a series of hands-on labs to familiarize users with Bigtop components. Labs cover setting up environments, creating patches, running tests, and deploying modules, emphasizing the need for active participation and documentation of findings. Ensure you stay engaged for updates and community support!
E N D
Apache BigtopWorking Group Cluster stuff
BigtopAdministration • Make sure you are signed up on the bigtop-dev mailing list. Lots of info which will never get repeated if you miss it • Bigtop-user, bigtop-dev • Sign up for jira • Maybe the component mailing lists also. Not Hadoop, whirr, etc…
Newbie Slide • Structure: • Registration, Join Biocurious. Pays for space nobody takes a cut of this • Registration = AWS Credits. Cancelling IntelliJ. • Do labs, labs too easy • Lab 1 Modified to take 1-2 weeks. Update the wiki with your findings • Lab 2 Build Bigtop 0.3.0; too easy now (0.2.0 was harder, better, compile-native JAVA_HOME bug was good, current dl bug is good) • Lab 2a Create a patch, apply it, do a build to verify it works • Lab 3 setup development environment, setup a maven project using gmaven and surefire • Lab 4 map reduce program • Lab 5 Run the unit tests under the component downloads • Lab 6 Run the integration tests • Lab 7 Puppet, deploy and run • Lab 8 Port a module • Labs are changing; not a class. Cant show up and expect to learn by listening to what others did. Requires time commitment • Demo, doesn’t need to be working; for your benefit not ours
Lab 1 • Install bigtop. Web search for apache bigtop, go to wiki link http://incubator.apache.org/bigtop/ • https://cwiki.apache.org/confluence/display/BIGTOP/Index • https://cwiki.apache.org/confluence/display/BIGTOP/How+to+install+Hadoop+distribution+from+Bigtop
Lab 1 • Install bigtop, run all the components, Hive/Hbase/Pig/Hadoop/Mahout/Oozie • There are bugs, document them • Add the sample programs in quickstart to the wiki. Not all are included yet
Lab 1 • Update the wiki • Sqoop open (User group meeting next week) • Flume/Flume NG (open/nothing) • Zookeeper(open/nothing)
Hadoop Components • Old: Don’t stop at running Pi as test of HDFS • Still missing: Run Terasort in Hadoop, need cluster • https://cwiki.apache.org/confluence/display/BIGTOP/How+to+install+Hadoop+distribution+from+Bigtop • Whirr may need patch depending on where you run it from
Mahout • Don’t run jar like in Hadoop • Scripts handle downloading and clustering, email demo, etc.. Under /examples/bin. • Bigtop puts example/bin under /usr/share/doc/mahout. Is this correct? Not documentation • Add documentation to wiki • Ticket filed
Oozie • Oozie runs, forget the error message, set to highest version
Flume/Flume NG • New patch checkinfor Flume NG • Testing
Whirr • sudo apt-get install whirr • Run as: whirr launch-cluster --config /udt/lib/whirr/recipes/mahout-ec2.properties • If successful will see directory under ~/.whirr • whirr.log • mvn clean install
Puppet • sudo apt-get install puppet facter fails