1 / 12

Labs 2: Palabras

Labs 2: Palabras. Palabras Archiecture. …. Master1. Master2. Master3. MasterN. Slave1 Slave2 Slave3 … SlaveN. Directory. Jobs3. Jobs1. Jobs2. JobsM. …. Slave1. Slave2. Slave3. SlaveM. Step 1: Get Started. Login: Username: nombre cc5212 Password on board

dylan
Télécharger la présentation

Labs 2: Palabras

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Labs 2: Palabras

  2. PalabrasArchiecture … Master1 Master2 Master3 MasterN Slave1 Slave2 Slave3 … SlaveN Directory Jobs3 Jobs1 Jobs2 JobsM … Slave1 Slave2 Slave3 SlaveM

  3. Step 1: Get Started • Login: • Username: nombre\cc5212 • Password on board • http://aidanhogan.com/teaching/cc5212-1/mdp-lab2.zip • C:/Program Files (x86)/eclipse/ (in Spanish ) • File > Import > … • http://aidanhogan.com/teaching/cc5212-1/mdp-lab2-data/

  4. Step 2: Run Locally • ~600.000 abstracts • ~52.340.000 non-unique words • ~320 MB uncompressed • org.mdp.cli.RunWordCountLocally • Right Click > Run As > Run Configurations > Arguments • -i<path>/abstracts-es.txt.gz -igz –k 500 How long will it take? Will it even run? -Xmx256M

  5. Step 3: Start the Directory • I start the directory! • vm116.dcc.uchile.cl (172.17.69.190) • Port 1985 Remind me to set heap-space

  6. Step 4: Prepare Slave org.mdp.cli.StartWordCountSlave • Implement openDirectoryStub() • Add the slave’s name to the directory • Review the other code

  7. Step 5: Run Slave Build the .jar using build.xml(dist) Open cmd and go to directory java –jar –Xmx256M mdp-2.jar StartWordCountSlave –dn vm116.dcc.uchile.cl –dp 1985 –sn <username>

  8. Step 6: Prepare Master org.mdp.cli.StartWordCountMaster • Connect to the directory • Get the list of slaves from the directory • Clear words from the slave for you • Choose a slave for each word • Send the add-words job to each slave

  9. Step 7: Run Master • For small dataset! • org.mdp.cli.StartWordCountMaster • Right Click > Run As > Run Configurations > Arguments • -i<path>\es-abstracts-10k.txt.gz -igz-dp 1985 -dn vm116.dcc.uchile.cl -mn <username> -k 500

  10. Step 8: Run Big Master  • For big dataset! • org.mdp.cli.StartWordCountMaster • Right Click > Run As > Run Configurations > Arguments • -i <path>\es-abstracts.txt.gz -igz-dp 1985 -dn vm116.dcc.uchile.cl -mn<username> -k 500

  11. Step 9: Run Distribution Locally • Start a directory server • Build and use the jar • java -jar mdp-2.jar StartRegistryAndServer -n localhost-p 1985 -r -s 1 -sp • Start 4 slaves (give different names) in four different CMD windows • Use the jar • java -jar mdp-2.jar StartSlave -dnlocalhost-dp1985 –wn <usernameN> • Start a master • Can use Eclipse or jar (as preferred) • Point it to local directory • Use small file (large file if successful) -Xmx256M

  12. Final Step: Teach Me Spanish Ask me words in the top 500! 

More Related