Programming Lab 1: Word Count
- We will learn how to setup development environment for Hadoop projects
- Run “Word Count ” applications
- Create new application to count letters in text documents
Prerequisites
- Java 1.6
- Hadoop and log4j libraries
- NetBeans or Eclipse
Create Project and Link with Libraries
- Copy provided libraries and java code from USB drives
- Create project in NetBeans or Eclipse (specific instructions on the next page)
- Link with libraries
- Create new class and copy provided code
- Modify input and output directory
- Run code and examine result
Create Project with NetBeans
- Click on File -> New Project
- Select Java Application type
- Set main class to WordCount
- Name your project HadoopLab1WordCount
- Follow instructions on the screen
- See the next page for a screenshot
- Open a file from the provided material in ProgLabs/lab1/original
- Copy everything except the package name
- Insert code into newly created class right after the package name
Let’s link with appropriate libraries
- Right click on project name and select properties
- Pick Libraries option and click on the Compile tab
- Click on Add JAR/Folder button
- Add everything from ProgLabs/lib folder
- NetBeans will re-evaluate dependencies and you should not see any errors at this point
- Adjust input/output values in run method to ProgLabs/lab1/<<input|output>> accordingly
Link the project with appropriate libraries
- Right click on project name and select properties
- Pick Libraries option and click on Compile tab
- Click on Add JAR/Folder button
- Add everything from ProgLab/lib folder
- NetBeans will reevaluate dependencies and you should not see any errors at this point
- Adjust the input/output values in run method to ProgLabs/lab1/<<input|output>> accordingly
- Right-click and run!
Create Project with Eclipse
- Create New Project: Click on File -> New Project
- Name project HadoopLab1WordCount
- Select Java 1.6
- Click Next and add libraries from ProgLabs/lib
- Create a new class – WordCount
- Right-click on the project -> select New -> Class
- Copy code from Labs/ProgLabs/lab4/original into new class
- Right click and run it!
Lets examine output directory
You will see two files: one indicating result of the Map Reduce job and the second one containing result of the job
Result
Adam 1
Brandon 1
Graig 1
Kim 1
Marty 1
Mike 1
Nancy 1
Nick 1
Nishani 1
Steve 2
Tracy 2
Vidur 1
Exercise: Let’s Count Letters!
- Modify word count application to count letters in the document
- Create another class that implements Reducer and switch application to use it in run method
- Hint:
String ch = String.valueOf(line.charAt(i));
No comments:
Post a Comment