mapper.sh

点击(此处)折叠或打开

  1. #mapper.sh
  2. #!/bin/sh
  3. wc -l

reducer.sh

点击(此处)折叠或打开

  1. #reducer.sh
  2. #!/bin/sh
  3. sum=0
  4. while read i
  5. do
  6.         let sum+=$i
  7.         echo $i
  8. done
  9. echo $sum
map统计模块的行数,reduce将各模块的统计相加。
cat aa.txt |./mapper.sh | sort | ./reducer.sh
结果会是:

点击(此处)折叠或打开

  1. 49724
  2. 49724
通过hadoop计算流程:
hadoop jar /home/hadoop/hadoop/hadoop-2.6.0/share/hadoop/tools/lib/hadoop-streaming-2.6.0.jar  -mapper mapper.sh -reducer reducer.sh -input /tmp/datalook -output /shelltest -file /home/hadoop/datacreate/mapper.sh -file /home/hadoop/datacreate/reducer.sh
结果如下(因为datalook比aa.txt行数大,为了看map)

点击(此处)折叠或打开

  1. [hadoop@master datacreate]$ hdfs dfs -cat /shelltest/part-00000
  2. 12201611    
  3. 12201611    
  4. 12201611    
  5. 12201611    
  6. 12201611    
  7. 12201611    
  8. 12201611    
  9. 12201611    
  10. 12201611    
  11. 12201611    
  12. 12201611    
  13. 12201611    
  14. 12201611    
  15. 12201611    
  16. 12201612    
  17. 12201612    
  18. 12201612    
  19. 12201612    
  20. 12201612    
  21. 12201612    
  22. 12201612    
  23. 12201612    
  24. 12201612    
  25. 12201612    
  26. 12201612    
  27. 12201612    
  28. 12201612    
  29. 12201612    
  30. 12201612    
  31. 12201612    
  32. 12201612    
  33. 12201612    
  34. 12201612    
  35. 12201612    
  36. 12201612    
  37. 12201612    
  38. 12201612    
  39. 12201612    
  40. 12201612    
  41. 12201612    
  42. 9175534    
  43. 497240000
总共497240000通过多次map得到结果.

文件分布情况

点击(此处)折叠或打开

  1. [hadoop@master datacreate]$ hadoop fsck /tmp/datalook/bb.txt -files -blocks -racks
  2. DEPRECATED: Use of this script to execute hdfs command is deprecated.
  3. Instead use the hdfs command for it.

  4. Connecting to namenode via http://master:50070
  5. FSCK started by hadoop (auth:SIMPLE) from /172.27.28.203 for path /tmp/datalook/bb.txt at Tue Jul 28 05:06:46 EDT 2015
  6. /tmp/datalook/bb.txt 5469640000 bytes, 41 block(s): OK
  7. 0. BP-617219782-172.27.28.203-1437641470957:blk_1073742111_1287 len=134217728 repl=1 [/default-rack/172.27.28.205:50010]
  8. 1. BP-617219782-172.27.28.203-1437641470957:blk_1073742112_1288 len=134217728 repl=1 [/default-rack/172.27.28.205:50010]
  9. 2. BP-617219782-172.27.28.203-1437641470957:blk_1073742113_1289 len=134217728 repl=1 [/default-rack/172.27.28.205:50010]
  10. 3. BP-617219782-172.27.28.203-1437641470957:blk_1073742114_1290 len=134217728 repl=1 [/default-rack/172.27.28.205:50010]
  11. 4. BP-617219782-172.27.28.203-1437641470957:blk_1073742115_1291 len=134217728 repl=1 [/default-rack/172.27.28.205:50010]
  12. 5. BP-617219782-172.27.28.203-1437641470957:blk_1073742116_1292 len=134217728 repl=1 [/default-rack/172.27.28.205:50010]
  13. 6. BP-617219782-172.27.28.203-1437641470957:blk_1073742117_1293 len=134217728 repl=1 [/default-rack/172.27.28.205:50010]
  14. 7. BP-617219782-172.27.28.203-1437641470957:blk_1073742118_1294 len=134217728 repl=1 [/default-rack/172.27.28.205:50010]
  15. 8. BP-617219782-172.27.28.203-1437641470957:blk_1073742119_1295 len=134217728 repl=1 [/default-rack/172.27.28.205:50010]
  16. 9. BP-617219782-172.27.28.203-1437641470957:blk_1073742120_1296 len=134217728 repl=1 [/default-rack/172.27.28.205:50010]
  17. 10. BP-617219782-172.27.28.203-1437641470957:blk_1073742121_1297 len=134217728 repl=1 [/default-rack/172.27.28.205:50010]
  18. 11. BP-617219782-172.27.28.203-1437641470957:blk_1073742122_1298 len=134217728 repl=1 [/default-rack/172.27.28.205:50010]
  19. 12. BP-617219782-172.27.28.203-1437641470957:blk_1073742123_1299 len=134217728 repl=1 [/default-rack/172.27.28.205:50010]
  20. 13. BP-617219782-172.27.28.203-1437641470957:blk_1073742124_1300 len=134217728 repl=1 [/default-rack/172.27.28.205:50010]
  21. 14. BP-617219782-172.27.28.203-1437641470957:blk_1073742125_1301 len=134217728 repl=1 [/default-rack/172.27.28.205:50010]
  22. 15. BP-617219782-172.27.28.203-1437641470957:blk_1073742126_1302 len=134217728 repl=1 [/default-rack/172.27.28.205:50010]
  23. 16. BP-617219782-172.27.28.203-1437641470957:blk_1073742127_1303 len=134217728 repl=1 [/default-rack/172.27.28.205:50010]
  24. 17. BP-617219782-172.27.28.203-1437641470957:blk_1073742128_1304 len=134217728 repl=1 [/default-rack/172.27.28.205:50010]
  25. 18. BP-617219782-172.27.28.203-1437641470957:blk_1073742129_1305 len=134217728 repl=1 [/default-rack/172.27.28.205:50010]
  26. 19. BP-617219782-172.27.28.203-1437641470957:blk_1073742130_1306 len=134217728 repl=1 [/default-rack/172.27.28.205:50010]
  27. 20. BP-617219782-172.27.28.203-1437641470957:blk_1073742131_1307 len=134217728 repl=1 [/default-rack/172.27.28.205:50010]
  28. 21. BP-617219782-172.27.28.203-1437641470957:blk_1073742132_1308 len=134217728 repl=1 [/default-rack/172.27.28.205:50010]
  29. 22. BP-617219782-172.27.28.203-1437641470957:blk_1073742133_1309 len=134217728 repl=1 [/default-rack/172.27.28.205:50010]
  30. 23. BP-617219782-172.27.28.203-1437641470957:blk_1073742134_1310 len=134217728 repl=1 [/default-rack/172.27.28.205:50010]
  31. 24. BP-617219782-172.27.28.203-1437641470957:blk_1073742135_1311 len=134217728 repl=1 [/default-rack/172.27.28.205:50010]
  32. 25. BP-617219782-172.27.28.203-1437641470957:blk_1073742136_1312 len=134217728 repl=1 [/default-rack/172.27.28.205:50010]
  33. 26. BP-617219782-172.27.28.203-1437641470957:blk_1073742137_1313 len=134217728 repl=1 [/default-rack/172.27.28.205:50010]
  34. 27. BP-617219782-172.27.28.203-1437641470957:blk_1073742138_1314 len=134217728 repl=1 [/default-rack/172.27.28.205:50010]
  35. 28. BP-617219782-172.27.28.203-1437641470957:blk_1073742139_1315 len=134217728 repl=1 [/default-rack/172.27.28.205:50010]
  36. 29. BP-617219782-172.27.28.203-1437641470957:blk_1073742140_1316 len=134217728 repl=1 [/default-rack/172.27.28.205:50010]
  37. 30. BP-617219782-172.27.28.203-1437641470957:blk_1073742141_1317 len=134217728 repl=1 [/default-rack/172.27.28.205:50010]
  38. 31. BP-617219782-172.27.28.203-1437641470957:blk_1073742142_1318 len=134217728 repl=1 [/default-rack/172.27.28.205:50010]
  39. 32. BP-617219782-172.27.28.203-1437641470957:blk_1073742143_1319 len=134217728 repl=1 [/default-rack/172.27.28.205:50010]
  40. 33. BP-617219782-172.27.28.203-1437641470957:blk_1073742144_1320 len=134217728 repl=1 [/default-rack/172.27.28.205:50010]
  41. 34. BP-617219782-172.27.28.203-1437641470957:blk_1073742145_1321 len=134217728 repl=1 [/default-rack/172.27.28.205:50010]
  42. 35. BP-617219782-172.27.28.203-1437641470957:blk_1073742146_1322 len=134217728 repl=1 [/default-rack/172.27.28.205:50010]
  43. 36. BP-617219782-172.27.28.203-1437641470957:blk_1073742147_1323 len=134217728 repl=1 [/default-rack/172.27.28.205:50010]
  44. 37. BP-617219782-172.27.28.203-1437641470957:blk_1073742148_1324 len=134217728 repl=1 [/default-rack/172.27.28.205:50010]
  45. 38. BP-617219782-172.27.28.203-1437641470957:blk_1073742149_1325 len=134217728 repl=1 [/default-rack/172.27.28.205:50010]
  46. 39. BP-617219782-172.27.28.203-1437641470957:blk_1073742150_1326 len=134217728 repl=1 [/default-rack/172.27.28.205:50010]
  47. 40. BP-617219782-172.27.28.203-1437641470957:blk_1073742151_1327 len=100930880 repl=1 [/default-rack/172.27.28.205:50010]

Status: HEALTHY
 Total size:5469640000 B
 Total dirs:0
 Total files:1
 Total symlinks:0
 Total blocks (validated):41 (avg. block size 133405853 B)
 Minimally replicated blocks:41 (100.0 %)
 Over-replicated blocks:0 (0.0 %)
 Under-replicated blocks:0 (0.0 %)
 Mis-replicated blocks:0 (0.0 %)
 Default replication factor:1
 Average block replication:1.0
 Corrupt blocks:0
 Missing replicas:0 (0.0 %)
 Number of data-nodes:1
 Number of racks:1
FSCK ended at Tue Jul 28 05:06:46 EDT 2015 in 4 milliseconds




The filesystem under path '/tmp/datalook/bb.txt' is HEALTHY






















09-25 11:36