java - Multiple Input Files In one Mapper Class Hadoop? -
so,i trying code fp-tree algorithm in mapreduce paradigm, creating frequent item set list , have following problem :
input:
file1.txt (contains transactions)
123 452 221 12 45 76 987 77 76 123 354 [each line contains items bought in 1 transaction]
file2.txt (contains items bought in descending order)
12 123 6 221 5 77 4 354 [count] [item id]
output :
output.txt 123 221 123 77 354 [2nd transaction eliminated]
items according count taken in ( descending ), others deleted
is possible take both file1.txt , file2.txt 1 mapper class ? because solve problem
or there way perform operation in way ?
any appreciated.
look mapreduce distributed cache example.may full
http://myhadoopexamples.com/2014/04/16/hadoop-map-side-join-with-distributed-cache-example/
read file in setup method.the above link give guidance.
Comments
Post a Comment