apache pig - the number of vowels in a file -


can me this? much. , code:

g = load 'input.txt' (line:chararray); b = foreach g generate flatten(strsplit(lower(line), '(?<=.)(?=.)')) s:chararray; c = foreach b generate flatten(tobag(*)) letter; result = filter c ( letter == 'a' or  letter == 'e' or letter == 'i' or letter == 'o' or letter == 'u' ); e = group result letter; f = foreach e generate group, count(result) ; dump f; 

first tokenize line words , characters words.use replace slice characters in words.instead of using tobag(*),use tokenize split characters along replaced delimiter.filter aeiou,then group character , counts.

pigscript

a = load 'test4.txt' (line:chararray); b = foreach generate  flatten(tokenize(line)) words; c = foreach b generate  flatten(tokenize(replace(lower(words),'','|'),'|')) letter; d = filter c (letter == 'a' or  letter == 'e' or letter == 'i' or letter == 'o' or letter == 'u' ); e = group d letter; f = foreach e generate group letter,count(d.letter) total; dump f; 

output

output


Comments

Popular posts from this blog

javascript - How to get current YouTube IDs via iMacros? -

c# - Maintaining a program folder in program files out of date? -

emulation - Android map show my location didn't work -