awk - How to count occurrences no matter its case? -
table
chr10 10482 10484 0 11 + ca chr10 10486 10488 0 12 + ca chr10 10487 10489 0 13 + ca chr10 10490 10492 0 13 + ca chr10 10491 10493 0 12 + ct chr10 10494 10496 6.66667 15 + ca chr10 10495 10497 6.66667 15 + cc i count number of lines in column 7 "ca" can found regardless of of 2 letters being in upper or lower case.
the desired output 5.
the 2 commands (below) give empty output
cat table | awk ' $7 ==/^[cc][aa]/{++count} end {print count}' awk 'begin {ignorecase = 1} $7==/"ca"/ {++count} end {print count}' table the below command returns value of 1
awk 'begin {ignorecase = 1} end {if ($7=="ca"){++count} {print count}}' table note: actual table tens of millions of lines long, not want write table intermediate in order count. (i need repeat task other files too).
there little problem in syntax: either var == "string" or var ~ regexp, saying var ~ /"string"/. using correct combination makes command work:
$ awk '$7 ~ /^[cc][aa]/{++count} end {print count+0}' file 5 $ awk 'begin {ignorecase = 1} $7=="ca" {++count} end {print count+0}' file 5 also, may want use toupper() (or tolower()) check this, instead of using ignorecase flag:
awk 'toupper($7) == "ca" {++count} end {print count+0}' file note trick print count + 0 instead of count. way, cast variable 0 if wasn't set before. this, print 0 whenever there no matches; if print count, return empty string.
Comments
Post a Comment