awk - How to count occurrences no matter its case? -
table
chr10 10482 10484 0 11 + ca chr10 10486 10488 0 12 + ca chr10 10487 10489 0 13 + ca chr10 10490 10492 0 13 + ca chr10 10491 10493 0 12 + ct chr10 10494 10496 6.66667 15 + ca chr10 10495 10497 6.66667 15 + cc
i count number of lines in column 7 "ca" can found regardless of of 2 letters being in upper or lower case.
the desired output 5.
the 2 commands (below) give empty output
cat table | awk ' $7 ==/^[cc][aa]/{++count} end {print count}' awk 'begin {ignorecase = 1} $7==/"ca"/ {++count} end {print count}' table
the below command returns value of 1
awk 'begin {ignorecase = 1} end {if ($7=="ca"){++count} {print count}}' table
note: actual table tens of millions of lines long, not want write table intermediate in order count. (i need repeat task other files too).
there little problem in syntax: either var == "string"
or var ~ regexp
, saying var ~ /"string"/
. using correct combination makes command work:
$ awk '$7 ~ /^[cc][aa]/{++count} end {print count+0}' file 5 $ awk 'begin {ignorecase = 1} $7=="ca" {++count} end {print count+0}' file 5
also, may want use toupper()
(or tolower()
) check this, instead of using ignorecase
flag:
awk 'toupper($7) == "ca" {++count} end {print count+0}' file
note trick print count + 0
instead of count
. way, cast variable 0
if wasn't set before. this, print 0
whenever there no matches; if print count
, return empty string.
Comments
Post a Comment