Monday, May 20, 2013

Using awk to work on data files

In order to average numbers in a file we can use awk as so:
$ cat meyer-heavy.txt | awk '{ sum+=$1;count++ } END {print sum/count}'
To find the largest value:
$ cat meyer-heavy.txt | awk 'BEGIN{max=-9999} $1 > max {print $1; max=$1}' 
Improved version (only prints largest value, not intermediate ones): 
$ cat meyer-heavy.txt | awk 'BEGIN{max=-9999} $1 > max {max=$1} END{print max}'
To find the smallest value:
$ cat meyer-heavy.txt | awk 'BEGIN{min=9999} $1 < min { print $1; min=$1 }' 
Fixed version (doesn't assign blank lines as min value): 
$ cat meyer-heavy.txt | awk 'BEGIN{min=9999} /\-[0-9]+/&&$1 < min { min=$1 } END{print min}'
Where $1 is for column 1, use $2 for column 2 etc...

No comments:

Post a Comment