A Bunch of Linux Baloney: Using awk to work on data files

Monday, May 20, 2013

In order to average numbers in a file we can use awk as so:

$ cat meyer-heavy.txt | awk '{ sum+=$1;count++ } END {print sum/count}'

To find the largest value:

~~$ cat meyer-heavy.txt | awk 'BEGIN{max=-9999} $1 > max {print $1; max=$1}'~~

Improved version (only prints largest value, not intermediate ones):

$ cat meyer-heavy.txt | awk 'BEGIN{max=-9999} $1 > max {max=$1} END{print max}'

To find the smallest value:

~~$ cat meyer-heavy.txt | awk 'BEGIN{min=9999} $1 < min { print $1; min=$1 }'~~

Fixed version (doesn't assign blank lines as min value):

$ cat meyer-heavy.txt | awk 'BEGIN{min=9999} /\-[0-9]+/&&$1 < min { min=$1 } END{print min}'

Where $1 is for column 1, use $2 for column 2 etc...

A Bunch of Linux Baloney