Loop through an array in Bash

The Bash shell, which is available on Unix, Linux, OS X, and soon Microsoft Windows as well as Microsoft has announced support for Bash in the Windows 10 Annivesary Update expected to ship in the summer of 2016, supports arrays , a commonly used programming data type for storing collections of elements. And with the for loop that is also available for Bash, you can iterate over the items in an array that you create.

E.g., I have a comma-separated values (CSV) file that contains information on pending work requests. Each line in the file contains a "Project" field. I want to count the number of pending requests per project. I can easily do so with a Bash script using utilities commonly found on Linux, Unix, and Apple's OS X systems.

# If no arguments appear on the command line, display usage information
if [ $# -eq 0 ]
then
   echo "Usage: ./count_requests filename.csv"
   exit
else
   fn=$1
fi

grep -v "ProjectName" $fn |cut -d"," -f8 | cut -d'"' -f 2 | sort | uniq -c | sort -n

# Provide a total number for all requests in the pending removal state
echo "-----"
grep -v "ProjectName" $fn | wc -l | cut -c 5-8

I can run the script, count_requests from a Bash Bash shell prompt specifying the name of the CSV file containing the data with ./count_requests filename where filename is the location and name of the CSV file. I check to make sure that filename argument was included on the command line and print a usage message if the file name argument was omitted. Otherwise the variable "fn" is set to be the first argument on the command line.

Since the first line of the CSV file is a header with one of the fields in the header being "ProjectName", I filter out that line with the grep command specifying with the -v option that I want to ignore any line containing "ProjectName". Since it is a CSV file with fields separated by commas, I can pipe the file's contents, excluding the header line, into the cut utility with the comma character set as the delimiter with -d",". Since the project name field is the 8th comma-separated field on a line, I can extract just that field with -f8. I can then remove the double quotes that are around the project name in the CSV file by piping the results into another cut command that specifies the delmiter as the comma character and outputs the 2nd field on the line which is just the project name without the surrounding double quotes. I can then use the sort utility to sort all of the lines alphabetically then pipe its output into the uniq command which, with the -c option will eliminate all the duplicates of a project name, but give me the count of the number of occurrences of each unique project name. That will give me output like the following:

   4 Wind
   5 IPAM
   5 SDO
   8 MMOC
  15 MMS

Finally, I can sort the above output from uniq numerically with the sort command by using sort -n, so that 15 will be listed after any occurence of a line that has 2 for the count. After the count for each project is output, I count the total number of requests by piping the output of the grep command that excludes the header line into the wc command that counts the number of lines. So, I'll then have output similar to the following, which shows there are 330 pending work requests:

   1 CD Manager
   1 DSN
   1 EO1
   1 Enterprise Services
<text snipped>
   5 SD1
   8 MMOC
   9 MAVE
   9 SLAC
  11 IO PM
  12 DISCOVER
  15 MMS
  18 LADE
  21 TRMM
  26 MMOC
  54 GLASS
  86 IPnoc
-----
 330

Most of those pending requests are for internal projects rather than for external projects, however, and I'd like to know how many of the 330 requests are for internal projects. I can determine that number using Bash's "array" and "for loop" capabilities by adding the following code to the bottom of the above script:

# Internal requests

intrequests=( "CD Manager" "Enterprise Services" "IO PM" "IPnoc" "SECTION1" )

echo ""
echo "Internal Requests"
echo ""

for i in "${intrequests[@]}"
do :
   grep "$i" $fn | cut -d"," -f8 | cut -d'"' -f 2  | sort | uniq -c
done

The array is created by setting a variable, intrequests equal to the contents of the values between the parentheses. I can then loop through that array with for i in "${intrequests[@]}" putting the steps that I want performed within the loop between do : and done. The result will be additional lines of output where the counts are displayed just for the internal projects. I.e., I will see something like the following displayed:

Internal Requests

   1 CD Manager
   1 Enterprise Services
   5 IPAM
  86 IPnoc
   8 Section1

But I'd also like the script to add the number of internal requests for me, though in this example I can add them fairly quickly in my head. But to have the script perform the calculation, I'll read the file again and loop through the array again, but this time add the count for each internal project to a variable, total. Since the file will be at most a few hundred lines, I'm not concerned about optimizing the performance, so I'm reading the file a few times, but that happens so quickly that I'm not concerned about reading it more than once. Since I know the number on each line will be in columns 1 to 4, I have a cut -c 1-4 as the last operation to extract just the count for each project without including the project name on the line that determines the count for each element in the array intrequests.

# Display total count for internal requests

echo "-----"
total=0
for i in "${intrequests[@]}"
do :
   count=`grep "$i" $fn | cut -d"," -f8 | cut -d'"' -f 2  | sort | uniq -c  | cut -c 1-4`
let "total = $total + count"
done
echo $total

The output for the internal requests will then look like the following:

Internal Requests

   1 CD Manager
   1 Enterprise Services
   5 IPAM
  86 IPnoc
   8 Section1
-----
101

References:

  1. Bash Variables Are Untyped
    The Linux Documentation Project

 

TechRabbit ad 300x250 newegg.com

Justdeals Daily Electronics Deals1x1 px