Home‎ > ‎CIS 98‎ > ‎

Tools of the Trade

Often times a shell script will have to analyze information in a configuration file or in a text-based database file. The shell does not have built in features for parsing such data, so shell scripts rely on UNIX commands to parse and trransform data from one format to another. In this lecture, you will learn about five of these commands in addition to sed and grep that we covered in the last lecture.


cut, paste, tr, sort, and uniq

Objectives

  1. Use commands for extracting fields from a database file: cut
  2. Use commands for combining columns of data into a multicolumn file: paste
  3. Use the tr command for translating files from one format to another.
  4. Use the sort and uniq commands for sorting data and removing duplicate entries.
The cut command
Extracts one or more fields from a database file.
Good for pulling account information out of the /etc/passwd file:
cut -d: -f1,5 /etc/passwd   or
or extractiing columns of data from the output of another command:
ls -l | cut -cn-m
 
The paste command
Combines columns of data into a single multi-column file.
Used for building reports with just the information you need.
paste file1 file2 ...
The default delimiter used is tab; use the -d option to specify otherwise.
 
The tr command
This command is great for deleting specific characters, e.g. carriage returns, from a file or for changing the delimiters of a database file.
tr -d '\r' < input > output # removes carriage returns from file
Or use it this way to fold all alphabetic characters from lower to upper case:
tr [a-z] [A-Z] < input > output
 
The sort and uniq commands
The sort command sorts its input according to the collating sequence assigned to the localization variable LOCALE. It's usually fairly close to dictionary sort. Sorting is performed on a per-line basis with each word in the line acting as a possible key.
  1. The output of the who command is normally sorted by tty. The following command sorts it by username:
    who | sort
  2. The output of the ps command is usually sorted by the PID number. The following command sorts it by the name of the process:
    ps -e | sort -k4
The output of the sort command is often piped to the uniq command for purposes of removing identical entries.
who | cut -d' ' -f1 | sort | uniq
Removes the duplicate names of users logged in more than once.

Exercise 1: Using the cut and paste commands

How would you use the cut command to:
  1. Extract the username and uid from the /etc/passwd file.
  2. Extract the username and the comment field from the output of the who command.
  3. Create a file called lastnames that contains the last name of everyone who has a cis98 account on opus.
  4. Create a file called firstnames that contains the first name of everyone who has a cis98 account on opus.
  5. Put the above two files together into one file called names such that each line is of the form: lastname,firstname

Exercise 2: Using the tr command

Use the tr command to:
  1. Create a comma separated list of all the files in the /tmp directory.
  2. Translate the spaces in the /etc/crontab file to tab characters.
  3. Delete all tabs in the /etc/auto.master file.
  4. List all the files in your current directory so that their names are uppercase.
The output of the above commands should all go to stdout.

Exercise 3: Using the sort and uniq commands

Use the sort command to:
  1. Sort the /etc/passwd by username.
  2. Sort the current processes on the system alphabetically by name of process.
  3. Sort the current processes on the system numerically by size of process (SZ)
  4. Create a list of users currenty logged on to the system, and insure that there are no duplicate entries in the list.
Comments