Thursday, April 03, 2008

pathlint

In my day job I spend a fair amount of time working on computers where I'm not the sysadmin. That's fine, as I'm glad not to have to administer a 90-node cluster, but it means that I have to put up with some things that I don't here on dear, old, slow, Hal. (Hey! Sorry.)

For example, different machines have different locations for the Fortran compiler and libraries, so the sysadmin helpfully locates them for you by adding the appropriate directories to your path. Typically, you'll be told to put a statement of the form

source /usr/sysadmin/cshrc

into your .cshrc or .bashrc file, where that file says something like:

setenv PATH $PATH:/opt/intel/bin

(in csh, of course) which adds the Intel directory to your path.

That's fine the first time you open up an xterm. But suppose you use your xterm file to launch another xterm? Now you've got two invocations of /opt/intel/bin in your path, one after another. Doesn't do too much harm, but it can be difficult to look at your path and decide which directories are there.

So I wrote a little Perl script called pathlint to take care of this:

#! /usr/bin/perl

# Learn about $ENV in "Learning Perl," 2nd ed., pp. 143-4
# split is on pp. 89-90:

# Sort routine from http://www.perl.com/doc/FAQs/FAQ/oldfaq-html/Q5.4.html:

# This could almost be made into a one-liner:

undef %saw;
@out = grep(!$saw{$_}++, split(/:/,$ENV{"PATH"}));

# The inverse of split is (Learning Perl, p. 90):

$newpath = join(":",@out);

# Note the lack of a newline, because we're going to use
#  this as the argument of a path command

print $newpath;

Note that this doesn't actually reset the path, which would require adding something like

# Reset the path:
# $ENV{"PATH"} = $newpath;

That's because the Perl script is a child process, and children can't reset the environment of their parents. Think about it: if you could, then a program like xine could switch your directory so that you would start looking at Aunt Tillie's porn directory – trust me, you don't want to go there.

So what pathlint does is to print out the current path, without any duplicate directory names, e.g.

$ echo $PATH
/bin:/usr/bin/:/opt/intel/bin:/opt/intel/bin

$ pathlint
/bin:/usr/bin/:/opt/intel/bin

without a newline. To actually change the path, just

$ setenv PATH `pathlint`

(those are backquotes) if you are using cshell or one of its derivatives, or

PATH=`pathlint`

from bash, dash, sh, ksh, etc. If you put the appropriate line at the bottom of your .cshrc or .bashrc file, opening up a new xterm will always start you with an unduplicated path.

0 comments: