Friday, December 30, 2011

Stanford's Online Courses: ml-class, ai-class and db-class

About three months ago, I signed up to Stanford's first-of-its-kind, free, online courses on Artifical Intelligence, Machine Learning and Databases. I have now successfully completed all three courses!
  • The Artifical Intelligence (ai-class) class, led by Sebastian Thrun and Peter Norvig, covered probability, Bayes networks, machine learning, planning, Markov decision processes (MDPs), particle filters, game theory, computer vision, robotics and natural language processing. There were in-class quizzes, homework exercises and two exams.
  • The Machine Learning (ml-class) class, led by Professor Andrew Ng, covered supervised learning (linear regression, logistic regression, neural networks, support vector machines), unsupervised learning (k-Means clustering), anomaly detection and recommender systems. There were in-class quizzes, review questions and programming exercises.
  • The Introduction to Databases(db-class) class, led by Professor Jennifer Widom, covered relational databases, relational algebra, XML, XPaths, SQL, UML, constraints, triggers, transactions, authorization, recursion and NoSQL systems. There were in-class quizzes, assignments, including practical exercises (such as creating triggers, writing SQL or XPaths that would run on a set of data) and two exams.
Even though I had studied these topics a long time ago at university, I found it really useful to refresh my memory. They took up quite a bit of time (about 2 hours a week for each class), but it was definitely worth it.

I am now looking forward to starting some new classes in 2012!

Tuesday, December 27, 2011

Args4j vs JCommander for Parsing Command Line Parameters

In the past, I've always used Apache Commons CLI for parsing command line options passed to programs and have found it quite tedious because of all the boiler plate code involved. Just take a look at their Ant Example and you will see how much code is required to create each option.

As an alternative, there are two annotation-based command line parsing frameworks which I have been evaluating recently:

I'm going to use the Ant example to illustrate how to parse command line options using these two libraries. I'm only going to use a few options in my example because they are all very similar. Here is an extract of the help output for Ant, which I will be aiming to replicate:
ant [options] [target [target2 [target3] ...]]
Options:
  -help, -h              print this message
  -lib <path>            specifies a path to search for jars and classes
  -buildfile <file>      use given buildfile
    -file    <file>              ''
    -f       <file>              ''
  -D<property>=<value>   use value for given property
  -nice  number          A niceness value for the main thread:
Args4j (v2.0.12)
The class below demonstrates how to parse command line options for Ant using Args4j. The main method parses some sample arguments and also prints out the usage of the command.
import static org.junit.Assert.*;
import java.io.File;
import java.util.*;
import org.kohsuke.args4j.*;

/**
 * Example of using Args4j for parsing
 * Ant command line options
 */
public class AntOptsArgs4j {

  @Argument(metaVar = "[target [target2 [target3] ...]]", usage = "targets")
  private List<String> targets = new ArrayList<String>();

  @Option(name = "-h", aliases = "-help", usage = "print this message")
  private boolean help = false;

  @Option(name = "-lib", metaVar = "<path>",
          usage = "specifies a path to search for jars and classes")
  private String lib;

  @Option(name = "-f", aliases = { "-file", "-buildfile" }, metaVar = "<file>",
          usage = "use given buildfile")
  private File buildFile;

  @Option(name = "-nice", metaVar = "number",
          usage = "A niceness value for the main thread:\n"
          + "1 (lowest) to 10 (highest); 5 is the default")
  private int nice = 5;

  private Map<String, String> properties = new HashMap<String, String>();
  @Option(name = "-D", metaVar = "<property>=<value>",
          usage = "use value for given property")
  private void setProperty(final String property) throws CmdLineException {
    String[] arr = property.split("=");
    if(arr.length != 2) {
        throw new CmdLineException("Properties must be specified in the form:"+
                                   "<property>=<value>");
    }
    properties.put(arr[0], arr[1]);
  }

  public static void main(String[] args) throws CmdLineException {
    final String[] argv = { "-D", "key=value", "-f", "build.xml",
                            "-D", "key2=value2", "clean", "install" };
    final AntOptsArgs4j options = new AntOptsArgs4j();
    final CmdLineParser parser = new CmdLineParser(options);
    parser.parseArgument(argv);

    // print usage
    parser.setUsageWidth(Integer.MAX_VALUE);
    parser.printUsage(System.err);

    // check the options have been set correctly
    assertEquals("build.xml", options.buildFile.getName());
    assertEquals(2, options.targets.size());
    assertEquals(2, options.properties.size());
  }
}
Running this program prints:
 [target [target2 [target3] ...]] : targets
 -D <property>=<value>            : use value for given property
 -f (-file, -buildfile) <file>    : use given buildfile
 -h (-help)                       : print this message
 -lib <path>                      : specifies a path to search for jars and classes
 -nice number                     : A niceness value for the main thread:
                                    1 (lowest) to 10 (highest); 5 is the default
JCommander (v1.13)
Similarly, here is a class which demonstrates how to parse command line options for Ant using JCommander.
import static org.junit.Assert.*;
import java.io.File;
import java.util.*;
import com.beust.jcommander.*;

/**
 * Example of using JCommander for parsing
 * Ant command line options
 */
public class AntOptsJCmdr {

  @Parameter(description = "targets")
  private List<String> targets = new ArrayList<String>();

  @Parameter(names = { "-help", "-h" }, description = "print this message")
  private boolean help = false;

  @Parameter(names = { "-lib" },
             description = "specifies a path to search for jars and classes")
  private String lib;

  @Parameter(names = { "-buildfile", "-file", "-f" },
             description = "use given buildfile")
  private File buildFile;

  @Parameter(names = "-nice", description = "A niceness value for the main thread:\n"
        + "1 (lowest) to 10 (highest); 5 is the default")
  private int nice = 5;

  @Parameter(names = { "-D" }, description = "use value for given property")
  private List<String> properties = new ArrayList<String>();

  public static void main(String[] args) {
    final String[] argv = { "-D", "key=value", "-f", "build.xml",
                            "-D", "key2=value2", "clean", "install" };
    final AntOptsJCmdr options = new AntOptsJCmdr();
    final JCommander jcmdr = new JCommander(options, argv);

    // print usage
    jcmdr.setProgramName("ant");
    jcmdr.usage();

    // check the options have been set correctly
    assertEquals("build.xml", options.buildFile.getName());
    assertEquals(2, options.targets.size());
    assertEquals(2, options.properties.size());
  }
}
Running this program prints:
Usage: ant [options]
 targets
  Options:
    -D                      use value for given property
                            Default: [key=value, key2=value2]
    -buildfile, -file, -f   use given buildfile
    -help, -h               print this message
                            Default: false
    -lib                    specifies a path to search for jars and classes
    -nice                   A niceness value for the main thread:
1 (lowest) to
                            10 (highest); 5 is the default
                            Default: 5
Args4j vs JCommander
As you can see from the implementations above, both frameworks are very similar. There are a few differences though:
  1. JCommander does not have an equivalent to Arg4j's metaVar which allows you to display the value that an option might take. For example, if you have an option called "-f" which takes a file, you can set metaVar="<file>" and Args4j will display -f <file> when it prints the usage. This is not possible in JCommander, so it is difficult to see which options take values and which ones don't.

  2. JCommander's @Parameter option can only be applied to fields, not methods. This makes it slightly restrictive. In Args4j, you can add the annotation on a "setter" method, which allows you to tweak the value before it is set. In JCommander, you would have to create a custom converter.

  3. In the example above, JCommander was unable to place -D property=value options into a map. It was able to save them into a list and then you would have to do some post-processing to convert the elements in the list to key-value pairs in a map. On the other hand, Args4j was able to put the properties straight into the map by applying the annotation on a setter method.

  4. JCommander's usage output is not as pretty as Args4j's. In particular, the description of the "nice" option is not aligned correctly.

Based purely on this example, the winner is Args4j. However, note that there are other features present in JCommander which are not available in Args4j, such as parameter validation and password type parameters. Please read the documentation to find out which one is better suited to your needs. But one thing is quite clear: annotation based command line parsing is the way forward!

Saturday, December 24, 2011

Guava Cache

Google's Guava Cache is a lightweight, threadsafe Java cache that provides some nice features such as:
  • eviction of least recently used entries when a maximum size is breached
  • eviction of entries based on time since last access or last write
  • notification of evicted entries
  • performance statistics e.g. hit and miss counts
In order to create the Cache, you need to use a CacheBuilder. This allows you to specify the eviction policy and other features such as concurrency level, soft or weak values etc. You also need to specify a CacheLoader which will be invoked automatically by the cache if a key does not exist and is used to populate it.

The following code demonstrates how to create a cache:

// Create the cache. Only allow a max of 10 entries.
// Old entries will be evicted.
final Cache<String, String> cache = CacheBuilder.newBuilder()
    .maximumSize(10)
    .removalListener(new RemovalListener<String, String>() {
        @Override
        public void onRemoval(RemovalNotification<String, String> n) {
            System.out.println("REMOVED: " + n);
        }
    })
    .build(new CacheLoader<String, String>() {
        @Override
        public String load(String key) throws Exception {
            System.out.println("LOADING: " + key);
            return key + "-VALUE";
        }
    });

// Get values from the cache.
// If a key does not exist, it will be loaded.
for (int i = 0; i < 10; i++) {
  System.out.println(cache.get("Key" + i));
}
for (int i = 9; i >= 0; i--) {
  System.out.println(cache.get("Key" + i));
}

//Print out the hit counts.
System.out.println(cache.stats());
The output of this program is:
LOADING: Key0
LOADING: Key1
LOADING: Key2
LOADING: Key3
LOADING: Key4
LOADING: Key5
LOADING: Key6
REMOVED: Key0=Key0-VALUE
LOADING: Key7
REMOVED: Key3=Key3-VALUE
LOADING: Key8
LOADING: Key9
LOADING: Key3
REMOVED: Key7=Key7-VALUE
LOADING: Key0
REMOVED: Key6=Key6-VALUE
CacheStats{hitCount=8, missCount=12, loadSuccessCount=12, loadExceptionCount=0, 
totalLoadTime=563806, evictionCount=4}
It is important to note that entries were evicted BEFORE the maximum size of 10 was reached. In this case, an entry was removed when the cache had 7 entries in it.

The cache stats show that out of 20 calls to the cache, 12 missed and had to be loaded. However, 8 were successfully retrieved from the cache. 4 entries were evicted.

Saturday, December 03, 2011

Using XStream to Map a Single Element

Let's say you have the following XML which has a single element containing an attribute and some text:
<error code="99">This is an error message</error>
and you would like to convert it, using XStream, into an Error object:
public class Error {
    String message;
    int code;

    public String getMessage() {
        return message;
    }

    public int getCode() {
        return code;
    }
}
It took me a while to figure this out. It was easy getting the code attribute set in the Error object, but it just wasn't picking up the message.

Eventually, I found the ToAttributedValueConverter class which "supports the definition of one field member that will be written as value and all other field members are written as attributes."

The following code shows how the ToAttributedValueConverter is used. You specify which instance variable maps to the value of the XML element (in this case message). All other instance variables automatically map to attributes, so you don't need to explicitly annotate them with XStreamAsAttribute.

@XStreamAlias("error")
@XStreamConverter(value=ToAttributedValueConverter.class, strings={"message"})
public class Error {

  String message;

  @XStreamAlias("code")
  int code;

  public String getMessage() {
      return message;
  }

  public int getCode() {
      return code;
  }

  public static void main(String[] args) {
      XStream xStream = new XStream();
      xStream.processAnnotations(Error.class);

      String xmlResponse="<error code=\"99\">This is an error message</error>";

      Error error = (Error)xStream.fromXML(xmlResponse);
      System.out.println(error.getCode());
      System.out.println(error.getMessage());
  }
}

Saturday, November 05, 2011

Regular Expressions in Bash

Traditionally, external tools such as grep, sed, awk and perl have been used to match a string against a regular expression, but the Bash shell has this functionality built into it as well!

In Bash, the =~ operator allows you to match a string on the left against an extended regular expression on the right and returns 0 if the string matches the pattern, and 1 otherwise. Capturing groups are saved in the array variable BASH_REMATCH with the first element, Group 0, representing the entire expression.

The following script matches a string against a regex and prints out the capturing groups:

#!/bin/bash

if [ $# -lt 2 ]
then
    echo "Usage: $0 regex string" >&2
    exit 1
fi

regex=$1
input=$2

if [[ $input =~ $regex ]]
then
    echo "$input matches regex: $regex"

    #print out capturing groups
    for (( i=0; i<${#BASH_REMATCH[@]}; i++))
    do
        echo -e "\tGroup[$i]: ${BASH_REMATCH[$i]}"
    done
else
    echo "$input does not match regex: $regex"
fi
Example usage:
sharfah@starship:~> matcher.sh '(.*)=(.*)' foo=bar
foo=bar matches regex (.*)=(.*)
    Group[0]: foo=bar
    Group[1]: foo
    Group[2]: bar

Sunday, October 23, 2011

Finding the Maximum using Relational Algebra

We know that if you want to get the maximum value of a column in SQL, you can simply use the MAX function as shown below:
SELECT MAX(value) FROM T
You can also do it without the MAX function as follows:
SELECT T.* FROM T
MINUS
SELECT T.* FROM T, T as T2 WHERE T.value<T2.value
or:
SELECT T.* FROM T
LEFT JOIN T as T2 ON T.value<T2.value
WHERE T2.value IS NULL
Relational Algebra:
Using Relational Algebra (RA) syntax, this would be:
\project_{value}(T)
\diff
\project_{value} (
    \select_{value < value2}(
      \project_{value}(T)
      \cross
      \rename_{value2}(\project_{value}(T))
    )
)
where:
  • \cross is the relational cross-product operator
  • \diff is the relational diff operator
  • \project_{attr_list} is the relational projection operator
  • \rename_{new_attr_name_list} is the relational renaming operator
  • \select_{cond} is the relational selection operator

Sunday, October 16, 2011

Validating XML with xmllint

The following commands show you how to validate an XML file against a DTD or XSD using xmllint.

To validate an XML file against:

  • a DTD stored in the same file:
  • xmllint --valid --noout fileWithDTD.xml
    
  • a DTD stored in a separate file:
  • xmllint --dtdvalid DTD.dtd --noout fileWithoutDTD.xml
    
  • an XSD stored in a separate file:
  • xmllint --schema schema.xsd --noout file.xml
    
The --noout option suppresses the output of the xml file.

Example
To validate:

<countries>
  <country name="Afghanistan" population="22664136" area="647500">
    <language percentage="11">Turkic</language>
    <language percentage="35">Pashtu</language>
    <language percentage="50">Afghan Persian</language>
  </country>
  <country name="Albania" population="3249136" area="28750"/>
  <country name="Algeria" population="29183032" area="2381740">
    <city>
      <name>Algiers</name>
      <population>1507241</population>
    </city>
  </country>
</countries>
against:
<!ELEMENT countries (country*)>
<!ELEMENT country (language|city)*>
<!ATTLIST country name CDATA #REQUIRED>
<!ATTLIST country population CDATA #REQUIRED>
<!ATTLIST country area CDATA #REQUIRED>
<!ELEMENT language (#PCDATA)>
<!ATTLIST language percentage CDATA #REQUIRED>
<!ELEMENT city (name, population)>
<!ELEMENT name (#PCDATA)>
<!ELEMENT population (#PCDATA)>
use the command:
xmllint --dtdvalid countries.dtd --noout countries.xml

Saturday, October 08, 2011

Splitting a large file into smaller pieces

If you have a large file and want to break it into smaller pieces, you can use the Unix split command. You can tell it what the prefix of each split file should be and it will then append an alphabet (or number) to the end of each name.

In the example below, I split a file containing 100,000 lines. I instruct split to use numeric suffixes (-d), put 10,000 lines in each split file (-l 10000) and use suffixes of length 3 (-a 3). As a result, ten split files are created, each with 10,000 lines.

$ ls
hugefile

$ wc -l hugefile
100000 hugefile

$ split -d -l 10000 -a 3 hugefile hugefile.split.

$ ls
hugefile                hugefile.split.005
hugefile.split.000      hugefile.split.006
hugefile.split.001      hugefile.split.007  
hugefile.split.002      hugefile.split.008
hugefile.split.003      hugefile.split.009
hugefile.split.004

$ wc -l *split*
 10000 hugefile.split.000
 10000 hugefile.split.001
 10000 hugefile.split.002
 10000 hugefile.split.003
 10000 hugefile.split.004
 10000 hugefile.split.005
 10000 hugefile.split.006
 10000 hugefile.split.007
 10000 hugefile.split.008
 10000 hugefile.split.009
100000 total

Sunday, October 02, 2011

Better Bash Completion for Tmux

In my previous post, I wrote about how awesome tmux is for managing multiple terminals. However, even though it is widely used, I haven't been able to find a good Bash completion script for it. The tmux package does come with bash_completion_tmux.sh but this does not complete command options or command aliases. So I wrote a better version which completes tmux commands, aliases and their options. However, there is still room for improvement. It would be nice if it could complete session and window names too, but I haven't found the time to implement this yet.

Here is a demo:

$ tmux lis[TAB]
list-buffers   list-clients   list-commands
list-keys      list-panes     list-sessions  list-windows

$ tmux list-windows -[TAB]
-a -t

$ tmux list-windows -a
sharfah:0: less [180x82] [layout f0de,180x82,0,0]
sharfah:1: tmp [180x82] [layout f0de,180x82,0,0] (active)
sharfah:2: isengard [180x82] [layout f0de,180x82,0,0]
sharfah:3: java [180x82] [layout f0de,180x82,0,0]
My completion script is shown below. You need to source it in your Bash profile. Alternatively, save it to your Bash completion directory e.g. ~/.bash/.bash/.bash_completion.d and it should automatically get picked up.

The script is also available in my GitHub dotfiles repository. If you can improve it, fork it and send me a pull request!

#
# tmux completion
# by Fahd Shariff
#
_tmux() {
  # an array of commands and their options
  declare -A tmux_cmd_map
  tmux_cmd_map=( ["attach-session"]="-dr -t target-session" \
                 ["bind-key"]="-cnr -t key-table key command arguments" \
                 ["break-pane"]="-d -t target-pane" \
                 ["capture-pane"]="-b buffer-index -E end-line -S start-line -t target-pane" \
                 ["choose-buffer"]="-t target-window template" \
                 ["choose-client"]="-t target-window template" \
                 ["choose-session"]="-t target-window template" \
                 ["choose-window"]="-t target-window template" \
                 ["clear-history"]="-t target-pane" \
                 ["clock-mode"]="-t target-pane" \
                 ["command-prompt"]="-I inputs -p prompts -t target-client template" \
                 ["confirm-before"]="-p prompt -t target-client command" \
                 ["copy-mode"]="-u -t target-pane" \
                 ["delete-buffer"]="-b buffer-index" \
                 ["detach-client"]="-P -s target-session -t target-client" \
                 ["display-message"]="-p -c target-client -t target-pane message" \
                 ["display-panes"]="-t target-client" \
                 ["find-window"]="-t target-window match-string" \
                 ["has-session"]="-t target-session" \
                 ["if-shell"]="shell-command command" \
                 ["join-pane"]="-dhv -p percentage|-l size -s src-pane -t dst-pane" \
                 ["kill-pane"]="-a -t target-pane" \
                 ["kill-server"]="kill-server" \
                 ["kill-session"]="-t target-session" \
                 ["kill-window"]="-t target-window" \
                 ["last-pane"]="-t target-window" \
                 ["last-window"]="-t target-session" \
                 ["link-window"]="-dk -s src-window -t dst-window" \
                 ["list-buffers"]="list-buffers" \
                 ["list-clients"]="-t target-session" \
                 ["list-commands"]="list-commands" \
                 ["list-keys"]="-t key-table" \
                 ["list-panes"]="-as -t target" \
                 ["list-sessions"]="list-sessions" \
                 ["list-windows"]="-a -t target-session" \
                 ["load-buffer"]="-b buffer-index path" \
                 ["lock-client"]="-t target-client" \
                 ["lock-server"]="lock-server" \
                 ["lock-session"]="-t target-session" \
                 ["move-window"]="-dk -s src-window -t dst-window" \
                 ["new-session"]="-d -n window-name -s session-name -t target-session -x width -y height command" \
                 ["new-window"]="-adk -n window-name -t target-window command" \
                 ["next-layout"]="-t target-window" \
                 ["next-window"]="-a -t target-session" \
                 ["paste-buffer"]="-dr -s separator -b buffer-index -t target-pane" \
                 ["pipe-pane"]="-t target-pane-o command" \
                 ["previous-layout"]="-t target-window" \
                 ["previous-window"]="-a -t target-session" \
                 ["refresh-client"]="-t target-client" \
                 ["rename-session"]="-t target-session new-name" \
                 ["rename-window"]="-t target-window new-name" \
                 ["resize-pane"]="-DLRU -t target-pane adjustment" \
                 ["respawn-pane"]="-k -t target-pane command" \
                 ["respawn-window"]="-k -t target-window command" \
                 ["rotate-window"]="-DU -t target-window" \
                 ["run-shell"]="command" \
                 ["save-buffer"]="-a -b buffer-index" \
                 ["select-layout"]="-np -t target-window layout-name" \
                 ["select-pane"]="-lDLRU -t target-pane" \
                 ["select-window"]="-lnp -t target-window" \
                 ["send-keys"]="-t target-pane key " \
                 ["send-prefix"]="-t target-pane" \
                 ["server-info"]="server-info" \
                 ["set-buffer"]="-b buffer-index data" \
                 ["set-environment"]="-gru -t target-session name value" \
                 ["set-option"]="-agsuw -t target-session|target-window option value" \
                 ["set-window-option"]="-agu -t target-window option value" \
                 ["show-buffer"]="-b buffer-index" \
                 ["show-environment"]="-g -t target-session" \
                 ["show-messages"]="-t target-client" \
                 ["show-options"]="-gsw -t target-session|target-window" \
                 ["show-window-options"]="-g -t target-window" \
                 ["source-file"]="path" \
                 ["split-window"]="-dhvP -p percentage|-l size -t target-pane command" \
                 ["start-server"]="start-server" \
                 ["suspend-client"]="-t target-client" \
                 ["swap-pane"]="-dDU -s src-pane -t dst-pane" \
                 ["swap-window"]="-d -s src-window -t dst-window" \
                 ["switch-client"]="-lnp -c target-client -t target-session" \
                 ["unbind-key"]="-acn -t key-table key" \
                 ["unlink-window"]="-k -t target-window" )

   declare -A tmux_alias_map
   tmux_alias_map=( ["attach"]="attach-session" \
                  ["detach"]="detach-client" \
                  ["has"]="has-session" \
                  ["lsc"]="list-clients" \
                  ["lscm"]="list-commands" \
                  ["ls"]="list-sessions" \
                  ["lockc"]="lock-client" \
                  ["locks"]="lock-session" \
                  ["new"]="new-session" \
                  ["refresh"]="refresh-client" \
                  ["rename"]="rename-session" \
                  ["showmsgs"]="show-messages" \
                  ["source"]="source-file" \
                  ["start"]="start-server" \
                  ["suspendc"]="suspend-client" \
                  ["switchc"]="switch-client" \
                  ["breakp"]="break-pane" \
                  ["capturep"]="target-pane]" \
                  ["displayp"]="display-panes" \
                  ["findw"]="find-window" \
                  ["joinp"]="join-pane" \
                  ["killp"]="kill-pane" \
                  ["killw"]="kill-window" \
                  ["lastp"]="last-pane" \
                  ["last"]="last-window" \
                  ["linkw"]="link-window" \
                  ["lsp"]="list-panes" \
                  ["lsw"]="list-windows" \
                  ["movew"]="move-window" \
                  ["neww"]="new-window" \
                  ["nextl"]="next-layout" \
                  ["next"]="next-window" \
                  ["pipep"]="pipe-pane" \
                  ["prevl"]="previous-layout" \
                  ["prev"]="previous-window" \
                  ["renamew"]="rename-window" \
                  ["resizep"]="resize-pane" \
                  ["respawnp"]="respawn-pane" \
                  ["respawnw"]="respawn-window" \
                  ["rotatew"]="rotate-window" \
                  ["selectl"]="select-layout" \
                  ["selectp"]="select-pane" \
                  ["selectw"]="select-window" \
                  ["splitw"]="[shell-command]" \
                  ["swapp"]="swap-pane" \
                  ["swapw"]="swap-window" \
                  ["unlinkw"]="unlink-window" \
                  ["bind"]="bind-key" \
                  ["lsk"]="list-keys" \
                  ["send"]="send-keys" \
                  ["unbind"]="unbind-key" \
                  ["set"]="set-option" \
                  ["setw"]="set-window-option" \
                  ["show"]="show-options" \
                  ["showw"]="show-window-options" \
                  ["setenv"]="set-environment" \
                  ["showenv"]="show-environment" \
                  ["confirm"]="confirm-before" \
                  ["display"]="display-message" \
                  ["clearhist"]="clear-history" \
                  ["deleteb"]="delete-buffer" \
                  ["lsb"]="list-buffers" \
                  ["loadb"]="load-buffer" \
                  ["pasteb"]="paste-buffer" \
                  ["saveb"]="save-buffer" \
                  ["setb"]="set-buffer" \
                  ["showb"]="show-buffer" \
                  ["if"]="if-shell" \
                  ["lock"]="lock-server" \
                  ["run"]="run-shell" \
                  ["info"]="server-info" )

   local cur="${COMP_WORDS[COMP_CWORD]}"
   local prev="${COMP_WORDS[COMP_CWORD-1]}"
   COMPREPLY=()

   # completing an option
   if [[ "$cur" == -* ]]; then
     #tmux options
     if [[ "$prev" == "tmux" ]]; then
         COMPREPLY=( $( compgen -W "-2 -8 -c -f -L -l -q -S -u -v -V" -- $cur ) )
     else
         #find the tmux command so that we can complete the options
         local cmd="$prev"
         local i=$COMP_CWORD
         while [[ "$cmd" == -* ]]
         do
             cmd="${COMP_WORDS[i]}"
             ((i--))
         done

         #if it is an alias, look up what the alias maps to
         local alias_cmd=${tmux_alias_map[$cmd]}
         if [[ -n ${alias_cmd} ]]
         then
             cmd=${alias_cmd}
         fi

         #now work out the options to this command
         local opts=""
         for opt in ${tmux_cmd_map[$cmd]}
         do
              if [[ "$opt" == -* ]]; then
                  len=${#opt}
                  i=1
                  while [ $i -lt $len ]; do
                      opts="$opts -${opt:$i:1}"
                      ((i++))
                  done
              fi
         done
         COMPREPLY=($(compgen -W "$opts" -- ${cur}))
     fi
   else
     COMPREPLY=($(compgen -W "$(echo ${!tmux_cmd_map[@]} ${!tmux_alias_map[@]})" -- ${cur}))
   fi
   return 0
}
complete -F _tmux tmux
Related posts:
Managing Multiple Terminals with Tmux Writing your own Bash Completion Function

Saturday, October 01, 2011

Managing Multiple Terminals with Tmux

I've started using tmux, which is a "terminal multiplexer", similar to screen. It allows you to manage a number of terminals from a single screen. So, for example, instead of having 5 PuTTY windows cluttering up your desktop, you now have only one window, containing 5 terminals. If you close this window, you can simply open a new one and "attach" to your running tmux session, to get all your terminals back at the same state you left them in.

There are lots of cool things you can do with tmux. For example, you can split a terminal window horizontally or vertically into "panes". This allows you to look at files side by side, or simply watch a process in one pane while you do something else in another.

I took the following screenshot of tmux in action:


The status bar along the bottom shows that I have 5 terminal windows open. I am currently in the one labelled "1-demo" and within this window I have 4 panes, each running a different command.

There are quite a few key bindings to learn, but once you have mastered them you will be able to jump back and forth between windows, move them around and kill them without lifting your hands off the keyboard. You can also set your own key bindings for things you do frequently. For example, my Ctrl-b / binding splits my window vertically and opens up a specified man page on the right. My Ctrl+b S binding allows me to SSH to a server in a new window.

Here is my tmux configuration taken from ~/.tmux.conf which shows my key bindings and colour setup. You can download this file from my GitHub dotfiles repository.

bind | split-window -h
bind - split-window -v
bind _ split-window -v
bind R source-file ~/.tmux.conf \; display-message "tmux.conf reloaded!"

bind / command-prompt -p "man" "split-window -h 'man %%'"
bind S command-prompt -p "ssh" "new-window -n %1 'exec ssh %1'"
bind h split-window -h  "man tmux"

set -g terminal-overrides 'xterm*:smcup@:rmcup@'

set -g history-limit 9999

# Terminal emulator window title
set -g set-titles on
set -g set-titles-string '#S:#I.#P #W'

# notifications
setw -g monitor-activity on
setw -g visual-activity on

# auto rename
setw -g automatic-rename on

# Clock
setw -g clock-mode-colour green
setw -g clock-mode-style 24

# Window status colors
setw -g window-status-bg colour235
setw -g window-status-fg colour248
setw -g window-status-alert-attr underscore
setw -g window-status-alert-bg colour235
setw -g window-status-alert-fg colour248
setw -g window-status-current-attr bright
setw -g window-status-current-bg colour235
setw -g window-status-current-fg colour248

# Message/command input colors
set -g message-bg colour240
set -g message-fg yellow
set -g message-attr bright

# Status Bar
set -g status-bg colour235
set -g status-fg colour248
set -g status-interval 1
set -g status-left '[#H]'
set -g status-right ''

set -g pane-border-fg white
set -g pane-border-bg default
set -g pane-active-border-fg white
set -g pane-active-border-bg default

Sunday, September 25, 2011

Speeding up Bash Profile Load Time

I started noticing a considerable delay whenever opening a new terminal or connecting to another server. After profiling my Bash profile with a few time commands, I discovered that the slowest part was the loading of the completion file:
$ time  ~/.bash/.bash_completion

real    0m0.457s
user    0m0.183s
sys     0m0.276s
The Bash completion script I use is from http://bash-completion.alioth.debian.org. I found that there is an existing bug for this issue #467231: bash_completion is big and loads slowly; load-by-need proposed and someone has submitted a script to speed up Bash completion load time called dyncomp.sh.

This is a one-time script, which only needs to be run when you install your Bash completions or modify them. It loads your completions and moves the completion functions out of the script and into a separate directory. They are only loaded when needed. This speeds up the load time considerably and new terminal windows open up instantly!

$ time  ~/.bash/.bash_dyncompletion

real    0m0.020s
user    0m0.018s
sys     0m0.002s
You can visit my GitHub dotfiles repository for the latest version of my Bash profile.

Saturday, September 17, 2011

Faster SSH with Multiplexing

OpenSSH allows you to speed up multiple SSH connections to the same server using "multiplexing". The first connection acts as the "master" and any other connections reuse the master instance's network connection rather than initiating new ones.

In order to set this up add the following to your ~/.ssh/config file:

Host *
ControlMaster auto
ControlPath /tmp/%r@%h:%p
ControlMaster auto will use a master if one exists, or start a master otherwise. ControlPath is the path to the control socket used for connection sharing. %r, %h and %p are replaced with your username, host to which you are connecting and the port respectively.

In addition, you may want to add Ciphers arcfour in order to use the arcfour cipher which is faster than the default (aes128-cbc). The transfer rate of arcfour is about 90 MB/s, aes128-cbc is about 75 MB/s and the slowest is 3des-cbc, at 19 MB/s.

Sunday, August 28, 2011

Profiling Perl Code

I've started using NYTProf to profile and optimise my perl code. It produces a very useful report showing how much time is spent in each subroutine and statement.

Usage:

perl -d:NYTProf script.pl
or
perl -d:NYTProf -I/path/to/Devel-NYTProf/4.06/lib/perl5 script.pl
The output of the profiler is written to ./nytprof.out.

Use the following command to create HTML reports from the profiler's results:

/path/Devel-NYTProf/4.06/bin/nytprofhtml --file nytprof.out -out ./htmlReports --delete

Wednesday, August 24, 2011

Play My Code: Chain Reaction

I've just published a new game called Chain Reaction on Play My Code.

Play My Code is a place where you can create your own online games easily using the site's Quby language. The games are compiled into native JavaScript, compatible with all modern HTML5-compliant browsers. Once you've written a game, you can embed it on your website or blog, like I have done below. It's like YouTube, but for games!

Here's Chain Reaction! The aim of the game, is to start a chain reaction (by clicking on the screen) and explode as many atoms as possible. See how many levels you can complete!

Click here to see all my games.

Sunday, August 21, 2011

Java 7: ThreadLocalRandom for Concurrent Random Numbers

The ThreadLocalRandom class in JDK 7, allows you to generate random numbers from multiple threads. It is more efficient than using shared Random objects and will result in better performance as there is less overhead and contention.

In addition, this class also provides "bounded" generation methods.

For example, the statement below generates a random number between 1 (inclusive) and 100 (exclusive).

int random = ThreadLocalRandom.current().nextInt(1,100);
Wait, there's a BUG!
While trying out this class, I noticed that the SAME random numbers were being produced across all my threads. I then discovered this bug which reported the same issue I was having. It appears that the seed is never initialised so the same random numbers are produced every time. I wouldn't recommend using this class until the bug is fixed. The following code illustrates the issue:
//3 threads
for(int i = 0; i < 3; i++) {
    final Thread thread = new Thread() {
        @Override
        public void run() {
            System.out.print(Thread.currentThread().getName() + ":");

            //each thread prints 3 random numbers
            for(int j = 0; j < 3; j++) {
                final int random = ThreadLocalRandom.current().nextInt(1, 50);
                System.out.print(random + ",");
            }
            System.out.println();
        }
    };
    thread.start();
    thread.join();
}
prints:
Thread-0:1,5,24,
Thread-1:1,5,24,
Thread-2:1,5,24,

Saturday, August 20, 2011

LESSOPEN Powers Up Less

A really useful feature of the Unix less pager is LESSOPEN which is the "input preprocessor" for less. This is a script, defined in the LESSOPEN environment variable, which is invoked before the file is opened. It gives you the chance to modify the way the contents of the file are displayed. Why would you want to do this? The most common reason is to uncompress files before you view them, allowing you to less GZ files. But it also allows you to list the contents of zip files and other archives. I like to use it to format XML files and to view Java class files by invoking jad.

You can download a really useful LESSOPEN script from http://sourceforge.net/projects/lesspipe/ and then extend it if necessary.

To use it, simply add export LESSOPEN="|/path/to/bin/lesspipe.sh %s" to your bashrc.

You can then less:

  • directories
  • compressed files
  • archives, to list the files contained in them
  • files contained in archives e.g. less foo.zip:bar.txt
  • binary files

Sunday, August 14, 2011

Useful Eclipse Templates for Faster Coding

I wrote about my Eclipse code templates a few years ago and since then I've made a quite a few changes to them. I've added a few new templates to help with JUnit tests and xml parsing. I've also updated my existing file IO templates to use Java 7 features.

Templates are simply "magic words" or shortcuts to standard blocks of code or text. They are very handy because once you have them setup you don't have to waste time writing boilerplate code any more! An example of a pre-defined template in Eclipse is sysout which expands to System.out.println();. All you have to do is type sysout followed by Ctrl+Space to insert the statement into your Java source file.

To see what templates are defined in Eclipse:

  • Open your Preferences dialog by going to Windows > Preferences
  • On the navigation tree on the left, go to Java > Editor > Templates
  • You will see a list of pre-defined templates
  • You can add new ones by pressing the "New..." button
My templates are shown below. They can also be downloaded from my GitHub repository and then imported into Eclipse.

General Utility Templates:

Nameif
ContextJava statements
Descriptionif null
Pattern
if (${var} == null){
    ${cursor}
}
Nameif
ContextJava statements
Descriptionif not null
Pattern
if (${var} != null){
    ${cursor}
}
Namefor
ContextJava statements
Descriptioniterate over map
Pattern
${:import(java.util.Map.Entry)}
for(Entry<${key:argType(map,0)},${value:argType(map,1)}> entry :
                    ${map:var(java.util.Map)}.entrySet()) {
    ${key} key = entry.getKey();
    ${value} value = entry.getValue();
    ${cursor}
}
Namestrf
ContextJava
Descriptionformat string
Pattern
String.format("${word_selection}${}",${var}${cursor})
Namesysf
ContextJava statements
Descriptionprint formatted string to standard out
Pattern
System.out.printf("${word_selection}${}",${var}${cursor});
Namestatic_final
ContextJava type members
Descriptionstatic final field
Pattern
${visibility:link(
              public,
              protected,
              private)} static final ${type} ${NAME} = ${word_selection}${};

File IO Templates:
The following templates are useful for reading or writing files. They use Java 7 features such as try-with-resources to automatically close files. They also use methods from NIO2.0 to obtain a buffered reader and read the file.

Namereadfile
ContextJava statements
Descriptionread text from file
Pattern
${:import(java.nio.file.Files,
          java.nio.file.Paths,
          java.nio.charset.Charset,
          java.io.IOException,
          java.io.BufferedReader)}
try (BufferedReader in = Files.newBufferedReader(Paths.get(${fileName:var(String)}),
                                                 Charset.forName("UTF-8"))) {
    String line = null;
    while ((line = in.readLine()) != null) {
        ${cursor}
    }
} catch (IOException e) {
    // ${todo}: handle exception
}
Namereadfile
ContextJava statements
Descriptionread all lines from file as a list
Pattern
${:import(java.nio.file.Files,
          java.nio.file.Paths,
          java.nio.charset.Charset,
          java.util.List,
          java.util.ArrayList)}
Lis<String> lines = new ArrayList<>();
try{
	lines = Files.readAllLines(Paths.get(${fileName:var(String)}),
                                        Charset.forName("UTF-8"));
}catch (IOException e) {
    // ${todo}: handle exception
}
${cursor}
Namewritefile
ContextJava statements
Descriptionwrite text to file
Pattern
${:import(java.nio.file.Files,
          java.nio.file.Paths,
          java.nio.Charset,
          java.io.IOException,
          java.io.BufferedWriter)}
try (BufferedWriter out = Files.newBufferedWriter(Paths.get(${fileName:var(String)}),
                                                  Charset.forName("UTF-8"))) {
    out.write(${string:var(String)});
    out.newLine();
    ${cursor}
} catch (IOException e) {
    // ${todo}: handle exception
}

XML Templates:
The following templates are used to read xml files or strings and return a DOM.

Nameparsexml
ContextJava statements
Descriptionparse xml file as Document
Pattern
${:import(org.w3c.dom.Document,
          javax.xml.parsers.DocumentBuilderFactory,
          java.io.File,
          java.io.IOException,
          javax.xml.parsers.ParserConfigurationException,
          org.xml.sax.SAXException)}
Document doc = null;
try {
	doc = DocumentBuilderFactory.newInstance()
			.newDocumentBuilder()
			.parse(new File(${filename:var(String)}));
} catch (SAXException | IOException | ParserConfigurationException e) {
	// ${todo}: handle exception
}
${cursor}
Nameparsexml
ContextJava statements
Descriptionparse xml string as Document
Pattern
${:import(org.w3c.dom.Document,
          javax.xml.parsers.DocumentBuilderFactory,
          org.xml.sax.InputSource,
          java.io.StringReader,
          java.io.IOException,
          javax.xml.parsers.ParserConfigurationException,
          org.xml.sax.SAXException)}
Document doc = null;
try {
	doc = DocumentBuilderFactory.newInstance()
			.newDocumentBuilder()
			.parse(new InputSource(new StringReader(${str:var(String)})));
} catch (SAXException | IOException | ParserConfigurationException e) {
	// ${todo}: handle exception
}
${cursor}

Logging Templates:
The templates below are useful for creating a logger and logging messages. I use SLF4J, but they could easily be tweaked to use any other logging framework.

Namelogger
ContextJava type members
Descriptioncreate new logger
Pattern

${:import(org.slf4j.Logger,
          org.slf4j.LoggerFactory)}
private static final Logger LOGGER =
       LoggerFactory.getLogger(${enclosing_type}.class);
Namelogd
ContextJava statements
Descriptionlogger debug
Pattern
if(LOGGER.isDebugEnabled())
     LOGGER.debug(${word_selection}${});
${cursor}
Namelogi
ContextJava statements
Descriptionlogger info
Pattern
LOGGER.info(${word_selection}${});
${cursor}
Namelogerr
ContextJava statements
Descriptionlogger error
Pattern
LOGGER.error(${word_selection}${}, ${exception_variable_name});
Namelogthrow
ContextJava statements
Descriptionlog error and throw exception
Pattern
LOGGER.error(${word_selection}${}, ${exception_variable_name});
throw ${exception_variable_name};
${cursor}

JUnit Templates:
The templates below assist in writing JUnit tests.

Namebefore
ContextJava type members
Descriptionjunit before method
Pattern
${:import (org.junit.Before)}

@Before
public void setUp() {
    ${cursor}
}
Nameafter
ContextJava type members
Descriptionjunit after method
Pattern
${:import (org.junit.After)}

@After
public void tearDown() {
    ${cursor}
}
Namebeforeclass
ContextJava type members
Descriptionjunit beforeclass method
Pattern
${:import (org.junit.BeforeClass)}

@BeforeClass
public static void oneTimeSetUp() {
    // one-time initialization code
    ${cursor}
}
Nameafterclass
ContextJava type members
Descriptionjunit afterclass method
Pattern
${:import (org.junit.AfterClass)}

@AfterClass
public static void oneTimeTearDown() {
    // one-time cleanup code
    ${cursor}
}

Do YOU have any useful templates? If so, share them in the comments section!

Changing Java Library Path at Runtime

The java.library.path system property instructs the JVM where to search for native libraries. You have to specify it as a JVM argument using -Djava.library.path=/path/to/lib and then when you try to load a library using System.loadLibrary("foo"), the JVM will search the library path for the specified library. If it cannot be found you will get an exception which looks like:
Exception in thread "main" java.lang.UnsatisfiedLinkError: no foo in java.library.path
	at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1734)
	at java.lang.Runtime.loadLibrary0(Runtime.java:823)
	at java.lang.System.loadLibrary(System.java:1028)
The java.library.path is read only once when the JVM starts up. If you change this property using System.setProperty, it won't make any difference.

Here is the code from ClassLoader.loadLibrary which shows how the path is initialised:

if (sys_paths == null) {
    usr_paths = initializePath("java.library.path");
    sys_paths = initializePath("sun.boot.library.path");
}
As you can see from the code above, the usr_paths variable is only initialised if sys_paths is null, which will only be once.

So, how can you modify the library path at runtime? There are a couple of ways to do this, both involving reflection. You should only do this if you really have to.

Option 1: Unset sys_paths
If you set sys_paths to null, the library path will be re-initialised when you try to load a library. The following code does this:

/**
 * Sets the java library path to the specified path
 *
 * @param path the new library path
 * @throws Exception
 */
public static void setLibraryPath(String path) throws Exception {
    System.setProperty("java.library.path", path);

    //set sys_paths to null
    final Field sysPathsField = ClassLoader.class.getDeclaredField("sys_paths");
    sysPathsField.setAccessible(true);
    sysPathsField.set(null, null);
}
Option 2: Add path to usr_paths
Instead of having to re-evaluate the entire java.library.path and sun.boot.library.path as in Option 1, you can instead append your path to the usr_paths array. This is shown in the following code:
/**
* Adds the specified path to the java library path
*
* @param pathToAdd the path to add
* @throws Exception
*/
public static void addLibraryPath(String pathToAdd) throws Exception{
    final Field usrPathsField = ClassLoader.class.getDeclaredField("usr_paths");
    usrPathsField.setAccessible(true);

    //get array of paths
    final String[] paths = (String[])usrPathsField.get(null);

    //check if the path to add is already present
    for(String path : paths) {
        if(path.equals(pathToAdd)) {
            return;
        }
    }

    //add the new path
    final String[] newPaths = Arrays.copyOf(paths, paths.length + 1);
    newPaths[newPaths.length-1] = pathToAdd;
    usrPathsField.set(null, newPaths);
}

Saturday, August 13, 2011

Dotfiles in Git

I've added all my dotfiles (including my entire bash profile and vimrc) to my GitHub dotfiles repository. Whenever I make any changes, I will commit them to the repository.

In order to download the latest version, go to my Downloads page. Alternatively, if you have git installed, use the following command, to clone my repository:

git clone git://github.com/sharfah/dotfiles.git
This will download them to a directory called dotfiles. You can then copy the files recursively (cp -r) to your home directory (don't forget to backup your original files first!). Alternatively, use symlinks.

Sunday, August 07, 2011

Java 7: WatchService for File Change Notification

The Watch Service API in JDK7 allows you to watch a directory for changes to files and receive notification events when a file is added, deleted or modified. You no longer need to poll the file system for changes which is inefficient and doesn't scale well.

The code below shows how you would use the Watch Service API. First, you have to create a WatchService for the file system and then register the directory you want to monitor with it. You have to specify which events (create, modify or delete) you are interested in receiving. Then start an infinite loop to wait for events. When an event occurs, a WatchKey is placed into the watch service's queue and you have to call take to retrieve it. You can then query the key for events and print them out.

/**
 * Watch the specified directory
 * @param dirname the directory to watch
 * @throws IOException
 * @throws InterruptedException
 */
public static void watchDir(String dir)
    throws IOException, InterruptedException{

  //create the watchService
  final WatchService watchService = FileSystems.getDefault().newWatchService();

  //register the directory with the watchService
  //for create, modify and delete events
  final Path path = Paths.get(dir);
  path.register(watchService,
            StandardWatchEventKinds.ENTRY_CREATE,
            StandardWatchEventKinds.ENTRY_MODIFY,
            StandardWatchEventKinds.ENTRY_DELETE);

  //start an infinite loop
  while(true){

    //remove the next watch key
    final WatchKey key = watchService.take();

    //get list of events for the watch key
    for (WatchEvent<?> watchEvent : key.pollEvents()) {

      //get the filename for the event
      final WatchEvent<Path> ev = (WatchEvent<Path>)watchEvent;
      final Path filename = ev.context();

      //get the kind of event (create, modify, delete)
      final Kind<?> kind = watchEvent.kind();

      //print it out
      System.out.println(kind + ": " + filename);
    }

    //reset the key
    boolean valid = key.reset();

    //exit loop if the key is not valid
    //e.g. if the directory was deleted
      if (!valid) {
          break;
      }
  }
}
Demo:
$ java Watcher &
$ touch foo
ENTRY_CREATE: foo
$ echo hello >> foo
ENTRY_MODIFY: foo
$ rm foo
ENTRY_DELETE: foo

Java 7: Working with Zip Files

The Zip File System Provider in JDK7 allows you to treat a zip or jar file as a file system, which means that you can perform operations, such as moving, copying, deleting, renaming etc, just as you would with ordinary files. In previous versions of Java, you would have to use ZipEntry objects and read/write using ZipInputStreams and ZipOutputStreams which was quite messy and verbose. The zip file system makes working with zip files much easier!

This post shows you how to create a zip file and extract/list its contents, all using a zip file system.

Constructing a zip file system:
In order to work with a zip file, you have to construct a "zip file system" first. The method below shows how this is done. You need to pass in a properties map with create=true if you want the file system to create the zip file if it doesn't exist.

/**
 * Returns a zip file system
 * @param zipFilename to construct the file system from
 * @param create true if the zip file should be created
 * @return a zip file system
 * @throws IOException
 */
private static FileSystem createZipFileSystem(String zipFilename,
                                              boolean create)
                                              throws IOException {
  // convert the filename to a URI
  final Path path = Paths.get(zipFilename);
  final URI uri = URI.create("jar:file:" + path.toUri().getPath());

  final Map<String, String> env = new HashMap<>();
  if (create) {
    env.put("create", "true");
  }
  return FileSystems.newFileSystem(uri, env);
}
Once you have a zip file system, you can invoke methods of the java.nio.file.FileSystem, java.nio.file.Path and java.nio.file.Files classes to manipulate the zip file.

Unzipping a Zip File:
In order to extract a zip file, you can walk the zip file tree from the root and copy files to the destination directory. Since you are dealing with a zip file system, extracting a directory is exactly the same as copying a directory recursively to another directory. The code below demonstrates this. (Note the use of the try-with-resources statement to close the zip file system automatically when done.)

/**
 * Unzips the specified zip file to the specified destination directory.
 * Replaces any files in the destination, if they already exist.
 * @param zipFilename the name of the zip file to extract
 * @param destFilename the directory to unzip to
 * @throws IOException
 */
public static void unzip(String zipFilename, String destDirname)
    throws IOException{

  final Path destDir = Paths.get(destDirname);
  //if the destination doesn't exist, create it
  if(Files.notExists(destDir)){
    System.out.println(destDir + " does not exist. Creating...");
    Files.createDirectories(destDir);
  }

  try (FileSystem zipFileSystem = createZipFileSystem(zipFilename, false)){
    final Path root = zipFileSystem.getPath("/");

    //walk the zip file tree and copy files to the destination
    Files.walkFileTree(root, new SimpleFileVisitor<Path>(){
      @Override
      public FileVisitResult visitFile(Path file,
          BasicFileAttributes attrs) throws IOException {
        final Path destFile = Paths.get(destDir.toString(),
                                        file.toString());
        System.out.printf("Extracting file %s to %s\n", file, destFile);
        Files.copy(file, destFile, StandardCopyOption.REPLACE_EXISTING);
        return FileVisitResult.CONTINUE;
      }

      @Override
      public FileVisitResult preVisitDirectory(Path dir,
          BasicFileAttributes attrs) throws IOException {
        final Path dirToCreate = Paths.get(destDir.toString(),
                                           dir.toString());
        if(Files.notExists(dirToCreate)){
          System.out.printf("Creating directory %s\n", dirToCreate);
          Files.createDirectory(dirToCreate);
        }
        return FileVisitResult.CONTINUE;
      }
    });
  }
}
Creating a Zip File:
The following method shows how to create a zip file from a list of files. If a directory is passed in, it walks the directory tree and copies files into the zip file system:
/**
 * Creates/updates a zip file.
 * @param zipFilename the name of the zip to create
 * @param filenames list of filename to add to the zip
 * @throws IOException
 */
public static void create(String zipFilename, String... filenames)
    throws IOException {

  try (FileSystem zipFileSystem = createZipFileSystem(zipFilename, true)) {
    final Path root = zipFileSystem.getPath("/");

    //iterate over the files we need to add
    for (String filename : filenames) {
      final Path src = Paths.get(filename);

      //add a file to the zip file system
      if(!Files.isDirectory(src)){
        final Path dest = zipFileSystem.getPath(root.toString(),
                                                src.toString());
        final Path parent = dest.getParent();
        if(Files.notExists(parent)){
          System.out.printf("Creating directory %s\n", parent);
          Files.createDirectories(parent);
        }
        Files.copy(src, dest, StandardCopyOption.REPLACE_EXISTING);
      }
      else{
        //for directories, walk the file tree
        Files.walkFileTree(src, new SimpleFileVisitor<Path>(){
          @Override
          public FileVisitResult visitFile(Path file,
              BasicFileAttributes attrs) throws IOException {
            final Path dest = zipFileSystem.getPath(root.toString(),
                                                    file.toString());
            Files.copy(file, dest, StandardCopyOption.REPLACE_EXISTING);
            return FileVisitResult.CONTINUE;
          }

          @Override
          public FileVisitResult preVisitDirectory(Path dir,
              BasicFileAttributes attrs) throws IOException {
            final Path dirToCreate = zipFileSystem.getPath(root.toString(),
                                                           dir.toString());
            if(Files.notExists(dirToCreate)){
              System.out.printf("Creating directory %s\n", dirToCreate);
              Files.createDirectories(dirToCreate);
            }
            return FileVisitResult.CONTINUE;
          }
        });
      }
    }
  }
}
Listing the contents of a zip file:
This is the same as extracting a zip file except that instead of copying the files visited, we simply print them out:
/**
 * List the contents of the specified zip file
 * @param filename
 * @throws IOException
 * @throws URISyntaxException
 */
public static void list(String zipFilename) throws IOException{

  System.out.printf("Listing Archive:  %s\n",zipFilename);

  //create the file system
  try (FileSystem zipFileSystem = createZipFileSystem(zipFilename, false)) {

    final Path root = zipFileSystem.getPath("/");

    //walk the file tree and print out the directory and filenames
    Files.walkFileTree(root, new SimpleFileVisitor<Path>(){
      @Override
      public FileVisitResult visitFile(Path file,
          BasicFileAttributes attrs) throws IOException {
        print(file);
        return FileVisitResult.CONTINUE;
      }

      @Override
      public FileVisitResult preVisitDirectory(Path dir,
          BasicFileAttributes attrs) throws IOException {
        print(dir);
        return FileVisitResult.CONTINUE;
      }

      /**
       * prints out details about the specified path
       * such as size and modification time
       * @param file
       * @throws IOException
       */
      private void print(Path file) throws IOException{
        final DateFormat df = new SimpleDateFormat("MM/dd/yyyy-HH:mm:ss");
        final String modTime= df.format(new Date(
                             Files.getLastModifiedTime(file).toMillis()));
        System.out.printf("%d  %s  %s\n",
                          Files.size(file),
                          modTime,
                          file);
      }
    });
  }
}
Further Reading:
Zip File System Provider

Saturday, August 06, 2011

Eclipse: Default to "File Search" Tab in Search Dialog

I don't like the fact that Eclipse defaults to "Java Search" whenever you open the Search Dialog (Ctrl+H), because I almost ALWAYS need "File Search". I then need to press Ctrl+Page Up to switch to the "File Search" tab. The good news is that there is a way to make Eclipse default to "File Search" by changing your key bindings. This is how:
  • Go to Window > Preferences and navigate to General > Keys
  • Type Open Search Dialog in the filter text box to search for the command. You should see it bound to Ctrl+H. Click on the command and press Unbind Command
  • Now type File Search and click on the command. In the Binding text field enter Ctrl+H
  • Press Apply
Now, whenever you hit Ctrl+H, the Search Dialog will show the File Search tab by default!

Another thing...
You can clean up the Search Dialog by removing those tabs that you don't need. Do this by opening the Search Dialog (Ctrl+H) and then clicking on Customize.... Deselect the ones you don't need e.g. Task Search, JavaScript Search, Plug-in Search, Spring Pointcut Matches etc.

My Bash Profile - Part VI: Inputrc

inputrc is the name of the readline startup file. You can set key bindings and certain variables in this file. One of my favourite key bindings is Alt+L to ls -ltrF. I also have bindings which allow you to go back and forth across words using the Ctrl+Left/Right Arrow keys.

To take a look at all your current key bindings execute the command bind -P or bind -p. Check out the man pages for more information.

Update: My dotfiles are now in Git. For the latest version, please visit my GitHub dotfiles repository.

Here is my INPUTRC:

set bell-style none
set completion-ignore-case On
set echo-control-characters Off
set enable-keypad On
set mark-symlinked-directories On
set show-all-if-ambiguous On
set show-all-if-unmodified On
set skip-completed-text On
set visible-stats On

"\M-l": "ls -ltrF\r"
"\M-h": "dirs -v\r"

# If you type any text and press Up/Down,
# you can search your history for commands starting
# with that text
"\e[B": history-search-forward
"\e[A": history-search-backward

# Use Ctrl or Alt Arrow keys to move along words
"\C-[OD" backward-word
"\C-[OC" forward-word
"\e\e[C": forward-word
"\e\e[D": backward-word

"\M-r": forward-search-history
If you have any useful bindings, please share them in the comments section below.

More posts on my Bash profile:

Tuesday, August 02, 2011

Java 7: Deleting a Directory by Walking the File Tree

The Java 7 NIO library allows you to walk a file tree and visit each file in the tree. You do this by implementing a FileVisitor and then calling Files.walkFileTree using the visitor. The visitor has four methods:
  • visitFile: Invoked for a file in a directory.
  • visitFileFailed: Invoked for a file that could not be visited.
  • preVisitDirectory: Invoked for a directory before entries in the directory are visited.
  • postVisitDirectory: Invoked for a directory after entries in the directory, and all of their descendants, have been visited.

Recursively Delete all Files in a Directory:
The following code shows how you can recursively delete a directory by walking the file tree. It does not follow symbolic links. I have overridden the visitFile and postVisitDirectory methods in SimpleFileVisitor so that when a file is visited, it is deleted and after all the files in the directory have been visited, the directory is deleted.

import java.io.IOException;
import java.nio.file.FileVisitResult;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.nio.file.SimpleFileVisitor;
import java.nio.file.attribute.BasicFileAttributes;
import static java.nio.file.FileVisitResult.*;

Path dir = Paths.get("/tmp/foo");
try {
  Files.walkFileTree(dir, new SimpleFileVisitor<Path>() {

      @Override
      public FileVisitResult visitFile(Path file,
              BasicFileAttributes attrs) throws IOException {

          System.out.println("Deleting file: " + file);
          Files.delete(file);
          return CONTINUE;
      }

      @Override
      public FileVisitResult postVisitDirectory(Path dir,
              IOException exc) throws IOException {

          System.out.println("Deleting dir: " + dir);
          if (exc == null) {
              Files.delete(dir);
              return CONTINUE;
          } else {
              throw exc;
          }
      }

  });
} catch (IOException e) {
  e.printStackTrace();
}

Sunday, July 31, 2011

Java 7: Try-With-Resources

The "Try With Resources", also called "Automatic Resource Management" (ARM), statement allows you to declare one or more resources in the try statement. When the statement completes, either successfully or unsuccessfully, all of its resources are closed automatically. You don't need to manually close resources in a finally block anymore, which means you don't have to worry about resource leaks.

Here is a method to make a copy of a file, using the Try-With-Resources statement. There are two resources defined in the try statement, which are automatically closed when the statement completes.

public static void copyFile(String src, String dest) throws IOException  {
  try (BufferedReader in = new BufferedReader(new FileReader(src));
       BufferedWriter out = new BufferedWriter(new FileWriter(dest))){
      String line;
      while((line = in.readLine()) != null) {
          out.write(line);
          out.write('\n');
      }
  }//no need to close resources in a "finally"
}
Here is another example, in which I have created my own resources implementing the AutoCloseable interface:
class ResourceA implements AutoCloseable{
  public void read() throws Exception{
    throw new Exception("ResourceA read exception");
  }
  @Override
  public void close() throws Exception {
    throw new Exception("ResourceA close exception");
  }
}

class ResourceB implements AutoCloseable{
  public void read() throws Exception{
    throw new Exception("ResourceB read exception");
  }
  @Override
  public void close() throws Exception {
    throw new Exception("ResourceB close exception");
  }
}

//a test method
public static void test() throws Exception{
  try (ResourceA a = new ResourceA();
       ResourceB b = new ResourceB()) {
    a.read();
    b.read();
  } catch (Exception e) {
    throw e;
  }
}
When this code is executed, a.read() throws an exception. The two resources are automatically closed, first B and then A (in the reverse order to which they were created). The "read" exception is thrown out of the method, and the two "close" exceptions are "suppressed". You can retrieve these suppressed exceptions by calling the Throwable.getSuppressed method from the exception thrown by the try block. The complete stack trace is shown below:
java.lang.Exception: ResourceA read exception
  at ResourceA.read(Dummy.java:48)
  at Dummy.test(Dummy.java:18)
  at Dummy.main(Dummy.java:38)
  Suppressed: java.lang.Exception: ResourceB close exception
    at ResourceB.close(Dummy.java:63)
    at Dummy.test(Dummy.java:20)
    ... 1 more
  Suppressed: java.lang.Exception: ResourceA close exception
    at ResourceA.close(Dummy.java:52)
    at Dummy.test(Dummy.java:20)
    ... 1 more
Further Reading:
The try-with-resources Statement

Java 7: Precise Rethrow

Previously, rethrowing an exception was treated as throwing the type of the catch parameter. For example, let's say that your try block could throw a ParseException or an IOException. To intercept all exceptions and rethrow them, you would have to catch Exception and declare your method as throwing an Exception. This is "imprecise rethrow", because you are throwing a general Exception type (instead of specific ones) and statements calling your method need to catch this general Exception.

This is illustrated below:

//imprecise rethrow.
//must use "throws Exception"
public static void imprecise() throws Exception{
	try {
		new SimpleDateFormat("yyyyMMdd").parse("foo");
		new FileReader("file.txt").read();
	} catch (Exception e) {
		System.out.println("Caught exception: " + e.getMessage());
		throw e;
	}
}
However, in Java 7, you can be more precise about the exception types being rethrown from a method. If you rethrow an exception from a catch block, you are actually throwing an exception type which:
  • the try block can throw,
  • no previous catch clause handles, and
  • is a subtype of one of the types in the declaration of the catch parameter
This leads to improved checking for rethrown exceptions. You can be more precise about the exceptions being thrown from the method and you can handle them a lot better at the calling site.
//java 7: precise rethrow.
//no longer "throws Exception"
public static void precise() throws ParseException, IOException{
	try {
		new SimpleDateFormat("yyyyMMdd").parse("foo");
		new FileReader("file.txt").read();
	} catch (Exception e) {
		System.out.println("Caught exception: " + e.getMessage());
		throw e;
	}
}

//this example handles ParseException
public static void precise2() throws IOException{
	try {
		new SimpleDateFormat("yyyyMMdd").parse("foo");
		new FileReader("file.txt").read();
	} catch(ParseException e){
	    System.out.println("Parse Exception");
	}catch (Exception e) {
		System.out.println("Caught exception: " + e.getMessage());
		throw e;
	}
}
(Note: Early documentation of this feature, states that you have to use a final modifier on the catch parameter, but this restriction was lifted later on, so is not necessary.)

Further Reading
Rethrowing Exceptions with More Inclusive Type Checking

Saturday, July 30, 2011

Java 7: Safe Varargs Method Invocation

In previous versions of Java, when you invoke a varargs method with a non-reifiable varargs type, the compiler generates a warning on the calling statement. Consider the code:
//varargs method
public static <T> void print(T... a) {
  for (T t : a) {
      System.out.println(t);
  }
}

//calling method
public static void main(String[] args){

  print("Hello", "World"); //this is fine

  print(new Pair<Integer,String>(1,"One"), new Pair<Integer,String>(2,"Two"));
  //WARNING: Type safety : A generic array of Pair<Integer,String>
  //is created for a varargs parameter
}
This because the compiler tries to create an array of Pair<Integer,String>[] to hold the varargs, which is not permitted because Pair<Integer,String> is type erased at runtime to just Pair. (More information here.) To suppress this warning, you need to add @SuppressWarnings("unchecked") to each method which makes a call to the varargs method.

In JDK7, the warning has been moved from the call site to the varargs method declaration and you can annotate the varargs method with @SafeVarargs in order to suppress it. This reduces the total number of warnings reported and those that have to be suppressed.

@SafeVarargs
// WARNING SUPPRESSED: Type safety: Potential heap pollution via varargs parameter a
public static <T> void print(T... a) {
  for (T t : a) {
      System.out.println(t);
  }
}

public static void main(String[] args){
  print("Hello", "World");
  print(new Pair<Integer,String>(1,"One"), new Pair<Integer,String>(2,"Two"));
  //no warnings :)
}
You can see this annotation used in JDK7's Arrays.asList method:

@SafeVarargs
public static <T> List<T> asList(T... a) {
   return new ArrayList<>(a);
}
Eclipse support:
Quick fix gives you the option to Add @SafeVarargs to your method.

Further Reading:
Improved Compiler Warnings and Errors When Using Non-Reifiable Formal Parameters with Varargs Methods

Java 7: Strings in Switch

Strings in switch gives you the ability to switch on string values just like you can currently do on primitives. Previously, you would have had to create chained if-else tests for string equality or introduce an enum, which is not necessary anymore.

The strings-in-switch feature works by first switching on the hashCode of the String and then performing an equals test.

Here is an example of a method which returns the number of days in a month. It uses if-else statements and string equality:
public static int getDaysInMonth(String month, int year) {
    if("January".equals(month) ||
       "March".equals(month)   ||
       "May".equals(month)     ||
       "July".equals(month)    ||
       "August".equals(month)  ||
       "October".equals(month) ||
       "December".equals(month))
        return 31;
    else if("April".equals(month)    ||
            "June".equals(month)     ||
            "September".equals(month)||
            "November".equals(month))
        return 30;
    else if("February".equals(month))
        return ((year % 4 == 0 && year % 100 != 0) ||
                 year % 400 == 0) ? 29 : 28;
    else
        throw new IllegalArgumentException("Invalid month: " + month);
}
The same code can be rewritten neatly using strings-in-switch as follows:
public static int getDaysInMonth2(String month, int year) {
    switch(month) {
        case "January":
        case "March":
        case "May":
        case "July":
        case "August":
        case "October":
        case "December":
            return 31;
        case "April":
        case "June":
        case "September":
        case "November":
            return 30;
        case "February":
            return ((year % 4 == 0 && year % 100 != 0) ||
                     year % 400 == 0) ? 29 : 28;
        default:
            throw new IllegalArgumentException("Invalid month: " + month);
    }
}
Further Reading:
Strings in switch Statements

Java 7: Underscores in Numbers and Binary Literals

In order to aid readability, you can now place underscores between numbers but you must start and end with a digit. For example:
int million = 1_000_000;
float pi = 3.14_159f;
JDK7 also allows you to express integer literals in binary form. This makes it easier to read code which uses bitwise operations. In order to do this, prefix your binary sequence with 0b or 0B. For example:
//previously, you would have had to do it like this:
int x = Integer.parseInt("1000", 2);

//in jdk7, you can create a binary literal like this:
int eight = 0b1000;

//easier to read bitwise operations
int four = 0b1000>>1;
Further Reading:
Underscores in Numeric Literals Binary Literals

Java 7: Multi-catch

You can now catch more than one exception in a single catch clause which removes redundant code.

Here is an example of a statement which throws multiple exceptions:

try {
    Class.forName("Object").newInstance();
} catch (ClassNotFoundException e) {
    e.printStackTrace();
} catch (InstantiationException e) {
    e.printStackTrace();
} catch (IllegalAccessException e) {
    e.printStackTrace();
}
In JDK7, you can collapse the catch blocks into a single one:
try {
    Class.forName("Object").newInstance();
} catch (ClassNotFoundException |
         InstantiationException |
         IllegalAccessException e) {
    e.printStackTrace();
}
Eclipse Support:
  • If you have multiple catch clauses, quick fix gives you the option to Combine catch blocks provided all the catch bodies are the same

  • Conversely, if you have a multi-catch clause, quick fix gives you the option to Use separate catch blocks so that you can handle each exception separately

  • Quick fix gives you the option to Move exception to separate catch block which allows you to take exceptions out of the multi-catch in case you want to handle them separately

  • Quick fix also gives you the option to Surround with try/multi-catch
Further Reading
Handling More Than One Type of Exception

Java 7: Diamond Operator

The diamond operator (<>) removes the need for explicit type arguments in constructor calls to generic classes, thereby reducing visual clutter. For example:
//previously:
Map<Integer, List<String>> map = new HashMap<Integer, List<String>>();

//in jdk7, use the diamond operator. Saves typing!
Map<Integer, List<String>> map2 = new HashMap<>();

List<?> list = new ArrayList<>();
The compiler infers the type on the right side. So if you have a list of ?, the compiler will infer a list of Object.

Eclipse support:

  • Eclipse Content Assist (Ctrl + Space), auto-completes using a diamond instead of explicit type arguments. So, in the example above, when you type new HashM[Ctrl+Space], Eclipse will insert new HashMap<>();.

  • You can also configure Eclipse to warn you if you use explicit type arguments, instead of a diamond. To do this, go to your Preferences and navigate to Java > Compiler > Errors/Warnings. In the Generic types section, select Warning against Redundant type arguments. Eclipse will then offer you a quick-fix to remove the type arguments, if you accidently put them in.
Further Reading:
Type Inference for Generic Instance Creation