Colin's Devlog

Saturday, November 12, 2011

Lambdas in Java: Playing with the snapshot binary

The first binary snapshots of the Project Lambda repository were made available yesterday. I hadn't bothered to check out and build the code before now, but with only having to download the snapshot I decided to give it a try. I'd been following the mailing list and looking at the code changes to the JDK libraries, so I more or less knew what to expect.

In general, things worked nicely. While I wasn't a big fan of it at first, I do like how concise the new syntax is. It's similar to the C# syntax:

people.filter(p -> p.getAge() < 50)
    .map(p -> p.getFirstName() + " " + p.getLastName())
    .forEach(n -> { System.out.println(n); });

It's pretty much what you'd expect, and obviously nothing revolutionary... just long overdue in my opinion. What I'm really excited about is the improvements to APIs (and new APIs) that concise lambda expressions will allow in Java. I experimented with just a few examples, including:

A table model that uses a map function to derive column values from objects.
A logging API that can take a Supplier for the message so that it only builds the message if needed.
A version of Guava's Optional that has a map function. Far too clunky to be very useful with anonymous inner classes, very nice with a concise lambda expression:
A framework for async tasks with callbacks based on Guava's ListenableFuture code that I've been playing with. It's so much nicer when there's so little syntactic overhead for both submitting tasks (since lambdas can be used for Runnable and Callable) and adding callbacks (since they can be SAM types too).

Extension methods should also be a huge benefit. While it could be considered unfortunate that they don't allow you to add methods to others' APIs, I think that's a reasonable decision for Java. Giving API authors the ability to extend interfaces will allow APIs to evolve over time in ways they couldn't before.

There were a couple issues I noticed, but they weren't a big deal and are to be expected with something that's very much in progress.

Extension methods only work for JDK classes out of the box because currently a bytecode weaving tool is used to simulate VM support for extension methods (which isn't there yet). That tool was run as part of the snapshot build, but has to be run separately on any other classes that you want to support extension methods. For example, an AbstractMethodError is thrown if you try to call filter on Guava's ImmutableList... the compiler knows that you should be able to call it since filter is defined on Iterable, but without the VM support or having run the weaver for ImmutableList, it fails at runtime. (Mike Duigou explains how to use the weaver here.)

Support for method references seems kind of shaky. The compiler threw an exception in one situation where I tried to use a method reference and I got compile errors in a number of other situations where I thought a method reference should work. In some other situations, they worked. A final syntax hasn't been decided on for method references, so I imagine support for them just isn't as far along.

There were some unfortunate overload resolution shortcomings. For example, neither of the following statements compile:

Comparator<String> c = Comparators.comparing(s -> s.length());
Comparator<String> c = Comparators.comparing(
    (String s) -> s.length());

You have to write one of the following, so it know which kind of Mapper you want the lambda to be:

Comparator<String> c = Comparators.<String, Integer>comparing(
    s -> s.length()); // Mapper<String, Integer>
Comparator<String> c = Comparators.comparing(
    (IntMapper<String>) s -> s.length()); // IntMapper<String>

I hope it will be possible to change the resolution to choose the most specific interface (in this case IntMapper<String>) so I don't have to specify it unless I want to.

Anyway, it was fun to be able to actually write and run code using lambda expressions in Java. Between Project Lambda and JSR-310 (the new date/time API), I'm really looking forward to JDK 8... too bad it isn't coming until 2013!

Monday, June 20, 2011

Using libraries

I occasionally get flak for recommending Guava as a solution on relatively simple questions on StackOverflow. The argument generally goes: "this is so easy, don't add a library just for that!". The key phrase there is "just for that": it seems that some people on StackOverflow tend to take the viewpoint that whatever is being asked in the question is all the person who asked will need to do and that as such the only valid answer is some standalone code they can copy and paste in. I don't believe that a whole library is worth it to save yourself from writing one 4-line method... I just believe code doesn't exist in isolation and that in almost any project that's more than a few classes, Guava will be more than worth its weight.

The point of Guava is that it makes a lot of things easier, from fairly easy things that take several lines of code (but should just take one) to very hard things that would be easy to get wrong. In aggregate, using its facilities instead of writing extra code yourself can make your application clearer and more maintainable. There's also the benefit that people who have used the library before will know where to look for things instead of having to figure out what class in your project (if any) you copy/pasted some certain utility method into.

I also believe that it's up to the person who asked the question to evaluate whether they want to add a library or not. Someone almost always posts an alternative answer that does not use a library, so that option is generally going to be on the table. My hope is that they'll say "oh, nice that someone has made something that fixes this little pain point for me", take a look at what else it offers and recognize the value it'll provide them. I hope this because I genuinely believe they'll be better off if they do.

I'm aware that there are various concerns such as company policy that can get in the way of using a 3rd party library and that opinions differ on what libraries are good to use, but in general I think answers that make use of libraries are perfectly valid even when they aren't necessarily a big savings over writing the code yourself for the problem at hand.

Thursday, April 7, 2011

Lambdas in Java: constructor references

One thing I'm extremely excited about is the addition of support for lambda expressions in JDK 8. I've been following the Project Lambda mailing list since its creation and watching as things have evolved (and as it slipped from JDK 7, sadly... though there really wasn't time for it). Anyway, since I'm constantly thinking "Gahhh! If only Java had lambdas already!" when I'm working, I thought I should do some posts talking about what I'm looking forward to. Most of them will probably involve Guava (since it's one of my favorite things in Java) and how things you can do currently with it will be easier and read nicer using lambdas.

Today, I want to talk about constructor references! Basic support for them was added to the lambda repository of OpenJDK back in January.

A constructor reference is, as its name implies, a reference to a class constructor. Creating such a reference gives you an object implementing some interface and having a single method that, when called, will call the constructor and return the new instance that is created. For a simple example of how constructor references can help make code that's currently verbose nice and simple, consider the following code:

Multimap<String, String> multimap = Multimaps.newMultimap( new TreeMap<String, Collection<String>>(), new Supplier<List<String>>() { @Override public List<String> get() { return new ArrayList<String>(); } });

This code creates a Multimap sorted by keys but with unsorted ArrayLists holding values. One thing stands out here: the Supplier. That's a lot of code just to say "I want to make an ArrayList for each collection". But there isn't really any better way to do that in Java. Now take a look at how a constructor reference improves this:

The whole bulky Supplier from the first example has been replaced with ArrayList<String>#new. Just that! The reference to the no-arg constructor for ArrayList will be translated to a Supplier<ArrayList<String>> since it has a compatible method signature (no arguments and returns an ArrayList<String>). This is a huge improvement! Not only does it cut out 2/3 of the lines of code here (and a lot of ugly indenting, etc.), it's much clearer what we're doing.

What if you want to specify the initial capacity for the lists that are created? This is just as easy: use ArrayList<String>#new(100) instead. This will call the constructor that takes an int argument each time, passing the value 100.

Of course, even without constructor references all this would be relatively easy using lambda expressions: ArrayList<String>#new would instead be something like:

#{ -> new ArrayList<String>() }

It's nice to see constructors are getting the same treatment methods are, though.

As a side note, boy do I ever wish that I could use markdown on this blog. If writing a nicely formatted blog post were as easy as writing a StackOverflow answer... well, maybe I'd do it a little more often!

Wednesday, February 16, 2011

Deploying documentation to GitHub pages with Jenkins / Hudson

GitHub Pages is a neat feature of GitHub that allows you to serve static files by using a special gh-pages branch in a repository. Having got Jenkins set up to build my jdbc-persist project, I thought it would be neat to set up something to take the Javadoc generated during a build and automatically push it to the gh-pages branch when the build finishes.

My project uses Maven for building, and I didn't see any way to have the same job that builds the project handle pushing the documentation on a separate branch, so I set up a second job (which I called jdbc-persist-site) to do that.

The process isn't terribly elegant. The first job uses the Maven site:site goal to generate both Javadoc and a whole Maven project site. The second project first pulls my gh-pages branch. It then runs the following batch script (wouldn't be much different as a .sh script):

This basically just grabs the generated site from the other job manually and overwrites the previous version of the files. I actually had to split the three git commands into 3 separate batch file steps in Jenkins because a single script would stop after just 1 git call.

The final step was to push the new commit to gh-pages so that the updated files would show up. This was a simple matter of selecting the Git Publisher post-build action and telling it to publish to the gh-pages branch on origin. I also set up a trigger to run the jdbc-persist-site job after any successful jdbc-persist build, ensuring that any changes I make are automatically reflected on the site each time.

I find an approach like this to be far better than the approach of storing generated Javadoc files in the same branch as the source code the way a number of projects I like (Guava and Guice for example) do. Directories containing artifacts like Javadoc that can be derived from the source clutter up the project structure, clutter up commit history with changes to derived files and add to the volume of files that someone checking out the project has to download. I think the ability to do this sort of thing easily speaks well for the ease and power of Git's branching model and for GitHub's awesome features!

Tuesday, February 15, 2011

Git clone error on Jenkins/Hudson on Windows

Recently I set up a new instance of Jenkins (formerly Hudson) running on my Windows 7 desktop computer. I tried to set up a job that would pull from a GitHub repository and do a build but (like every other time I've tried this) was foiled by the job simply hanging at the step where it tries to clone or fetch from GitHub. Cancelling the job leads to the following errors in the console output:

First, I tried the usual suspects:

Ensuring that Jenkins was running as the correct user.
Ensuring that the user's .ssh directory and keys were in place and checked out.
Trying the full path to git.exe in Git's bin directory rather than just using git as the path.

I did a search for the error, but as in the past it turned up no solutions that actually fixed the problem for me. I then tried running a job that just executed a Windows batch command to run git clone on the same repository... which worked fine!

The solution

Double-checking things, I noticed that my PATH contained two directories for my Git installation (msysgit): one for Git/bin and another for Git/cmd, which I hadn't been aware of before. Looking in Git/cmd, I noticed that it contained a file git.cmd which appeared to be some kind of script wrapping git.exe. So, I went into the Jenkins configuration and changed the path to the git executable to git.cmd. And that fixed everything!

Based on what I've seen when searching for this error, it seems like there are quite a few potential causes for it. But this is what worked for me in my situation and I haven't seen this solution elsewhere, so I thought I'd write it down... if nothing else to help me remember if it happens again.

Friday, June 18, 2010

Property interfaces and Guava functional programming

Something that I’ve been thinking about lately is how I can make it easier to use Guava’s methods that operate on Functions, Predicates, etc.

One simple use for Functions is to get properties of objects, allowing you to (among other things) create a transformed view of a collection containing a certain property from each element. For example, transforming a list of Person objects to a String collection of names. Any time you want to do something with a collection of objects based on a certain property of those objects, this is useful.

However, it's rather ugly and awkward to have to create an anonymous inner class inline when calling a method that uses a Function. While you can create static final instances of such functions and methods that return instances of functions, with many classes you could end up with many such instances and methods. One good way to reduce the number of Function objects you need to create and make it easy to work with certain common properties (such as IDs, names, etc.) on many objects is to create interfaces that each expose a single such property:

Not only does this help add consistency to your classes, it allows you to collect behavior that can work on any type of object that has the property the interface exposes. I'd probably name such classes as the plural of the property name ("Ids" and "Names", for example).

class Ids { private Ids() {} public static final Function<HasId, Long> GET = new Function<HasId, Long>() { public Long apply(HasId hasId) { return hasId.getId(); } }; public static Iterable<Long> of( Iterable<? extends HasId> items) { return Iterables.transform(items, GET); } }

This makes working with the IDs of a list of things with IDs as easy as possible:

The GET function is exposed to allow its use for anything you might want a function for, not just tranform. For example, with a HasName interface and Names class similar to the code for IDs, I could do something like this:

This would of course filter a list of people to an iterable that only includes people with the name Bob. It uses static imports (because they make this stuff read a lot nicer), and the methods are:

I think this approach can make working with the properties of objects using Functions a lot easier and cleaner, while also encouraging more consistency between classes in general, which is nice.

Thursday, March 25, 2010

Get your UI out of my logic

I really don't like seeing something like this used in a domain object.

I've also seen a list of objects retrieved from the database and put into an Object array with "Select" as array[0] and the rest of the objects being actual objects someone might care about. Let's just have everyone remember what type of objects are in this array, and they should also remember that the first element isn't one of those!

I can't imagine what people find so hard about producing a new list or whatever to base their combo box model on at the time that the combo box is actually created. Furthermore, with something like the enum above, the enum constants use lowercase letters because it'll look nicer when they're rendered in the UI using the default toString(). The use of toString() to produce the text for the UI in general bothers me. I want it to produce text that's helpful to me in error messages... use some other means for rendering it to users.

I guess what I want to say is: keep user interface concerns where they belong, near the user interface!