Monday, 13 September 2010

On Maven 3 Parallel builds and why smaller modules are good for you

I recently noticed that parrallel builds are now a feature of Maven 3. I decided to try it out on the project I'm working on (calculating some taxes for the Flemish government), and I noticed a big speedup:

Default maven:

mvn clean install
...

[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 7:05.842s
[INFO] Finished at: Mon Sep 13 19:23:29 CEST 2010
[INFO] Final Memory: 193M/1216M
[INFO] ------------------------------------------------------------------------
Using the parallel builds feature:
mvn clean install -T 4
...

[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 5:04.982s (Wall Clock)
[INFO] Finished at: Mon Sep 13 19:29:02 CEST 2010
[INFO] Final Memory: 135M/1326M
[INFO] ------------------------------------------------------------------------

This is on my quad core I7 system. The amount of parallelism you can achieve depends on the structure of your project. If you have a project where C depends on B depends on A, you will not get a speedup since nothing can be build in parallel. If on the other hand, C and B just depend on A,you can now build B and C in parallel.

There's also an option (Weave Mode) to run different phases of different modules in parallel (for example compiling B while still running tests of A), but that doesn't yet work reliably on this project. I guess it could allow for an even greater speedup, since even dependency trees like C depends on B depends on A could then be parallelized.

The largest and slowest module of this project takes about 2 minutes to build (it has more than 800 scenario tests). Now, since that module actually implements 2 different taxes, it can be further split up into three parts (one common and 2 for the specific taxes). Doing that, I can now build those three parts in parallel, taking only 1.40 minutes instead of two. That's why multiple smaller modules are better than one larger as far as build speed is concerned (and you want multiple modules to enforce your modularity anyway).

This kind of feature, and the ease with witch you can use it out of the box, is what I absolutely love about Maven. Ok, the strict Maven model limits you in some ways at times, and the pom.xmls can become very verbose, but you gain so much back.


P.S.

Does anyone know whether any other build systems (Gradle, SBT, ...) have anything like this? (I can't imagine trying to do anything like this by myself in an Ant build.)

Technorati Tags:

Posted by cvf at 9:28 PM in Java

Monday, 3 May 2010

The importance of a fluent interface for building testdata

Today, I was talking to someone about how we made a fluent interface/dsl that allows us to construct data needed for our scenario tests in a readable and maintainable way. This allows us (the developers) to quickly create and understand tests, and even allows us to explain them to a business analyst when discussing a requirement/bug/current behaviour.

He asked me to clarify what I meant with readable, and since talking about code without seeing any is pretty hard, I'm showing an example here.

The core of our datamodel needed for calculating immovable property taxes consists of the following connected entities (a simplification of the real model):
As you can see, that's already quite a bit of data we will need to setup. Since this datamodel is also bitemporal (data has a validity and record dimension to it) and sourced (meaning several sources/reporters can report the same kind of data), setting things up gets complex and verbose very quickly.

This is what a (partial) setup would look like in our raw entity model API:


Person landlord = Person.create();
landlord.getPersonalEvent(Reporter.RR, PersonalEventType.BIRTH)
		.set(new PersonalEvent(PersonalEventType.BIRTH, new Day(7, 6, 1980), Reporter.RR),
				ValidityRange.fromToday());
landlord.getPersonName(Reporter.RR).set(new PersonName("landlord", Reporter.RR));
landlord.getExternalIdentification(Reporter.AKRED, ExternalIdentificationType.AKRED_NUMBER).set(
		Collections.singleton(new ExternalIdentification(ExternalIdentificationType.AKRED_NUMBER,
				"1234567890X", Reporter.AKRED)));
landlord.getOccupiedSpace(Reporter.VLABEL).set(new OccupiedSpace(someSpace, landlord, Reporter.VLABEL));


CadastralArticle article = CadastralArticle.create(someCadastralDepartment, 1);

Collection<RightState> rightstates = CollectionFactory.newList();
RightState rs = new RightState(landlord, 1, new Percentage(100));
rightstates.add(rs);
article.getTimeSlice(Reporter.AKRED, null).set(new TimeSlice(Reporter.AKRED, rightstates),
		ValidityRange.wholeYear(2009));

ImmovableProperty ip = ImmovableProperty.create(someCadastralDepartment, "some-long-code");

Collection<ImmovablePropertyIncome> incomes = CollectionFactory.newList();
ImmovablePropertyIncomeBuilder builder = BuilderFactory.createBuilder(ImmovablePropertyIncomeBuilder.class);
builder.setCadastralIncome(new MonetaryAmount(100));
builder.setFiscalStatus(FiscalStatusType.NORMAL);
builder.setSequenceNumber(1);
builder.setType(ImmovablePropertyIncomeType.NORMAL_BEBOUWD);
incomes.add(builder.build());
ip.getIncomes(Reporter.AKRED).set(incomes, ValidityRange.wholeYear(2009));

// and it just goes on and on

As you can see, that's quite some code to setup the data (and it's just a trivial scenario). Just imagine trying to maintain hundres of testcases written this way. Now, this is what it looks like in our simplified API:

// we need the finals since they're being used in the anonymous inner classes we create.
final Person landlord = new TestPersonBuilder() {
	{
		bornOn(new Day(7, 6, 1980));
		name("landlord");
		eid("1234567890X");
		occupies(someSpace);
	}
}.build();

final CadastralArticle article = new TestCaBuilder() {
	{
		articleNumber(1);
		fullyOwnedBy(landlord);
	}
}.build();


ImmovableProperty house = new TestIpBuilder() {
	{
		plotCode("some-long-code");
		normalIncome(100);
		includedIn(landlordArticle, sequence(1));
		equivalentWith(someSpace);
	}
}.build();

This is already quite a bit more readable, and at the same time offers more functionality. Sane defaults are being used behind the scenes, but you can still override them where necessary. By using these TestDataBuilders (as we called those classes), we've been able to implement hundreds of testcases in a readable and maintainable way, and it was much worth the effort coming up with those.

Technorati Tags:

Posted by cvf at 11:01 PM in Java

Tuesday, 20 April 2010

Scala / TestNG gotcha with result type inference

I recently ran into a Scala / TestNG gotcha that proves how much of a Scala noob I still am (but also the kind of problems you can have with the type inference features).

Scala has two nice features that can make your method definitions very succinct, but as is the case with many of such features, they can cause some problems of their own if you're not careful.
The two features are:

  • result type inference
  • optional return statement

The result type inference allows you not to specify the result type of a method, allowing the scala compiler to infer it:


// the result type is explicitly defined
def getFooWithResultTypeSpecified() : String = {
    return "123"
}

//the result type will be inferred
def getFoo() = {
    return "123"
}

val foo = getFoo()
// foo will be inferred to be a String as getFoo was inferred to returning String.
assertEquals(foo,"123")

The optional return statement is exactly that: it allows you to leave out the actual return statement. The compiler will use the result of the last expression as the result type.

def getFoo() = {
    "123"
}

val foo = getFoo()
assertEquals(foo,"123")

The gotcha I had was in the combination of these features. TestNG considers test methods to be those methods that are both public and void returning. So let's say I start out with he following method:

def testFoo() = {
    val foo = new Foo()
    // if doSomeVoidMethod doesn't fail, it's good
    foo.doSomeVoidMethod()
}
The result type of testFoo is void, so the method will be executed by TestNG. Than you change foo.doSomeVoidMethod to foo.doSomeStringReturningMethod. Now testFoo is inferred to return a String, and the method is no longer executed by TestNG. This was exactly the problem I had, and I spent quite some time figuring out why some of my test methods were no longer being run.


The correct protection against this is always ensuring that your test methods are explicitly declared to return Unit (void) like this:

def testFoo() : Unit = {
    val foo = new Foo()
    // if doSomeStringReturningMethod doesn't fail, it's good
    foo.doSomeStringReturningMethod()
}
The correct protection against this is using procedures instead of functions, like this:
def testFoo() {
    val foo = new Foo()
    // if doSomeStringReturningMethod doesn't fail, it's good
    foo.doSomeStringReturningMethod()
}
Note the missing = that makes all the difference. Thanks to Joey for making that clear.

Technorati Tags:

Posted by cvf at 7:55 AM in Java

Monday, 5 April 2010

There's been an explosion in the ASCII factory.

There's been an explosion in the ascii factory.

I always thought this sentence was reserved for Perl, but I recently discovered that this can unfortunately also be applied to Scala. Scala supports operator overloading (although technically it's not), so we see the good and the bad sides of this.

Representing the good side, you have the parser/combinators library that comes with scala. It implements a DSL that looks very natural if you're already familiar with the BNF form.

The dubious honour for representing the bad side in this case goes to the Dispatch library. I was looking for a library that would allow me to make some calls to a restful service (couchdb in this case), and came upon it.

It has some of the worst abuses of operator I've seen in a while. Look at the following lines you can find in the introduction docs:


import dispatch._
val http = new Http
val req = :/("example.com") / "path"

val rhead = req <:> Map("Cache-control" -> "no-cache")
http(req / "somefile.xml" <> { _ \\ "book" })
val rauth = req as ("user", "secret")
val rquery = req <<? Map("key" -> "value")
val rform = req << Map("key" -> "value")
http(req >~ { _.getLines.foreach(println) })
val rsec = req.secure

Now, I left out the comments above each line that explain what the code does, but that is my whole point. There is now no longer any way for someone not already very familiar with Dispatch to have any idea whatsoever about what that code does. Even if I were to write code using this library, I'm certain that I'd have trouble figuring out what it exactly does again only a week later.

It really feels to me that the author of this library has an innate hatred against the alphabetic part of his keyboard. There's i.m.o. no other justification for letting "<<?" mean "add a query string to the request".
Even a javastyle req.setQuery(Map(...)) would have been much better, rememberable and readable.

If libraries like Dispatch represent how idiomatic scala will be written, I'll be looking for another language to succeed java.

A small bonus :)

("@ , There's been
°!&  0 an explosion
#^a /|\ at the ASCII
`;<  |  factory!!!!
@a  / \

Technorati Tags:

Posted by cvf at 10:29 PM in Java

Tuesday, 4 March 2008

Dynamically generating Builder implementations, part 1.

In my current project, we have a lot of immutable values (a side effect of the fact that we have a bitemporal domain12, but that is a story for another time). We decided to actually enforce that immutability, so those values have no methods that mutate its state (a.k.a. setters). So all properties must be set at creation time using the constructor.

Now, some of these value classes have 10+ properties, so instantiating everything using the constructor isn't an attractive or readable option. Ideally, you'd use some kind of Builder Pattern to set those properties and create the actual instance once you're done setting them.

The only part that sucks about using Builders is that you have to code, test and maintain them. This is the part where dynamic proxies and code generation can come to the rescue. In the way I implemented this, we have three components in the solution:

  • The immutable value class (OrganizationState is this example).
  • Builder, an interface that defines a Builder, which is a class that knows how to build instances. This one should be extended for each concrete type you want to build, containing setters for each of the properties you want to set.
  • BuilderFactory, a concrete class that can generate an implementation of the right Builder. This is done using JDK proxies.

Using those components looks like this:

The value class and its accompanying Builder interface:

public class OrganizationState implements Serializable, Reported {

	private Day foundedOn;
	private Day endedOn;
	private OrganizationEndReason endReason;
	private OrganizationType type;
	private EstablishmentType establishmentType;
	private KboStatus kboStatus;
	private Reporter authenticReporter;

	public OrganizationState(Day foundedOn, Day endedOn,
OrganizationEndReason endReason, OrganizationType type,
EstablishmentType establishmentType, KboStatus kboStatus, 
Reporter authenticReporter) {
		this.foundedOn = foundedOn;
...
public interface OrganizationStateBuilder extends
 Builder<OrganizationState> {
	void setFoundedOn(Day foundedOn);
	void setEndedOn(Day endedOn);
	void setEndReason(OrganizationEndReason endReason);
	void setType(OrganizationType type);
	void setEstablishmentType(EstablishmentType establishmentType);
	void setKboStatus(KboStatus kboStatus);
	void setAuthenticReporter(Reporter authenticReporter);
}

As you can see, we have 7 properties that need to be set. Using a constructor would lead to wieldy, unreadable code, especially when you have have lots of properties of the same type. The Builder interface itself looks like this:

public interface Builder<V> {

	/**
	 * Initialize the V instance being built by the builder
         * with given prototype instance.
	 * @param prototype
	 */
	void fromPrototype(V prototype);

	/**
	 * Build the product (a V instance) and return it.
	 */
	V build();
}

Now the client code for using a Builder looks like this:

 
//create an implementation of the OrganizationStateBuilder, using JDK proxies.
OrganizationStateBuilder builder = BuilderFactory.createBuilder(OrganizationStateBuilder.class);

//set the properties on the build, from is a TO that comes from the frontend.
builder.setAuthenticReporter(from.getOrganizationInfo().getAuthenticReporter());
builder.setFoundedOn(from.getOrganizationInfo().getStartDate());
builder.setEndedOn(from.getOrganizationInfo().getEndDate());
//set some more properties
...

//build the actual instance.
OrganizationState state = builder.build();
It is also possible to create an instance that takes its initial values from another instance. This is very especially useful if you only need to change a single property:
OrganizationState prototype = ....;
...
OrganizationStateBuilder builder = BuilderFactory.createBuilder(OrganizationStateBuilder.class);
builder.fromPrototype(prototype);

builder.setFoundedOn(Day.today());
//build the actual instance.
OrganizationState state = builder.build();

As you can see, this code look much easier on the eyes than the equivalent constructor based version would look. For the next part, we'll look at how the BuilderFactory is implemented, and I'll talk about the how this code is generated and tested.




  1. [1] My colleague Erwin Vervaet made an excellent presentation at the Spring experience about this subject.

  2. [2] Martin Fowler has also written on the subject at http://www.martinfowler.com/eaaDev/timeNarrative.html.

Posted by cvf at 11:14 PM in Java

Sunday, 3 February 2008

"Overriding" a third-party method in javascript

One of my colleagues recently had an issue with a javascript validation method that was being generated (and called) by trinidad. He wanted to change the behavior of the method (basically adding a guard clause in the beginning), and did this by copy pasting the existing method, modifying and defining it again. He knew this was asking for trouble anytime the trinidad implementation changes, so I helped him work out a better solution.

Since in javascript, functions are full-fledged objects, you can have references to them (this is something that escapes most programmers when they only have a basic knowledge of javascript). But in our case, that provides everything we needed to solve our problem. So I suggested something like this instead:

//keep a reference to the original function
//notice we don't use (), otherwise we'd execute the function.
var origThirdPartyFunction = nameOfTrinidadFunction;
//make the name of the original function point to our version
nameOfTrinidadFunction = patched;
	
//our version of the fuction
function patched() {
	//check some stuff before
	if(someCondition) {
		return;
	}
	//call original trinidad method
	origThirdPartyFunction(arguments);
}

The trick is in the first line: you can refer to a function by assigning it to another var if you don't actually call it by using round brackets. In the next line, we're doing the same, but now with the function we defined ourselves. In that function, we call the original function with the name we gave it.

So now, anytime trinidad calls the function under its original name, our version will get executed instead. This way, we could add the guard clause, without potential breakage should the trinidad implementatin change, since we no longer copy it's implementation.

Now, there are also other ways to implement this, like using AOP with javascript :)

Technorati Tags:

Posted by cvf at 5:15 PM in Java

« February »
SunMonTueWedThuFriSat
   1234
567891011
12131415161718
19202122232425
26272829