As Docker is gathering more interest from the big IT companies, it seems to be a perfect way to delivery the applications to the production environemnt. If until now the maven artifct was the jar or war, now it looks natural the artifact should be the docker image. In fact this strategy is not new: Netflix was, and still does, deliver its application, as AMI images tagged with the verion and other build informations.

With Docker this approach become easier so that everyone can adopt it without the big Netflix infrastructure. The question is now, how to make our build generate the docker imag for us? If we are using maven, the Spotify docker-maven-plugin comes in handy here. it is able to build an image, tag it and push it on a public or private docker registry.

It provides three goals:

  • build
  • tag
  • push

It is possible to push also from the tag or the build goal, and it is possible to tag also from the build goal. It uses under the hood the spotify docker-client library backed by Jersey client.

Let take a look how to apply this plugin to the Maven lifecycle: the phases we need are three:

  • package
  • install
  • deploy

First of all we have to add the plugin block under our build/plugins block:

<plugin>
  <groupId>com.spotify</groupId>
  <artifactId>docker-maven-plugin</artifactId>
  <version>0.1.2</version>
  <executions>
    <!-- ... -->
  </executions>
</plugin>  

Similary for what happen for the jar, we would like our artifact to be built as part of the package phase. So lets bind an execution to the package phase. Assuming that the build is generating a packaged application myapp.tar.gz and that we have our Dockerfile in /src/main/docker:

<execution>
    <id>build-docker-image</id>
    <phase>package</phase>
    <goals>
        <goal>build</goal>
    </goals>

    <configuration>
        <imageName>myuser/myapp</imageName>
        <dockerDirectory>${project.basedir}/src/main/docker</dockerDirectory>
        <!-- The `tagInfoFile` is required apparently because of a bug in the `tag` mojo. -->
        <tagInfoFile>${project.build.directory}/docker_image_info.json</tagInfoFile>
        <resources>
            <resource>
                <targetPath>/</targetPath>
                <directory>${project.build.directory}</directory>
                <include>myapp.tar.gz</include>
            </resource>
        </resources>
    </configuration>

</execution>

The resources block can be used to add all the resources we need in our docker image. There is also an alternative way to build image, using th xml to define the docker commands, but I will prefere using the external Dockerfile, as it looks more familiar to the people accustomised to Docker already.

At the eand of the package phase we will have the myuser/myapp:latest image built on docker. however we want our image taggeg with the project version, don't we? We can achieve it in the same build goal but in my opinion, the install phase is a better fit for that.

Before the install phase, we can actually use the generated image for our functional or integration tests, maybe using the docker-client Spotify library. How the install phase looks like? here it is:

<execution>
    <id>tag-docker-image</id>
    <phase>install</phase>
    <goals>
        <goal>tag</goal>
    </goals>

    <configuration>
        <tagInfoFile>${project.build.directory}/docker_image_info.json</tagInfoFile>
        <image>myuser/myapp</image>
        <newName>myuser/myapp:${project.version}</newName>
    </configuration>
</execution>

The image parameter should match with the build goal imageName parameter, if the tag is omitted, the latest is assumed.

At the end of the install phase, we will have our image built and tagged in the docker host, now we can decide if we want to push on a remote docker registry. To push to a remote registry we need actually to rename the image, because we need to add the namespace in front of it. So we are going to reuse the tag goal again:

<execution>
    <id>push-docker-image</id>
    <phase>deploy</phase>
    <goals>
        <goal>tag</goal>
    </goals>

    <configuration>
        <tagInfoFile>${project.build.directory}/docker_image_info.json</tagInfoFile>
        <image>myuser/myapp:${project.version}</image>
        <newName>quay.io/myuser/myapp:${project.version}</newName>
        <pushImage>true</pushImage> <!-- We are also pushing this time -->
    </configuration>
</execution>

Since the 0.1.2 version, released today, it is possible to specify the remote registry credentials only using maven properties at the moments, but they are enough flexible to permit you using systen environment on your CI server for example.

I must say Spotify has done a great job to delivery this plugin, in perfect accordance with the open source spirit.

It is a solution I have written because my collegue was struggling to write a code that was collecting one element for each list in an ordered way and accepting list of different size.

The problem was collecting 50 images from a third party supplier. Each record has a set of albums, a ordered list of image, and a set of featured images pick from the albums. We want to collect 50 images, starting from the feature and then going trough the albums in a round-robin fashon (the first for each album, the second and so on). The album may have different size.

So if we have the feed containing these image id:

featuredImages: [1, 5, 4, 9]
albumOne: [1, 2, 3, 4]
albumTwo: [5, 6, 7, 8, 9, 10]
albumThree: [11]
albumFour: [12, 13, 14]

And we want to collect 10 images we got:

images: [1, 5, 4, 9, 5, 11, 12, 2, 6, 13]

The most simple solution I have think to to has been to use Guava Iterables to limit the image to 10.

Iterable<Image> images = limit(concat(featuredImages, albumImages), 10)

But how can I obtain the albumImages in the right order (round-robin)? I need a custom , then I have written this one:

public class RoundRobinIterable<T> implements Iterable<T> {
 
    private final Iterable<Iterable<T>> iterables;
 
    public RoundRobinIterable(Iterable<Iterable<T>> iterables) {
 
        checkNotNull(iterables);
 
        this.iterables = iterables;
    }
 
    @Override
    public Iterator<T> iterator() {
 
        final Iterator<Iterator<T>> it = cycle(filter(
                FluentIterable.from(iterables)
                .transform(RoundRobinIterable.<T>iteratorFn())
                .toList(),
                hasNextPr()
        ));
 
        return new UnmodifiableIterator<T>() {
            @Override
            public boolean hasNext() {
                return it.hasNext();
            }
 
            @Override
            public T next() {
                return it.next().next();
            }
        };
 
    }
 
    private enum HasNext implements Predicate<Iterator<?>> {
        INSTANCE;
 
        @Override
        public boolean apply(Iterator<?> input) {
            return input.hasNext();
        }
    }
 
    private static Predicate<Iterator<?>> hasNextPr() {
        return HasNext.INSTANCE;
    }
 
    private static <T> Function<Iterable<T>, Iterator<T>> iteratorFn() {
        return new Function<Iterable<T>, Iterator<T>>() {
            @Override
            public Iterator<T> apply(Iterable<T> input) {
                return input.iterator();
            }
        };
    }
}

It leverage on the lazy evaluation of so the will cycle over the given iterator, skipping the empty iterators on the fly.

And that's it. Well actually no, we want a test, don't we?

public class RoundRobinIterableTest {
 
    @Test
    public void iteratorShouldBeWorkAsExpected() {
 
        List<Integer> numbers = newArrayList(RoundRobinIterable.of(asList(1,2,3,9), asList(4,5), asList(6,7,8)));
        List<Integer> expected = newArrayList(1, 4, 6, 2, 5, 7, 3, 8, 9);
 
        assertThat(numbers, equalTo(expected));
    }
}

The full code is available at the RoundRobinIterable.java gist

Happy coding!

I started to develop a web crawler part of a bigger project, then I have to choice what kind of HTML parser library I have to use. I have used NekoHTML in the past and it was pretty good but it doesn't have any helper to select the DOM elements, you have o use the XPath, very flexible but not so easy.

I have found JSoup to be very cool library, its code is well written, clean and the interface is powerful. I love it. I was writing a Scala crawler so beside the JSoup interface is pretty cool, it is very javish, I prefer to have a better integration with Scala, so I started my first Pimp My Library pattern.

Let talk the code:

object Jsoup extends Jsoup

trait Jsoup {

  protected[Jsoup] def withDocument[T](url: String)(d: Document => T) = {
    try {
      Right(d(JJSoup.connect(url).get()))
    }
    catch {
      case e: IOException => Left(e)
    }
  }

  implicit def enrichNode(n: Node) = new RichNode(n)

  implicit def enrichElement(x: Element) = new RichElement(x)

  implicit def enrichDocument(x: Document) = new RichDocument(x)

  implicit def enrichElements(xs: Elements) = new RichElements(xs)

}

class RichNode(value: Node) {

  def nextSibling: Option[Node] = value.nextSibling match {
    case null => None
    case x => Some(x)
  }

}


class RichElement(value: Element) extends RichNode(value) {

  def getElementById(id: String): Option[Element] = value.getElementById(id) match {
    case null => None
    case x => Some(x)
  }

  def getElementsByTag(tag: String): Iterable[Element] = 
    Jsoup.enrichElements(value.getElementsByTag(tag))

  def getElementsByClass(tag: String): Iterable[Element] = 
    Jsoup.enrichElements(value.getElementsByClass(tag))

  def getElementsByAttribute(tag: String): Iterable[Element] = 
    Jsoup.enrichElements(value.getElementsByAttribute(tag))

  def getElementsByAttributeStarting(tag: String): Iterable[Element] = 
    Jsoup.enrichElements(value.getElementsByAttributeStarting(tag))

  def getElementsByAttributeValue(k: String, v: String): Iterable[Element] = 
    Jsoup.enrichElements(value.getElementsByAttributeValue(k, v))

  def apply(name: String): Option[String] = Option(value.attr(name))

}

class RichDocument(value: Document) extends RichElement(value) {

  def head = value.head match {
    case null => None
    case x => Some(x)
  }

  def body = value.body match {
    case null => None
    case x => Some(x)
  }

  def select(query: String) = new RichElements(value.select(query))

}

class RichElements(target: Elements) extends Iterable[Element] {

  def iterator: Iterator[Element] = {
    target.asScala.iterator
  }

}

The code has been upload to github SSoup reporitory.

Happy coding!

The Google Web Toolkit (GWT) allows you write plain java code, and then to translate it to client-side javascript code. Therefore it permits you to reuse your domain objects in the client side. This promise is amazing, but it isn't the whole true: using GWT without be aware of the transformation can produce a poor performance and, worst, unmaintainable project with disastrous results for your business.

I would like to spend some time to write a post about the best practices I have learnt using GWT and which common pitfalls to avoid.

When I have started to use GWT, the first problem I encountered was the development of the RPC services:

  1. They need two interfaces.
  2. They should be implemented by java Servlet.
  3. The serialization of objects is a difficult to manage.
  4. Their interface should not follow the java common best practices.

The 1st point is solvable using the maven-gwt-plugin: this will generate the Async interface as well the Servlet mapping on the web.xml descriptor (it doesn't work well with generics actually).

The drawbacks of the 2nd point instead is that if you are using Spring, the Servlet are instantiated outside the Spring context; you cannot apply any aspect on them. (e.g. they can't be transactional). If you want the GWT service to be managed
by the Spring context, you need to create two instances of the same interface:

  • A Spring bean
  • A Servlet that delegates all method implementations to the Spring bean.

This is a possible implementation of that:

public abstract class GWTService {

    protected HttpServletRequest currentRequest() {
        return GWTThreadLocals.instance().getRequest();
    }

    protected HttpServletResponse currentResponse() {
        return GWTThreadLocals.instance().getResponse();
    }
}

public final class GWTThreadLocals {

    private static final GWTThreadLocals INSTANCE = new GWTThreadLocals();

    protected ThreadLocal<HttpServletRequest> perThreadRequest;
    protected ThreadLocal<HttpServletResponse> perThreadResponse;

    private GWTThreadLocals() {
        perThreadRequest = new ThreadLocal<HttpServletRequest>();
        perThreadResponse = new ThreadLocal<HttpServletResponse>();
    }

    public static GWTThreadLocals instance() {
        return INSTANCE;
    }

    public void setRequest(HttpServletRequest request) {
        perThreadRequest.set(request);
    }

    public HttpServletRequest getRequest() {
        return perThreadRequest.get();
    }

    public void setResponse(HttpServletResponse response) {
        perThreadResponse.set(response);
    }

    public HttpServletResponse getResponse() {
        return perThreadResponse.get();
    }
}

public abstract class GWTRemoteServiceServlet extends RemoteServiceServlet {

    @Override
    protected void onBeforeRequestDeserialized(String serializedRequest) {
        super.onBeforeRequestDeserialized(serializedRequest);
        setGwtThreadLocals();
    }

    protected void setGwtThreadLocals() {
        GWTThreadLocals.instance().setRequest(getThreadLocalRequest());
        GWTThreadLocals.instance().setResponse(getThreadLocalResponse());
    }
}

So if we need to implement a GWT service interface, we have to extend and implement our interface. Then we have to write our Servlet that extends and implements the same interface, delegating all method implementations to the Spring bean.

This is the example for a interface:

@RemoteServiceRelativePath("ContactService")
public interface ContactService extends RemoteService {

  ArrayList<Contact> findByName(String name);

}

@Service("contactService")
public class ContactServiceImpl extends GWTService implements ContactService {


    @Override
    @Transactional(readOnly = true)
    public ArrayList<Contact> findByName(String name) {

       // Implementation calling the DB or whatever do you want
       return new ArrayList<Contact>();
    }
}

public class ContactServiceServlet extends GWTServiceServlet implements ContactService {

    private ContactService delegate;

    @Override
    public void init(ServletConfig config) throws ServletException {
        super.init(config);

        delegate = getRequiredWebApplicationContext(config.getServletContext()).getBean(ContactService.class);
    }


    @Override
    public ArrayList<Contact> findByName(String name) {
       // Call the delegate implementation
       return delegate.findByName(name);
    }
}

Writing this code for each service is very boring job of course, but your IDE can do it for you.

How you can notice the expose the concrete type instead of the interface. It is about the 4, exposing makes GWT creating one snipped of code for each possible implementation of . This will generate a oversize codebase and a longer compilation time. Indeed it is against any Java common best practice, but you have to keep in mind that GWT is not Java.

About the serialization of object:

  • They should implement Serializable or IsSerializable
  • They should have an empty default constructor
  • All transient or final(so crazy) fields will be ignored.
  • All dependent object should follow the same rules.

Of course GWT should know the codebase of the type for each dependent not final/transient field. The trick is put the source as dependency of the maven module contains the GWT plugin. However it is not enough because not all standard java classes are supported (no Calendar for example), so also if you include a lib source code, unlikely will be supported. Some libs offers a separate module for the GWT support, as Google Guava does. But they are not so common. Alternatively you can write the serialization for a particular class yourself. However is not so trivial.

How we have seen there is no simple solutions for the 4, having a serializable object, is complex and unlikely good integrated in our domain. If for example we are using Hibernate as ORM, serializing an Hibernate entity is the worst thing you can do. It will serialize your whole DB in the worst case. However in many case you have to deep copy the object because GWT is not able to serialize the Hibernate custom collections. And usually you don't need them in client side.

At the end using GWT RPC require to write a GWT service layer with its own domain objects. They will likely contain already preprocessed data, like the formatted date (no joda time on GWT, neither Calendar). At the end of the day, the promise to use your domain object in the client side is a lie.

In the next post I will describe a way to use a JSON REST api with GWT.

The complete code is available at GWT RPC Spring Scaffolding gist.

Fork me on GitHub