Archive

Posts Tagged ‘Distributed Version’

Distributed Builds: Part 4 (of 4+)

Quick Recap

This is the fourth in my series on adding distributed builds to CruiseControl.NET. Previously I’ve covered some of the issues, a basic design and some scenarios that need to be covered.

In this post I’ll start to look at how we can possibly configure a distributed build scenario.

Starting Off Easy

Of the six scenarios in my last post (here) the simplest one to implement is the single remote build scenario (well, the local build scenario is easier, but then nothing is distributed!) So I’ll look at how I can configure a project to be distributed.

NOTE: this is a work in progress, it is only my current ideas of how to implement it. Hopefully with feedback we will be able to improve it over time.

Machine Definitions

The configuration needs two sections. First we need to define some remote machines. These definitions provide the connection and metadata information about the remote machine. For example, we can define the following machines:

<remoteMachine name="buildAgent1">
  <address>http://winMachine</address>
  <configuration>
    <namedValue name="OS" value="Windows7" />
    <namedValue name="Browser" value="IE8" />
    <namedValue name="Browser" value="Firefox" />
  </configuration>
</remoteMachine>
<remoteMachine name="buildAgent2">
  <address>http://linuxMachine</address>
  <configuration>
    <namedValue name="OS" value="Ubuntu" />
    <namedValue name="Browser" value="Firefox" />
  </configuration>
</remoteMachine>

buildAgent1 has a target address of http://winMachine (at some future point we need to do some more work around the addresses, but it will do for now.) When a project builds it will send the request to this address. It also has some configuration values – it defines an OS and two browsers. buildAgent2 is very similar, but it has a different address and configuration values.

The address element will be mandatory (after all it is very hard to send a request to the machine if its address is unknown!) while the configuration element will be optional. The configuration element is basically to provide hints to the project on which machines it will choose.

These elements will be top-level elements in the configuration (i.e. direct children of <cruisecontrol>.)

Project Changes

Once the machines have been configured, we then need to hook up the project to the machine. This can be configured like so:

<project name="distributedTest">
  <buildTarget xsi:type="singleTarget" name="buildAgent1" />
  <!– Other configuration goes here –>
</project>

This defines the build target as a single remote machine with the name of “buildAgent1”. When the project builds it will look in the list of machines and find the machine with the name of “buildAgent1”. This machine will then receive the request to build.

This can also be expanded to allow configuring the local build:

<project name="distributedTest">
  <buildTarget xsi:type="singleTarget" name="buildAgent1" runLocal="OnFailure" />
  <!– Other configuration goes here –>
</project>

At this point I am thinking of allowing three runLocal options:

  • Never – never run a local build
  • OnFailure – run the build locally if the remote build fails (e.g. the machine cannot be contacted or rejects the request)
  • Always – always run a local build

The default would be OnFailure.

Handling Multiple Machines

Once we have the simple scenario, we can look at expanding this to handle multiple build targets. This would be done by adding a new build target type:

<project name="distributedTest">
  <buildTarget xsi:type="multipleTarget" runLocal="OnFailure" number="All">
    <targets>
      <singleTarget name="buildAgent1" />
      <singleTarget name="buildAgent2" />
    </targets>
  </buildTarget>
  <!– Other configuration goes here –>
</project>

This expands the original singleTarget by allowing multiple targets to be specified. The targets are specified with the same singleTarget definition (allowing re-use.) The runLocal attribute will work in the same way as runLocal on a singleTarget (note, runLocal will be ignored on the child targets.)

The number attribute specifies how many target machines to run on. This can be either a number or the keyword “All”. When “All” is specified the build will be run on all the specified targets.

Using Configured Machines

The above configuration handles most of the base scenarios, the only one uncovered is using configuration. This will be a new target type:

<project name="distributedTest">
  <buildTarget xsi:type="configurationTarget" runLocal="OnFailure">
    <configuration>
      <namedValue name="Browser" value="Firefox" />
    </configuration>
  </buildTarget>
  <!– Other configuration goes here –>
</project>

When the project runs it will try to match a machine with the same configuration (note there are no limits on what configuration values can be specified, so it is up to the administrator to decide what values are available.) This can then be combined with the multipleTarget to handle different configuration combinations:

<project name="distributedTest">
  <buildTarget xsi:type="multipleTarget" runLocal="OnFailure" number="1">
    <targets>
      <configurationTarget>
        <configuration>
          <namedValue name="Browser" value="Firefox" />
        </configuration>
      </configurationTarget>
      <configurationTarget>
        <configuration>
          <namedValue name="OS" value="Windows*" />
        </configuration>
      </configurationTarget>
    </targets>
  </buildTarget>
  <!– Other configuration goes here –>
</project>

Configuration Wrap-up

These are my basic plans for the configuration for distributed builds. These configuration elements cover all my basic scenarios, so in theory we can build a project in a number of different ways. Also this allows for people to add their own targets (round-robin, multiple-attempt, etc.)

However, there is still a side of the configuration that has not been covered – how to configure the remote machine? In my next post I will look into how I am thinking of doing that.

Stay tuned…

Distributed Builds: Part 3 (of 3+)

16 April, 2010 1 comment

A Quick Recap

Previously (here and here) I started outlining my plan for adding distributed builds and some of the issues. I was planning on writing about how I think it can be configured, but I wanted to return to the design for a little bit.

Some Scenarios

The end goal is to try and make a flexible system that other people can expand if desired (similar to the underlying design for most of CruiseControl.NET.) To help with this, I’ve put together a few scenarios of how I see people using this functionality.

Scenario #1: Local Build

This is pretty much how CruiseControl.NET currently works. A build is triggered on the server and runs locally. I’ve included this scenario since we need to ensure that it still works!!

Scenario #2: Single Remote Build

In this scenario a project is mapped to one or more remote machines. The first machine that is available builds the project, otherwise the project will be built locally. This looks something like:

image

The remote server can be a list of servers, but it is only ever triggered on one remote or local server.

Scenario #3: Multiple Remote Builds (Pick x of y)

This scenario involves multiple remote servers all building at the same time. The project is mapped to y servers, and it expects at least x to build. This looks something like:

image

The project does not care which servers build the project, just that at least x servers build it.

Scenario #4: Multiple Remote Builds (All)

This is really an extension of the above scenario, but stating that all the remote servers must build the project. But it is slightly different from the above as the project needs to check that all the servers have built it.

Scenario #5: Multiple Remote plus Local

Again, another extension of scenario #3, the difference being the local server is included in the list of servers doing the build. This looks like:

image

There are two variations on this scenario, but they are reasonably minor. The first variation is the local server is included as an optional server (e.g. it is included as one of the x servers.) The second variation is the local server must always build (e.g. it is not counted in the x servers.)

Scenario #6: Multiple Builds (At least one of each configuration)

This scenario is a specialised case of #4 – the remote servers have different configurations (e.g. Windows, Linux, Mac, etc.). The build must run on at least one of each configuration. This looks something like:

image

The key here is that for each configuration the build must run at least once. There can be multiple machines with the same configuration (but there does not have to be.) The local machine can also be included as one configuration (optional.)

Extended Scenarios

While I see these as my initial scenarios, they can also be chained together. For example, a remote server could pass a request onto other remote servers:

image

This would people to build not only build farms, but build pipelines where builds are shipped out over a potentially large distribution, with (hopefully) a simplified set of configuration options.

Sidestep Finished

Anyway, that enough of my sidestep, in my next post I WILL start looking into the configuration.

Stay tuned…

Distributed Builds: Part 2 (of 2+)

14 April, 2010 1 comment

A Quick Recap

In my last post (here) I started exploring distributed builds in CruiseControl.NET. In it I listed a number of issues to investigate, plus two scenarios that I’m planning on implementing. These scenarios are:

  1. A server sends out a request to another server to build an entire project
  2. A project sends out a request to a server to build part of itself

Both of these will use the same underlying communications model, they will just call it in slightly different ways.

The Plan

.NET already has a few communications options built into it, from simple sockets (a.k.a. roll-your own) through .NET Remoting right up to WCF. Previously we’ve used .NET Remoting, but this is a technological dead-end (just ask Microsoft!) It also has a number of issues, some of which have (and continue to) hurt us before.

So, what is the recommended framework – WCF (at least according to Microsoft). Previously this has not been an option for us as Mono has not supported it (yes, there was Project Olive, but this was not progressing when we last checked.) But times have changed, Silverlight has come and nudged Mono into the realms of WCF, so now Mono at least supports the Silverlight subset of functionality (plus there is talk of implementing most of the WCF stack.)

So with this in mind I’m not going to stress over communications, instead I’m going to use WCF (and yes we can review this decision later.) This helps in resolving some issues (secure channels, communications through firewalls, etc.)

Next, I’m going to treat the remote machine as a dumb server – it will not know anything about the caller. The caller will need to do all the work for the communications. While this is not necessarily the best solution, it is the simplest and allows me to test out some basics (like what to pass around.)

At the moment the remote machine will just be another CruiseControl.NET instance – this means all the logic for performing builds will already be there. The majority of the work will focus on the logic flow and the communications.

The Messages

There will be the following messages in the system (at least initially):

image

The calling server will hold the master copy of the configuration, while the remote server will have the individual build processes. Each server will have its own logs – the remote server will have just the logs for the builds, while the calling server will have its own log plus a merged copy of the remote log.

The initial message in the sequence is “Check build…”. This message is basically a request to the remote server asking can you build this project? It is up to the remote server to handle the logic of which builds are accepted.

If the first message is accepted, then the calling server then asks the remote server to perform a build (the “Build…” message). At the same time it will pass the configuration for what is needed to build. The remote server will accept this configuration, parse it and then build. It will return a unique build identifier to the calling server – this allows the calling server to check on the status of a build.

The calling server will then periodically poll the remote server for an update (via “Check status…”.) The remote server will return a status report containing all the necessary information (overall status, project item statuses, etc.) The calling server then uses this to update its own internal picture of what the remote build is doing.

If the “Check status…” returns that the remote build has finished then the calling server will request the log (via “Get log…”.) This log is then merged with the main build log.

The final message is more of a “what-if” message. If the build is cancelled on the calling server, then it needs someway to tell the remote server to cancel as well. This will not immediately cancel the build, instead it will call the logic on the remote server to cancel a build (in a similar way to abort build in the current system.) The calling server will continue to poll until the remote server says it has finished.

Hacking the Integration Loop

CruiseControl.NET already has an integration loop (with triggers, queues, events, etc.), so all I need to do is hack into this loop and add the remote integration stuff. This can be illustrated as follows:

image

The blue boxes are the already existing processes (much simplified). The purple boxes are the new changes to get distributed builds working, with the matching red boxes being the new changes on the remote server.

Basically the current logic will be unchanged, there will just be some additions. The current project loop will be used, but there will be an extra step to check if the build runs locally or not.If the build is not run remotely, then the process will be unchanged.

If the build is run remotely, it will pass off control to the remote server. This will be similar to running a project locally, except with some infrastructure around handling the communications between the two servers. The actual running of the project will be the same whether it is a local or remote project (the Run locally box), the only difference will be which machine it is running locally on.

To Be Continued…

This is the basics of how I’m going to hack into CruiseControl.NET to allow distributed builds. In my next post I’ll look into the configuration required, and then move on to the actual implementation.

Stay tuned…

Distributed Builds

12 April, 2010 2 comments

An Objective

One of the nice features that CruiseControl.NET lacks is the ability to perform distributed builds. A distributed build is where some or all of a build is performed on another machine – thus opening the possible to have Continuous Integration “farms”.

Unfortunately there is a simple reason why CruiseControl.NET does not have distributed builds functionality – it is hard to implement and even harder to do some both securely and simply.

A single CI server is reasonably easy to secure – since it is up to the program to decide what inputs to accept (note, I’m not talking about the security features that were recently added to CruiseControl.NET, this is looking at a deeper level, how does the program function internally.) As soon as a system becomes distributed it needs a way for its components to communicate. This communications channel then becomes a possible attack site for hackers!

The second challenge – simplicity – also ties into security. It can require some complex settings to secure a system so it cannot be hacked – yet at the same time the system needs to be simple enough that an administration can configure the system without too much headaches (this is even more critical when it is an open source system without anyone producing professional documentation!)

Finally, to round things off, a distributed system is not just a single system running in multiple locations – there are a whole range of ancillary functions that are needed. Is there a single controller or is it a collaborative system? How do you log and monitor the various components? Are artefacts passed around the system or stored in a central location? How do the components get updated? And so on and so on!

But I’m a sucker for punishment, so I am going to give it a try. At this time I have no intention of resolving all the issues (since I’m only doing this for fun!) I’m looking at putting in the base ground-work for this functionality, with the hope that other people will also want to contribute. As such my work in this area will start off small and (hopefully) grow in size to handle all the challenges.

You have been warned…

Some Basics

There are different ways that we can implement distributed builds, unfortunately the way the code is structured within CruiseControl.NET limits some of these options. So to start with I’m only going to consider two scenarios:

  1. A server sends out a request to another server to build an entire project
  2. A project sends out a request to a server to build part of itself

Scenario 1 is a typically client/server association. There is a centralised server (potentially one of many) that hosts the standard integration cycle within CruiseControl.NET. This server would poll the triggers and handle the initial queuing, etc. The change will occur when the project is ready to build – first it will check if the project should (a) run on a remote machine and (b) there is a remote machine available. If both of these conditions are met, the build request will be passed to the remote machine. Otherwise the project will run locally.

Scenario 2 is the more interesting scenario. Here a project is triggered (either as currently or via a remote build – scenario 1). It will progress through its tasks as per normal, but there will be a new task task – runRemote (or something similar). When this task is hit, it will bundle up all the sub-tasks and send them to the remote server to run.

Both of these are simple in concept, but they do have a few issues to cover:

Auditing

Every build generates an audit (or build) log. This log then gets stored in the artefacts folder in an XML format. For a distributed build the log needs to be shipped back to the calling server and merged with the main log.

CruiseControl.NET also has a nasty issue where the logs also contain all the results from the various tasks. This means the logs can be massive – so the last thing we want to do is load them into memory (or at least no more than we need to!)

Artefacts

Generally the objective of a build is to produce some sort of artefact. Some of these artefacts are ignored (if you only want to ensure the code builds, you don’t care about the results), while other artefacts are needed later on in the build (code analysis might run on the binaries produced by compilation.) Some artefacts are merged into the logs, while others are separate (e.g. text/xml vs. binaries).

So we need some way to specify which artefacts are to be passed around and some way of doing it (again without overloading memory).

Configuration

There needs to be some way to pass the configuration around between the different components. Worst case scenario is the administrator configures each server individually, but I think that approach is far too error prone! Instead the calling server should pass the necessary configuration to the callee. This way the correct configuration will always be available.

We also need to consider how to configure both the server and the client applications so they know to talk to each other, preferably also they don’t screw each other up.

Security

How can we limit communications so only authorised components can communicate with each other? How do we prevent external parties from eavesdropping on the communications and intercepting/modifying them? Finally, how do we make the communications open enough to go through firewalls, etc. without rewriting an entire communications stack!

Yes it is going to be fun :)

That’s All… For Now!

This is a start on looking into this issue. I’ve more listed some of my issues and concerns, plus areas that need to be investigated. In my next post on this topic I’ll start to look at some proof of concept work I’ve done in this area.

Stay tuned…

Integrating Remote Builds

1 December, 2009 6 comments

The Current State

One of the limitations of CruiseControl.NET (CC.NET) is it’s handling of distributed builds. CC.NET does allow people to remote trigger a build, but that’s about it. Additionally we are only allowed to use .NET Remoting, so if the remote CC.NET server is behind a firewall, there is no distributed functionality.

“That’s ok, at least we can trigger builds” you say. The problem is, that’s it – no results, no monitoring whether the build was successful or not – just a trigger! Ideally, it would be nice to trigger a build, wait for it to finish and then import the results into the triggering project.

Now, there are some very valid reasons why triggering is the limit of what is currently offered. Valid, but not very nice! First, there is a queued nature of CC.NET – when a build is triggered it is added to a queue. There may be a long time period between the trigger and the project actually starting to build. Next, there is the issue of builds taking a long time to run – a build could start and not finish for a long time. Finally there is the issue of importing the results – the remote build has its own build log, and just importing it into the log can cause some interesting results (although to be fair this is more of an issue with some of XSL-T). And what if there are additional data fields that were not included in the log?

So, basically, we have a number of issues with the fundamental way that CC.NET works that prevent integrating remote builds. Or do we?

A Remote Project Task

With the changes for CC.NET 2.0, some of these issues are minimised, others can be worked around in the configuration. The stream changes mean the output from builds are now in separate files. Additionally the build log format has been changed to be more of an index, rather than a complete data store. Plus we have the ability to import any data files (or at least any that are in the artefacts folder). So these take care of the results issues – I’ll take more on the other issues later. Finally, we have a new remote client API that simplifies accessing servers over different protocols. So we can add an improved remote build task.

I have added a new task to the codebase called remoteProject. This task allows a person to trigger a build on a remote machine, waits for the build to finish and then import the results. This uses the new build log format and the data transfer mechanism, so all the results can be imported!

As a basic start, here is an example of the task:


<tasks>
  <remoteProject projectName="Test" serverAddress="tcp://localhost:21234/CruiseManager.rem" />
</tasks>

This tells CC.NET to trigger the Test project on the server at the address tcp://localhost:21234/CruiseManager.rem. Any valid server address can be used – including HTTP and custom protocols. Additionally, the task supports a target attribute, so it can be used to trigger projects that sit behind a web dashboard.

How Does It Work?

Ok, this is where we show the nasty side of the task. The task basically does three things:

  1. Sends off a force build request to the remote server
  2. Waits for the remote build to finish
  3. Imports the results from the remote server

Step 1 is pretty simple – it’s almost exactly the same as the existing force build publisher. The main difference is is uses the new Remote API and checks the result of the request. If the request fails, the task will also fail (although it won’t error out like the publisher).

Step 2 is the nasty part – it enters into a polling loop while it waits for the remote project to finish. Every five seconds it polls the status of the build and waits for it to finish. This effectively means the build thread is locked while waiting for the remote build to finish. So, if the project is in a queue, no other projects will build until the remote build is finished! Told you it is nasty!

Step 3 is pretty simple, it gets the log from the remote build, iterates through all the results and imports them. Finally, the rebased log index is imported into the build log. I say rebased as all the files that were import will have their filenames changed to the new location (i.e. on the local server).

Getting Around the Polling Limitation

At the moment, there is no way around this limitation! If we want results, we have to wait :-( But there are some hacks that can minimise this wait":

  • First, put the remote projects in their own queues (where possible). This means when a build is triggered it will start almost immediately.
  • Second try not to make the remote builds too time consuming. If necessary, split them into smaller builds and call multiple projects concurrently (see the next point).
  • Third, try to call remote build concurrently using the parallel task. This means the current project is waiting for multiple remote builds to finish, instead of just one.

In the future I hope to add a new “Paused” project status. This means other builds in the queue can continue building and when the remote builds have finished, the paused build will resume. However like a lot of things, this involves breaking CC.NET in some major ways so it won’t be an easy task.

I’m also planning on adding a call-back mechanism. This means instead of polling, the project will be paused. When the remote project has finished, it tell the paused project to continue (no need to poll!)

So, in the short term there is no way around polling, other than minimising the delay. In the future this will be smoother.

Use At Your Own Risk

At the moment this task is still very much in its infancy. In reality, it is more of a proof-of-concept. Yes, we can run remote builds and import the results, no it is not production ready. So let me know what you think, plus send me your ideas for how it can be improved. Also, it would be good to know about any limitations you can think of with this task.

And in case you are wondering, this is only a baby step. I still have other plans for making CC.NET distributed, included distributed tasks, automatic upgrades, configuration propagation, etc. But they take time, especially to do them properly ;-)

So, stay tuned…

I’m Back: Holiday’s Over, Time to Write!

17 November, 2009 Leave a comment

Back to Life, Back to Reality

I have returned from my annual visit to see the in-laws in China, and once again I have worked on some nice goodies to add to CruiseControl.NET (CC.NET).

This time I was looking at two specific areas:

  • Converting to streams for results
  • A Silverlight RIA

Stream-based Results (Project Ares)

The streams functionality is to try and resolve a long-time issue with running out of memory. Currently CC.NET performs all of its result processing in-memory as strings. This means when a task runs, it generates an in-memory version of the results (e.g. from stdout/stderr or from importing file results). These results are then appended to an ever-increasing copy of the log file, which is finally written out to disk. Now this is fine if 1) the results are small or 2) you have lots of memory, but this can cause problems if these conditions are not met! Additionally the same problem applies not only to generating the results, but retrieving the results!

The solution is simple in concept – instead of writing to memory, write directly to disk instead – hence changing to using streams instead of strings. However like any non-trivial modification, this has a number of far-reaching implications, so it was not as easy as just changing from generating streams to writing to streams.

However, the good news is I got it working, although it still needs a bit of polish to get it working nicely. And the other good news is I documented what I did along the way :-) So over the next few weeks I’ll be reviewing and publishing this documentation on my blog. These posts will be published under Project Ares.

Silverlight Client (Project Capricorn)

The other area I played with was writing a Silverlight 3.0 client for CruiseControl.NET. This was more of a fun project to see what was possible (although it didn’t help that I was offline and had to go by trial and error).

For this project I wanted a similar type of interface to the current dashboard. This allows people to view information at four levels: farm (all monitored servers), server, project and build. More challengingly it allows people to develop their own plug-ins (either via code or XSL-T) and use these within the UI.

As well as implementing a basic dashboard-like UI, I wanted to lever some of the rich functionality that is available in Silverlight – things that are possible in HTML/CSS/Javascript, but are challenging to do.

So, I have put together a very rough implementation of a Silverlight client, although it is very much at a prototype stage. This allows for the basics of the UI (layout, navigation, etc.) plus a plug-in infrastructure for adding new plug-ins. Unfortunately both need work to get up to release level.

So, once I have finished writing about the stream changes I’ll write up about the Silverlight client under Project Capricorn. Hopefully there will be some interest in it, so we can look at completing the project and including it in the official codebase (probably for CC.NET 2.0). Otherwise I’ll move it to the FastForward.NET project and work on it as I have time.

But Wait, There’s More!

Another area that I’ve been slowly working towards for a while now is the ability to make CC.NET distributed. CC.NET as it currently stands has some distributed elements, but it doesn’t really work as a distributed application. Some of the changes I’ve been working on (messaging, hot-upgrades, etc.) have been pieces of the distributed puzzle. The streaming work added a couple more pieces of the puzzle, plus showed a few more challenges to be resolved!

So I’ll be adding a few posts on enhancing CC.NET to be distributed – either some of the issues involved or some of the pieces that have been added. Hopefully by the time we (finally) get to the 2.0 release CC.NET will be able to work as a distributed application :-)

That’s All For Now

So that’s what I’m planning on writing up over the next few weeks. At the same time I’ll be adding the source to SourceForge (under the CCNet2 branch) and hopefully spending some time with the other devs on getting CC.NET ready for the “official” 1.5 release.

Stay tuned…

Follow

Get every new post delivered to your Inbox.