Automated Coder

Exploring the Code of CruiseControl.Net

An Experimental Branch for CC.NET

Posted by Craig Sutherland on 28 September, 2009

Recently I’ve been exploring a few options for refactoring CruiseControl.NET to handle the out of memory exceptions that we have been getting. Unfortunately a lot of the paths I’ve investigated have turned out to be red-herrings – they either have wide-reaching repercussions or they involve significant changes to the server. I’ve even trying profiling the application using dotTrace, without finding any obvious areas for improvement.

So, it’s time to get a little more drastic. Since we are aiming to release the 1.5 version sometime this year (hopefully), I have started a new “experimental” branch. In this branch I’m planning on looking at some more drastic refactoring to resolve the memory issues, plus a couple of other ideas I want to try out for a “CruiseControl.NET 2.0”.

And in case you are wondering, here are some of the ideas I want to try out:

  • Distributed builds: being able to take a single project and distribute it over multiple machines
  • Build agents/build distributions: extend CruiseControl.NET so there can be master/slave instances, with the master being responsible for distributing build requests across multiple machines
  • Data storage layer: move all of the file I/O into a common layer – this is in preparation for adding database persistence

Now, these are just ideas at the moment – with my current amount of free time it’ll be a while before any of these see fruition. But I’ll be doing my standard document things as I go, so if you are wondering why I am writing about these things, this is why.

Posted in CruiseControl.Net | Tagged: , | Leave a Comment »

FastForward.NET: Beta 4 Release

Posted by Craig Sutherland on 19 September, 2009

It’s been a while, I’ve been side tracked with trying to track some performance issues with CC.NET. I have just posted the binaries for the fourth beta – although most of the changes have been around for a few weeks. The following items have been added/fixed:

  • Swapped Ok and Cancel buttons
  • Cleaned up the settings dialog so the name of the tab is not on each button within the tab
  • Added a double-click action to the all projects grid – the user can choose which action to perform
  • Added servers and projects to the system tray – clicking on a project triggers the double click action

I am running this as my CC.NET monitor and it seems to be working. I know there are a couple of issues with configuration that need to be resolved, otherwise it is ready for release :)

The binaries can be downloaded from https://www.ohloh.net/p/FastForwardNET/download.

Posted in FastForward.NET | Tagged: | 2 Comments »

FastForward.NET Relocated

Posted by Craig Sutherland on 18 September, 2009

I’ve been having a number of issues with SourceForge – mainly with its performance. And I’m not the only one – one of the other developers for CruiseControl.NET has been so frustrated that he has set up his own Subversion server, together with a Trac instance.  Additionally, he has been kind enough to let me host FastForward.NET.

So, everything has been moved from SourceForge. I will keep an eye on SourceForge for any issues that people may raise, but I won’t be updating it at all.

So, here are the links for FastForward.NET:

Many thanks to Daniel Nauck for letting me use his servers.

Posted in FastForward.NET | Tagged: | 2 Comments »

Memory Issues with CruiseControl.NET

Posted by Craig Sutherland on 16 September, 2009

The Problem Defined

I’ve been spending a bit of time trying to resolve a number of outstanding issues in JIRA about running out of memory with CruiseControl.NET:

These are all different examples of getting an OutOfMemoryException when performing a build. Basically a task generates a large output, which CC.NET then attempts to merge into the standard build result. Unfortunately some of these issues have been around a long time (CCNET-819 was first raised in Jan 2007!) which implies that this is a fairly deep rooted issue.

This post contains what I’ve found out so far, and some of my changes to try and reduce the memory usage. Unfortunately, I say “try” as this is both a hard problem to replicate and a hard problem to resolve!

Some Investigation

Looking at the stack traces, the basic issue is with strings. CC.NET is loading the entire build log into memory and manipulating it. Even worse, it can be getting various parts of the results and manipulating them, before writing them into the build log.

When a task executes, it can generate multiple ITaskResult instances. These instances have a Data property which is a string. As you can imagine, this lead to the data being loaded into the various implementations and sticking around until the build has finished. So, if a task generates a 20Mb output, this is added to memory and held for the entire remaining duration of the build. Actually, it’s probably held even longer, as the memory will not be released until a garbage collection is performed.

But, don’t worry, things are even worse! As ITaskResult is an interface, there are a number of different implementations. The implementations I have found are:

  • DataTaskResult
  • FileTaskResult
  • ProcessTaskResult

DataTaskResult is the simplest of the three – it just provides a backing field for the property that contains the string. When this class is initialised the string is loaded and held there until the result is cleaned up. While this is the default result generated, from what I can see in the code it is not actually used (except in the null task.)

FileTaskResult is a view onto a file. Typically a task will generate a file (e.g. when an external application is called) and then this result type is generated to reference the file. Now for the bad news – when this result is generated it opens the file and loads the entire file into memory! So if the task generates a huge file (e.g. NCover results for a large code base, etc.) the file is loaded into memory and hangs around like a bad smell :-(

The final result, ProcessTaskResult, is the most complex of the three. It is also the cause of one of the issues. When an external task is executed it normally uses the ProcessExecutor class. The output of this class is a ProcessResult, which contains all the output written to StandardOutput and StandardError. And yes, these are both stored as strings. ProcessTaskResult manipulates these strings to generate the final output, and that’s where the problems start coming in.

Some Background

If you are wondering why I am picking on strings at this point here is some background. In .NET strings are immutable. Once a string has been allocated it cannot be changed. But what about the string manipulation functionality? Unlike C++, these functions do not modify the string, instead they generate a new instance of the string (with the modifications of course), which is then another immutable string in memory.

So for example, if we had the string “This is a test   ” and we wanted to remove the extra whitespace we would call string.Trim(). At this point we now have two strings in memory: “This is a test   ” and “This is a test”. Even if we assigned the new string to the old string variable, this is still the case.

So, when is the old string removed from memory? When garbage collection occurs. So on a heavily loaded machine where garbage collection is running slowly, these strings can hang around for a while.

Of course, for my short example, this isn’t really a problem – most machines could hold millions of strings like these without any problems. But, imagine if the string is 20Mb in length (this isn’t too out of the ordinary for some processes). All of a sudden the memory will be chewed up very quickly (especially as the OS likes to take a fair chunk).

At this point, I should mention most OSs now-a-days can use disk swapping to extend the amount of free memory. However problems occur if the OS is unable to find a large enough continuous free space to allocate – this is typically what is causing the out-of-memory errors. RAM has been filled and the OS is unable to swap out some memory.

So, short of planning around with garbage collection (which I have no intention of even attempting), our best approach is to reduce the amount of strings we are generating.

Starting Small

As I mentioned earlier, FileTaskResult loads the entire file into memory when the class is initialised. The first change is load the file only when it is needed, and not store a reference to the string. Garbage collection works by detecting whether an object has been orphaned – that is whether there are any references to the object. If there are not references, then garbage collection will remove it (at least that is my understanding).

So now file results will only be loaded when they are needed, and then disposed of as soon as possible (again depending on garbage collection). This gives us a little more to manoeuvre. But, it’s only the tip of the ice berg.

Strings upon Strings upon Strings

Looking at the way ProcessTaskResult works, and how the instances are generated, there is a lot of strings being generated.

As an example, in ExecutableTask it needs to check if there is any output from the executable. This output can be from either standard out or standard error. The literal line is:

if (!StringUtil.IsWhitespace(processResult.StandardOutput + processResult.StandardError))

This combines the two outputs together to generate a new string – hence twice the memory allocation. So, this can be changed to:

if (!StringUtil.IsWhitespace(processResult.StandardOutput) || !StringUtil.IsWhitespace(processResult.StandardError))

which means the original strings are used instead of generating a new string.

Next, digging a little deeper, this is how StringUtil.IsWhitespace() works:

return value.Trim() == 0;

As you can see, this is generating another string! If the string is whitespace, then there is no overhead – it will just generate an empty string, no matter how large the original was. But, if the original had 20Mb of non-whitespace, there is now a second 20Mb string generated!

I’ve replaced this approach with a slightly more complex approach. First I check if the string is null or empty – if it is then the string is considered whitespace. If it is not null or empty, then the new routine iterates through every character in the string and checks if the character is a whitespace character (using char.IsWhiteSpace()). if the character is not whitespace, the loop is exited and false is returned. Otherwise, it will continue through the entire string and return true if no whitespace characters are encountered.

I’m not sure on the performance loss with this change, but I imagine the string data type will be doing something similar with its Trim() method, so it shouldn’t be too bad. Plus this has the advantage of not generating a new string – hence lower memory usage.

There is a third change that I am thinking about, but I haven’t done yet. When the ProcessTaskResult is generated, the caller often calls StringUtil.MakeBuildResult(). This converts the original newline delimited string into an XML structure with an element per line. However I’m not sure exactly whether this will provide any improvements, so I’ve left it for the moment.

Baby Steps

These are my first few baby steps to reducing memory usage in CC.NET. At the moment I’ve just been looking at the server. My initial steps have been to reduce the memory usage by reducing the number of strings generated and held in memory. While I’m hoping this will reduce memory usage, I don’t think it will have too much of an impact (big sigh).

The real problem, and the massive challenge, is to remove the strings from memory as much as possible.  I have tried looking into using streams, but this will be a massive change :-(

Second, this is only looking at the server side. Two of the issues are with the dashboard – which is a whole different area to look into.

Anyway, baby steps, I’ll continue looking into what can be done to improve the memory usage.

Posted in CruiseControl.Net | Tagged: | 6 Comments »

One Year On

Posted by Craig Sutherland on 23 August, 2009

Happy anniversary! Yes, my first post on this blog was one year ago! That was when I first adventured into the wonderful world of open source and embarked on enhancing and improving CruiseControl.NET.

Since then a lot of things have changed! Here are some of the highlights:

  • I have a son (he’s almost five months old now!)
  • I’ve written 178 posts
  • I’ve changed 60 thousand lines of code (wow!)
  • I’ve added a new application to CC.NET (CCValidator)
  • And I’ve started my own open source project (FastForward.NET)

In terms of CruiseControl.NET, I’ve added the following functionality:

  • Security
  • Dynamic build values and parameters
  • Message-based communications
  • Common communications client
  • Monitor API
  • NDepend and NCover tasks
  • Image and HTML support in the dashboard
  • Packages
  • Hot-swapping
  • Parallel and sequential tasks
  • Task statuses

Plus I’ve gotten to know a great bunch of people, and learnt a lot about coding and development.

So, all in all, one year down the line, I think it has been very much worth it :D

Here’s to another year!!!

Posted in CruiseControl.Net, FastForward.NET | Tagged: | 1 Comment »

FastForward.NET: Beta 3 Released

Posted by Craig Sutherland on 22 August, 2009

Yes, I’ve managed to stick to my weekly release schedule – I have just posted the binaries for the third beta. As well as expanding the installer to include the build templates and release notes, the following items have been added/fixed:

  • Added a properties display for projects, servers and build queues
  • Added global error handling and reporting
  • Save the position of pop-up windows so they can be restored when re-opened
  • Remove the next build date display if a project does not have a next build time
  • Standardised on how the server name is generated – should now always display the local name
  • Added the server name to the build visualisations
  • Fixed bug where plug-ins can be duplicated
  • Plug-ins can now hide the application instead of closing
  • Added estimated time of completion

This release is now a lot more stable – I have been able to run it for the entire day on machine without any crashes. However, if you find any issues, let me know :-)

The binaries can be downloaded from https://www.ohloh.net/p/FastForwardNET/download.

Posted in FastForward.NET | Tagged: | 3 Comments »

FastForward.NET: Error Handling

Posted by Craig Sutherland on 21 August, 2009

Oh No, It’s Crashed!

It’s hard building real-world software. Not that writing software is hard, no it’s the “real-world” part that is hard! Why? Because the real world is messy!

I’m not talking about a little bit of mud or slime that can be easily wiped off, no, I’m talking about your kids haven’t cleaned their bedrooms in a year and they can’t that check they were supposed to bank three months ago messy. Times that by a hundred kids, in different countries, some of whom don’t even understand you!!

Not that I’m advocating sloppy coding, but sometimes, no matter how beautiful the code is, things just break. For no apparent reason (yes, I know it worked on your machine when you wrote it.) That’s why someone coined the term defensive programming. And like all developers, my code needs defensive code just as much.

Hence, in the next release I’ve added some error handling code.

You Mean It Isn’t Perfect?

Yes, FastForward.NET isn’t perfect :-(

So I have added a generic handler for all unhandled exceptions (yes there are place were I expect problems and have error handling in place.) If one of these errors occur you’ll get a message like this:

image

This tells you there was an error, it logs the details and tells you where and then advises shutting down the application. I could have been mean and just killed the application, but I’ll give you the choice (it is highly recommended though!)

What’s in the Log?

The reason I generate the logs, is it is nice to get some data on what caused the crash. So, here is an example log:

  1: <?xml version="1.0" encoding="utf-8"?>
  2: <errorLog>
  3:   <environment time="2009-08-19T19:23:00.2808000+12:00"
  4:                os="Microsoft Windows NT 6.0.6001 Service Pack 1"
  5:                processors="2"
  6:                clr="2.0.50727.3074" />
  7:   <error type="System.Exception">
  8:     <message>Hey, you've broken something!</message>
  9:     <stack>
 10:       at FastForward.Monitor.Plugins.BuildVisualisationDisplay.OnMouseDown(MouseEventArgs e) in C:\Open Source\FastForward.NET\trunk\project\FastForward.Monitor\Plugins\BuildVisualisationDisplay.cs:line 136
 11:       at System.Windows.Forms.Control.WmMouseDown(Message&amp; m, MouseButtons button, Int32 clicks)
 12:       at System.Windows.Forms.Control.WndProc(Message&amp; m)
 13:       at System.Windows.Forms.Control.ControlNativeWindow.OnMessage(Message&amp; m)
 14:       at System.Windows.Forms.Control.ControlNativeWindow.WndProc(Message&amp; m)
 15:       at System.Windows.Forms.NativeWindow.DebuggableCallback(IntPtr hWnd, Int32 msg, IntPtr wparam, IntPtr lparam)
 16:       at System.Windows.Forms.UnsafeNativeMethods.DispatchMessageW(MSG&amp; msg)
 17:       at System.Windows.Forms.Application.ComponentManager.System.Windows.Forms.UnsafeNativeMethods.IMsoComponentManager.FPushMessageLoop(Int32 dwComponentID, Int32 reason, Int32 pvLoopData)
 18:       at System.Windows.Forms.Application.ThreadContext.RunMessageLoopInner(Int32 reason, ApplicationContext context)
 19:       at System.Windows.Forms.Application.ThreadContext.RunMessageLoop(Int32 reason, ApplicationContext context)
 20:       at System.Windows.Forms.Application.Run(ApplicationContext context)
 21:       at FastForward.Monitor.Program.Main() in C:\Open Source\FastForward.NET\trunk\project\FastForward.Monitor\Program.cs:line 46
 22:       at System.AppDomain._nExecuteAssembly(Assembly assembly, String[] args)
 23:       at System.AppDomain.ExecuteAssembly(String assemblyFile, Evidence assemblySecurity, String[] args)
 24:       at Microsoft.VisualStudio.HostingProcess.HostProc.RunUsersAssembly()
 25:       at System.Threading.ThreadHelper.ThreadStart_Context(Object state)
 26:       at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state)
 27:       at System.Threading.ThreadHelper.ThreadStart()
 28:     </stack>
 29:   </error>
 30: </errorLog>

As you see, there’s nothing confidential :-) So it would be nice if you get one of these, if you could post it in the error report on Trac.

And of course, feel free to add any additional information you think would be helpful. I promise I will try to resolve any issues as fast as possible.

Posted in FastForward.NET | Tagged: | 1 Comment »

FastForward.VisualStudio

Posted by Craig Sutherland on 19 August, 2009

The Concept

As part of the FastForward.NET suite, I am planning on adding a Visual Studio plug-in. This plug-in will allow developers to interact with one or more CruiseControl.NET servers from the comfort of their IDE :-)

My current plan is to include the following items:

  • Projects view tool window
  • Queues view tool window
  • Server explorer tool window
  • Build report document

Plus I would also like to add some options for working with configuration files:

  • Intellisense
  • A configuration editor (maybe using a DSL)

And maybe even expand the configuration API so it is possible to configure a CruiseControl.NET server from within VSIDE – yes, I definitely have some fun ideas ;-)

A Reality Check

Before everyone starts getting excited (including myself), I must warn people I am not a Visual Studio developer – this will be my first attempt at building a proper plug-in for Visual Studio.

When I first started on FastForward.NET I planned on releasing the Visual Studio and Monitor components at the same time – after all, they can use a lot of the same underlying infrastructure. Unfortunately my experiments with building Visual Studio plug-in rather discouraged me and so I kind of gave up. COM interop, resource issues, trying to get things appearing in the right place – it all seemed too much for just a hobby project.

However, all was not lost, I discovered VSSDK – a managed API for building plug-ins. So I will be trying to implement the plug-in using this instead of directly working with Visual Studio and COM. This will mean a bit of learning for me, but so far I have had more success than I had previously!

Progress to Date

I’ve got to admit, not a lot :-(

I’ve been playing around with VSSDK in my spare time (in between improving the Monitor and fixing issues.) I like to try and understand what I am doing, as that helps resolve weird issues that come up later if I just copy and paste.

The good news, is I finally have a tool window working in Visual Studio:

image

This can be configured to load or remove servers.

Unfortunately, I am missing something somewhere, as all my menu commands are duplicating or worse:

image

*sigh*, some things are not as easy as you would think!

This is not ready to release as part of the installer yet, not even as an “experimental” feature, but the code is in Subversion. So, if anyone is feeling brave, feel free to download it, build it and give it a test run. Of course, I’m more than happy to receive advise on how to do things ;-)

I am aiming to have an initial release early October, so stay tuned…

Posted in FastForward.NET | Tagged: | Leave a Comment »

Why Targets?

Posted by Craig Sutherland on 17 August, 2009

This post is the answer to a question I was asked recently.

The Question

Looking at the add server window, there is an address field and a target server field:

Initial-7a

The question was why are both necessary?

The Answer

The answer to this question lies in the way the different components of CruiseControl.NET (CC.NET) communicate. The following picture shows the components and their interactions:

Communications

What we are really interested in monitoring is the CC.NET. However, the server is not always what is connected to. It is only a direct connection if .NET Remoting is used (or a custom protocol). If someone is connecting via the web dashboard, the address they enter is the address of the dashboard, not the CC.NET server.

Since we want to monitor the actual server, not the dashboard, the target server field is required. This tells the dashboard that although it has received a request, the request for for an actual CC.NET server and needs to be passed on.

Now, in CCTray this field is not needed. But then a lot of the functionality in CCTray is only available for a .NET Remoting connection – they will fail on a HTTP connection. And this is exactly the reason why they fail – the dashboard guesses which server the request should be sent to, and it can’t always get the right server!

Related Questions

This leads to a few other questions:

1: What is the target server for a dashboard connection?

In the dashboard, it displays the available target servers:

Dashboard-Farm

The available servers are listed on the left under the heading “Servers”.

2: What is the target server for a .NET Remoting connection?

Currently this is ignored, but in future CC.NET may handle routing to other servers (i.e. in a farm scenario). Therefore, this should be left as “local”. This will tell CC.NET that the request is for the local server and should not be passed on.

3: Is there a way of listing the servers in the monitor?

At the moment CC.NET does not provide a list of possible target servers. This needs to be added to CC.NET first (i.e. in the dashboard) and then I can add the listing to the monitor.

This is on the to-do list, but first I need to resolve some of the other issues in FastForward.NET.

4: If I want to monitor multiple CC.NET servers that are on one dashboard, do I need to add them individually?

Yes. FastForward.NET monitors the individual servers – not the dashboard.

In future I may add the ability to monitor a dashboard instead of the individual servers – but this means the interactions that are currently provided will be more limited (i.e. to what is allowed by the dashboard).

Posted in FastForward.NET | Tagged: | Leave a Comment »

FastForward.Cache – An Experiment in Availability

Posted by Craig Sutherland on 16 August, 2009

The Idea

This is another idea that I am playing around with under the FastForward.NET banner – a cache service. Before I go into how I am implementing the cache, first lets look at how clients communicate with a CruiseControl.NET server.

Some Background

A CruiseControl.NET (CC.NET) server only natively supports one protocol – .NET Remoting. So, in order to connect to the server a client must use .NET Remoting.

But, you say, what about HTTP? Yes, it appears that CC.NET has an HTTP protocol, but this is actually done via the web dashboard. This is a separate application that runs in a web server, which connects to the actual CC.NET server via .NET Remoting.

Additionally, with CC.NET 1.4.4 or later, it is possible to add a custom extension that adds a new protocol – not that I am aware of anyone doing this yet.

So, the following options are available:

image

Often the web dashboard and the CC.NET server will be on the same machine, but there is no requirement for this. Additionally, a single dashboard can be the front-end for multiple CC.NET servers – providing more functionality and more complexity to the mix!

How, if all of these are on a single network, which is reasonably fast, this infrastructure is not a problem. But add a bit of network lag, or a large number of servers, and suddenly things will slow right down.

One Scenario

To show what I am proposing as a solution, I want to look at a reasonably simple scenario. In this scenario we have five CC.NET servers that need to be monitored. These are all monitored by a single dashboard, and the client then connects to the dashboard to see the project details:

image

(Yes, I did choose FastForward.NET Monitor as the client :-) )

Whenever the client wants to get the status of the servers, it will send a request to the dashboard. The dashboard then sequentially goes through and sends a request to each of the servers, then finally send the response to the client. Now imagine there is a 0.25s lag in connecting to each server – it would take 3s just to get one response back to the client (2.5s for the dashboard to query all the servers, plus the lag between the client and the dashboard)!

Now, expand that out slightly – imagine three clients:

image

The dashboard does not cache any results – so every time a client wants to refresh the status it has to go through the process of querying each server. So, with the default polling interval for FastForward.NET (and CCTray as well), this means the dashboard is going to be asking at least one of the CC.NET servers for its status at any point in time!

Now a .25s lag might seam like a lot, but over an Internet connection this is not at all unreasonable (actually, with some of the dashboards I connect to, a 0.25s lag would be very quick!)

The problem in this scenario is the dashboard just acts as a pass-through layer – it doesn’t provide any value on its own. And that’s just for status upgrades – there are also all the other items that CC.NET provides (e.g. build logs, RSS feeds, etc.)

Instead, what I am thinking of is to replace the dashboard as the intermediate server. Instead there would be the FastForward.NET cache service:

image

This looks very similar to the previous picture, but with one major difference – the cache service would have a local database.

Now, if we open up the hood of the cache service, it will work something like this:

image

When a client wants to request some data, the request comes in over a client connection (1). The client connection receives the request and checks the local database to see if the data is there (2). If the data is there, then this data is used (3). Otherwise a request is sent to the server connections for the data (4). This request gets passed on the server (or servers) (5). When the data is received it is stored in the database (6) as well as being passed back to the client.

This would be the standard model for most requests – for status requests it would be possible to have an additional disconnect:

image

In this model, there is only the client connection loop (1-3) – if there is no data in the local database then no status information will be returned to the client. Instead, the server connections will be constantly polling the CC.NET servers (a) and then storing the status information in the local database (b). This means when the client needs the information, they just use the local data – saves having to query one or more servers for their status.

Limitations

Of course, this model is not without its limitations. First, this would only work for retrieving data – action requests (force/abort build, start/stop project, etc.) would pass through directly to the CC.NET servers.

Second, this does not take into account security – all secure requests would again pass right-through (especially if they are encrypted).

Third, it means the data the client is getting can potentially be very out-of-date. For some data this is not an issue – e.g. build logs are only written once and then never change. Other data could be potentially very wrong – e.g. the project statuses.

Any Comments

I have already started working on a service to do this – although at the moment it is very experimental. However I do think this could have a lot of value, especially for large installations and even more so for those with slow networks (or slow Internet connections).

Does anyone have any comments?

Posted in FastForward.NET | Tagged: | Leave a Comment »