The Problem
This is a continuation on my earlier post on memory issues (read it here). In that post I made some minor changes to reduce memory usage (mainly around string handling). However those changes didn’t fix the underlying problem – CruiseControl.NET is working with large strings in memory.
Now normally the OS and .NET will allow us to play around with large objects in memory without too many issues. Normally…
Sometimes it is possible to use up all the available memory (including swap files, etc.) Why? Because we put too many big strings into memory – and that’s exactly what CruiseControl.NET does. And to make things worse, CruiseControl.NET likes to compound things by the way it works.
So, how does CruiseControl.NET work?
First off, we have the server. This is basically a polling application – every five seconds it checks if a build can start and if so, triggers the build (actually it’s a little bit worse than that, each project has it’s own thread which polls every five seconds, but that’s another matter altogether.) As the build runs it builds up a set of task results, which then get merged together into one big log file:
So, it is possible to have multiple projects, each having multiple large task results in memory. Then, when these get merged into the build log, that’s even more memory used. But wait, there’s more!
As well as the project threads, the server is also responsible for serving results to anyone who requests one of these build logs. This is done by loading the log into memory and then returning the entire log (via .NET Remoting) to the client:
So, as well as having multiple task results and multiple build logs in memory, it is also possible to have multiple instances of the build logs – one per request handler (although in theory these would be quick requests and shouldn’t hold onto memory.) And yes, there’s still more!
The CruiseControl.NET server only allows .NET Remoting clients – which isn’t very helpful if you need to go through a firewall. Plus, people need some way of seeing the results from the tasks – CCTray doesn’t display them. So to handle these situations there is the Web Dashboard. This functions as a client application and does all sorts of things – including manipulating build logs in memory (normally an XSL-T transform):
And where does the Web Dashboard normally sit? On the same machine as the server (unless you’ve changed the default install and installed the dashboard on a separate machine).
So, this gives us a picture of having large strings in memory in several places. The project threads might be okay by themselves, but when you add the communications side, it is very easy to start consuming large amounts of memory with big build logs. Especially when there are a sizeable number of people trying to get results from a build (and what’s the first thing people do when they see a broken build – they go to the server to see what’s broken!)
Pruning Some Memory
Now that we’ve quickly reviewed the possible places where build logs are stored, what can we do to reduce the amount of memory?
Two initial approaches come to mind:
- Reduce the size of the log file
- Reduce the number of times each log file is loaded into memory
Option #1 can be done either in CruiseControl.NET or externally. First, we can split the mega-log into smaller logs, second we could write less data, third we could compress the data.
I’ve been looking at what I can do to split the mega-log file into build result specific log files, so a client could just grab the data they need (e.g. just the NAnt/NUnit/MSBuild/etc. results) instead of everything at once. Unfortunately this is a MASSIVE change, that will need to modify both the server and the clients in a large number of places. So for the moment I have spun off a “play” branch in the Subversion repository where I will attempt to do this (oh, and resolve all the associated problems with having multiple result files.) So this is a no-go for the current release.
Next, writing less data – well CruiseControl.NET doesn’t put that much CruiseControl.NET-related data into the log files. Most of the data comes from the external tools. So if this needs to be done, then people need to reduce the amount of data the tools generate – hence this becomes an external task.
So, that leaves compressing the data. Personally I like this approach as it would also reduce the amount of network traffic. Unfortunately there are a couple of gottas – there is no guarantee that compression would actually save space and it requires changes to the clients to decompress the data. But I still think the idea has merit.
Returning back to the two initial approaches, the other approach is to reduce the number of times each log is loaded. This is the approach I want to investigate some more.
Log Viewing Usage Patterns
From my experience with CruiseControl.NET, there are two general usage patterns:
- THE BUILD HAS BROKEN – what’s wrong?
- What happened to an older build?
And typically the second pattern also happens a lot as part of pattern #1 when people try to find out what has changed to cause the build to break. Very rarely do people go and look at a historical build.
The following picture sums up this pattern, with the size of the arrow indicating how much usage a build log would get:
So, most people only view the logs when something is broken and they primarily focus on the log of the build that actually failed. A smaller set of people would also check the previous successful build to see if there is anything pertinent to the failure. Finally a very small set of people might go through older logs to see what has happened (your manager is checking up on the amount of work you’ve been doing, etc.)
This sounds like a very good scenario for caching. Since there is a (potentially) small amount of data, this could be cached on the server, the dashboard or even both.
Bring on the Cache(s)
On the server side, we could just cache the log files – although we wouldn’t want to cache too many big ones.
The dashboard offers us a few different possibilities for caching. First, we could take the same approach as for the server and cache the the log files. Second, we could cache the parsed XML documents (since the entire document must be loaded and parsed before it can be transformed). Finally, we could cache the transformed output.
However, like anything in the dashboard, the way the build logs are transformed is not a simple process. Instead we have a number of interfaces and their implementations to go through. The build report generation is a plug-in, which generates multiple actions. Each action can have one or more style sheets, which actually get loaded and processed by a class in the Core library.
To Be Continued…
I’ve gone back and reviewed the memory usage scenarios for CruiseControl.NET with a view of reducing the usage on the server. This took a step back and viewed the wider picture, with a view of all the items that CruiseControl.NET is doing on a (server) machine. Unlike my previous post, this takes into account multiple users doing similar things at the same time.
I’ve come up with a couple of ideas for reducing memory usage – caching and compression – and both of them have some gottas that need to be investigated.
So rather than making this an even longer post, I’m going to stop at this point and look into caching and compression in some future posts.
So stay tuned, more fun to come
RSS - Posts