A Multi-OS Challenge
CruiseControl.NET has gone through a few iterations of file transfer. Prior to version 1.4, the only files that could be transferred was the build log and other special files like the statistics or RSS feed. To handle these files there were special methods in the Remote API, and all they did was transfer the entire file across the network as a single string (serialised via .NET Remoting).
With CruiseControl.NET 1.4 we added the ability to transfer any file that was located in the artefacts folder of a project. This was a new method in the Remote API, but it now allowed the remote client to ask for a file. If the file was there, it was transferred across the web. And this is where things got interesting!
Originally, this method generated a MarshalByRef-based object that would connect back to the server and transfer the data a block at a time. This ensured that only small blocks of data were transferred, thus reducing the memory usage on the server.
Unfortunately Windows 2008 did not like this approach! IPSec blocked the object from connecting back to the server, so no data was transferred (yes, it worked on every other OS we tried, just one that stopped it.) After several frustrating attempts to resolve the issue, we gave up and went to transferring the entire file as a single byte[] buffer over the network. Once again memory usage went up
So, for CruiseControl 2.0 we are back to the issue of how can we transfer large files across the network?
Let’s Chat
The problem with transferring a single large block of data across the network is it needs to be loaded into memory first, and then transferred across the network. Even .NET Remoting does not like transferring large blocks across the network, but it does it. Under the hood .NET Remoting takes the large block of data and breaks it into smaller blocks, which then get transferred across the network. This is what our original approach did, but IPSec did not like it!
So, how can we go back to the block-based approach and get past IPSec? The answer is to move from our nice chunky interface, were .NET Remoting handles the blocks, to a more chatty interface (yes, I know, normally not a good practise, but sometimes we don’t live in an ideal world!)
The current situation looks something like this:
With .NET Remoting handling the blocking within the red arrow. The new approach will look like this:
Yes, it will be a multi-step approach. The steps are:
- Open the file on the server
- Transfer each block of data
- Close the file on the server
This approach requires tighter coupling between the client and the server (sigh). When the client opens the file, a file identifier if returned. This file identifier is what the client then passes to the server for all subsequent operations.
So, why a file identifier? The identifier is used to identify which instance of the opened file is being used (the file is opened in read-mode so it can be opened multiple times). The server then maps the identifier to the stream instance (which is only held on the server). When each block is transferred, the stream is repositioned so the next block is ready. This means the client does not care where it is in the process, it only cares that there is more data to fetch.
The other downside for this approach is the file now needs to be closed on the server – other the number of open streams will slowly increase (i.e. a memory leak!) So closing the file will clean up after the transfer has been completed.
The other approach I looked at was for the client to pass in the starting position. But in the end I decided against this approach for two reasons:
- It would require the client know what data it wanted (i.e. the position within the file). If it requested a position beyond the file length, then the server would need some error handling to not fail.
- It would require opening the file, positioning to the correct location and closing the file every data fetch! I’m not sure on the performance of this, but it sounds slow.
Of course, if we run into issues with my current approach we can switch to the alternate and test it out, but for now, it’s the approach we are using
Being Helpful
Now, rather than forcing the clients to implement their own versions of file transfer (and potentially duplicate their code), I’ve added a new method to CruiseServerClient in the Remote API. This method is called TransferFile() and it literally does that – it handles all the logic of transferring a file from the server into a local stream.
To use this method is as simple as the following code:
client.TransferFile(projectName, fileName, outputStream);
Where client is an instance of CruiseServerClient (this can be generated from a CruiseServerClientFactory).
I have gone through and removed the old file transfer mechanism (in both the web dashboard and CCTray) and changed to using this method. This means all file transfers will now use the new approach and use less memory. However in a future post I’ll do some tests to see how much (if any) memory is being saved.
Coming Up
Now that we have a file transfer mechanism, the next step is to look at handling the new build log format. In my next post (or posts) I’ll look at the modifications to retrieve the build data and the changes to the dashboard.
Stay tuned…
RSS - Posts