Automated Coder

Exploring the Code of CruiseControl.Net

Reducing Strings 2: Getting to the Root

Posted by Craig Sutherland on 15 October, 2009

Continuation

In my last post on this subject (read it here) I added the concept of a task context. This is a context that the task runs within and stores all the output from a task. The next step is to start writing to this context.

One of the core concepts in the context is it generates streams that can be used by the task. The task is then responsible for managing the stream, but the context is responsible for managing the referencing to the stream. So, the trick to generate the streams from the context, pass it through to where they are needed in the tasks and then clean up when the task has finished with the streams.

And of course, that’s where it gets tricky!

The Current Situation

90% of the time, a task does not directly execute an external application. Instead it calls through a number of layers. This allows a number of cross-cutting functions to be built in, but at the same time it makes changes from strings to streams harder.

Here is the current way it works:

image

The calling task calls the TryToRun() method on BaseExecutableTask, which passes it onto ProcessExecutor and finally RunnableProcess. RunnableProcess internally creates two StringBuilder instances – one for standard output (StdOut) and one for standard error (StdErr). RunnableProcess also provides all the functionality necessary for getting the data from StdOut and StdErr and putting it into these two instances.

When the Run() method on RunnableProcess has finished, it generates a ProcessResult and stores the strings from the two StringBuilder instances into there. From here on, the StdOut and StdErr are stored as strings in memory.

So, the issue now becomes one of where should the streams be initialised? How many streams should be generated? And what do we want to store for future usage?

The New Situation

The main change I am making is removing the instantiation of StdOut and StdErr from within RunnableProcess to BaseExecutableTask. These will be instantiated as streams and passed through to RunnableProcess, where it is an easy enough change to write to the streams instead of a StringBuilder.

Additionally, I’m going to get BaseExecutableTask to generate two streams – one for StdOut and one for StdErr. These get passed down the chain. When the Execute() method on ProcessExecutor has completed, these two streams will be merged into one.

So, this shows how I am changing things:

image

On a side note, a ProcessResult will still be generated as this contains additional information needed for the tasks (e.g. exit codes, time-out details, etc.)

Based on this plan, most of the work is in the BaseExecutableTask, with minor changes to the other two classes. Additionally, I’m going to do some work around the merging of StdErr and StdOut in TaskContext, as this class is responsible for managing the references.

The Actual Changes

Most of the changes are straight-forward, and reasonably boring (except when I made a mistake and have to debug it!)

RunnableProcess and ProcessExecutor were both modified to accept streams and use them (instead of the internal StringBuilders.) Reasonably simple change – just needed to remember to close the StreamWriters I was using.

BaseExecutableTask got a new override for TryToRun() that uses the new streams functionality. This override includes the task name and type (required for creating a result). Additionally I added a couple of new protected virtual methods methods to allow people to override some of the functionality. This is the creating the result stream and merging results functionality (more on this below).

Finally TaskContext got a new method – MergeResultStreams(). This will literally merge two or more streams into a single stream. It also manages the references – the old references are removed and a new reference is added for the merged file. The merging is handled by a delegate, so individual tasks can define how the results are merged. The default merge is a binary merge – copy all the bytes from each of the streams into a single stream.

BaseExecutableTask defines a custom merge delegate. This merge delegate will merge all the results into an XML format – similar to how ExecutableTask does it currently. However this uses the streams to handle the merging and formatting, instead of directly manipulating strings in memory – so we shouldn’t have the out of memory exceptions :-)

And that’s all there is to it – at least for this phase. At this point the new code breaks a number of other tasks – those that rely on the data being in a single massive string. Plus there are a number of additional overrides that I have added temporarily to reduce the amount of breaking changes.

So, What’s Next?

Looking over my previous what’s next list, I realise I’ve skipped over a couple of points – all of this is included in the task changing. I have modified Project to both generate the task context and associate it with tasks, but I think a bit more work is needed there. Additionally I need to look at the container tasks (e.g. parallel task, sequential task, etc.) to see what needs to change to pass on child contexts.

So, in my next post I plan on covering the “plumbing” for the contexts, and then I’ll return to modifying tasks.

Stay tuned…

Leave a Reply

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <pre> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>