Tuesday, January 6, 2009

Malevich, an introduction

For the last several weeks I was working on a code review system. I got sucked into it gradually: similar software that I used at Google took Guido van Rossum - a much better developer than me - almost a year to implement as his starter project. You can read about it here: http://code.google.com/p/rietveld/.

So I didn't expect that I would be able to do it, at least not in a reasonable time that is left from work.

But the team needed a code review system, and I was playing with ASP.NET. At some point, I was wondering how hard it is - really - to display a diff view of a file. Two hours later I had a working prototype. A localized success.

This got me thinking. Displaying the diff file must be about the hardest part of the system, isn't it? But what about entering the comments? It turned out that I did not completely forget the JavaScript yet, and an hour or so later I was able to click on my ASP.NET-generated page to enter the comments in a text box, and have them show permanently on a page after clicking a Submit button, or disappear on Cancel or Remove.



Ok, so far so good. Now I needed

  1. Free time.

  2. A database.

  3. A source control interface.



The first one was solved quickly - the team has finished our first milestone, a lot of people went on their Christmas vacations, and Seattle was thoroughly snowed in, all at the same time. So the evenings had become much longer. There was at a very minimum an opportunity to take the project further.

One big problem is that I am a database noob. I used SQL Server once before, and not in production environment, but rather for another pet project of mine - a status reporting application for the team. I did learn how to create tables, stored procedures, and security objects back then, but I have no idea how good (or, rather, how bad) my SQL really is.

Luckily, I did discover the database extension for Visual Studio 2008 in my previous run-in with SQL Server. It is called Visual Studio 2008 Database Edition GDR (German Democratic Republic? I have no idea what GDR is), and you can download it from here: http://www.microsoft.com/downloads/details.aspx?FamilyID=bb3ad767-5f69-4db9-b1c9-8f55759846ed&displaylang=en.

The really nice feature of this GDR thing is that it makes updating the database really, really easy: you modify T-SQL that creates the database objects, and hit Deploy, after which it figures out what NEEDS to be changed, and changes the database accordingly, without destroying the database contents. It even does data conversions automagically!

The downside of editing database code inside GDR vs. SQL Server Management Studio is that Intellisense does not work.

The database took some time - mostly T-SQL in the stored procedures. But eventually it worked, and I needed to start storing real data in it.

Most Microsoft teams use Source Depot - a Perforce derivative - as their source control system. I don't know if Perforce has an API today. Source Depot was snapped from Perforce a while ago, and the API available for it is beyond horrific. Really, truly terrible. A command is issued by calling a function with p4 command line, and you get back semi-parsed data. Except some data gets dumped into standard output anyway.

So I ended up just running the command line parameter, capturing the output and parsing it.

By the way, before I forget, here's a way to capture a process output in C# that actually works (just reading standard handles blocks forever if there is a lot of output on both stdout and stderr channels):

private delegate string StringDelegate();

public static string ReadProcessOutput(Process proc,
bool eatFirstLine, ref string errorMessage)
{
StringDelegate outputStreamAsyncReader =
new StringDelegate(proc.StandardOutput.ReadToEnd);
StringDelegate errorStreamAsyncReader =
new StringDelegate(proc.StandardError.ReadToEnd);
IAsyncResult outAsyncResult =
outputStreamAsyncReader.BeginInvoke(null, null);
IAsyncResult errAsyncResult =
errorStreamAsyncReader.BeginInvoke(null, null);

// WaitHandle.WaitAll does not work in STA.
if (Thread.CurrentThread.GetApartmentState() ==
ApartmentState.STA)
{
while (!(outAsyncResult.IsCompleted &&
errAsyncResult.IsCompleted))
Thread.Sleep(500);
}
else
{
WaitHandle[] handles = {
outAsyncResult.AsyncWaitHandle,
errAsyncResult.AsyncWaitHandle
};
if (!WaitHandle.WaitAll(handles))
{
Console.WriteLine("Execution aborted!");
return null;
}
}

string results = outputStreamAsyncReader.EndInvoke(
outAsyncResult);
errorMessage = errorStreamAsyncReader.EndInvoke(
errAsyncResult);

proc.WaitForExit();

return results;
}
Anyway, dealing with the source control turned out to be the nastiest part of the whole project. But in the end, it finally worked, and I could parse the change descriptions, get file versions, and so on.


At this point, I was very, very close. Now I needed a name.

Google code review system was called Mondrian - for Piet Mondrian, an early XX century Dutch painter (http://en.wikipedia.org/wiki/Piet_Mondrian). So Malevich (http://en.wikipedia.org/wiki/Kazimir_Malevich) - a Russian painter from approximately the same era was a logical choice. It also had an added benefit: his most famous painting was the "Black Square". Here it is:


I love "Black Square" because, being really, really, really bad at anything that involves any sort of art, and doubly bad with painting, images, and graphical design (as you can see by looking at the screen shots) this is the one image that even I can reproduce. So if this software project will ever need an icon, or a logo, or any sort of graphical representation, I revel in knowing that I will be able to do it :-).

Anyway, after the name was picked, two other things remained - a program to upload files, and finishing the web site.

I've got to say - I LOVE ASP.NET. I have no idea how good it is in terms of performance, but man it is easy to develop on! Very logical API, easy to learn and use, and Intellisense absolutely rocks! Without much previous experience, I was able to put out roughly 2000 lines of code that implemented the web site and web service to talk to JavaScript code in the browser in perhaps a couple of weekends.



So where are we now? The code is deployed, I demoed it to the team, and a few reviews had flown through it. A couple of bugs found and fixed within minutes, complete with redeployment of the app. I changed the database to accomodate - in the future - the addition of TFS. I've added perforce support, this took less than one evening.



All in all, I've spent less than 80 hours working on it, which I think is not bad. I can say that Visual Studio, SQL, LINQ and ASP.NET mattered a lot - there's no way in hell I could produce anything even close to it even in twice the time without these tools. Windows is not free, but man it makes developers productive :-).

I've uploaded Malevich to CodePlex (http://www.codeplex.com), but have not published the project yet - I think I will add TFS support before I do. The rule on CodePlex is that it must be published within a month, so by the 1st of February it has to be up, or it will get deleted. Check back before then for the link...

Update: It's been published now: http://www.codeplex.com/Malevich. And we have successfully used it for a months and a half now on my team for more than 200 reviews, 2000 files, and 3000 comments. Give it a whirl!

2 comments:

nathan3700 said...

Nice work. I like how you can view review comments in line with the code.

About the most sophisticated thing I do these days is bring up Xemacs on a projector and write comments in the code itself.

Julien Couvreur said...

What was Guido doing, spending a whole year on that stuff? ;-)