Wednesday, November 28, 2007

My post about Gmail new features on the official Gmail blog

You can read it here:

Group Chat is OUT!

**This is my personal blog. The views expressed on these pages are mine alone and not those of my employer.**

My first independent project at Google - group chat in Gmail - is being deployed RIGHT NOW! It has already showed up on my private Gmail accounts. Which means that it is no longer a secret and I can talk about it.

To use group chat, select it from the Options menu in the regular chat, add people, and BOOM! you are in your own chat room with the people you have invited! It's that easy.

As an added bonus, you get a more advanced support for emoticons - my intern's project.

Gmail rocks!

**This is my personal blog. The views expressed on these pages are mine alone and not those of my employer.**

Monday, November 26, 2007

High-performance JavaScript

**This is my personal blog. The views expressed on these pages are mine alone and not those of my employer.**

An interesting presentation on performance in AJAX app can be found here:

I do not agree with everything (e. g. the author argues against data encapsulation) - while most/all of his suggestions are beneficial in terms of pure perf, if taken as a set they can lead to code that would be very hard to maintain. You have to balance high performance with extensibility/costs of maintenance. If not for these trade-offs, managed languages would not have existed :-).

But overall, extremely useful overview.

Other considerations about performance/JS development that I learned in Gmail:
- use regex expressions whenever possible - they are very fast on all browsers
- on ie6 'a' + 'b' + 'c' + 'd' is excruciatingly slow, and ie7 is better but still not ideal. ['a', 'b', 'c', 'd'].join('') is much faster - create a Java-like StringBuilder object for string accumulations.
- have a library that abstracts certain functionality that browser runtimes implement differently, and compensate for the differences in a single layer, as opposed to the whole app
- have a preprocessor that minimizes function names, inlines accessors, etc - smaller code base really does make a huge difference.
- Use standard programming techniques with JavaScript. Yes, it's a script language, but you will not be able to create a big AJAX app if the code is not readable, well abstracted, and has extensible architecture. Do not cut corners just because it is called script.

**This is my personal blog. The views expressed on these pages are mine alone and not those of my employer.**

Sunday, November 25, 2007

The evil of lowered expectations

**This is my personal blog. The views expressed on these pages are mine alone and not those of my employer.**

"Do not be evil" is Google's motto. I am not quite qualified to analyze Google's culture in depth with respect to this credo because I have only worked here for 6 months and did not have wide exposure to a lot of aspects of corporate life. All this is yet to come.

There is one evil, however, that I observed aplenty both in my previous workplace as well as in everyday life, which I do not see much around here - it is the artificially lowered expectations for the customers.

My last job at MSFT was running the dev team for Windows Home Server. I was the very first dev and then the dev manager, so I was lucky to participate in the product life cycle starting with it being a mere concept and to the point where it was ready to ship. And that meant customer research, designing (and redesigning, and redesigning) the features, and cutting, cutting, cutting.

One of the things that I liked least in my job was when a feature was cut because "the users wouldn't understand it". Sometimes it just meant a dumbed-down UI with less information given to the user, but quite a few useful features did not make it into a product because they were deemed too hard for the customers to grasp.

Truth be told, it's extremely hard to say whether it is true or not for vast majority of the features. First, it is a tautology that for every feature there exists a non-empty customer set consisting of people who will find the feature very hard to understand. Second, defining a customer (and his or her capabilities) is a deeply subjective process.

For example, in the case of Windows Home Server approximately 12 months into the process we realized that we do not know if our customers understand a concept of a share. (Just imagine designing a FILE SERVER for people who do not know what a FILE SHARE is!)

Also, the argument against advanced features can be plausible on the surface because it can be very easily commingled with the argument against the feature bloat - how many MS Word features did you ever use?

Why should a software project shoot for a power user at all? There are several reasons.

First, the people who build the product are usually power users of the product themselves. Having the team LOVE the product is one of the easiest way to maximize its productivity, and the difference between productive and unproductive teams in software engineering is huge - often better than an order of magnitude (at least binary :-)).

Second, which is super important for start ups, the early adopters - the early users who pay the premium for early version, both in money spent and bugs found, who move the product from obscurity and into the mainstream, are almost always the power users. Without them, the product is doomed to failure.

Finally, the product team's knowledge of customer is usually flawed. Think how much easier it would be for the Windows Home Server team to just assume that the customers do know what file sharing is? We could have skipped the unproductive time spent on inventing newer ("easier") primitives, all of which had failed anyway. Because when the product shipped in the end - after much blood has been spilled - it shipped to the people who do understand file sharing. Moreover judging from all the feedback I have read, it was adopted primarily by people who know not just how to share files, but how to program :-).

So by underestimating customer, you do it at your own peril - you alienate key constituents, expend unnecessary effort, and most likely come out wrong anyway.

Avoiding the dumbing down of a product is easier (and I claim that result is vastly more successful) when the product team is designing the product for themselves, and when there is an effective firewall between the people who want to sell as many copies as possible (and so have an incentive to dumb the product down to make it accessible to the broadest audience) and the people who are charged with the actual work.

If you have a temptation to scoff now and say "yes, then everything will be designed for geeks and normal people will never be able to use anything", or "OK, but how do we make money?" I've got a few examples for you.

Think about Gmail. Look up the key accelerators for navigating your mail - they come directly from vi. Clearly, this product was designed for geeks, and by geeks.

Compare and the original MSN search pages. Google start page was extremely small and clean. It was designed by people who cared mostly about functionality. MSN search page looked like a tabloid. It took a long time to load (forget about using it on a mobile device or on a 56K modem connection), and was trying to sell you printer ink and teach you how to please your boyfriend. It was designed by marketing people for a soccer mom.

The problem - soccer moms (at least at the time) did not shop on the internet. The ones that do use the internet, use it primarily from a narrow band connections. Microsoft doesn't know any more about soccer moms than Google does. They were building a search site for a fictitious dumbed down audience, with the primary goal of extracting as much revenue as possible, and a secondary (distant) goal of producing (some) service. Google engineers were building a service they could use, which could also make money.

Consider Halo vs. Gears of War. Gears of War protagonist is basically a thug - lots of muscles, lots of tattoos, foul language, etc. Gears of War was designed with a certain vision of a hardcore gamer in mind - basically, a dumb thug. The reality is, while some dumb thugs are probably playing computer games, for the most part they are just living these lives out in the ghettos of LA. The actual hardcore game audience sympathizes with the classy Master Chief of Halo - which was proven by sales numbers time and again. In that, it is probably close to the engineering team who created the game.

If you are tempted to think that Microsoft has become too big and too dumb so it treats the customers as an image of itself (or, if you are a crazy Linux fanatic, that it is evil and is simply working on making more dumb people who can only use Windows), the examples are plenty outside Microsoft. In fact, they are the norm in American society.

Why do space ships in the movies make sound as they move past you? Sound does not propagate in vacuum, so in reality the spaceships are perfectly silent. It is not required for artistic expression, because a huge imperial battleship moving past you in complete silence is much more formidable (and believable) than when it is making funny noises. I can however easily visualize a marketing exec previewing the movie and saying - hey, why is it silent? Customers won't get it. Can we add a whooshing sound?

I am betting that it was for the same reason (and process) that the American edition of Harry Potter has been adapted for the US reader - most of the changes are small (such as replacing "food trolley" with the "food cart"), but the most egregious is the renaming of the first book from the Philosopher's Stone to Sorcerer's Stone - the philosopher's stone (lapis philosophorum) is a term that dates back to alchemical exploits of the Middle Ages (at least). Sorcerer's Stone is something designed by Scholastic's (!) marketing department, who thought "that American readers would not associate the word "philosopher" with a magical theme (as a Philosopher's Stone is alchemy-related)".

Did you know that J. K. Rowling publishes under J. K. rather than Joanne (she has no middle name) because publishers (Bloomsbury, UK, in case you're tempted to blame US for everything) felt that readers would not be interested by a book written by a woman?

- Let the team build the product for themselves.
- Do not think that your customer is dumber than you are
- People care about functionality at least as much as they care about appearance
- Erect a firewall between people who finance the project (and not care about anything but short-term return) and the people who build it

**This is my personal blog. The views expressed on these pages are mine alone and not those of my employer.**

Wednesday, November 21, 2007

Guerilla guide to native memory management

DISCLAIMER: Vast majority of the article if about fighting heap fragmentation issues that plague native environments. Managed code that uses generational (or generational-like) garbage collectors is not as susceptible to heap fragmentation, and a lot of things that are described here are not applicable.

Once upon a time I was working on (initially, just managing) a project that to protect the guilty shall remain nameless. The project was a very intensive graphical application running on Windows CE. It was a port from an identical feature in regular Windows. Very soon after the RTM (release to the manufacturing) a huge customer came back and told us that there is no way they are going to be upgrading to the new version of the product - it is at least 3 times slower than the previous version on their tests.

As a side note, an important lesson here - one needs performance testing as part of the release criteria. But that was aeons ago, in a relatively small team, and our processes were not nearly as sophisticated as they should have been.

Anyway, we did run their test, and - lo and behold - version 2 took 3 minutes to complete, and version 3 took over 10 minutes. WTF?

Between v2 and v3 we ported the next version of the app from Windows NT. Both versions of the product ran just fine in that environment. After poking around, we found out that graphic calls, specifically the creation of graphics primitives was becoming extremely slow after a while, and the test used to create a lot of them. That was actually the main difference between v2 and v3 (v2 reused the objects - brushes, pens, DC, fonts, and other things that you use in normal Windows graphics, whereas v3 was recreating them).

Ok, it was time to look at the OS code. Here's what I've found - most of the graphical primitives were implemented as very small (16-20-24 bytes) structures. For example, a brush might hold a color, an index to a pattern, a size. That takes ~12 bytes. The objects were allocated from regular heap of the video subsystem using LocalAlloc.

If you allocate and free a lot of small objects in random succession, the heap experiences a phenomenon called "heap fragmentation". This means that the heap contains a lot of small objects interspersed with a lot of small empty spaces. Naive memory allocation algorithms (like we had in Windows CE) take time that is roughly linear with the number of objects because they look though the linked list of blocks to find an empty block that has sufficient size. If you have 10000 of 12-28 byte objects, and between them there are another 10000 12-28 byte empty spaces, it takes a while to find a place (at the end of this list) for a 512 byte block.

When I replaced LocalAlloc with a fixed-block allocation algorithm (see below), the test went back to 3 minutes, erasing ALL of the performance difference between the two versions.

Before we start on techniques that can be used to avoid or minimize heap fragmentation, it is important to list the conditions under which the heap fragmentation occurs.

Specifically, heap fragmentation matter if and only if a program is allocating and freeing a lot of small objects, and is resident in memory for a long period of time.

Let me give an example where heap fragmentation does NOT matter. Suppose you are converting a document from one format to another. This involves parsing the source format, and then generating the output. Let's suppose that you have to parse the entire document before starting the conversion because there are certain globally-computed parameters in the output format that are only known based on the entire document. Let's also assume that the task runs as a single process that exits when the job is done.

In this example we are clearly allocating a lot of small memory blocks. However, we (1) do not free them, so they are all allocated in a contiguous block in the heap, without creating holes, and (2) the program does not stay resident for a long time. So basically allocations are taking as little time as we can make them anyway, and the heap fragmentation does not occur.

Now suppose the same tasks run as a web service, processing a lot of simultaneous requests from different users. Now one thread can be allocating and the same time as another thread is freeing, and the heap can survive for a long long time. At a minimum we have to consider heap fragmentation as one of the possible performance bottlenecks.

So how does one fight heap fragmentation? By minimizing or containing small allocations. Large allocation are less of a problem because when freed, they result in holes in the heap big enough to accomodate variety of subsequent allocations.

Containing allocations

Containing allocations means replacing many small allocations with fewer large pools, and managing small objects out of these pools. The ways pools are constructed depend on specifics of the task.

Fixed block allocator

One technique for containment is called fixed heap, or fixed-size allocation pool. This technique is widely used by compilers and OS kernels. Here, if you expect that you will need a lot of small objects of equal size, you pre-allocate a number of them (a pool), then dispense the objects one by one, and then allocate another pool as you run out. The code for this looks roughly as follows:

// Free code! Free for all! Public domain!
// No need for attribution!
// That is - if you debug it. I have not.
// I just typed it right into the blog.
struct Pool {
Pool *next;
void *free_objects;
int used_objects;
union {
unsigned char memory[0];
void *__enforce_ptr_alignment;

struct PoolDescriptor {
Pool *pools;
size_t object_size;
size_t alloc_step;

void *GetPool(size_t size, size_t alloc_step) {
PoolDescriptor *descr = (PoolDescriptor *)malloc(
if (! descr)
return NULL;

descr->pools = null;

// ensure pointer-sized alignment since we will be
// cross-linking the empty blocks
descr->object_size = (size + sizeof(void *) - 1) &
~(sizeof(void *) - 1);
descr->alloc_step = alloc_step;
return descr;

void *PoolAlloc(void *pool_descr) {
PoolDescriptor *descr = (PoolDescriptor *)pool_descr;
Pool *pool = descr->pools;
while (pool && !pool->free_objects)
pool = pool->next;

if (!pool) {
pool = (Pool *)malloc(offsetof(Pool, memory) +
descr->object_size * descr->alloc_step);

if (!pool)
return NULL;

pool->next = descr->pools;
descr->pools = pool;

// Populate the free list. Initial allocation
// is done in order to mimic the cache locality
// of the regular heap.
pool->free_objects = (void *)pool->memory;
unsigned char *ptr = pool->memory;
for (int i = 0; i < descr->alloc_step - 1; ++i) {
unsigned char *next = ptr + descr->object_size;
*(unsigned char **)ptr = next;
ptr = next;
*(unsigned void **)ptr = NULL;

pool->used_objects = 0;
// pop one from free list
void *result = pool->free_objects;
pool->free_objects = *(void **)pool->free_objects;
return result;

static int PoolContains(Pool *pool, void *block,
size_t size) {
return pool->memory <= block
&& pool->memory + size > block;

static int NumFreePools(PoolDescriptor *descr) {
Pool *pool = descr->pools;
int count = 0;
while (pool) {
if (pool->used_objects == 0)
pool = pool->next;
return count;

void PoolFree(void *pool_descr, void *block) {
PoolDescriptor *descr = (PoolDescriptor *)pool_descr;
Pool *pool = descr->pools;
size_t size = descr->alloc_step * descr->object_size;
while (pool && !PoolContains(pool, block))
pool = pool->next;
if (!pool) {
*(void **)block = pool->free_objects;
pool->free_objects = block;
// Unlink one pool if more than one become free
if (pool->used_objects == 0 &&
NumFreePools(descr) > 1) {
Pool *runner = descr->pools;
Pool *parent = NULL;
while (runner && runner->used_objects) {
parent = runner;
runner = runner->next;
if (parent)
parent->next = runner->next;
descr->pools = runner->next;

void DeletePool(void *pool_descr) {
... left as an exercise for the reader

As you can see, allocations from the fixed heap are almost always O(1), and frees are O(N) (it is actually easy to make frees O(1) as well if you do not need to collect free pools. In this case instead of keeping per-pool free list, keep a global one in PoolDescriptor).

Arena allocator

Another widely used technique is called 'arena allocator'. Arenas are used whenever you expect that you will need to make a lot of small allocations of different sizes, and you will be throwing them all away as a block. In the example above of a service converting documents on the web, one conversion requires many allocations, that do not need to be freed until the processing of a document is completely done. The same may be true in many other cases - rendering HTML page, printing a document, rendering a map, etc.

Arena allocators allocate a big block of memory, and then give away parts of this block as requested. Because there is no freeing of individual objects, arena allocator does not need to keep around any data about allocated objects - only the current pointer and size of the remainder of allocated block of memory (they still need to heed the alignment requirements of the blocks that they allocate, i. e. memory that contains doubles needs to be aligned on 8 byte boundaries, etc). This makes arenas extremely space-efficient.

The code for an arena allocator might look like this example.

// Free code! Free for all! Public domain!
// No need for attribution!
// That is - if you debug it. I have not.
// I just typed it right into the blog.

struct Arena {
Arena *next;
size_t bytes_allocated;
size_t bytes_used;
char memory[0];

struct ArenaDescriptor {
Arena *arenas;
size_t alloc_step;

// alloc_step should be big, e. g. a MB
void *GetArena(size_t alloc_step) {
ArenaDescriptor *descr = (ArenaDescriptor *)
if (!descr)
return NULL;
descr->alloc_step = alloc_step;
descr->arenas = NULL;
return descr;

void *ArenaAlloc(void *arena_descr, size_t size,
int align) {
// align must be power of 2
assert(aling && align ^ (align - 1) == 0);
ArenaDescriptor *descr = (ArenaDescriptor *)arena_descr;
Arena *a = descr->arenas;
while (a) {
if ((a->bytes_used + align - 1) & ~(align - 1) +
size < a->bytes_allocated)
a = a->next;

if (!a) {
size_t bytes = (offsetof(Arena, memory) + align - 1)
& ~(align - 1);
if (bytes < descr->alloc_step)
bytes = descr->alloc_step;
a = (Arena *)malloc(offsetof(Arena, memory) + bytes);
a->next = descr->arenas;
descr->arenas = a;
a->bytes_allocated = bytes;
a->bytes_used = 0;

size_t offset = (a->bytes_used + align - 1)
& ~(align - 1);
a->bytes_used = offset + size;
return a->memory + offset;

void DeleteArena(void *arena_descr) {
...left as an exercise for the reader

Sounds way too complicated? Well, efficiency is not free :-). However, there is a simpler way if you are prepared to pay slightly higher per-object allocation cost in memory.

Starting with Windows XP NT implements a so-called "low-fragmentation heap", which behaves like a block heap. See the description here: LFH behaves in a way that is similar to a fixed block allocator, it just spends more time (and memory) managing the objects because an object of any size can be allocated out of it, and the heap will still have to service the request. With the fixed allocator we can avoid this complexity by restricting the problem space.

Also, for a poor man's arena implementation, consider constructing a new heap (HeapCreate on NT) whenever you need an arena. It is not as memory efficient because it still keeps around all the data that is necessary to delete individual blocks, but as long as you never delete them, there is no fragmentation, and at the end of your processing you can throw away the entire heap as one block.

Minimizing allocations

Using programming practices that minimize the number of small allocations can go a long way towards a healthy heap even without the help of advanced allocators. Here are several tips:

(1) Try to embed the data in your structures instead of using pointers.
For example, this:

struct A {
char name[256];

is preferable to

struct B {
char *name;

Yes, if the string is small, you can save a few bytes. Remember, however, that allocating from heap costs 8-16 bytes just for alignment losses because the heap-allocated objects must be aligned to the most restrictive data type on your computer, since the heap manager does not know what you are going to be storing there, and whether it requires natural alignment. There are another 8-16 bytes for the heap header. Plus you store a pointer in your structure (another 4-8 bytes). So the savings are actually smaller than might appear.

This is the actual code that I have seen in a commercial product, which you should have been never, ever written:

struct A {
int num_elements;
int *array;

where array was always malloc'ed to contain either 1 or 2 integers. This is what it should have been, which is always cheaper in every respect (time, memory) than the code above:

struct A {
int num_elements;
int array[2];

(2) Use stack allocation whenever possible. Stack allocation is very quick, since it is handled by the compiler, and is cleaned up automatically once the function exits.

For example, this:

#define ARRLEN(c) (sizeof(c)/sizeof(c[0]))

void func(WCHAR *dir) {
StringCchCopyW(mask, ARRLEN(mask), dir);
StringCchCatW(mask, ARRLEN(mask), L"\\*.txt");

is always better than this:

void func(WCHAR *dir) {
WCHAR *mask = (WCHAR *)malloc(sizeof(WCHAR) * (wcslen(dir) + 7));
StringCchCopyW(mask, ARRLEN(mask), dir);
StringCchCatW(mask, ARRLEN(mask), L"\\*.txt");

Yes, it takes more space, but it's a temporary space on the stack which has no implication for long-term run-time of the program.

Sometimes it is hard to predict the maximum size of the buffer, but possible to predict it for vast majority of cases. Consider the following rewrite of the function above that is now not limited to _MAX_PATH characters in the file name (so it can support long file names on NTFS):

void func(WCHAR *dir) {
WCHAR mask_buffer[_MAX_PATH]; // 99% of our cases
WCHAR *mask = mask_buffer;
size_t cchNeeded = wcslen(dir) + 7;
if (cchNeeded > ARRLEN(mask_buffer))
mask = (WCHAR *)malloc(cchNeeded * sizeof(WCHAR));
if (mask != mask_buffer)

This uses stack space most of the time, except in really exceptional cases.

(3) Finally, do not be afraid to trade slightly higher memory consumption for memory fragmentation. For example, allocate bigger string buffers so it is less likely that you need to reallocate them multiple times if user input require it.

Other considerations

If you've read this far, you may be wondering why on Earth did I omit the most frequently used way to "optimize" allocation speed - the free lists? For those that do not know, free list is a technique whereby malloc and free are overridden in such a way that free does not actually free (to a point), but instead puts memory blocks on a list of free memory blocks. The malloc then gets the closest bigger match to the size that is needed

The reason for this is that the simplest implementations of the free list are quite inefficient. For example, a block on a free list can be much larger than the block that is requested, leading to quite a bit of wasted memory. This can of course be overcome by a smart heuristic.

But the main reason is that free list does not really help heap fragmentation in cases where there are a lot of small allocations. When the free list runs out of number of objects, they are allocated from heap (and when the free list is big, they are freed to heap), leading to the same set of problems that we set out to solve in the beginning of this article.

Tuesday, November 20, 2007

What If Gmail Had Been Designed by Microsoft?

Another jab at Microsoft UI, this time applied to Gmail

If you remember the original Microsoft Designs the iPod package, it is not dissimilar, but equalliy hilarious.

The core lesson here is - once engineering (AKA "reality-based community") loses control, appearance starts to dominate, and substance diminishes.

Thursday, November 15, 2007

Look, it's possible to make a new version of an OS...

...that is not slower than the previous version on the same h/w.

Unfortunately, it's a Mac :-(.

Monday, November 12, 2007

If Microsoft wants to stay competitive in the browser wars, it needs its own version of Firebug.

There is no two way of saying it - IE is just not friendly to AJAX developers.

Just take a very small thing - runtime error pop-ups. Because I write web applications and I need to debug them in browsers, I have Javascript error checking enabled, in IE and in Firefox.

So when I go to the New York Times, the very first thing I see in IE is this:

So I have to click on the dialog box to dismiss it. Then it comes again, so I have to click again. Then and only then the front page shows.

When I use Firefox (on which I have Firebug installed), instead of showing a big ugly pop-up, I see a small, unobtrusive indication of an error on the page in the lower canvas. If I double-click there, I get an explanation of an error, a JavaScript stack trace, and an opportunity to to set a breakpoint.

The list goes on, and on, and on - have you figured out how to set a breakpoint in JavaScript from Microsoft Script Editor or Visual Studio 2005 yet? I have not. Whereas in FireFox with Firebug all the debugging experience is integrated in the browser almost as well as debugging is integrated in a modern IDE.

This is the reason why most developers I know use FireFox to develop their apps, and then port it to IE (for a bunch of them, that IE is running on a virtual machine on a Mac, see my post about Vista).

Microsoft is probably one of the best companies to fully appreciate what allegiance of the developers (or lack thereof) is worth. And it is losing it fast in the AJAX space, because of the terrible tools.

Vista free...

Generally, I am very passionate about running latest and greatest - I am a very conservative dog fooder, but once the software ships, I feel really bad if I am not running the latest version. I installed all versions of NT from 3.1 to Server 2003 inclusive within a day or two from RTM. The same was true for Office, Visual Studio (which I faithfully upgraded even during the time it was shipping quarterly). Vista was the first exception from this rule...

When I left Microsoft, I felt compelled to spend my balance in the company store and bought a bunch of copies of Vista Ultimate, 7 or 8 boxes. I have tons of computers, most of them less than 1 year old, and so I started installing it on my daughter's PC, my wife's new ThinkPad T60p, my game box (where Vista was required because Halo 2 PC will not run on anything else), and 2 of my Media Center machines.

Then, as time went, I started gradually getting rid of Vista and replacing it with XP.

Media Center machines were first to go back to XP lineage.

As it turned out, Vista wound't play a bunch of video clips that I already have (and also Media Player and Media Center played different subsets). (Yes, I know how to install codecs. It's the file format that is the problem. Who tested this?)

Nero 6 did not work at all, so I'd have to buy an upgrade, and UareU Workstation (fingerprint recognition s/w which I use on my media centers to avoid fumbling with the keyboard when you want to watch a DVD) did not work, and there was no upgrade. And the same was true for my video capturing device, and the software that came with it.

Finally, I have all my DVD collection ripped to a RAID array on my server. On Media Center 2004-2005 all I had to do was to place a shortcut to a server share in My Videos folder, and bingo! the share would show under Videos in MCE shell, I could navigate the file structure, and it recognized the folders that contained VIDEO_TS folders and would play the DVD when the folder was selected.

On Vista, this does not work. Instead, there's a concept of DVD Gallery (which one has to enable in the registry), and then it shows all your DVDs in one list sorted alphabetically. All 200-300 of them. In one big list. How stupid is this?

Then my 8 years old daughter started to complain about the same basic things - that it's impossible to watch DVDs, that fingerprint sensor does not work anymore, that it hangs all the time (XP was sleeping fine on her machine, but Vista crashed every time the PC was supposed to go to sleep), that printing from IE does not produce WYSIWYG results.

I've spent hours on that last one, by the way, trying to figure out what could I have possibly done wrong with the printer driver - web pages printed from the admin account looked fine, but the pages printed out under a non-admin user's account were formatted incorrectly, and had parts missing - until I spoke to a co-worker who was a test manager for Vista printing. As it turned out, that was by design, and due to some security problem with UAC they could not reconcile.

My daughter was willing to suffer a bit longer because Vista played Geometry Wars, and XP wouldn't, but the moment they released that on Steam, Vista was gone that same day.

Then there was my Dad, who got his HP laptop pre-loaded with Vista. Most of the software he used did not work on it, so he asked me for a copy of XP and reinstalled it about 3 weeks ago.

My wife's suffered the longest. She's a dev at Microsoft, and she worked on Vista. That was her product. But it just did not work - from VMWare to even Microsoft's very own solution for logging into corp network remotely. She threw in the towel 2 weeks ago, and now her laptop is 10 times faster, running XP.

Speed is another thing with Vista. That "security" stuff they put in is not free, as Mark Russinovich points out on his blog here and here. I have a gaming PC that is an absolute speed demon - 3GHz Core 2 Duo E6840 running on 1333MHz motherboard with 4GB 800MHz dual channel RAM and Nvidia 8800GTS 640MB video card. Vista runs OK on it - but only OK - takes a while to boot, then a bit more to load the sidebar, then clicks on the sidebar take 10-15 seconds to show the link, etc. XP just plain flies. Boots in something like 10 seconds. UI is so snappy it feels like it responds BEFORE I click :-)...

So net net is...

Cons for Vista:
- slow
- media support is horrendous
- a bunch of s/w does not work, including a lot of Microsoft's own
- a bunch of h/w does not have drivers
- UI is different (I don't know if it's better, but in many places it's different seemingly without a reason, so you have to re-learn it again; and there's no "classic" mode)
- UAC mode is just plain stupid - it causes more harm than good (the registry entry to turn it of was the most frequently asked questions about UAC/LUA at Microsoft for quite some time)

Pros for Vista:
- eventually, we will all get used to it and it will be the future. Or maybe we'll migrate to Macs, as a whole bunch of former Microsoft people now working at Google did...

I guess my family and I will be staying with XP for a while. Good for me that I bought a whole bunch of copies of XP in the company store as well :-)...

Saturday, November 10, 2007

Domain languages

When I told my intern that I was moving to Maps from Gmail, she smiled and said - well, of course, you want to code in C++ rather than in Java/JavaScript which is what the majority of Gmail is written in.

She was not the first to remark on that. Somehow I exhude the aura of disdain for managed languages.

In reality, I use Java and C# quite a bit, and often in an environment where one would never use a language one hates - at home. I have a system that downloads podcasts (mostly NPR radio shows) which is written in C#. Another program that I wrote periodically records radio shows, mostly those not available for download, such as This American Life (yes, I do know it's available on iTunes; I don't use iTunes). It is also written in C#. A family of programs that I use to index web sites and download parts of them that match specific criteria - is written in Java. And I am working on a bunch of arcade games that will be available off my web site - that's written in JavaScript.

Truth be told, nothing that I wrote for pleasure in the last 2-3 years is written in C++.

I do think that languages have their domains of applicability. I doubt many people will argue with that. What may be controversial is where the boundaries of these domains are.

For the purposes of this article I will combine Java and C# into one entity called 'managed languages'. This may offend the purist, but this is a very convenient short cut for the purposes of this article.

Anyway, let's start with what managed languages are NOT good for. I don't think they are good for desktop applications. There are 3 problems.

First, the resources that they require to run, specifically, the startup time and memory footprint. If you ever ran Office Communicator, for example, or ATI control center, you know what I mean. On my (super-powerful) laptop, the latter takes 15 seconds to start. On my (older) desktop, the former made the system unusable for the first MINUTE or so after reboot while it loaded.

Second, it is impossible to create a .NET application that does something more that printing a line of text and uses less than 50MB of RAM. Just check the memory consumption of the aforementioned Communicator. And it's not just .net - my Java indexer is running right now on one of my computers and takes 51MB of RAM. The program is barely 1000 lines of code, and is not super memory intensive - all it does is load text from the web and do a regex match looking for a specific pattern.

Here's a bigger (but much better tuned) program - Eclipse. 100MB and 40 seconds to start on my laptop. By comparison, Visual Studio Professional 2005, a much bigger system, on the same computer - 60MB and 10 seconds to start, with a C++ project loaded.

While it's fine for one program to be written in managed code, imagine an environment where there are say 10-20-30 managed applications and services running simultaneously. How about 10 minutes for such a system to boot? And 2GB of RAM just to do nothing? There actually is an example of just such a system - "Origami" Ultra-Mobile PC from Samsung. This thing takes ~5 minutes to boot, of which about 30 seconds is Windows, and the rest is the shell implemented in .NET framework.

Second, deployment issues. When you are distributing your managed app, you have to distribute .NET framework and/or Java runtime with it. This adds to installation time. It may require reboots. It may conflict with the beta version of the same runtime that has been installed on the user's computer with some other app. Finally, it leaves the distinct taste of "crap" being installed on the user's computer. How many installation programs that do deploy runtime environments clean them up after the program is uninstalled? How many runtime uninstallation programs do clean up after themselves?

Finally, the UI of managed programs does not behave quite like native UI. This is especially true for Java, which looks and behaves like it was written for Unix, but the differences rear their ugly head in WinForms as well, although that environment had made a lot of progress. The difference are subtle - behavior of focus, tabs, z-order, but the net result is - I can tell, and I am sure many users can, too. Except they might not know what is the cause.

Another area where managed languages should not be used is education. Most schools have now switched their curriculums to Java, with disastrous results for the industry. I have been interviewing new grads for the last 10 years, and vast majority of the people coming out of contemporary CS programs does not understand pointers, does not know how to manage memory, and has no concept of implementation efficiency. Here's a recent example - an interviewee has created ArrayList to store elements for quicksort partitioning loop.

Of course all these concepts are required in any languages, including managed, where you can never ignore memory management issues - only postpone them to the "tune" stage of your project, which will become progressively longer the longer you ignore it during the development.

And students coming out of school during mid to late 90s had all these concepts - perhaps not quite on par with industry veterans, but they had enough to get started.

Another reason why Java and C# should never be taught in school is that they are manifestly "trade" languages - they are designed to make the craft part of the programming easy. Unlike C, which exposes the user to details of computer architecture, or Scheme/LISP/ML that teach advanced concepts of computer science. Joel has a wonderful article on this here:

Where do managed languages succeed? If you examine the problems I listed above, the answer becomes obvious - wherever the environment is fully controlled by the developer (i. e. deployment does not matter), wherever it serves single purpose (so the are not multiple applications that run concurrently, so startup time and resources do not matter), and wherever there is no UI. Which is to say, on a server.

Here you really reap the benefit of easier development such as automatic memory management, rich runtimes, more agile development schedules, and fewer bugs without paying the costs (except maybe you have to deploy more servers).

Another area of course is single-purpose applications ("scripts") that do batch processing and rely heavily on existing OS/apps infrastructure. For example, a task that runs at 3am and downloads podcasts. These programs are mostly produced by technical enthusiasts for the purpose of automating routine tasks. Here managed languages are great because of the richness of their runtime, that allows developer to complete big tasks with relatively little effort.

C# is especially good because of the COM interop layer that is superbly done - it allows the access to all the Windows and Office infrastructure that is much easier than form unmanaged languages. In effect, the whole of Windows and Office becomes its runtime. To illustrate this concept, here's a C# program that will read you a Project Gutenberg book aloud:

using System;
using System.Collections.Generic;
using System.Text;
using SpeechLib;
namespace speak {
class Program {
static void Main(string[] args) {
if ((args.Length != 2)
|| ((! args[0].Equals("say"))
&& (!args[0].Equals("read")))) {
System.Console.WriteLine("Usage: speak "
+ "{say \"Sentence\" read file_name}");
SpVoice voice = new SpVoice();
if (args[0].Equals("say")) {
try {
System.IO.StreamReader reader = new
while (reader.Peek() > 0) {
string s = reader.ReadLine();
} catch (Exception e) {
System.Console.WriteLine("Error "
+ e.ToString());

In other words, there are no good or bad languages, there are languages that are good or bad for a particular task.

Intergroup cooperation, or why Google 2008 != Microsoft 1998

If you've ever been a manager at a big technology company, you know it's tough. In fact I am willing to bet that intergroup cooperation somewhere within the top 3 things that you consider hardest in your job.

The reason it is tough is because the high-tech environment consists of people who are very creative, and creative people need to personally believe in the cause to work on (or be helpful to) it. Heck, their own manager has probably worked her tail off trying to get them excited about the task they are working on currently. What is the chance you mortal have coming from outside to get them to help you?

Here are the criteria that you need to achieve to make your intergroup cooperation project successful at Microsoft:

(1) You need to get your own people to believe that this is possible, and that the team A will in fact make feature B that your product needs a part of their upcoming release. Your team consists mostly of smart and experienced people. They have seen countless times where this has not happened, and they are yet to see the case where a project like that was successful. They also may have ideas on how to do what you need without feature B.

If you have managed to do (1), you can proceed to the next part.

(2) You need to persuade the PM of team A that feature B is better than a backlog of his or her own pet features that will never be implemented because there's not enough time.

(3) You need to persuade the dev of team A of the same, against the same considerations.

(4) You need to persuade the tester of team A that he has time testing it. That same one that has just helped kill a similar feature because there's no time for it on the test schedule. (As a side note, there really is not enough time to adequately test even what has shipped a couple of releases back, let alone new features).

If you think that your product has just been declared as a New Hope(TM) of the company by Bill or Steve, or that the management chain throughout believes that this is important will help, you are wrong. At best it will not hurt. Line workers very rarely agree with the execs on the set of priorities (more about it later), and if they have not fully bought in, nothing will happen regardless of what management thinks.

You may have an urge to skip (2-4) and head straight to the management. There is no better way to kill the project, because, again, without the cooperation of the people who actually do the work NOTHING WILL HAPPEN.

Getting the feature team buy-in is by far the hardest. But if you did achieve that, now you, hopefully with their help...

(5) ...need to get the ascent of PUM (product unit manager) and the discipline managers (dev manager, test manager, and GPM). The bad part is that PUM and the management team has never believed in the schedule in the first place. You're proposing to make the late (and undertested) project later (and buggier). The good part is that they have just returned from the management training where the idea that they have to trust their employees with decisions have just been re-enforced. And the feature team is with you.

As you can see, there are way too many moving parts, a lot of cooks, and so it rarely works, at least not in an environment where the schedule rules (more on that later).

When I was working in Windows CE, a big part of which consists of code ported from other Microsoft products, most notably, Windows, I had to do these dances more or less for a living. I used to bribe people from other teams by giving them cool gadgets for dogfooding.

If you work in NT base, you have to usually debug 10 layers of code at the same time to figure out anything at all (it tens of millions line of interdependent code in one unprotected address space, where anything influences everything), so people skills (as in - getting other people help you debug your code) are paramount to your success there, too.

In Windows Home Server, I tried my best to only build on existing, shipped technology, and never, EVER depend on anything that is not done. One exception to this case which was forced on us by a manager that was obviously excited by a possibility of claiming intergroup cooperation success (and who pre-sold this success to Bill) came damn near to killing the entire project, when untested, poorly designed and engineered code blew up from under us.

At Microsoft, the answer to the yearly MSPoll question about intergroup cooperation year after year has garnered an approval rate in mid 30%.

So when I was reading freshly released Google Geist numbers (yearly Google Poll measuring employee happiness), and reached the intergroup cooperation, I could not believe my eyes. I rubbed them, then looked again. Here it still was - around 90% of all Googlers are HAPPY with the amount of support they get from their peers and other teams in Google. This was among the highest numbers in the survey, and it was not because Google employs more excitable people than Microsoft (for example, both employee bodies agreed on attractiveness of their compensation package to within 5%).

So here's another argument to why Google 2008 != Microsoft 1998. Yes, the internal environment is - somewhat - similar. But only somewhat. There are differences in employee bodies that run very, very deep.

**This is my personal blog. The views expressed on these pages are mine alone and not those of my employer.**

Halo 3: not so good

I am spending about an hour every day playing Halo 2 on my PC while stepping on the treadmill, while Xbox 360 is sitting on the side, collecting dust.


(1) Matchmaking on Xbox live is terrible! This is not actually a feature of Halo 3 per se, it was the same towards the end with Halo 2. How do I know? Because most of the games I play end up with extremely unbalanced teams. I am willing to bet that if anyone at Microsoft were to care to look at the stats, they would have discovered that maybe only 10-20% of the games end with close scores (2:3, 4:5, 3:5, even), whereas >50% end with one of the teams scoring 0.

And when you get either beaten all the time, or win all the time, it's just not fun. This situation was actually becoming worse as time went, from the initial matchmaking when Halo 2 just shipped, which was not indecent, to what it is now.

One might also think that creating a matchmaking environment where teams are balanced, as evidenced by the scores, is a very decent, and measurable, review score. Perhaps not.

On the PC there is no automatic matchmaking. One joins whatever game he or she likes from the list. If there are more strong players on one team than the other, people tend to migrate to the other team until things balance out.

(2) Boy, does Halo 3 have weapons! When Halo 1 came out, it had 10 Spartan-usable weapons, including 2 types of grenades. Now in Halo 3 there are whopping 12 human and 14 covenant weapons, and that does not include the vehicles!

When you have a matrix of 26x26, it is impossible to talk about balance.

The power here is NOT in numbers!

(3) They changed the controller button mapping - and in a completely unreasonable way. There was one notable problem with the old Xbox button mapping - to punch, one needed to move the finger off the direction thumbstick, thus not being able to turn while trying to punch.

With the new set of buttons close to the triggers, it looked like a perfect opportunity to fix that and place the punch button where it actually can be accessed. Instead they randomly moved weapon loading functions to these buttons. Why?

(4) Halo 1 was extremely innovative game because instead of putting the player into an enfilade of rooms where you have to kill all the monster in Room 1 before getting to Room 2, it gave a player a relatively open physical space controlled by certain laws, but where there are unlimited number of solutions to every problem.

Halo 2 was very innovative because it went a very long way towards solving typical networking problems of a multiplayer game - it handled lag an order of magnitude better than anything before that, and the matchmaking algorithm was not that bad from the start.

Why would anyone call Halo 3 an innovative game? Because they did not broke (completely) anything that did not need fixing?

The article in Wired about Halo 3 usability hints that the development team was not particularly proud of Halo 2 release - they felt that the story was dropped on the floor at the very end. I am wondering if the Bungie team is proud of Halo 3. Judging from their recent separation from Microsoft - maybe not...

Power in numbers: Google engineering

One thing that is very different at Google is that engineers seem to run everything. At Microsoft, every decision needed to be made by trifecta of dev, PM, and test representatives. If any of the discipline were not brought in from the start, getting its ascent was exponentially harder as time went. And if the consensus was not secured, any discipline could - and often did - sabotage the process.

At Google this conflict either does not seem to exist, or takes a much milder form. Why? Because
there are fewer, much fewer people in other disciplines.

At Microsoft, the official position was 1:2:3 - for 1 PM there were supposed to be 2 devs, and 3 testers. In reality I have never seen this work, because there usually were fewer (but not by the order of magnitude) than 1 PM, and the ratio of test to dev was usually smaller than 1:1 (but not by much). But by an order of magnitude, all 3 disciplines were approximately equal.

There was one principal source of conflict between dev and PM organizations at Microsoft - basically control of the product. There are 3 things that define typical product:
1) What goes in it? (features)
2) When it gets shipped? (schedule)
3) How does it work? (architecture)

PMs are principally responsible for (1), and devs are principally responsible for (3). However, when we hire devs, we want them to be passionate about the user (more on hiring later). So it is extremely difficult to tell devs to just shut up and code, and not participate as full co-owners of the product design. (And of course product implementation does depend on selected features).

Likewise, when we hire PMs, we want them to be passionate about technology (and techincal). So it is extremely difficult to keep them out of implementation of the product. (And of course the feasibility of features depend on implementation details!)

Therein lies the conflict - two disciplines with very different approaches vie for control of basically the same area.

One of my mentors at Microsoft used to say that a strong dev team can never coexist with the strong PM team - one ends up yielding.

In my previous projects at Microsoft, there was very strong correlation between ratio of PM:dev with the amount of conflict between the disciplines. The lowest one was in my NT base team (1 PM to 6 devs), then second lowest was Windows CE (~1 to 4), and the highest was in Windows Home Server (1 to 2 - between myself and our GPM we spent probably ~20-30% of the time mitigating this conflict, and it was really sapping the productivity).

At Google, there are very few PMs - not even 1 for every 10 developers. The PM organization is nowhere never the levels of manpower necessary to assert control. Which means that they end up being responsible for relatively high-level decision, but there's still enough meat in product design for developers to enjoy.

Similar conflict exists between PMs and devs on one side and testers on the other side with respect to schedule - most of the time it is impossible to adequately test the product in the time given, so testers push to compress dev schedule (and number of features) to expand the test schedule - and PMs and devs push in the different direction.

At Google, the test org is not anywhere near being as small as the PM org, but the ratio is still way smaller than at Microsoft. And a lot of testing is done by devs - the culture of writing unit tests is extremely strong here. But as a result, testers at Google are nowhere near wielding as much power on shipping decisions as they were/are at Microsoft.

Talk about power being in numbers!

**This is my personal blog. The views expressed on these pages are mine alone and not those of my employer.**

About myself

My name is Sergey Solyanik. I live in Seattle and work at Microsoft.

Here's an abbreviated version of my resume:

Where I went to school:
  • University of Washington, '01-'02, MBA, Technology Management
  • University of Pennsylvania '95-'97, MS, Computer Science
  • Moscow Institute of Physics and Technology, '97-'82, MS, Physics

Where I work(ed):
  • Microsoft, 2008-present, Development Manager
    • Leading a team of 20+ developers on a startup project in datacenter management.
  • Google, 2007-2008, Software Engineer
    • Served as a tech lead on a new business voice project
    • Implemented road traffic incidents in Google Maps
    • Implemented multi-user chat in Gmail
    • (Re)Implemented most of the client portion of Gmail spellchecker
    • Led Google Readability team; Served as JavaScript readability reviewer
    • Served on Seattle Hiring committee
  • Microsoft, 1998-2007, Software Design Engineer, Development Lead, Development Manager
    • Ran development for Windows Home Server v1
    • Ran development for an effort to port Windows NT to a new CPU
    • Ran Windows CE Middleware team
    • Implemented Bluetooth stack for Windows Mobile
    • Implemented Windows CE version of MSMQ
    • Ported DCOM to Windows CE
    • Implemented CMD and console subsystem for Windows CE
  • Bentley Systems, Inc., 1993-1997, Software Developer
    • Implemented JIT compiler and optimizing compiler backend for Java port to MicroStation
    • Ported MicroStation to PowerMac, OS/2, Solaris x86, and several Windows NT architectures
    • Implemented several device drivers for graphics, printing, and input.

  • Coding
  • Halo
  • Economics
  • Politics

**This is my personal blog. The views expressed on these pages are mine alone and not those of my employer.**

First things first

**This is my personal blog. The views expressed on these pages are mine alone and not those of my employer.**