1-800-MAGIC: February 2008

Friday, February 29, 2008

Testing disk backup

Now that I've written a simple disk backup problem, I need to ensure that it is correct. After all, we're dealing with the user data here - terrible things will happen if it will read (or write) a wrong cluster once in a while!

So, obviously, before I can use it, I need to persuade myself that it won't eat my data.

What makes a great test? The best test of all runs the software on all possible data sets in all possible environments, and verifies that it ran correctly. In our case, verification means that the backup can be restored, and the restored files match the files that were backed up perfectly.

Verification is actually is quite straightforward - we'll take a backup of one partition, restore it to another, and then compare the files.

Much harder is to generate all combinations of file systems and all environments (or partition sizes, in this case).

When there's a test that needs a lot of various combination, and I can't foresee the "interesting" data sets, I like the random number generator. I like it for two reasons. First, given enough time, it will come up with things I could not foresee. Second, most random number generators aren't really random - they generate a fixed sequence of numbers, given the starting number (a seed). For testing, this is actually an asset - this means that the test that is driven by a pseudorandom number generator is REPRODUCIBLE. Why is this good (actually, necessary)? Because when the test does not work, you want to repeat the conditions under which the software broke.

What about running in all possible environments? For us, the environment is the partition size. Here we will have to yield to reality - there is no way we can test a lot of partition configurations. We'll just have to go with the sequence I consider "interesting" - partition sizes of powers of two, powers of two +1, powers of two -1, and a random size in between.

Ok, then, the test. It was written in python and run from the same directory where the source for the program is.

"""Test for discio.

Usage:
  python.exe test.py disk_no letter1 letter2
Where:
  disk_no The number of the disk to use.
      The disk must contain no information.
      Everything on the disk will be wiped out!
  letter1 The letter for the first volume to be
      created on the test disk. Must not be
      an existing volume, or else it will be
      wiped out.
  letter2 The letter for the second volume to be
      created on the test disk. Must not be
      an existing volume, or else it will be
      wiped out.

  WARNING: If run incorrectly, this test can
      be enormously destructive! It cleans disks,
      formats volumes etc. If you call it with an
      incorrect disk number, that disk will be gone!
"""

import os
import random
import sys
import tempfile
import time

class Error(Exception):
  """Base error for the test."""
  pass


class GenericFailure(Error):
  """Test has failed. The string describes the failure."""
  pass


__DISKPART = 'c:\\windows\\system32\\diskpart.exe'
__FORMAT = 'c:\\windows\\system32\\format.com'
__DISKIO = 'Release\\discio.exe'
__BACKUPFILE = 'test.bkf'
__SVI = 'System Volume Information'

A function that compares directories and files. If given two files, it just keeps reading them in 2K chunks, and comparing the results. If given two directories, it lists them, sorts them, and then compares the sorted lists in the merge-sort fashion: if the names are the same, match the files (or directories). If not, say that the smaller (in lexicographical order) does not exists, and the advance the pointer to the smaller one.

def Compare(f1, f2):
  """Compares two filesystem objects.

  Prints out the differences.
  Returns:
    True if the files compare OK.
  """
  ret = True

  if (__SVI in f1 and
      __SVI in f2):
    return ret

  if os.path.isfile(f1) and os.path.isfile(f2):
    fd1 = open(f1, 'rb')
    fd2 = open(f2, 'rb')
    while True:
      s1 = fd1.read(2048)
      s2 = fd2.read(2048)
      if s1 != s2:
         ret = False
         break
      if not s1:
         break
    fd1.close()
    fd2.close()
    if not ret:
      print '%s and %s are different' % (f1, f2)
    return ret

  if os.path.isfile(f2):
    print '%s is a dir, %s is a file' % (f1, f2)
    return False

  if os.path.isfile(f1):
    print '%s is a dir, %s is a file' % (f2, f1)
    return False

  #print "%s <-> %s" % (f1, f2)

  files1 = os.listdir(f1)
  files2 = os.listdir(f2)

  files1.sort()
  files2.sort()

  i = 0
  j = 0

  while i < len(files1) and j < len(files2):
    if files1[i] == files2[j]:
      if not Compare(os.path.join(f1, files1[i]),
                     os.path.join(f2, files2[j])):
        ret = False
      i = i + 1
      j = j + 1
      continue

    ret = False

    if files1[i] < files2[j]:
      print 'Missing ' + os.path.join(f2, files1[i])
      i = i + 1
    else:
      print 'Missing ' + os.path.join(f1, files2[j])
      j = j + 1

  while i < len(files1):
    print 'Missing ' + os.path.join(f2, files1[i])
    i = i + 1
    ret = False

  while j < len(files2):
    print 'Missing ' + os.path.join(f1, files2[j])
    j = j + 1
    ret = False

  return ret

This one is the biggest bottleneck in the whole system: imagine generating 4G of data and writing it to files in a SCRIPT! Yet that's exactly what it does, computer time be damned!

def GenerateFiles(base, size):
  """Generates a bunch of random files and directories.

  Args:
    base The base directory where the files should be.
        Must end with '\\'.
    size The size in kilobytes of files that it should
        not exceed.
  """
  if not os.path.exists(base):
    os.mkdir(base)

  if size > 100:
    GenerateFiles(os.path.join(base, 'dir1'), size / 6)
    GenerateFiles(os.path.join(base, 'dir2'), size / 6)
    GenerateFiles(os.path.join(base, 'dir3'), size / 6)
    size = size / 2

  file_num = 0
  size = size * 1024

  while size > 0:
    if size < 2000:
      file_size = size
    else:
      file_size = random.randint(0, size)
    fd = open(os.path.join(base, ('file%d.txt' % file_num)),
              'w')

    str = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
    str = str + str.lower() + '\n'

    try:
      for i in xrange(0, file_size/len(str)):
        fd.write(str)

      fd.flush()
    finally:
      fd.close()

    file_num = file_num + 1
    size = size - file_size

The reason I remove some files is to generate holes - unused clusters between the used ones. Otherwise I would be backing up a completely filled disk - almost as uninsteresting as testing backup of an empty disk!

def RemoveSomeFiles(base):
  """Removes every third file from a tree."""
  counter = 0
  for (basepath, dirs, files) in os.walk(base):
    if __SVI in basepath:
      continue
    for file in files:
      if counter % 3 == 0:
        full_file_name = os.path.join(basepath, file)
        os.remove(full_file_name)
      counter = counter + 1


def GenerateFileSystem(base, size):
  """Generates a bunch of random files and directories.

  Args:
    base The base directory where the files should be.
        Must end with '\\'.
    size The size in kilobytes of files that it should
        not exceed.
  """
  try:
    GenerateFiles(base, int(size * 9 / 10))
  except IOError:
    pass

  RemoveSomeFiles(base)

Finally, the test itself. We use diskpart to wipe and repartition the disk on every iteration - which means that we need the whole disk to ourselves to run. If you haven't figured this out on your own by now :-).

def RunTest(size, fs, disk, source, dest):
  """The test.

  Creates a two partitions, on the disk,
  formats one, fills it with a bunch of random
  files, backups it, and restores the results
  on the second partition.

  Then verifies that the files are identical
  on both.

  Args:
    size The size of an individual partition
    fs The file system to use, FAT, FAT32,
        or NTFS
    disk The number of the disk to use in testing
        Everything on this disk will be gone!
    source The letter to use for volume on which
        the test data will be generated
    dest The letter to use for volume to which
        the backup will be restored.

  Throws:
    GenericFailure if the test fails

  """
  print 'Running the test.'
  print '  Physical disk: %d ' % disk
  print '  Parition size: %dM' % size
  print '  File system:   %s' % fs

  (fi, script) = tempfile.mkstemp()
  os.close(fi)
  fd = open(script, 'w')
  fd.write('SELE DIS %d\n' % disk)
  fd.write('CLEAN\n')
  fd.write('CREAT PART PRI SIZE=%d\n' % size)
  fd.write('ASSIGN LETTER=%s\n' % source)
  fd.write('CREAT PART PRI SIZE=%d\n' % size)
  fd.write('ASSIGN LETTER=%s\n' % dest)
  fd.write('EXIT\n')
  fd.close()

  print 'Preparing the partitions'

  ret = os.spawnv(os.P_WAIT, __DISKPART,
                  [__DISKPART,
                   '/s',
                   script])

  os.remove(script)

  if ret != 0:
    raise GenericFailure(
        'Could not partition the disk!')

  ret = os.spawnv(os.P_WAIT, __FORMAT,
                  [__FORMAT,
                   source + ':',
                   '/fs:' + fs,
                   '/v:test',
                   '/y'])

  if ret != 0:
    raise GenericFailure(
        'Could not format the test partition')

  print 'Generating test files'

  GenerateFileSystem(source + ':\\', size * 1024)

  time.sleep(20)

  print 'Running backup'

  ret = os.spawnv(os.P_WAIT, __DISKIO,
                  [__DISKIO,
                   source + ':\\',
                   'readto',
                   __BACKUPFILE]);

  if ret != 0:
    raise GenericFailure('Backup failed!')

  print 'Running restore'

  ret = os.spawnv(os.P_WAIT, __DISKIO,
                  [__DISKIO,
                   dest + ':\\',
                   'writefrom',
                   __BACKUPFILE]);

  if ret != 0:
    raise GenericFailure('Restore failed!')

  if not Compare(source + ':\\', dest + ':\\'):
    raise GenericFailure(
        'Restore does not match backup!')

  print 'Test succeeded!'
  print

And finally, the main. First and foremost it is trying to keep you safe: prevent you from wiping out some useful data.

def main(args):
  """Main entry point.

  Args:
    args: Command line arguments. See usage.

  """

  random.seed(0)

  if len(args) < 3:
    print __doc__
    return
    
  try:
    disk = int(args[0])
  except ValueError:
    print 'Disk should be a number!'
    return

  if disk <= 0:
    print 'Disk cannot be a system disk (0).'
    return

  source = args[1].upper()
  dest = args[2].upper()
  if len(source) != 1 or len(dest) != 1:
    print 'Only use 1 letter to specify drives'
    return

  if (dest < 'K' or dest > 'O'
      or source < 'K' or source > 'O'):
    print 'Use letters K through O for test drives.'
    print 'This is to minimize potential of a'
    print 'collision with the real drives.'
    return

  if dest == source:
    print 'Source and destination should be different!'
    return

  if os.path.exists(source + ':\\'):
    print 'Cannot continue: %s:\\ exists!' % source
    return

  if os.path.exists(dest + ':\\'):
    print 'Cannot continue: %s:\\ exists!' % dest
    return

  print 'Running the test on physical disk %d' % disk
  print 'and using drive letters %s and %s' % (source,
                                               dest)

And here we run through a bunch of partition sizes (in Megabytes) I thought were interesting, and run backup test on 3 file systems: FAT16, FAT32, and NTFS.

  try:
    for size in [50, 100, 128, 200, 256, 300, 315,
                 1000, 1024, 2047, 2048, 2049, 4095,
                 4096, 4097]:
      RunTest(size, "FAT", disk, source, dest)
      RunTest(size, "FAT32", disk, source, dest)
      RunTest(size, "NTFS", disk, source, dest)
  except GenericFailure, f:
    print "Test failed: " + f.message

  
if __name__ == '__main__':
  main(sys.argv[1:])

And - voila! This ran overnight and succeeded!

Taking disk image snapshots - programmatically

A while ago I published a quick program to rip software CD images by opening CD device as a file for block access.

Today I will expand it to read (and restore) the images of a regular hard drive.

Technically speaking, just running the code from http://1-800-magic.blogspot.com/2008/02/backing-up-software-cds.html directly would work (sort of, we'll talk about an important gotcha later). But it will result in a very, very big file. Hard drives are huge these days, but mostly empty. So a lot of the backup image would be the empty space.

Luckily, Windows has a handy facility to figure out what blocks on the disks are actually used. Device IO control FSCTL_GET_VOLUME_BITMAP returns a bitmap where 1s correspond to used blocks, and 0s to the clusters that are empty.

Unluckily, this io control only works for NTFS. For FAT it returns the bitmap, but it does not start with the beginning of the disk - rather, with the end of the FAT tables as it seems.

Funnily enough, FSCTL_GET_VOLUME_BITMAP also reports the starting cluster where the bitmap is supposed to start. For FAT, it should have been non-zero - we could then backup everything before it, and then whatever is in the bitmap after, but no luck - it just happily returns zero, and the shifted bitmap.

There's surely a way to figure out the sizes of FATs, but since this filesystem is less and less common, and the drives formatted with FAT are small, so in the interests of simplicity I just backup the whole disk if the filesystem is not NTFS.

Another bit of interesting data is that your partition is actually bigger than your file system by one cluster. This cluster contains a copy of the the boot sector for the partition. The intent, I think, is to use it in disk recovery. We back it up for NTFS only, not for other file systems.

Alright, so here's the code. The usual things first. See this post for implementation of AtoI: http://1-800-magic.blogspot.com/2008/02/down-with-atoi.html.

#define UNICODE 1
#define _UNICODE 1

#include <windows.h>
#include <stdio.h>
#include <ctype.h>
#include <strsafe.h>
#include <assert.h>

#define ARRLEN(c) (sizeof(c)/sizeof(c[0]))

template<class IntType> bool AtoI(const WCHAR *sz, IntType *out) {
...}

GetDeviceName translates the disk number (0-N), a drive letter (A:), or a volume mount point (C:\, or c:\cdrom) to a name that Windows would understand as a block device.

bool GetDeviceName(WCHAR *szDevName,
                   size_t cchDevName,
                   const WCHAR *szInput) {
  unsigned int disk_no = 0;
  if (AtoI<unsigned int>(szInput, &disk_no)) {
    // It's a physical disk
    StringCchPrintfW(szDevName,
                     cchDevName,
                     L"\\\\.\\PHYSICALDRIVE%d", 
                     disk_no);
    return true;
  }

  WCHAR drive_letter = toupper(szInput[0]);
  if (drive_letter >= 'A' && 
      drive_letter <= 'Z' &&
      szInput[1] == ':' &&
      szInput[2] == '\0') {
    // It's a drive letter
    StringCchPrintfW(szDevName,
                     cchDevName,
                     L"\\\\.\\%c:",
                     drive_letter);
    return true;
  }

  WCHAR sz[_MAX_PATH];
  const WCHAR *p = wcsrchr(szInput, '\\');
  if (!p || p[1] != '\0') {
    // Mount point needs to end in backslash
    StringCchCopyW(sz, ARRLEN(sz), szInput);
    StringCchCatW(sz, ARRLEN(sz), L"\\");
    szInput = sz;
  }

  if (!GetVolumeNameForVolumeMountPointW(
      szInput, szDevName, cchDevName))
    return false;

  // This returns a name ending with '\'
  // but CreateFile does not take it, so
  // we need to get rid of it.
  WCHAR *q = wcsrchr(szDevName, '\\');
  if (q && q[1] == '\0')
    *q = '\0';

  return true;
}

Then opening is straightforward. Notice the sharing mode - this is part of the gotcha I will be talking about later. Also notice that I removed the FSCTL_ALLOW_EXTENDED_DASD_IO from here. This IO control allows reading and writing past the end of the volume as understood by the file system. Since we backup the overflow sector only for NTFS, I moved it to NTFS-specific place.

HANDLE OpenBlockDevice(const WCHAR *szDevName,
                       bool fWrite) {
  HANDLE h = CreateFile(szDevName,
    fWrite ? GENERIC_WRITE : GENERIC_READ,
    FILE_SHARE_WRITE, NULL, OPEN_EXISTING,
    FILE_FLAG_NO_BUFFERING |
        (fWrite ? FILE_FLAG_WRITE_THROUGH : 0),
    NULL);
  return h;
}

Now the fun part - let's get the volume bitmap. The idiosyncrasy of this function is that there's no way - at least not a way I know of - to get the size of the bitmap that we need to pre-allocate, first. If you give it the NULL input, as is customary with majority of Windows API, and expect the output to say how many bytes it needs, you will be disappointed - the function just errors out, and returns 0 in the bytes returned. Therefore, I do this ugly dance:

VOLUME_BITMAP_BUFFER *GetVolumeBitmap(
    HANDLE hDev, unsigned int *puiBitmapSize) {
  STARTING_LCN_INPUT_BUFFER sStartLcn;
  sStartLcn.StartingLcn.QuadPart = 0;

  DWORD dwBitmapSize = 0;
  DWORD dwAllocatedSize = 64 * 1024;
  VOLUME_BITMAP_BUFFER *pVolumeBitmap = NULL;
  for ( ; ; ) {
    pVolumeBitmap = (VOLUME_BITMAP_BUFFER *)
        LocalAlloc(LMEM_FIXED, dwAllocatedSize);

    BOOL ret = DeviceIoControl(hDev,
                      FSCTL_GET_VOLUME_BITMAP,
                      &sStartLcn,
                      sizeof(sStartLcn),
                      pVolumeBitmap,
                      dwAllocatedSize,
                      &dwBitmapSize,
                      NULL);
    if (ret) {
      *puiBitmapSize = dwBitmapSize;
      return pVolumeBitmap;
    }

    if (GetLastError() != ERROR_MORE_DATA)
      return NULL;

    dwAllocatedSize *= 2;
  }
}

I want to support both the case where we backup the entire volume as well as the case where we would only backup the used clusters (let's call it sparse backup). I do it by prepending the following header to the sparse backup file - so when I read it back, if the magic string is there, I know that I need to read the volume bitmap, and then write the data accordingly. If it's not there, the file is basically the image and I just blast the whole thing on the disk:. When I restore, I also need the sizes of the cluster and the sector, so I know how to interpret the bitmap. If I were somewhat less lazy, I would also care to check that on restore the sector size of the disk matches that of the backup.

#define SIG "Compressed Disk Image"

struct BackupHeader {
  char signature[sizeof(SIG)];
  unsigned int uiSectorSize;
  unsigned int uiClusterSize;
  unsigned int uiBitmapSize;
};

I put together a generic function that can transfer data from disk to file, and from file to disk according to the volume bitmap if it is given. The small gotcha here is that the size of the volume bitmap is in BITS. Storage API rarely follows standard Windows patters for some reason!

Notice that in the case where pVolumeBitmap is NULL, the function gets reduced to a very simple loop that just reads and writes the entire block device.

// This function assumes the ownership of
// pVolumeBitmap, i. e. it frees it.
// hFrom is the handle from which the data is read.
// hTo is the handle to which the data is written.
// hSeek is always the handle of the device - this
// is the pointer that is being moved to skip empty
// extents.
// For backup: hFrom = hSeek - disk handle
//             hTo - file handle
// For restore: hTo = hSeek - disk handle
//             hFrom - file handle
// If device is not an NTFS volume, pBackupData,
// pVolumeBitmap are both NULL.
DWORD Transfer(HANDLE hFrom,
               HANDLE hTo,
               HANDLE hSeek,
               BackupHeader *pBackupData,
               VOLUME_BITMAP_BUFFER *pVolumeBitmap,
               void *buffer,
               const unsigned int cbBufferSize,
               unsigned __int64 &ui64Bytes) {
  DWORD dwErr = ERROR_SUCCESS;

  // This is used to step through the bitmap
  // of used clusters.
  unsigned char bmpMask = 1;
  unsigned char *bmpIndex = NULL;
  int maxClustersPerRead = 0;
  unsigned __int64 maxClusters = 0;
  if (pVolumeBitmap) {
    bmpIndex = pVolumeBitmap->Buffer;
    maxClusters = pVolumeBitmap->BitmapSize.QuadPart;
    maxClustersPerRead = cbBufferSize /
        pBackupData->uiClusterSize;
  }

  unsigned __int64 cluster = 0;
  bool fTerm = false;

  while (!fTerm) {
    DWORD dwToMove = cbBufferSize;
    if (pVolumeBitmap) {
      // Skip empty. Note that at the end
      // of this loop cluster can be more
      // than maxCluster, because sometimes
      // we step in 8.
      while (cluster < maxClusters) {
        if (!*bmpIndex) {
          cluster += 8;

          ++bmpIndex;
          assert(bmpMask == 1);

          continue;
        }

        if ((*bmpIndex & bmpMask) == 0) {
          ++cluster;

          bmpMask <<= 1;
          if (! bmpMask) {
            bmpMask = 1;
            ++bmpIndex;
          }

          continue;
        }
        break;
      }

      dwToMove = 0;
      int numClusters = 0;
      while (cluster  + numClusters < maxClusters
          && numClusters < maxClustersPerRead
          && (*bmpIndex & bmpMask) != 0) {
        dwToMove += pBackupData->uiClusterSize;
        ++numClusters;

        bmpMask <<= 1;
        if (!bmpMask) {
          bmpMask = 1;
          ++bmpIndex;
        }
      }

      assert(dwToMove <= cbBufferSize);
      if (dwToMove == 0) { // The End?
        // The last sector on the volume
        // contains duplicate of the boot
        // sector, but is not part of the
        // volume map.
        dwToMove = pBackupData->uiSectorSize;

        // Since we might have skipped more
        // clusters than there were because
        // sometimes we count by 8-s, we reset
        // the current cluster count as well:
        cluster = maxClusters;

        // Convenient place to free the bitmap.
        LocalFree(pVolumeBitmap);
        pVolumeBitmap = NULL;
        fTerm = true;
      }

      unsigned __int64 offset = cluster *
        (unsigned __int64)pBackupData->uiClusterSize;

      if (!SetFilePointerEx(hSeek,
                            *(LARGE_INTEGER *)&offset,
                            NULL,
                            FILE_BEGIN)) {
        dwErr = GetLastError();
        wprintf(L"Seek error %d (0x%08x)\n",
            dwErr, dwErr);
      }

      cluster += numClusters;
    }

    DWORD dwRead = 0;
    if (!ReadFile(hFrom, buffer,
        dwToMove, &dwRead, NULL)) {
        dwErr = GetLastError();
        wprintf(L"\nRead error %d (0x%08x)\n",
            dwErr, dwErr);
        break;
    }

    if (dwRead == 0)
      break;

    DWORD dwWrit = 0;
    if (!WriteFile(hTo, buffer,
        dwRead, &dwWrit, NULL) ||
        dwWrit != dwRead) {
      dwErr = GetLastError();
      wprintf(L"\nWrite error %d (0x%08x)\n",
          dwErr, dwErr);
      break;
    }

    ui64Bytes += dwWrit;
    wprintf(L"%I64u\r", ui64Bytes);
  }

  return dwErr;
}

This makes read much shorter than what it would have been:

DWORD ReadToFile(HANDLE hDev,
                 HANDLE hFile,
                 void *buffer,
                 const unsigned int cbBufferSize,
                 unsigned __int64 &ui64BytesRead) {
  DWORD dwErr = ERROR_SUCCESS;

  unsigned int uiBitmapSize = 0;
  VOLUME_BITMAP_BUFFER *pVolumeBitmap = NULL;

  NTFS_VOLUME_DATA_BUFFER sVolumeData;
  DWORD dwRead = 0;
  // For NTFS volumes we can store just the
  // clusters that are used. We get the
  // bitmap where 1 means that the cluster
  // is used here.
  if (DeviceIoControl(hDev,
                      FSCTL_GET_NTFS_VOLUME_DATA,
                      NULL,
                      0,
                      &sVolumeData,
                      sizeof(sVolumeData),
                      &dwRead,
                      NULL)) {
    // For NTFS disable volume boundary checks by FS
    DWORD dwBytes = 0;
    DeviceIoControl(hDev,
                    FSCTL_ALLOW_EXTENDED_DASD_IO,
                    NULL,
                    0,
                    NULL,
                    0,
                    &dwBytes,
                    NULL);

    pVolumeBitmap = GetVolumeBitmap(hDev, &uiBitmapSize);
  }

  BackupHeader h;
  DWORD dwWrit = 0;
  if (pVolumeBitmap) {
    memcpy(h.signature, SIG, sizeof(SIG));
    h.uiBitmapSize = uiBitmapSize;
    h.uiSectorSize = sVolumeData.BytesPerSector;
    h.uiClusterSize = sVolumeData.BytesPerCluster;
    if (!WriteFile(hFile,
                   &h,
                   sizeof(h),
                   &dwWrit,
                   NULL)
        || dwWrit != sizeof(h)) {
      dwErr = GetLastError();
      wprintf(L"Write error %d (0x%08x)\n",
          dwErr, dwErr);
      return dwErr;
    }

    if (!WriteFile(hFile,
                   pVolumeBitmap,
                   uiBitmapSize,
                   &dwWrit,
                   NULL)
        || dwWrit != uiBitmapSize) {
      dwErr = GetLastError();
      wprintf(L"Write error %d (0x%08x)\n",
          dwErr, dwErr);
      return dwErr;
    }
  }

  return Transfer(hDev,
                  hFile,
                  hDev,
                  &h,
                  pVolumeBitmap,
                  buffer,
                  cbBufferSize,
                  ui64BytesRead);
}

And write, even shorter:

DWORD WriteFromFile(HANDLE hDev,
                 HANDLE hFile,
                 void *buffer,
                 const unsigned int cbBufferSize,
                 unsigned __int64 &ui64BytesWritten) {
  DWORD dwRead = 0;
  DeviceIoControl(hDev,
                  FSCTL_ALLOW_EXTENDED_DASD_IO,
                  NULL,
                  0,
                  NULL,
                  0,
                  &dwRead,
                  NULL);

  unsigned int uiBitmapSize = 0;
  VOLUME_BITMAP_BUFFER *pVolumeBitmap = NULL;

  BackupHeader h;

  if (ReadFile(hFile, &h, sizeof(h), &dwRead, NULL)
      && dwRead == sizeof(h)) {
    if (memcmp(h.signature, SIG, sizeof(SIG)) == 0) {
      pVolumeBitmap = (VOLUME_BITMAP_BUFFER *)
          LocalAlloc(LMEM_FIXED, h.uiBitmapSize);
      if (!ReadFile(hFile, pVolumeBitmap, h.uiBitmapSize,
          &dwRead, NULL) || dwRead != h.uiBitmapSize) {
        wprintf(L"Backup corrupted - could not read the "
                L"volume bitmap!\n");
        return ERROR_DATA_NOT_ACCEPTED;
      }
    } else
      SetFilePointer(hFile, 0, NULL, FILE_BEGIN);
  }

  DWORD dwErr = Transfer(hFile,
                         hDev,
                         hDev,
                         &h,
                         pVolumeBitmap,
                         buffer,
                         cbBufferSize,
                         ui64BytesWritten);

  DeviceIoControl(hDev,
                  IOCTL_DISK_UPDATE_PROPERTIES,
                  NULL,
                  0,
                  NULL,
                  0,
                  &dwRead,
                  NULL);

  return dwErr;
}

The main just parses the command line then:

int wmain(int argc, WCHAR **argv) {
  if (argc < 4) {
    wprintf(L"Usage: %s {disk|volume} cmd "
            L"args\n", argv[0]);
    wprintf(L"Where disk is physical number "
            L"of the drive, e. g. 0\n");
    wprintf(L"Volume is a drive letter or a "
            L"full path to \n");
    wprintf(L"the mount point, e. g. a: or "
            L"c:\\mount\\vol\n");
    wprintf(L"Commands:\n");
    wprintf(L"readto filename - read the "
            L"contents of the device into "
            L"a file\n");
    wprintf(L"writefrom filename - write the "
            L"contents of the file onto "
            L"the disk or volume\n");
    return 0;
  }

  UINT uiErrModeSav = SetErrorMode(
      SEM_FAILCRITICALERRORS);

  WCHAR szDevName[_MAX_PATH];
  if (!GetDeviceName(szDevName,
                     ARRLEN(szDevName),
                     argv[1])) {
    wprintf(L"Could not recognize %s\n", argv[1]);
    SetErrorMode(uiErrModeSav);
    return 1;
  }

  const int kBufferSize = 64 * 1024;
  void *buffer = VirtualAlloc(
      NULL,
      kBufferSize,
      MEM_COMMIT | MEM_RESERVE,
      PAGE_READWRITE);
  if (!buffer) {
    DWORD dwErr = GetLastError();
    wprintf (L"Could not reserve memory, "
             L"error %d (0x%08x)\n", dwErr, dwErr);
    SetErrorMode(uiErrModeSav);
    return 3;
  }

  DWORD dwTickStart = GetTickCount();
  unsigned __int64 ui64Bytes = 0;
  DWORD dwErr = ERROR_SUCCESS;
  if (_wcsicmp(argv[2], L"readto") == 0) {
    HANDLE hDev = OpenBlockDevice(szDevName, false);
    if (hDev != INVALID_HANDLE_VALUE) {
      HANDLE hFile = CreateFile(
          argv[3], GENERIC_WRITE, 0, NULL,
          CREATE_ALWAYS, 0, NULL);
      if (hFile != INVALID_HANDLE_VALUE) {
        wprintf(L"%s --> %s:\n", szDevName, argv[3]);
        dwErr = ReadToFile(hDev,
                           hFile,
                           buffer,
                           kBufferSize,
                           ui64Bytes);
        CloseHandle(hFile);
        if (dwErr != ERROR_SUCCESS)
          DeleteFile(argv[3]);
      } else {
        dwErr = GetLastError();
        wprintf (L"Could not open %s, error %d "
                 L"(0x%08x)\n", argv[3],
                 dwErr, dwErr);
      }
      CloseHandle(hDev);
    } else {
      dwErr = GetLastError();
      wprintf(L"Could not open %s, "
              L"error %d (0x%08x)\n",
              argv[1], dwErr, dwErr);
    }
  } else if (_wcsicmp(argv[2], L"writefrom") == 0) {
    HANDLE hDev = OpenBlockDevice(szDevName, true);
    if (hDev != INVALID_HANDLE_VALUE) {
      HANDLE hFile = CreateFile(
          argv[3], GENERIC_READ, 0, NULL,
          OPEN_EXISTING, 0, NULL);
      if (hFile != INVALID_HANDLE_VALUE) {
        wprintf(L"%s --> %s:\n", argv[3], szDevName);
        dwErr = WriteFromFile(hDev,
                              hFile,
                              buffer,
                              kBufferSize,
                              ui64Bytes);
        CloseHandle(hFile);
      } else {
        dwErr = GetLastError();
        wprintf (L"Could not open %s, error %d "
                 L"(0x%08x)\n", argv[3],
                 dwErr, dwErr);
      }
      CloseHandle(hDev);
    } else {
      dwErr = GetLastError();
      wprintf(L"Could not open %s, "
              L"error %d (0x%08x)\n",
              argv[1], dwErr, dwErr);
    }
  } else {
    wprintf (L"Command %s not recognized.\n",
             argv[2]);
    dwErr = ERROR_NOT_SUPPORTED;
  }

  if (dwErr == ERROR_SUCCESS &&
      ui64Bytes != 0) {
    wprintf(L"Transferred %I64u bytes in %d "
            L"seconds.\n", ui64Bytes,
            (GetTickCount() - dwTickStart) / 1000);
  }

  VirtualFree(buffer, 0, MEM_RELEASE);
  SetErrorMode(uiErrModeSav);

  return dwErr == ERROR_SUCCESS ? 0 : 4;
}

Now it WOULD be time to publish the binary and the source code in its intact form, except for one thing. This code works poorly if it is run against your system volume. The problem is that the system volume does not usually exist in the consistent state - it is being constantly written to, with part of the information already on disk, and another part in buffers.

Moreover, while the backup is in progress, another program can add data, thus creating an inconsistency between the disk and the NTFS descriptors (say, we'd read the disk, but not yet the MFT. Then the program would add a file to the area of the disk we have already written, and modify the MFT - we would've gotten an image where there's a file full of garbage).

The solution is called VSS, and I will cover it in another post. Hopefully, at that point we will have the code that would work well with the system volume. But even before that, I owe an article on testing it.

As it stands now, it can be used on secondary volumes (when you're sure nothing is writing on it), unprotected CDs and DVDs, or on a system volume if you're booting an alternative OS.

Wednesday, February 27, 2008

STL vector perf redux

Alright, my exercise in STL vector perf analysis was to a large extent an exercise in n00bness. Nevertheless, I learned a lot of stuff, thanks to the readers, and here's the summary.

If you are looking for the gory details, read this: http://1-800-magic.blogspot.com/2008/02/stl-vector-performance.html and the commentary.

The task: stuff an STL vector with 10000 numbers, measure what it takes per iteration.

The performance of course will vary with the computer. Mine is T2600 @ 2.16GHz.

C reference point - pre-allocated array.
2 clocks per iteration:

int *work = (int *)malloc(10000 * sizeof(int));
for (int j = 0; j < 10000; ++j)
  work[j] = j;

C reference point - dynamically allocated array using STL allocation policy.
70 clocks per iteration:

int size = 20;
int *work = (int *)malloc(size * sizeof(int));
for (int j = 0; j < 10000; ++j) {
  if (j >= size) {
    size = size + size / 2;
    work = (int *)realloc(work,
        size * sizeof(int));
  }
  work[j] = j;
}

STL, dynamically allocated vector, with or without security checks.
140 clocks per iteration.

vector work;
for (int j = 0; j < 10000; ++j)
  work.push_back(j);

STL, pre-allocated, with security checks, push_back.
9 clocks per iteration.

vector work;
work.reserve(10000);
for (int j = 0; j < 10000; ++j)
  work.push_back(j);

STL, pre-allocated, no security checks, push_back.
8 clocks per iteration.

#define _SECURE_SCL 0
vector work;
work.reserve(10000);
for (int j = 0; j < 10000; ++j)
  work.push_back(j);

STL, pre-allocated, with security checks, [].
6 clocks per iteration.

vector work(10000);
for (int j = 0; j < 10000; ++j)
  work[j] = j;

STL, pre-allocated, no security checks, [].
2 clocks per iteration.

#define _SECURE_SCL 0
vector work(10000);
for (int j = 0; j < 10000; ++j)
  work[j] = j;

Resume: know what you're doing. Performance can be from 3x to 70x slower if you're not careful.

For those who didn't read the previous post, and want to argue that none of this is important, read this http://1-800-magic.blogspot.com/2007/12/memory-is-not-free-more-on-vista.html for a fun real life story when you wouldn't think that performance matters until the very end, when it suddenly does, but then it's too late :-).

Another observation: the fact that most STL implementations are not THAT much smaller is an accolade to the modern CPU design.

If you actually look at the generated code, the STL (especially push_back) compiles to much bigger assembly than you could guess from the results above - even without the security check it is ~30 instructions plus the call into reallocating logic.

; for (int j = 0; j < NUM_ITERATIONS; ++j)
mov         edx,dword ptr [esp+3Ch] 
mov         ecx,dword ptr [esp+38h] 
xor         esi,esi 
mov         dword ptr [esp+1Ch],esi 
mov         edi,edi 
; work.push_back(j);
L0:
cmp         ecx,edi 
jne         L1
xor         eax,eax 
jmp         L2
L1:
mov         eax,dword ptr [esp+40h] 
sub         eax,ecx 
sar         eax,2 
L2:
mov         ebx,edx 
sub         ebx,ecx 
sar         ebx,2 
cmp         ebx,eax 
jae         L3
mov         dword ptr [edx],esi 
add         edx,4 
mov         dword ptr [esp+3Ch],edx 
jmp         L4
L3:
lea         ecx,[esp+1Ch] 
push        ecx  
push        edx  
lea         eax,[esp+3Ch] 
call        std::vector<int,std::allocator<int> >::_Insert_n
mov         edx,dword ptr [esp+3Ch] 
mov         ecx,dword ptr [esp+38h] 
L4:
inc         esi  
cmp         esi,2710h 
mov         dword ptr [esp+1Ch],esi 
jl          $LN22+5Ah (4013E0h)

This is only 4 times slower than the C loop, which is so much less:

xor         eax,eax 
lea         esp,[esp]
L0: 
mov         dword ptr [ecx+eax*4],eax 
inc         eax  
cmp         eax,2710h 
jl          L0

This is because all these heaps of STL code execute in parallel, speculatively, whereas the straightforward C loop is not parallelizeable at all.

Tuesday, February 26, 2008

STL vector performance

I was going through an interview package today, where the interviewee returned a whole STL vector as a result of the function. I wondered if this is a reasonable thing to do, and whether it would cause a lot of data copying.

So I cranked up Visual Studio 2008 jotted down a few lines of code, and looked at the disassembly.

Turns out, the compiler does a really good job with the aliasing, so the code that it generates (in optimized version) is very efficient. What drew my attention was nearly 30 instructions that it generated for "vector.push_back(i)". I expected that the optimized code would be a lot less, and started digging.

As it turned out, it was a lot worse. The code generated inline was the tip of an iceberg - another function was called, and not just in the case where array reallocation was required. I stepped through it in assembly several times to make sure.

It looked pretty bad, so I decided to measure how it would compare to non-STL programming idioms. I compared pushing 10000 numbers into pre-allocated STL vector, an auto-expanding STL vector, an expanding C array, and a pre-allocated C array.

The code was compiled in Release (optimized) mode on Visual Studio 2008.

Here is the test. Getting the boring stuff out of the way first:


#include <windows.h>

#include <stdio.h>
#include <math.h>

#include <vector>

using namespace std;

I used Pentium clock counter instruction to minimize overhead. This function returns 64-bit clock count since the CPU was last reset. Thank you, Wikipedia!


__declspec(naked)
unsigned __int64 __cdecl rdtsc(void) {
  __asm {
    rdtsc
    ret
  }
}

However, if the thread is relocated to a different core, the clock count returned is for a different CPU, which makes time measurement completely random. So I restricted the code to a single core, and for a good measure, pre-committed the heap to ensure that paging does not spoil the results:


#define NUM_ITERATIONS 10000
void PrepEnvironment(void) {
  // Move everything to one CPU, we're going
  // to be using its clock counter
  SetThreadAffinityMask(GetCurrentThread(), 1);

  // Commit all the memory.
  void *p = malloc(sizeof(int) * 4 * NUM_ITERATIONS);
  memset(p, 0, sizeof(int) * 4 * NUM_ITERATIONS);
  free(p);
}

For stats, I calculated the average, the median, and the histogram on the log scale:


int __cdecl comparator(const void *p1,
                       const void *p2) {
  const unsigned int *pi1 =
      (const unsigned int*)p1;
  const unsigned int *pi2 =
      (const unsigned int*)p2;
  return (int)(*pi1 - *pi2);
}

#define BUCKETS 20
void DumpStats(unsigned int *clocks,
               unsigned int cptr,
               unsigned int iters) {
  qsort(clocks, cptr,
      sizeof(unsigned int), comparator);

  int buckets[BUCKETS];
  memset(buckets, 0, sizeof(buckets));

  double average = 0;
  double log2 = log(2.0);

  for (unsigned int i = 0 ; i < cptr; ++i) {
    average += (double)clocks[i];
    int ndx = (int)(log((double)clocks[i]
        / 1000.0) / log2);
    if (ndx < 0)
      buckets[0]++;
    else if (ndx < BUCKETS-1)
      buckets[ndx + 1]++;
    else
      buckets[BUCKETS-1]++;
  }

  average /= cptr;

  printf("Median: %d\n",
      (int)clocks[cptr / 2]);
  printf("Median, CPI: %d\n",
      clocks[cptr / 2] / iters);
  printf("Average: %d\n", (int)average);
  printf("Average, CPI: %d\n",
      (int)(average / iters));

  printf("Histogram:\n");
  int prev = 0;
  int curr = 1000;
  for (int i = 0; i < BUCKETS - 1; ++i) {
    printf("%d - %d: %d\n", prev, curr,
        buckets[i]);
    prev = curr;
    curr *= 2;
  }

  printf(" > %d: %d\n", prev,
      buckets[BUCKETS - 1]);
}

Finally, the 4 different pieces of code all filled an array with 10000 numbers, and repeated this operation 1000 times (so the standard error is about 3%). I have made C array reallocation strategy to match STL vector implementation.


#define NUM_SAMPLES 1000
int main(int argc, char **argv) {
  PrepEnvironment();

  static unsigned int clocks[NUM_SAMPLES];
  for (int k = 0; k < 4; ++k) {
    unsigned int cptr = 0;

    switch(k) {
    case 0:
      printf("STL re-alloc\n");
      for (int i = 0; i < NUM_SAMPLES; ++i) {
        vector work;
        unsigned __int64 begin = rdtsc();
        for (int j = 0; j < NUM_ITERATIONS; ++j)
          work.push_back(i);
        unsigned __int64 end = rdtsc();
        clocks[cptr++] =
            (unsigned int)(end - begin);
      }
      break;
    case 1:
      printf("STL pre-alloc\n");
      for (int i = 0; i < NUM_SAMPLES; ++i) {
        vector work;
        work.reserve(NUM_ITERATIONS);
        unsigned __int64 begin = rdtsc();
        for (int j = 0; j < NUM_ITERATIONS; ++j)
          work.push_back(i);
        unsigned __int64 end = rdtsc();
        clocks[cptr++] =
            (unsigned int)(end - begin);
      }
      break;
    case 2:
      printf("C re-alloc\n");
      for (int i = 0; i < NUM_SAMPLES; ++i) {
        int size = 20;
        int *work = (int *)malloc(size * sizeof(int));
        unsigned __int64 begin = rdtsc();
        for (int j = 0; j < NUM_ITERATIONS; ++j) {
          if (j >= size) {
            // Match STL vector allocation algorithm
            size = size + size / 2;
            work = (int *)realloc(work, size
                * sizeof(int));
          }
          work[j] = j;
        }
        unsigned __int64 end = rdtsc();
        clocks[cptr++] =
            (unsigned int)(end - begin);
        free(work);
      }
      break;
    case 3:
      printf("C pre-alloc\n");
      for (int i = 0; i < NUM_SAMPLES; ++i) {
        int *work = (int *)malloc(
            NUM_ITERATIONS * sizeof(int));
        unsigned __int64 begin = rdtsc();
        for (int j = 0; j < NUM_ITERATIONS; ++j)
          work[j] = j;
        unsigned __int64 end = rdtsc();
        clocks[cptr++] =
            (unsigned int)(end - begin);
        free(work);
      }
      break;
    }

    DumpStats(clocks, cptr, NUM_ITERATIONS);
    printf("\n\n");
  }

  return 0;
}

So the results are as follows (this is on my T60p ThinkPad T2600 Core 2 Duo running at 2.16 GHz plugged-in):


STL re-alloc
Median: 1511328
Median, CPI: 151
Average: 1540473
Average, CPI: 154
Histogram:
0 - 1000: 0
1000 - 2000: 0
2000 - 4000: 0
4000 - 8000: 0
8000 - 16000: 0
16000 - 32000: 0
32000 - 64000: 0
64000 - 128000: 0
128000 - 256000: 0
256000 - 512000: 0
512000 - 1024000: 123
1024000 - 2048000: 835
2048000 - 4096000: 35
4096000 - 8192000: 6
8192000 - 16384000: 1
16384000 - 32768000: 0
32768000 - 65536000: 0
65536000 - 131072000: 0
131072000 - 262144000: 0
 > 262144000: 0


STL pre-alloc
Median: 80171
Median, CPI: 8
Average: 82007
Average, CPI: 8
Histogram:
0 - 1000: 0
1000 - 2000: 0
2000 - 4000: 0
4000 - 8000: 0
8000 - 16000: 0
16000 - 32000: 0
32000 - 64000: 0
64000 - 128000: 995
128000 - 256000: 2
256000 - 512000: 1
512000 - 1024000: 2
1024000 - 2048000: 0
2048000 - 4096000: 0
4096000 - 8192000: 0
8192000 - 16384000: 0
16384000 - 32768000: 0
32768000 - 65536000: 0
65536000 - 131072000: 0
131072000 - 262144000: 0
 > 262144000: 0


C re-alloc
Median: 698529
Median, CPI: 69
Average: 728341
Average, CPI: 72
Histogram:
0 - 1000: 0
1000 - 2000: 0
2000 - 4000: 0
4000 - 8000: 0
8000 - 16000: 0
16000 - 32000: 0
32000 - 64000: 0
64000 - 128000: 0
128000 - 256000: 0
256000 - 512000: 0
512000 - 1024000: 972
1024000 - 2048000: 27
2048000 - 4096000: 1
4096000 - 8192000: 0
8192000 - 16384000: 0
16384000 - 32768000: 0
32768000 - 65536000: 0
65536000 - 131072000: 0
131072000 - 262144000: 0
 > 262144000: 0


C pre-alloc
Median: 20124
Median, CPI: 2
Average: 21572
Average, CPI: 2
Histogram:
0 - 1000: 0
1000 - 2000: 0
2000 - 4000: 0
4000 - 8000: 0
8000 - 16000: 0
16000 - 32000: 980
32000 - 64000: 15
64000 - 128000: 2
128000 - 256000: 2
256000 - 512000: 0
512000 - 1024000: 1
1024000 - 2048000: 0
2048000 - 4096000: 0
4096000 - 8192000: 0
8192000 - 16384000: 0
16384000 - 32768000: 0
32768000 - 65536000: 0
65536000 - 131072000: 0
131072000 - 262144000: 0
 > 262144000: 0

The results are quite damning: the most commonly accepted C idiom of pre-allocating the array is 2 clocks per integer inserted, compared to 8 for pre-allocated vector, and 150 for most commonly used dynamically expanding vector. Which is 4 to 75 times(!) faster.

NOTE: The original version of the article contained an error: I pre-allocated the array incorrectly. The problem is now fixed and results updated. The fix resulted in dramatic improvement of STL performance, although it still did not come close to matching raw C performance.

NOTE2: I have tested it with the disabled security (#define _SECURE_SCL 0, and verified that generated assembly did not contain calls to the boundary checker), but it did not change the performance.

Even the dynamically expanding C array is the same speed as the pre-allocated vector, and 2.5 times faster than its dynamically-expanding STL counterpart.

But who cares, right? Computers are fast, we'll optimize the tight loops, and nobody would notice things that are not on the critical path, anyway. Well, yes and no. If your software is slow across the board, you'll be hit with what I call the Vista syndrome - there isn't one loop to optimize if it's all sluggish. More on this here: http://1-800-magic.blogspot.com/2007/12/memory-is-not-free-more-on-vista.html.

For a company like Google, it also means higher CPU utilization, more machines, and ultimately, more energy used. The difference in power by the way is very real - Google clusters do have a certain maximum CPU utilization threshold, beyond which the fuses get blown.

Next on my plate is to measure Java performance, and my gut feel is that it's not going to be much slower than STL. Which brings up the question - why use C++ at all, if the only advantage it has over Java and C# - the performance - is eaten by the runtime?

-----
Update: There IS a way to get STL to match C in the pre-allocated case. Here's how it's done. Ouch!


...
#define _SECURE_SCL 0
#include 
...

  printf("STL pre-alloc\n");
  for (int i = 0; i < NUM_SAMPLES; ++i) {
    vector work(NUM_ITERATIONS);
    work.reserve(NUM_ITERATIONS);
    unsigned __int64 begin = rdtsc();
    for (int j = 0; j < NUM_ITERATIONS; ++j)
      work[j] = j;
    unsigned __int64 end = rdtsc();
    clocks[cptr++] =
        (unsigned int)(end - begin);
  }

...and here are the results:


STL pre-alloc
Median: 20124
Median, CPI: 2
Average: 21668
Average, CPI: 2
Histogram:
0 - 1000: 0
1000 - 2000: 0
2000 - 4000: 0
4000 - 8000: 0
8000 - 16000: 0
16000 - 32000: 979
32000 - 64000: 11
64000 - 128000: 5
128000 - 256000: 4
256000 - 512000: 1
512000 - 1024000: 0
1024000 - 2048000: 0
2048000 - 4096000: 0
4096000 - 8192000: 0
8192000 - 16384000: 0
16384000 - 32768000: 0
32768000 - 65536000: 0
65536000 - 131072000: 0
131072000 - 262144000: 0
 > 262144000: 0

Beware the STL my son! The jaws that bite, the claws that catch!

I used to like C++ a lot. It was a very simple, efficient language, and you could always say exactly how the code you're writing mapped to your CPU.

Of course that was at Microsoft and before, and the dialect that I used was more C than C++ - sprinkled with some syntactic sugar like ability to define variables at the site of the first use, and maybe some classes, and an occasional - very infrequent - template. But still C at the very core.

This is not how Google uses C++ though. Here, C++ means STL. Big time. None of this C style is tolerated (there isn't even readability process for C!). And everything - and I mean everything! is a class.

STL was spooking me for a long time. I never used it at Microsoft because things that I worked on - mostly in the core of the OS - had to be insanely fast, and, most importantly, long-running. So I had to care a lot about things like memory fragmentation (see here: http://1-800-magic.blogspot.com/2007/11/guerilla-guide-to-native-memory.html and here: http://1-800-magic.blogspot.com/2007/12/memory-is-not-free-more-on-vista.html).

And it's kinda hard - possible, but you have to bend over backwards - to make STL use custom allocators that were my bread and butter.

Anyway, at Google I had to learn and start using STL. This, unfortunately, made me like C++ (at least STL-ish C++) less and less.

The first problem that jumped at me was the way you learn things. Most of my career I was learning by reading the source.

With C runtime, even with its Microsoft implementation which has way more levels of indirection than necessary, it is easy - you step into the function, and look what's inside. The code that compiler generates maps directly to the source, so the performance is on the surface - it executes as written.

With STL, the file vector - what can be simpler than that?! - is 2424 lines (Visual Studio 2008). algorithm - basically, sorting, searching, and copying - is 5796 lines. This, by the way, is the longest quicksort implementation I've seen in my life :-)! When you step through the code, it's a level of indirection on top of level of indirection on top of level of indirection.

[As a side note - the description of level 62 in old Microsoft - the level where you stop being a junior developer - had, among others, this as a qualification: "Does not create unnecessary abstractions." I think this was written by a smart, smart person!]

Now, consider the following piece of code:


#include <stdio.h>
#include <vector>

using namespace std;

template<class T> class Outer {
  template<class T> class Inner {
  public:
    int i;
    Y t;
    Inner(int a, T b) {
      i = a;
      t = b;
    }
  };

public:
  void DoWork(T t) {
    vector<Inner<T>> data;
    for (int i = 0; i < 10; ++i)
      data.push_back(Inner<T>(i, t));

    for (vector<Inner<T>>::iterator
        it = data.begin(); it != data.end(); ++it)
      it->i += 10;
  }
};

int main(int argc, char **argv) {
  Outer<int> x;
  x.DoWork(25);
  return 0;
}

[Note, this is just an example. The original code was not it. Yes, I know about STL pair.]

This code compiles and runs just fine on Visual Studio compiler. It does not on GCC, for 3 reasons.

First, GCC requires a space between two closing '>' on a templatized template, so you can't write X<Y<Z>>. It has to be X<Y<Z> >.

Second, the template T in the Inner class declaration on GCC shadows the outer T, so it has to be named something else.

And finally, and the most ridiculous - GCC does not parse vector<Inner<T>>::iterator as a type. You have to prefix it with the keyword typename, like so:


  typename vector<Inner<T>>::iterator it =...

If the compiler can't recognize the type as a type, there's something wrong with the code, don't you think?

Update 03/02/2008: Judging from commentrarry, I failed to communicate the intended point. Yes I know that in the three cases above the behavior is required by the standard, and VS2008 is not standards compliant. VS2008 is probably responsible for compiling at least 60% of the code written on this planet, and if the leading compiler is not compliant, this means only one thing - that the standard is too unwieldy. None of this happens to be the case with C or Java - there's only one way to parse them, and they are parsed the same by all tools.

Today, I started looking at the performance of STL code. But this post is getting too big for that, so the details are in the next articles. Here I'll just say that they are not encouraging.

Dumb and Dumberer

An interesting article in the Washington Post by the author of "The Age of American Unreason".

http://www.washingtonpost.com/wp-dyn/content/article/2008/02/15/AR2008021502901.html

A lot of scary statistics, among them...

"Despite an aggressive marketing campaign aimed at encouraging babies as young as 6 months to watch videos, there is no evidence that focusing on a screen is anything but bad for infants and toddlers. In a study released last August, University of Washington researchers found that babies between 8 and 16 months recognized an average of six to eight fewer words for every hour spent watching videos."

Are Americans really dumber than the rest of the world? I don't think so. I've seen really scary population stratas in Russia, entire schools where students added 1/2 + 3/4 and got 4/6. I think a bell curve is a bell curve everywhere - any country has approximately equivalent percentages of dumb and smart people.

What I do see here is a more direct democracy, where dumb people get to participate a lot more than they do in other country. Sure there were really scary places in Russia, but most people never made a blimp on the political and intellectual landscape of the nation - it was as if they never existed.

Here the 20% of all Americans who believe that Sun revolves around the Sun - they must be the same who also believe that Saddam attacked us on 9/11 and that the Iraq war is going well - get to pick the president. So we get George W. Bush - there's no more direct way to influence not just the US, but the whole world.

As they say, smart people hire people smarter than themselves, dumb people hire people dumber than themselves. Hence our Congress and our "President". And the disproportional role of idiots in public life (compared to the rest of the world).

But the voter participation in Europe is so much higher, you would say. Surely the role that idiots play there must be even greater?

The thing is, Europe has a very strong institution of public press and, most importantly, public TV. So the government has a way to channel information back to the people. If not smarter, they are more educated about the issues they are voting on.

In the US, the idiots are left to themselves and the corporate media whose goal is to extract money, not to educate. And why would the media dig their own grave trying to make the populace more educated? At a minimum, this will cost them the McDonalds advertising revenue...

Monday, February 25, 2008

IQ lies low on college admission committees priorities list

"To those of you who've received honors, awards and distinctions, I say well done, and to the C students... I say, you too can be President of the United States."

-- George W. Bush

My daughter's school had a college admissions weekend for parents, so my wife and I spent last Friday, Saturday, and Sunday on the East Coast.

The event was hosted by the school's college admissions office and featured a bunch of college admissions officials, mostly from here: http://colleges.usnews.rankingsandreviews.com/usnews/edu/college/rankings/brief/t1libartco_brief.php.

I will spare the reader the unimportant details. The most interesting (and useful) exercise was this: the parents were broken into 9 or 10 different rooms, each led by one of the aforementioned college admissions officials (we've got the lady from Harvey Mudd). Each room was given the same 3 applications to judge.

I don't know if the applications were real or fictitious, but they purportedly came from our school, and had familiar classes and grading system on the transcript, and familiar names of the teachers who wrote recommendation letters.

The first application was from an insanely smart girl. Her GPA was between A and A-, she has completed the most advanced calculus by the end of the 10th grade (Calculus BC level), and was taking classes in linear algebra and around in her last two years at school.

In other words, in math and sciences she was operating at the same level I was during my second year at the University. My school which was the Russian version of MIT, by far the most selective school at the time concentrating in math, natural sciences, and technology. Also, as was the custom in Russia, it was completely unburdened by humanity classes - everything was physics and math.

And yet I am sure she would have given me the run for my money.

On top of that, she exhibited the same quality of work in humanities as well - advanced classes in history, with the highest grades earned.

I liked her essay - she was writing about her mother teaching her to read rather than paint her face and dress up. I thought it was an accolade to her mother. As it turned out, most other people thought that she was complaining, and "had issues" with her mother.

The second candidate had a less remarkable GPA a bit over B+, but a track record of leadership in the school's newspaper. In fact, he was credited with a turnaround of the publication, and a huge ($50k) fundraiser that he organized and led. The recommendation letters spoke about maturity. I did not have time to read the essay, but I am sure it was great - an expectation from someone with a career in journalism.

On the negative side, his science grades were pretty bad - B- in physics, and not an advanced physics at that, and C in biology. His GPA also tanked at the very end - from consistent 9.5 to 8 in the Fall term of the last grade.

The third candidate was very weak, in my opinion. His GPA was B. His essay was mostly (>50%) quotes from Shakespeare, and the rest (maybe because of Shakespeare in the background) was absolutely terrible. He was a captain of the school's hockey team, but this should not be confused to student-athletes that get recruited into collegiate athletics - these people are not even going through the application process.

Anyway, after reading all this, I wondered why on Earth did they pick these applications - the decision looked uncontroversial. The girl completely dominated the landscape. I knew that any Google Hiring Committee would have picked her in a flash, too...

Then the parents in my team voted.

Smart girl - my hand shot up and then I looked around. My wife, several of other people - 7 in total.
Newspaperman - a forest of hands (26 by the count).
Hockey player - 10 votes.

Wow... I could not believe my eyes!

Then the groups reconvened in the main auditorium, and the statistics announced. Our group was by far the biggest, because it was a combination of 3 - a couple of college admissions people could not make it because of the snowstorm. But the results were roughly the same. The smart girl won one contest, led by a woman from Tufts. The hockey player got two or three. The rest went for the newspaper editor.

Then the question was asked of the college officials themselves - who would they select. The smart girl had done better here, but only because one of the schools was all-girls, and another (Harvey Mudd) was a math and sciences oriented place, so they were looking for the girls (specifically) who do well in science. But overall, the results largely mimicked the parents "committee".

The rationale given by them was this - there are tons of very smart kids with good grades. What they were looking for was a balanced class, with different types of competencies - IQ being just one of them. And on the balanced scale of priorities it (IQ) lost...

What a relief for the C students everywhere!

Wednesday, February 20, 2008

Backing up software CDs - programmatically

If you're like me, you are drowning in CDs. Even before I signed up for MSDN the sheer number of them has become so unmanageable, that I have given up trying to use CDs and DVDs - when I buy software, I immediately copy it to the server, and put the media in the Big Box(TM) in the basement - in case piracy police comes and demands the evidence that I actually do own it...

This works very well for most of it, but sometimes the CDs are bootable. Like Windows, for example. So just copying the files is not enough. Luckily, there are tons of programs that would burn an image for you, and with the cost of CD/DVD blanks approaching zero, it's easier to have CD images on server and burn them as required rather than keeping physical media handy.

There are, I am sure, tons of programs that would gladly rip a CD into an image file. But we're developers, right? And if a developer can write a tool, (s)he does. Luckily, Windows storage subsystem makes it extremely easy. This post is going to guide you through the process.

So let's start!


#define UNICODE 1
#define _UNICODE 1

#include <windows.h>
#include <stdio.h>
#include <ctype.h>
#include <strsafe.h>

#define ARRLEN(c) (sizeof(c)/sizeof(c[0]))

The essence: the same CreateFile function that we know and love can do much more than just opening plain vanilla files - it can open any device as well.

Every Windows physical disk and volume (partition) is backed by a block device. All we need to know is its name, and then we can address it just like any file - and yes, read and write from it using ReadFile and WriteFile. Completely bypassing the file system. Except of course if the device is a CD or DVD - then we can not WriteFile it (duh!).

Let's figure what the name is. Suppose the user is allowed to pass the number of the physical disk (C: is usually 0, D: is often 1, etc - you can look it up by going to My Computer->Manage->Disk Management), or a drive letter (D:) or a name of the folder where CDROM is mounted (C:\CDROM).

I use the AtoI conversion from here: http://1-800-magic.blogspot.com/2008/02/down-with-atoi.html.


bool GetDeviceName(WCHAR *szDevName,
                   size_t cchDevName,
                   const WCHAR *szInput) {
  unsigned int disk_no = 0;
  if (AtoI<unsigned int>(szInput, &disk_no)) {
    // It's a physical disk
    StringCchPrintfW(szDevName,
                     cchDevName,
                     L"\\\\.\\PHYSICALDRIVE%d", 
                     disk_no);
    return true;
  }

  WCHAR drive_letter = toupper(szInput[0]);
  if (drive_letter >= 'A' && 
      drive_letter <= 'Z' &&
      szInput[1] == ':' &&
      szInput[2] == '\0') {
    // It's a drive letter
    StringCchPrintfW(szDevName,
                     cchDevName,
                     L"\\\\.\\%c:",
                     drive_letter);
    return true;
  }

  WCHAR sz[_MAX_PATH];
  const WCHAR *p = wcsrchr(szInput, '\\');
  if (!p || p[1] != '\0') {
    // Mount point needs to end in backslash
    StringCchCopyW(sz, ARRLEN(sz), szInput);
    StringCchCatW(sz, ARRLEN(sz), L"\\");
    szInput = sz;
  }

  if (!GetVolumeNameForVolumeMountPointW(
      szInput, szDevName, cchDevName))
    return false;

  // This returns a name ending with '\'
  // but CreateFile does not take it, so
  // we need to get rid of it.
  WCHAR *q = wcsrchr(szDevName, '\\');
  if (q && q[1] == '\0')
    *q = '\0';

  return true;
}

Opening the block device is slightly tricky: if you just call CreateFile on it, the file system will try to filter the boundaries on our reads. So it will disallow access to hidden areas on the device which the file system would prefer to keep for itself. We don't want that - we want the whole thing. Hence the scary-looking DeviceIoControl:


HANDLE OpenBlockDevice(const WCHAR *szDevName,
                       bool fWrite) {
  HANDLE h = CreateFile(szDevName,
    fWrite ? GENERIC_WRITE : GENERIC_READ,
    0, NULL, OPEN_EXISTING,
    FILE_FLAG_NO_BUFFERING |
        (fWrite ? FILE_FLAG_WRITE_THROUGH : 0),
    NULL);
  if (h != INVALID_HANDLE_VALUE) {
    // Disable volume boundary checks by FS
    DWORD dwBytes = 0;
    DeviceIoControl(h,
                    FSCTL_ALLOW_EXTENDED_DASD_IO,
                    NULL,
                    0,
                    NULL,
                    0,
                    &dwBytes,
                    NULL);
  }
  return h;
}

Let's get the UI out of the way...


int wmain(int argc, WCHAR **argv) {
  if (argc < 4) {
    wprintf(L"Usage: %s {disk|volume} cmd "
            L"args\n", argv[0]);
    wprintf(L"Where disk is physical number "
            L"of the drive, e. g. 0\n");
    wprintf(L"Volume is a drive letter or a "
            L"full path to \n");
    wprintf(L"the mount point, e. g. a: or "
            L"c:\\mount\\vol\n");
    wprintf(L"Commands:\n");
    wprintf(L"readto filename - read the "
            L"contents of the device into "
            L"a file\n");
    return 0;
  }

If a removable disk is not present, the system will throw up an ugly dialog box. This nice little trick disables it so we can stay command line, and even run this from a service...


  UINT uiErrModeSav = SetErrorMode(
      SEM_FAILCRITICALERRORS);

Translate the user input into device name and open it:


  WCHAR szDevName[_MAX_PATH];
  if (!GetDeviceName(szDevName,
                     ARRLEN(szDevName),
                     argv[1])) {
    wprintf(L"Could not recognize %s\n", argv[1]);
    SetErrorMode(uiErrModeSav);
    return 1;
  }

  HANDLE hDev = OpenBlockDevice(szDevName, false);
  if (hDev == INVALID_HANDLE_VALUE) {
    DWORD dwErr = GetLastError();
    wprintf(L"Could not open %s, "
            L"error %d (0x%08x)\n",
            argv[1], dwErr, dwErr);
    SetErrorMode(uiErrModeSav);
    return 2;
  }

The very very last tricky thing is the memory allocation. To do IO to physical devices the memory must be on 64k boundary. The easiest way to get it is VirtualAlloc:


  const int kBufferSize = 64 * 1024;
  void *buffer = VirtualAlloc(
      NULL,
      kBufferSize,
      MEM_COMMIT | MEM_RESERVE,
      PAGE_READWRITE);
  if (!buffer) {
    DWORD dwErr = GetLastError();
    wprintf (L"Could not reserve memory, "
             L"error %d (0x%08x)\n", dwErr, dwErr);
    CloseHandle(hDev);
    SetErrorMode(uiErrModeSav);
    return 3;
  }

Finally, let's copy some data!


  DWORD dwTickStart = GetTickCount();
  unsigned __int64 ui64BytesRead = 0;
  DWORD dwErr = ERROR_SUCCESS;
  if (_wcsicmp(argv[2], L"readto") == 0) {
    HANDLE hFile = CreateFile(
        argv[3], GENERIC_WRITE, 0, NULL,
        CREATE_ALWAYS, 0, NULL);
    if (hFile != INVALID_HANDLE_VALUE) {
      wprintf(L"%s --> %s:\n", szDevName, argv[3]);
      for ( ; ; ) {
        DWORD dwRead = 0;
        if (!ReadFile(hDev, buffer,
            kBufferSize, &dwRead, NULL)) {
            dwErr = GetLastError();
            wprintf(L"Read error %d (0x%08x)\n",
                dwErr, dwErr);
            break;
        }

        if (dwRead == 0)
          break;

        DWORD dwWrit = 0;
        if (!WriteFile(hFile, buffer,
            dwRead, &dwWrit, NULL) ||
            dwWrit != dwRead) {
          dwErr = GetLastError();
          wprintf(L"Write error %d (0x%08x)\n",
              dwErr, dwErr);
          break;
        }

        ui64BytesRead += dwWrit;
        wprintf(L"%I64u\r", ui64BytesRead);
      }

      CloseHandle(hFile);
      if (dwErr != ERROR_SUCCESS)
        DeleteFile(argv[3]);
    } else {
      dwErr = GetLastError();
      wprintf (L"Could not open %s, error %d "
               L"(0x%08x)\n", argv[3],
               dwErr, dwErr);
    }
  } else {
    wprintf (L"Command %s not recognized.\n",
             argv[2]);
    dwErr = ERROR_NOT_SUPPORTED;
  }

  if (dwErr == ERROR_SUCCESS &&
      ui64BytesRead != 0) {
    wprintf(L"Transferred %I64u bytes in %d "
            L"seconds.\n", ui64BytesRead,
            (GetTickCount() - dwTickStart) / 1000);
  }

As you can see, at this point it is not any different from just copying a file. A little bit of cleanup...


  VirtualFree(buffer, 0, MEM_RELEASE);
  CloseHandle(hDev);
  SetErrorMode(uiErrModeSav);

  return dwErr == ERROR_SUCCESS ? 0 : 4;
}

...and we're done. Just below 200 lines of code is all it takes.

Monday, February 18, 2008

Microsoft's bid for Yahoo, continued...

Here's what Joe Rosenberg has to say about it:

"Ballmer is a great operating man but he lacks financial acumen. He ought to be thinking more of Microsoft employees who own a lot of Microsoft stock and have nothing to show for it in many years. If the stock doesn't start doing better, Microsoft will lose good people."

It rings so true! Apparently market has already discounted Microsoft value by exactly the amount of the merger - $40B - reaffirming the deal as completely worthless.

The Reuters article is here: http://www.reuters.com/article/technologyNews/idUSN1760137520080217.

Also, NYT writes today basically the same thing I was saying always - technologically the companies are so incompatible that it's hard to expect that the marriage will work out: http://www.nytimes.com/2008/02/18/technology/18integrate.html.

It repeats most of my arguments here: http://1-800-magic.blogspot.com/2008/02/50b-down-drain-or-microsoft-bids-for.html.

Sunday, February 17, 2008

It is a song that does not end! It just goes on and on, my friend...


History repeats itself, first as tragedy,
second as farce.
                        -- Karl Marx


A call to the switchboard of the Central Committee
of the Communist Party of the Soviet Union:
Caller: Are you looking for the next Secretary?
Operator: Comrade, are you sick?
Caller: Yes, and very, very old!
                        -- Russian joke circa 1983

The mid eighties were crazy times for people who tracked Soviet politics. The quarter century of Brezhnev's rule came to an end when he died in 1982. He was 76 and barely coherent when giving public speeches.

Another Russian joke from that period: Brezhnev ascends the podium and starts reading "Oh-oh-oh-oh-oh-oh-oh. Oh-oh-oh-oh-oh-oh-oh." Panicking aidees whisper from behind: "Leonid Ilyich, THIS IS NOT THE SPEECH! It's us playing tic-tac-toe!" (Cyrillic 'h' is written as 'x').

What followed was a rapid succession of very old Party Secretaries who died on the job very quickly.

Brezhnev was replaced by Yuri Andropov. He lasted just over a year - assumed office in November 1982 and died in February 1984.

The next was Konstantin Chernenko. He died after one year and 1 month in office at the age of 74. Rumors say he was ill most of the time.

When he died I was in high school. The days of the funerals were declared state days of mourning and schools and workplaces were closed so people could watch the funerals on TV. A fellow student in my class wrote "REST" for that day in his daily planner. You should have seen the reaction of our home room teacher - it was so funny!

Ok, so where am I going with all this?

The presumptive Republican nominee is John McCain, who is 71 years old now, and will be 72 when/if he actually does become the president.

Our current president is barely coherent when giving public speeches.

We may be heading for a re-run...

Bizzare PC hardware problem

Yesterday one of my Media Center PCs started turning itself off. Not spontaneously, no, but every time I would copy a bunch of data from the network, or generate a lot of disk activity, the lights were going out as if it were a power outage.

I experimented with it a bit. Doing things with CPU or video did not seem to be affecting anything. I ran this batch script alongside playing an HD movie:


    @echo off
    start:
    goto start:

This caused CPU (both cores) to go to 80% utilization. Nothing. But the moment I copied the data, the power would go off.

So today I went to Fry's and bought a new motherboard, a thermal compound, and a Cooler Master Real Power Pro 750W power supply.

The power supply I bought mostly because it was unbelievably cheap - $125 - $50 rebate = $75. Deals like these do not happen very often, even at Fry's. [The deal is not on the web site by the way - boo! http://www.frys-electronics-ads.com/]

I did not actually expect that it was a power supply that was going bad because my CPU/video test was working.

However, because changing the power supply is much-much-much easier than changing the motherboard (I was really not looking forward to poring through my records trying to match Windows key to the hardware which was using it), I decided to try that first.

So I did that, and no more shutdowns. The old one was 450W, but it is not a very power-hungry system - 65W CPU, only one hard drive, and the video is a mid-range 7600GS Nvidia card. Be it as it may, the problem seems to be solved.

But why maxing out video card and the CPU was working, whereas disk/network was not - beats me. I'd love to know the answer though.

Down with atoi!

One of the interview questions that I ask is "What are the reasons one should dislike atoi?" atoi of course is an ancient C function that converts strings (a stands for ASCII) to integers:


    int atoi(char *);

The function embodies the first rule in the joke about Unix:

In the early PDP-11 days, Unix programs had the following design parameters:
Rule 1. It didn’t have to be good, or even correct, but:
Rule 2. It had to be small.
Thus the toolkit approach, and so forth. Of course, over time, computer hardware has become progressively more powerful: processors speed up, address spaces move from 16 to 32 bits, memory gets cheaper, and so forth. So Rule 2 has been relaxed.

Classic implementation of atoi is small and fast, and I bet that this is how it was implemented in original C runtime:


    int atoi(char *c) {
      int res = 0;
      while (*c >= '0' && *c <= '9'))
        res = res * 10 + *c++ - '0';
      return res;
    }

The problem is, it only works on correct input, and there's no way to know if the input is incorrect: call it with "boo" and it happily returns 0, which is the same as if it were passed with 0 input. Worse, give it a long-enough string of digits, and the return value would be incorrect because of the integer overflows.

So if you're absolutely guaranteed that the number is well-formatted, you can use atoi. But how often does this happen in real life? In most cases when parsing flags or user input the results of this function are non-deterministic.

These days atoi internal implementation is using strtol, which is a better function:


    long strtol(const char *nptr,
                char **endptr,
                int base);

With strtol it is possible to figure out which part of the input (or none) was actually converted. It also sets errno to indicate an overflow.

However, being a patch over atoi, it exhibits the same terrible interface - you CAN figure out if there are errors, but you have to bend over the backwards to do it. Here's how it should be called to actually process errors:


    errno = 0;
    char *p;
    long num = strtol(str, &p, 10);
    if (errno != 0 || *p != '\0') {
       ...error
    }

For this reason, every system that I've worked on so far had its own internal string to number conversion functions. I even wrote a couple myself :-).

What would be a production-worthy interface? I would argue that the best thing to do is to return an error explicitly, like this:


   // Returns 0 if the input contains parseable
   // number, and error number otherwise
   // Parameters:
   //   str Input string
   //   res Pointer to the resulting parsed integer
   int atoi(char *str, int *res);

Here's the C++-ish code that I use at home. It's not super-fast, (because to detect the overflow it uses division, which is a big performance no-no), but it does thorough input validation.

The advantage of it is that it's readable and easy to understand. I consider it generally acceptable when validating flags or user input - something that either is inherently slow, or happens once per program.

I would probably advise against using it in parsers though.


template<class IntType>
    bool AtoI(WCHAR *sz, IntType *out) {
  int i = 0;
  bool sign = false;
  if (sz[i] == '-') {
    ++i;
    sign = true;
  }

  int base = 10;
  if(sz[i] == '0' && tolower(sz[i+1]) == 'x') {
    i += 2;
    base = 16;
  } else if (sz[i] == '0' && sz[i+1] != '\0') {
    ++i;
    base = 8;
  }

  IntType res = 0;
  int chars = 0;
  while (sz[i] != '\0') {
    WCHAR c = tolower(sz[i++]);
    if (c >= '0' && c <= '9')
      c = c - '0';
    else if (c >= 'a' && c <= 'f')
      c = c - 'a' + 10;
    else
      return false;
    if (c >= base)
      return false;
    IntType res_sav = res;
    res *= (IntType)base;
    if (res / (IntType)base != res_sav) {
      // Overflow!
      return false;
    }
    res_sav = res;
    res += (IntType)c;
    if (res < res_sav) {
      // Overflow!
      return false;
    }
    ++chars;
  }
  if (chars == 0)
    return false;
  *out = sign ? -res : res;
  return true;
}

If you need performance and you are coding on Windows, you can use intsafe.h. It's a library that does overflow-proof data coversion and arithmetic. It is also unbelievably ugly (think names like UnsignedMultiply128).

But it does the job, correctly, and probably much faster than any home-brew approach (with the same level of correctness, that is).

Friday, February 15, 2008

Damned if you do, damned if you don't...

Hearings and Markup before the Subcommittee on Europe and the Middle East of the Committee on Foreign Affairs, House of Representatives, One Hundred First Congress, First Session, p. 66.

Ned Walker, Assistant to the Undersecretary for Middle East Affairs at the U.S. State Department
Lee Hamilton, chairman of the Subcommittee on Europe and the Middle East

Walker: The State Department definition which is included in the terrorism report annually defines it in terms of politically motivated attacks on non-combatant targets.

Hamilton: So an attack on a military unit in Israel will not be terrorism?

Walker: It does not necessarily mean that it would not have a very major impact on whatever we were proposing to do with the PLO.

Hamilton: I understand that, but it would not be terrorism.

Walker: An attack on a military target. Not according to the definition. Now wait a minute; that is not quite correct. You know, attacks can be made on military targets which clearly are terrorism. It depends on the individual circumstances.

Hamilton: Now wait a minute. I thought that you just gave me the State Department definition.

Walker: Non-combatant is the terminology, not military or civilian.

Hamilton: All right. So any attack on a non-combatant could be terrorism?

Walker: That is right.

Hamilton: And a non-combatant could include military?

Walker: Of course.

Hamilton: It certainly would include civilian, right?

Walker: Right.

Hamilton: But an attack on a military unity would not be terrorism?

Walker: It depends on the circumstances.

Hamilton: And what are those circumstances?

Walker: I do not think it will be productive to get into a description of the various terms and conditions under which we are going to define an act by the PLO as terrorism.

Regardless of what you attack, military of civilians, if it's us or our friends, you're a terrorist...

Radical Engineers

Just ran into this article in the January issue of Foreign Policy:

"Osama bin Laden studied engineering. So did lead 9/11 hijacker Mohammed Atta, 9/11 mastermind Khalid Sheikh Mohammed, and Ramzi Yousef, the architect of the 1993 World Trade Center bombing. Exceptions to the rule? Hardly. Most highprofile Islamist terrorists are, in fact, highly educated. And according to new research at Oxford University by sociologists Diego Gambetta and Steffen Hertog, most of them may be engineers.

After compiling educational biographies for nearly 300 known members of violent Islamist groups from 30 countries, Gambetta and Hertog found that the vast majority—69 percent—had attended college. Of those with clear areas of study, nearly half had gone into engineering. Across the Middle East and Southeast Asia, the share of engineers in violent Islamist groups was found to be at least nine times greater than what one might expect, given their proportion of the working male population.

It may be tempting to assume that people with engineering backgrounds are more likely to be recruited for their technical (read: bombmaking) skills. Gambetta and Hertog dismiss this claim. Instead, they argue radicalized engineers are vastly overrepresented in terrorist ranks thanks to thwarted professional ambitions and, most controversially, a unique mindset ripe for extremism."

[Emphasis mine.]

So next time your laptop is confiscated at the airport and held for a few months, you'd know why.

http://www.washingtonpost.com/wp-dyn/content/article/2008/02/06/AR2008020604763.html

Hermann Goering on "good government"

"Why, of course, the people don't want war," Goering shrugged. "Why would some poor slob on a farm want to risk his life in a war when the best that he can get out of it is to come back to his farm in one piece. Naturally, the common people don't want war; neither in Russia nor in England nor in America, nor for that matter in Germany. That is understood. But, after all, it is the leaders of the country who determine the policy and it is always a simple matter to drag the people along, whether it is a democracy or a fascist dictatorship or a Parliament or a Communist dictatorship."

"There is one difference," I pointed out. "In a democracy the people have some say in the matter through their elected representatives, and in the United States only Congress can declare wars."

"Oh, that is all well and good, but, voice or no voice, the people can always be brought to the bidding of the leaders. That is easy. All you have to do is tell them they are being attacked and denounce the pacifists for lack of patriotism and exposing the country to danger. It works the same way in any country."

Gilbert, G.M. Nuremberg Diary.

[Emphasis mine]

Thursday, February 14, 2008

Software is a resilient thing

About 10 years ago I was working on a Just-in-time compiler for a Java-like language. The runtime lived inside Microstation - a huge CAD package - millions and millions of lines of code, a lot of it written in MDL (Microstation Development Language), which was very close to C (and we were migrating it to very close to Java).

Anyway, the compiler generated an intermediate instruction set which I designed and which contained an instruction to copy a structure, so for example, you could write


    Point2d new_point = old_point;

and it would compile to one instruction.

So as I was working on the JITter, I accidentally reversed source and destination in the generated machine code, so in the example above instead of copying old_point into the new_point, it would copy the garbage from the uninitialized new_point to the old_point.

Needless to say, in a CAD application this sort of code is very, very common. There were thousands and thousands copy instructions embedded in compiled MDL code.

However, despite my bug Microstation started and even shown part of the design file - which required compiling about 2 megabytes of code (a lot for 1997! - and it executed a big part of it, too) - before it finally crashed.

It was not an isolated incident. It is not an infrequent occurence when I find bugs - on inspection - and the first thought I have is - how on earth did it ever work? Apparently one has to work really hard to produce unstable software :-)!

Just like Perl Harbor...

http://www.nytimes.com/2008/02/14/books/14dumb.htm

The author of seven other books, she was a fellow at the library when she first got the idea for this book back in 2001, on 9/11.

Walking home to her Upper East Side apartment, she said, overwhelmed and confused, she stopped at a bar. As she sipped her bloody mary, she quietly listened to two men, neatly dressed in suits. For a second she thought they were going to compare that day’s horrifying attack to the Japanese bombing in 1941 that blew America into World War II:

“This is just like Pearl Harbor,” one of the men said.

The other asked, “What is Pearl Harbor?”

“That was when the Vietnamese dropped bombs in a harbor, and it started the Vietnam War,” the first man replied.

Wednesday, February 13, 2008

Terrorism is good as long as it's doing one's own bidding...

From NYT, on the article about car-bombing of a senior Hezbollah commander:

"Gideon Ezra, a minister from the governing Kadima Party in Israel and former deputy chief of the Shin Bet internal intelligence agency, told Israel Radio on Wednesday that many countries had an interest in killing Mr. Mugniyah, but that “Israel too was hurt by him, more than other countries in recent years.”

Mr. Ezra said, “Of course I don’t know who killed him, but whoever did should be congratulated.”"

http://www.nytimes.com/2008/02/14/world/middleeast/14syria.html?hp

Sounds familiar? Osama Bin Ladin and his men were "freedom fighters" when they fought Soviet occupation in Afghanistan. The moment they turned their attention to the US, they became... the terrorists, too.

By the way, the terrorism is defined by State Department thusly: "Premeditated, politically motivated violence perpetrated against noncombatant* targets by subnational groups or clandestine agents, usually intended to influence an audience."

The reason "subnational" is here because the moment you start applying this definition to the nation-state, any war action can be classified as "terrorism". For example, bombing of Hiroshima or fire-bombing of Tokyo or Leipzig. They all were designed to make a political point ("influence" the people of the other countries to stop fighting). That would exclude war as a means of resolving international conflicts, something that the US does not want to do.

Physician, heal thyself

The adventure with MSN trying to cancel the phone account (http://1-800-magic.blogspot.com/2008/02/msn-support.html) got me thinking - so here we have a company with great online aspirations. It has spent innumerable billions of dollars there, and is prepared to spend tens of billions more.

Yet the simplest things in online customer account management are broken (as in obvious bugs disabling basic functionality), and you have to use the phone (which works) and talk to live person to get simplest things resolved. Contrast it with Amazon, for example, where everything is automated - I have recently returned a $400 video card and it took 5 minutes on their web site.

How can one hope to succeed in online services business if the most basic service underlying them all - the customer service - always requires human intervention? Incidentally, the customer service is probably the first online service that ever existed...

The same story happens in search - Microsoft is trying to compete with Google, but while I worked there, I was never successful trying to search for documents on its own internal network. That search is so broken that it is legendary among Microsoft employees - the most often cited example of a broken IT service at Microsoft. How can you lay a claim to organizing the world's information if you cannot organize your own?

The problems seem to be obvious, but somehow completely escape the leadership. I've seen this before.

8 years ago we have released the first "Palm-size PC" OS (the old one with the "Start" button - the pictures are here: http://www.pdagold.com/hardware/list.asp?c=12). The software was terrible - slow, unstable, and visually unappealing. Palm Pilot, which was the dominant PDA at the time, was polished and nimble. So then I read this interview with Bill Gates where he says that one of the things over which he's "loosing sleep at night" is figuring out why customers prefer Palms, and the first thought that just torn through my mind was - "Well, Bill, have you used it? It would be immediately obvious - no need to lose sleep!"

Eat your own damn dogfood!

Tuesday, February 12, 2008

MSN "Support"

My "smart watch" was out of commission for the last 6 months since its wristband broke. I got it at Amazon for $60 or something like that - about the same price I usually pay for watches.

Since not a single watch I owned could survive my constant fiddling for more than a year, I figured this one would be a safe bet - a smarter thing for the same price. The device actually survived for about 6 months, and was lying in the drawer ever since.

So today I got a reminder that I will be billed $40 for the next year some time in March, and I tried to cancel it. Here's how it works.

The email tells you to go to http://billing.microsoft.com, select the service, and click the cancel. When I did, the serivce actually did have the cancel link:

When this link is clicked, however, it does not let you cancel. Another web page opens to the side and offers another link:

This is where it gets interesting. When I click on MSN support, I get this page:

As you can see, the MSN direct is in the list, but it's not clickable. Why? Hovering over the question mark button explains why:

Basically, the account that I have does not grant me the privilege of cancelling it. Googling "cancel MSN direct" produces the page http://chris.pirillo.com/2006/04/27/canceling-msn-direct-is-a-pita/ with the number: 866-658-7032.

Calling this number first tells me that they are experiencing high volume of calls, and then that the office is closed, and I should call between 8 am and 12 am Eastern time.

Well, I'll try tomorrow, but next time I sign up for MSN paid services, I will think twice. Or three times. Actually, scratch that. I just won't do it again...

UPDATE: Called in and after ~10 minutes (5 on hold + 5 spelling my name and email address) got it cancelled. I've spent more than that yesterday just staring at the web site trying to figure out what's wrong. Who could have guessed that the phone works better with a computer company than the web site?

1-800-MAGIC