Wednesday, February 20, 2008

Backing up software CDs - programmatically

If you're like me, you are drowning in CDs. Even before I signed up for MSDN the sheer number of them has become so unmanageable, that I have given up trying to use CDs and DVDs - when I buy software, I immediately copy it to the server, and put the media in the Big Box(TM) in the basement - in case piracy police comes and demands the evidence that I actually do own it...

This works very well for most of it, but sometimes the CDs are bootable. Like Windows, for example. So just copying the files is not enough. Luckily, there are tons of programs that would burn an image for you, and with the cost of CD/DVD blanks approaching zero, it's easier to have CD images on server and burn them as required rather than keeping physical media handy.

There are, I am sure, tons of programs that would gladly rip a CD into an image file. But we're developers, right? And if a developer can write a tool, (s)he does. Luckily, Windows storage subsystem makes it extremely easy. This post is going to guide you through the process.

So let's start!

#define UNICODE 1
#define _UNICODE 1

#include <windows.h>
#include <stdio.h>
#include <ctype.h>
#include <strsafe.h>

#define ARRLEN(c) (sizeof(c)/sizeof(c[0]))

The essence: the same CreateFile function that we know and love can do much more than just opening plain vanilla files - it can open any device as well.

Every Windows physical disk and volume (partition) is backed by a block device. All we need to know is its name, and then we can address it just like any file - and yes, read and write from it using ReadFile and WriteFile. Completely bypassing the file system. Except of course if the device is a CD or DVD - then we can not WriteFile it (duh!).

Let's figure what the name is. Suppose the user is allowed to pass the number of the physical disk (C: is usually 0, D: is often 1, etc - you can look it up by going to My Computer->Manage->Disk Management), or a drive letter (D:) or a name of the folder where CDROM is mounted (C:\CDROM).

I use the AtoI conversion from here: http://1-800-magic.blogspot.com/2008/02/down-with-atoi.html.

bool GetDeviceName(WCHAR *szDevName,
size_t cchDevName,
const WCHAR *szInput) {
unsigned int disk_no = 0;
if (AtoI<unsigned int>(szInput, &disk_no)) {
// It's a physical disk
StringCchPrintfW(szDevName,
cchDevName,
L"\\\\.\\PHYSICALDRIVE%d",
disk_no);
return true;
}

WCHAR drive_letter = toupper(szInput[0]);
if (drive_letter >= 'A' &&
drive_letter <= 'Z' &&
szInput[1] == ':' &&
szInput[2] == '\0') {
// It's a drive letter
StringCchPrintfW(szDevName,
cchDevName,
L"\\\\.\\%c:",
drive_letter);
return true;
}

WCHAR sz[_MAX_PATH];
const WCHAR *p = wcsrchr(szInput, '\\');
if (!p || p[1] != '\0') {
// Mount point needs to end in backslash
StringCchCopyW(sz, ARRLEN(sz), szInput);
StringCchCatW(sz, ARRLEN(sz), L"\\");
szInput = sz;
}

if (!GetVolumeNameForVolumeMountPointW(
szInput, szDevName, cchDevName))
return false;

// This returns a name ending with '\'
// but CreateFile does not take it, so
// we need to get rid of it.
WCHAR *q = wcsrchr(szDevName, '\\');
if (q && q[1] == '\0')
*q = '\0';

return true;
}

Opening the block device is slightly tricky: if you just call CreateFile on it, the file system will try to filter the boundaries on our reads. So it will disallow access to hidden areas on the device which the file system would prefer to keep for itself. We don't want that - we want the whole thing. Hence the scary-looking DeviceIoControl:

HANDLE OpenBlockDevice(const WCHAR *szDevName,
bool fWrite) {
HANDLE h = CreateFile(szDevName,
fWrite ? GENERIC_WRITE : GENERIC_READ,
0, NULL, OPEN_EXISTING,
FILE_FLAG_NO_BUFFERING |
(fWrite ? FILE_FLAG_WRITE_THROUGH : 0),
NULL);
if (h != INVALID_HANDLE_VALUE) {
// Disable volume boundary checks by FS
DWORD dwBytes = 0;
DeviceIoControl(h,
FSCTL_ALLOW_EXTENDED_DASD_IO,
NULL,
0,
NULL,
0,
&dwBytes,
NULL);
}
return h;
}

Let's get the UI out of the way...

int wmain(int argc, WCHAR **argv) {
if (argc < 4) {
wprintf(L"Usage: %s {disk|volume} cmd "
L"args\n", argv[0]);
wprintf(L"Where disk is physical number "
L"of the drive, e. g. 0\n");
wprintf(L"Volume is a drive letter or a "
L"full path to \n");
wprintf(L"the mount point, e. g. a: or "
L"c:\\mount\\vol\n");
wprintf(L"Commands:\n");
wprintf(L"readto filename - read the "
L"contents of the device into "
L"a file\n");
return 0;
}

If a removable disk is not present, the system will throw up an ugly dialog box. This nice little trick disables it so we can stay command line, and even run this from a service...

UINT uiErrModeSav = SetErrorMode(
SEM_FAILCRITICALERRORS);

Translate the user input into device name and open it:

WCHAR szDevName[_MAX_PATH];
if (!GetDeviceName(szDevName,
ARRLEN(szDevName),
argv[1])) {
wprintf(L"Could not recognize %s\n", argv[1]);
SetErrorMode(uiErrModeSav);
return 1;
}

HANDLE hDev = OpenBlockDevice(szDevName, false);
if (hDev == INVALID_HANDLE_VALUE) {
DWORD dwErr = GetLastError();
wprintf(L"Could not open %s, "
L"error %d (0x%08x)\n",
argv[1], dwErr, dwErr);
SetErrorMode(uiErrModeSav);
return 2;
}

The very very last tricky thing is the memory allocation. To do IO to physical devices the memory must be on 64k boundary. The easiest way to get it is VirtualAlloc:

const int kBufferSize = 64 * 1024;
void *buffer = VirtualAlloc(
NULL,
kBufferSize,
MEM_COMMIT | MEM_RESERVE,
PAGE_READWRITE);
if (!buffer) {
DWORD dwErr = GetLastError();
wprintf (L"Could not reserve memory, "
L"error %d (0x%08x)\n", dwErr, dwErr);
CloseHandle(hDev);
SetErrorMode(uiErrModeSav);
return 3;
}

Finally, let's copy some data!

DWORD dwTickStart = GetTickCount();
unsigned __int64 ui64BytesRead = 0;
DWORD dwErr = ERROR_SUCCESS;
if (_wcsicmp(argv[2], L"readto") == 0) {
HANDLE hFile = CreateFile(
argv[3], GENERIC_WRITE, 0, NULL,
CREATE_ALWAYS, 0, NULL);
if (hFile != INVALID_HANDLE_VALUE) {
wprintf(L"%s --> %s:\n", szDevName, argv[3]);
for ( ; ; ) {
DWORD dwRead = 0;
if (!ReadFile(hDev, buffer,
kBufferSize, &dwRead, NULL)) {
dwErr = GetLastError();
wprintf(L"Read error %d (0x%08x)\n",
dwErr, dwErr);
break;
}

if (dwRead == 0)
break;

DWORD dwWrit = 0;
if (!WriteFile(hFile, buffer,
dwRead, &dwWrit, NULL) ||
dwWrit != dwRead) {
dwErr = GetLastError();
wprintf(L"Write error %d (0x%08x)\n",
dwErr, dwErr);
break;
}

ui64BytesRead += dwWrit;
wprintf(L"%I64u\r", ui64BytesRead);
}

CloseHandle(hFile);
if (dwErr != ERROR_SUCCESS)
DeleteFile(argv[3]);
} else {
dwErr = GetLastError();
wprintf (L"Could not open %s, error %d "
L"(0x%08x)\n", argv[3],
dwErr, dwErr);
}
} else {
wprintf (L"Command %s not recognized.\n",
argv[2]);
dwErr = ERROR_NOT_SUPPORTED;
}

if (dwErr == ERROR_SUCCESS &&
ui64BytesRead != 0) {
wprintf(L"Transferred %I64u bytes in %d "
L"seconds.\n", ui64BytesRead,
(GetTickCount() - dwTickStart) / 1000);
}

As you can see, at this point it is not any different from just copying a file. A little bit of cleanup...

VirtualFree(buffer, 0, MEM_RELEASE);
CloseHandle(hDev);
SetErrorMode(uiErrModeSav);

return dwErr == ERROR_SUCCESS ? 0 : 4;
}

...and we're done. Just below 200 lines of code is all it takes.

8 comments:

Eldar said...

Cool-e-O! Thanks! Really like the article. Funny, I did not know I can open a disk as a file. As an old hacker I always used some low level API... Thank you, really enjoyed the article!

Илья Казначеев said...

That's what people do when they haven't got dd. Perhaps you can have some of your money back.

Also: WIN32 api is sooo long and unwieldy!

Alex Efros said...

dd? You don't need special really powerful tools like dd just to create CD/DVD image, you can use 'cp' or even 'cat':

$ cp /dev/dvd /tmp/dvd.iso
$ file /tmp/dvd.iso
/tmp/dvd.iso: ISO 9660 CD-ROM filesystem data 'GOTHIC3'

Everything should really be a file, not just be 'like a file after 200 lines of code'. BTW, even in *NIX not everything is really a file, but in Inferno it finally is! :)

Anonymous said...

Great post! Cool and fun :-) Do not forget one small thing: CD ISO != CD. Not with this code anyway. It is not that simple. Try to fetch almost any game CD and make ISO this way. Then burn it and try to play. Try NERO. It will not do it as well. Also, good luck with most Japanese music CD's ;-) Most likely you will not even get ISO. But, as I said, this is a terrific post and code should work with non-protected media. Btw, try to run it on venerable VISTA :-) to enjoy it UAC one more time. So, if you hit some issue and do not have enough free time and team to "productize" this great 200 lines, you may need a shot of good alcohol 120% from this barrel:
http://www.alcohol-soft.com/
They are not charging 44$ for nothing :-)

Sergey Solyanik said...

Ilya: you're right about unwieldy API - that's Cutler's roots in DEC. I have a post on it here: http://1-800-magic.blogspot.com/2007/12/unix-vs-windows.html

Dzembu: yes, this will not rip audio CDs at all. They are not block devices.

Question to Unix people: can you actually rip an audio CD by copying the device to a file?

Alex Efros said...

No, it's not possible to rip an audio CD by copying the device to a file in *NIX.

Илья Казначеев said...

No, it is not possible even theoretically.

That's because reading redbook audio is nondeterministic: audio cds contain far less metadata than data cds, for example, they don't store current position along the track.
So dumping audio cds at speeds other than 1x requires a lot of processing - basically, you have to check whenever skip or overlay occured. That's called, as I believe, jitter correction.
Things get even funner then we've got scratch.

You can get some insights into this in man cdparanoia - look how much correctable errors it have got and how complex do they sound.

DzembuGaijin said...

Yeah, you are right of course, but even data ("software") CD protection is a fascinating subject. :-)

http://en.wikipedia.org/wiki/CD/DVD_copy_protection#Commercial_CD_protections

I think it is very common for all kind of commercials products: all Microsoft PC games use it for example: try Halo CD that you have for example.

I guess that internet "activation" ( node locking) makes it less relevant, but ... it should be still very popular.