Saturday, November 26, 2011

"War on militancy"

The attack comes as relations between the United States and Pakistan — its ally in the war on militancy — are already strained following the killing of al-Qaida leader Osama bin Laden by U.S. special forces in a secret raid on the Pakistani garrison town of Abbottabad in May.

Our ministry of self-censorship is clearly still recovering from a Thanksgiving dinner today...

Economics, as understood by Alan Greenspan

State secrets

"By the way, during seven of the eight George W. Bush years, the IRS report on the top 400 taxpayers was labeled a state secret, a policy that the Obama administration overturned almost instantly after his inauguration."

Tuesday, November 22, 2011

Hiring at elite companies

A friend sent me a pointer to this blog post:
The original paper is about hiring practices in financial institutions, but the fact that the best companies prefer applicants from the best universities (much in the same way the top grad schools mostly accept people from the top undergrad schools) is a recurring complaint in software industry as well.

In all cases the preferential treatment is driven by selectivity of the school, not so much by the quality of skills developed there: the idea is that since colleges are highly selective, you can apply them as the first filter to the applicant pool. The system then continues - the resumes from the candidates working at the top companies draw more attention than the resumes from lesser brands, and so it goes.

Which means that if you missed out on a good school early on, you are kinda screwed; while upward mobility is not impossible, it becomes much, much harder.

A long time ago I experienced something similar myself. When I came to the United States, I took the GRE and applied to several top graduate programs in physics. My GRE scores were far, far above the published average scores for all of these programs except MIT, where it was closer to (but still above) average.

In the Soviet Union where I was from, test results were the only allowed criteria for admission. There was no concept of reference letters, legacy status, nor other out-of-band information that would feed into the decision making process. In fact, anything resembling legacy considerations were considered corruption and would earn all involved parties a one-way, all expenses paid, trip to Siberia's finest labor camps (corruption did of course exist, and when discovered it was dealt with harshly; one of the people on my alma mater admission committee ended up in jail for bartering admissions spots for favors from other well-connected people).

Imagine my surprise when I got the rejection letter from UPenn - UPenn! - where the average subject GRE score was in 600s! At the time I did not know what a "safety school" was, but if I did I have thought about it as such at the time. I only applied to UPenn because I lived in Philadelphia at the time, all our relatives lived there, and I needed to show that I was not dismissing the idea of staying "close to the family" outright.

Completely baffled, I wrote them a letter, pointing out the huge discrepancy in the scores and asking them to explain the decision. Soon I befriended another Russian who had emigrated a year earlier and who filled me in on admission practices in the US. According to him, most of the people in the theoretical physics department at Princeton did not even take the GRE - which was listed as a requirement.

Instead, they were being accepted based strictly on references from their undergrad professors. And because people from the top grad schools knew people from the top undergrad schools, their references were trusted far more.

I, on the other hand, was completely unaware of the relative importance of the references, so I got them from random people who were unknown to the admissions committees, and so my application was roundly rejected - the letters from all the other schools arrived a bit later. (UPenn later reversed its decision and accepted me - they must have read my letter and decided that a guy so naive would end up living under a bridge were they not to save me).

But I digress.

This system of course leads to very high rate of false negatives - essentially, first strike out. Also, not surprisingly, it results in some amount of false positives as well - once inside, you have a larger than normal share of opportunities to recover from previous failures - regardless of your skills, that degree from Harvard will keep opening doors for you years from now.

The system is obviously sub-optimal, so over the years we tried our best to fix it. Microsoft pioneered the concept of a coding interview - where people are forced to write code on the whiteboard as part of the interview process, a system that is now standard almost everywhere.

This is infinitely better than hiring people based strictly on their resume and some feel-good conversation during the interview, but it is not entirely fool-proof.

First, it emphasizes a set of skills - specifically, algorithm design - that are not all that frequently required in the actual jobs that people do. Ask yourself, how many times have you had to write quick sort, a hash table, an AVL tree, or a read-write lock implementation at work (and if you did, I would really be interested to know why :-)). The vast majority of people - especially those who work on maintaining very large code bases - like Windows - simply do not get the opportunity to exercise their algorithmic muscle very often.

Yet these are all fair questions during the interview. Don't get me wrong - I BELIEVE that these are fair questions, and I ask them myself: even if you don't need to code hashtables every day, it still helps to know how they work.

For example, I ran into a situation at a hiring committee at Google where an interviewer was unhappy that the interviewee could not find an O(1) solution to a problem that - as far as I was concerned - did not have one. When I asked how exactly could one solve it in O(1) time, the person said - why, by using a hash, of course! Put a person who is convinced that a hash always exhibits O(1) performance on an OS component, and you are in for a number of interesting and intractable performance bugs down the road.
So knowing algorithms and data structures well is very important, but our jobs are not preparing us for that. Which is why I found that often hiring a person from college - especially, an elite college - is easier than hiring a person from the industry - they have not yet had time to forget the theory.

Also, knowing the things that are testable in the coding interview - algorithms, design practices, etc - is necessary, but not sufficient for engineering stardom. I've seen - and hired! - a number of people who were fantastic during the interview, but were very ineffective when they needed to deal with a real engineering problem - such as fixing a complex bug in a large system in a way that does not break existing functionality. To this day I have no idea how to do a practical examination of this skill in an interview setting!

Meanwhile, the candidate pool is huge, and the resumes are mostly BS. I once did an experiment. We needed a contractor for web development - AJAX, Javascript, things like that. So I took every person who claimed to be an expert in Web development on the resume that the contract agencies gave me for a standardized test - about 20 people in all. The test was not very advanced. The questions were "What is a closure in Javascript?", "What is the difference between = 'bar'; and = 'bar';", etc - introductory stuff. The best person on the test scored 25%. The average was below 10%.

Therein lies the cornerstone problem of the software industry, which I will summarize thus:

1) Success of a software company is dependent on hiring great people. A star software engineer is an order of magnitude more productive than an average engineer, but costs only marginally more. Thus there is a huge incentive for companies to hire the best of the best.

2) Many qualities of the best of the best engineers are very hard to measure directly. We can test problem solving, knowledge, and design skills during the interview, but we are forced to rely on a candidate's word when it comes to equally important qualifications such as ability to face ambiguity, passion, leadership, and working with others.

3) There is a huge pool of candidates. The resume databases at the top companies have literally millions of entries. Most of the resumes are "enhanced" for "quality".

4) Vast majority of the candidates are not... well, they are not in the top 10% :-). They may be enough to perform a lot of the jobs adequately, but they are nowhere near the productivity of the best of the best (see #1), and once the job for which they were hired is done, they may not be easily transferable to a different one.

5) Firing someone from a big company requires a lot of work, is a very lengthy process, and so the cost of false positive is very high - both in the company's bottom line as well as in morale of the team.

So... what do we do? How do we even screen the resumes - in a fully meritocratic system that pays no attention to selectivity of the previous places of work or study - given the pure amount of BS in an average sample?

The existing system that relies both on the selectivity of the environment as well as on the interview performance works in the sense that everything else we tried was worse.

After all, I do not ever see anything near the stuff people reported on this thread at work: As a matter of fact, even things worthy of the Daily WTF ( almost never come up - at least as long as you are staying inside MSFT product development teams.

This does not mean that the system cannot be improved further - but how? Your ideas are welcome in the comments.

Monday, November 21, 2011

How to clean up the Internet (via Reddit)

"She tells me one day her husband is a really great guy because he spends his free time helping to "clean up the internet."
I ask her what she means and she told me she found a bunch of porn in husbands web browser history. He informed her that he goes to porn sites to download the porn off of the internet servers onto his computer so that he can delete it. Apparently there's a lot of porn on the internet, but he was trying to do what he could to remove as much of it as possible - for the children and all...

She actually believes that he is doing this and uses it as a bragging point to show what a great guy her husband is in conversations."

Friday, November 18, 2011

How to sign device drivers with a test certificate

This: has a long, unwieldy explanation.

Here is a much simpler, step-by-step protocol:
1) Run the following from an elevated CMD window (RunAs Administrator):
    bcdedit /set testsigning on
    bcdedit /set nointegritychecks on
2) Reboot
3) Make a certificate. From a DDK command line window, type:
    makecert -r -pe -ss MyTestCertStore -n "CN=MyTestCert" mytestcert.cer
4) From an elevated CMD window
    certmgr -add mytestcert.cer -s -r localmachine root
    certmgr -add mytestcert.cer -s -r localmachine trustedpublisher
5) From certmgr window that just opened in step one or two (or type certmgr):
  a) Right click on Trusted Root Certification Authorities -> All Tasks -> Import
  b) Navigate to the cert file you have just created in step (3) (mytestcert.cer).
  c) Say "yes"
6) To sign the driver:
    signtool sign /n MyTestCert /s "MyTestCert" yourdrivername.sys

Why can't our documentation people produce something similar???

Tuesday, November 15, 2011

The new Free Speech protocol

Citizen! It has recently come to our attention that you wish to exercise your first amendment freedoms. In order to ensure compliance with Free Speech Safety standards please obey the following rules to ensure that your protest in conducted properly.
  • You can exercise your rights in a designated Free Speech Zone. Anyone who is caught outside specified zones participating in a free speech action will be beaten and jailed.
  • You must apply for a permit to designate a Free Speech Zone. To apply for a permit please contact the Board of Permitting and Public Safety. It is expected that you will have your sanitation, safety, education, environmental impact and concessions permits before applying. Anyone found participating in a free speech action without a permit will be beaten and jailed.
  • Free Speech Zones operate between the hours of 9am - 5pm, anyone caught participating in a free speech action outside of those times will be beaten and jailed.
  • All citizens participating in free speech actions must be properly dressed to identify themselves to authorities, corporate representatives and interested third parties. These uniforms can be purchased at several Free Speech Distribution Authorities located throughout your community. Anyone caught participating in a free speech action without proper attire will be beaten and jailed.
  • No items will be allowed to be carried into the Free Speech Zone. Anything that is not attached directly to your person or is out of compliance with the standard Free Speech Zone attire protocol will confiscated before entering the Free Speech Zone. Those caught with foreign items are subject to beatings and possible incarceration at the officers discretion. Any property confiscated will be promptly destroyed.
The first amendment is important to us, and we hope by obeying these simple rules you can make our community a safer and happier place.

Good luck with your free speech action!

Saturday, November 12, 2011

How I Stopped Worrying and Learned to Love the OWS Protests

"I originally was very uncomfortable with the way the protesters were focusing on the NYPD as symbols of the system. After all, I thought, these are just working-class guys from the Bronx and Staten Island who have never seen the inside of a Wall Street investment firm, much less had anything to do with the corruption of our financial system.

But I was wrong. The police in their own way are symbols of the problem. All over the country, thousands of armed cops have been deployed to stand around and surveil and even assault the polite crowds of Occupy protesters. This deployment of law-enforcement resources already dwarfs the amount of money and manpower that the government "committed" to fighting crime and corruption during the financial crisis. One OWS protester steps in the wrong place, and she immediately has police roping her off like wayward cattle. But in the skyscrapers above the protests, anything goes."

Wednesday, November 9, 2011

Puzzling heap performance in CRT under Visual Studio

So I'm running this app which does reasonable amount of allocations using standard CRT functions. It's compiled for release, but I was running it under Visual Studio. Absolutely terrible memory allocator performance, especially the one in free: 89 seconds to free just under a million structures (the app uses just two sizes - around 32000 2056 byte structures and about 900000 36 byte structures).

Generating: 1000000
Adding: 951 ms
Port: 10000, total ip entries: 994937, total memory used: 3673124
    Entries at level 1: 128 (blanket: 0)
    Entries at level 2: 32768 (blanket: 0)
    Entries at level 3: 885384 (blanket: 3975)
Testing the ip ranges... 0
Testing ip ranges: 27378 ms
Total ips allowed across all ports: 2012537
Removing: 71355 ms
Port: 10000, total ip entries: 0, total memory used: 0
    Entries at level 1: 0 (blanket: 0)
    Entries at level 2: 0 (blanket: 0)
    Entries at level 3: 0 (blanket: 0)

Fine, I replaced the allocator with my own, using fixed memory blocks (described here: The object removal time promptly dropped to just above 200ms. Ok. Then, on a hunch, I run this - with the standard CRT allocator - from the command line:

Generating: 1000000
Adding: 281 ms
Port: 10000, total ip entries: 994937, total memory used: 3673124
    Entries at level 1: 128 (blanket: 0)
    Entries at level 2: 32768 (blanket: 0)
    Entries at level 3: 885384 (blanket: 3975)
Testing the ip ranges... 0
Testing ip ranges: 27831 ms
Total ips allowed across all ports: 2012537
Removing: 249 msPort: 10000, total ip entries: 0, total memory used: 0
    Entries at level 1: 0 (blanket: 0)
    Entries at level 2: 0 (blanket: 0)
    Entries at level 3: 0 (blanket: 0)

Come on, Visual Studio guys. Release should mean RELEASE - not a debug heap!

Interestingly, I ran it under profiler after this, and the profiler does the right thing - it uses the fast heap. Otherwise the results would be really, really screwed. It's a good thing - if you happen to use the profiler or at least run it outside the environment before pursuing the problem that does not even exist!