This seems to be a marked different from the rest of the world - even from as close to home as computer hardware. Your monitor may occasionally have "bugs", but it is an aberration rather than the norm. Yet rules are somehow different for the software. How come?
There are several roots to the problem.
We ship buggy code because we can. The reasons are two - first, the software industry is relatively young. There is still certain amount of forgiveness that can be expected from the users - as long as what it does is indispensible, and solves their essential problem, they are willing to cope with the lack of quality - as long as there is no alternative.
Second, software industry is not awfully competitive (which more or less guarantees that there is no viable alternative to a lot of software products). Unlike most of the physical goods, software has virtually no manufacturing and physical distribution costs - making and shipping 10 million copies is as easy as shipping 10 thousand. Which leads to natural development of monopolies and oligopolies (I am omitting a whole lot of discussion on how exactly this happens - there are volumes and volumes written on the subject).
Having little competition at all means that there is even less competition based on the quality of the product. So there is little incentive to improve it beyond the point where the product is usable.
Most projects follow a relatively rigid model of product development: planning, followed by design, followed by development, followed by testing and stabilization.
The final dates are often inflexible, and even when they are, the initial stages of the project tend to expand to fill the extra time.
The result is, when it comes time to ship, the last stage - which happens to be stabilization - is cut short. Bang! The product ships with a bunch of buggy features.
In Windows Home Server we had modified it process to do it per feature. Every feature had its own design-implement-stabilize pass, and the developer did not get to working on the next feature before the previous was all done and stabilized.
This worked wonders on smaller features, but of course for every release there are a few features on the critical path - they are big enough to fill all the time alotted for the entire version. They also often define the product itself. For these features this model does not help at all - they still end up having their testing phase cut short.
The sheer magnitude of the problem
"The deparment's motto was, "Comprehending the infinity requires infinite time", from which they derived a curious result - why work?" - A & B Strugatsky, "Monday Starts on Saturday"
Vista has on the order of 50 million lines of code, which in turn depend on amazing number of variations of hardware on which it runs, and software that runs on it.
While running the QA for Windows Home Server in the beginning of the shipping cycle, I quickly learned (and my experience at Google where almost all tests are written by engineers corroborated it) that development of just the unit test code takes approximately twice the amount of resources as the development of the code under the test.
If you tack on the costs of all other programmatic testing (stress, long-term, environmental testing, integration testing, etc), that adds another 300% on top of it. So really, truly exhaustively testing an application programmatically costs at least 5 times as much as writing it in the first place.
This of course is in the cases where you can do it programmatically - the manual testing is cheaper upfront, but you have to repeat it over, and over, and over again, so it adds up quickly - probably to about the same grand total. Since the long-standing argument between the proponents of automation vs. manual testing stands unresolved, I suspect that there is no statistically proven cost difference between the two methods.
For the bigger projects like Windows this cost becomes truly monumental!
Why is testing is so expensive?
It's because of what is called a test matrix - a set of tests that need to be written and run, which is a Cartesian product of number of all potential code paths (code coverage) by all potential data inputs (data coverage) in all potential environments. Just the code paths problem is combinatorically divergent, and the input set is for all practical purposes infinite for all but the simplest applications.
Most test organizations survive by heroic efforts to establish equivalency relationships between the test cases which allow them to prune the test matrix. Then they prioritize what's left to fill the time that they have by the most important of the test cases left over from the equivalency pruning.
This guarantees that the common case is relatively bug free, but when you step out of the common usage scenario - welcome to the bug farm!