On the Art of Debugging Software

Excerpt from Mager, Troubleshooting the Troubleshooting Course, 1982:

A 1979 study by Cutler (Problem Solving in Clinical Medicine) made an observation about the importance of probability information by offering three maxims for diagnosticians:

  • Common diseases occur commonly.
  • Uncommon manifestations of common diseases are more common that common manifestations of uncommon diseases.
  • No disease is rare to the person who has it.
It is interesting to translate these maxims into the language of equipment and troubleshooting. They come out this way:

  • Common troubles occur frequently.
  • Unusual symptoms of common troubles occur more often than common symptoms of uncommon troubles.
  • No trouble is rare to the client who has it.
Mager’s book is mostly about equipment troubleshooting. More specifically, training courses on troubleshooting, their flaws, and how to fix them. The anecdotes in the book deal with the diagnosis of appliances, manufacturing equipment, radars, etc., but I was delighted by how relevant they are to diagnosing issues in software.
Most software engineers have a crazy bug story or two (or twenty) but it’s rarely a true “crazy bug.” It’s usually a typical bug like a race condition, off-by-one, uninitialized variable, memory leak, misinterpreted API, etc. that manifested itself in an extremely odd way.
Updating Cutler and Mager:
Uncommon behavior resulting from common software defects occur more often than common behavior of uncommon software defects.


Comcast’s xfinitywifi secondary access point that steals bandwidth and ruins your WIFI experience

About two weeks ago I noticed my wifi performance at home went to garbage. When I first got the router I was getting 20-30Mbps but now I was lucky to get 250kbps. TCP connections were constantly retrying and frequently timing out. Something changed. Figuring my neighbors might have set up a new AP I opened up Maverick’s Wireless Diagnostics tool and started scanning.

What I saw shocked me. There was a new network, one BSSID away from mine, named ‘xfinitywifi’ broadcasting on the same channel, at the same signal strength. That could only mean one thing: someone had hacked my wifi router and somehow got it to broadcast a second network so they could steal my bandwidth.
Thinking first that I was hacked, I reset the modem to factory settings and picked new passwords for everything. I ran some tests and confirmed that my wifi performance was back to what it was the day I got it. Things were fine for a few hours but then the xfinitywifi AP reappeared.
Google search for ‘xfinitywifi’. I’m not the only one. Lot’s of angry people. Others were noticing the same thing, and the culprit was Comcast. Searching more I found that supposedly you can disable it:
We encourage all subscribers to keep this feature enabled as it allows more people to enjoy the benefits of XFINITY WiFi around the neighborhood. You will always have the ability to disable the XFINITY WiFi feature on your Wireless Gateway by calling 1-800-XFINITY. You can also visit My Account at https://customer.comcast.com/, click on “Users & Preferences”, and then select “Manage XFINITY WiFi.”

But when I followed these instructions I received the classic lazy programmer’s, “an unexpected error has occurred, please try again later” error message.

So I called the number and was able to speak to a human being after being on hold for 9 minutes. What they said reflected the experience of other’s who’ve tried similar.
At first she denied that this was going on. I said that my router was broadcasting a second BSSID with the network name “xfinitywifi” and I wanted them to disable it. She said (not joking): “Are you sure? How do you know?” I said I have diagnostic tools and when I power the router off the second network goes down with it.
She still didn’t believe me. She wanted to run me through a set of standard “put the O.N.O.F.F. switch in the O.F.F. position then back in the O.N. position” tests. I interrupted and said I knew it was a Comcast configuration, “you have a button on your website to disable this but it’s not working, I get an error.”
Long pause. Then in a tone that said, “why didn’t you say that the first time” she says “Ohhh… the XFINITYWIFI network… Why would you want to disable that? That’s a free service that we provide. You don’t need to disable it.” Then she pauses, expecting me to say something in return.
My turn to pause. Having practiced these types of conversations in the past, I paused for about 5 seconds, then let out a slow, exasperated sigh. I said, “I dunno.. I dunno what to say. I’m speechless… So.. How can you do this? How can you share my bandwidth without my permission? Where did I agree to this? How is this legal?”
She then tells me that she doesn’t have permission to disable it. Only the wifi support department has that capability. I’m put back on hold.
Imagine at this point my frustration. While on hold I go back to the website and keep reloading the page to disable the feature and finally on the 5th try it lets me disable it. The changes take effect after a minute, the BSSID is gone. I hang up the phone.
30 minutes later I check the Wireless Diagnostics tool again. The BSSID ‘xfinitywifi’ is back. Go back to the website, “unexpected error” yet again.
Yup.
So it automatically re-enables itself. Which means the choice is either a) buy a different cable modem that doesn’t support this garbage or b) move to a different Internet provider.

I’m 110% for community wireless initiatives. I believe free or highly affordable public access to the Internet would be good for people and better for the economy. In 2002 I co-founded cafwap.net, a small project that tried to blanket as much of the Corvallis, Oregon area in free wifi as possible.

What bothers me about xfinitywifi is a) they didn’t ask my permission to steal my bandwidth b) broadcasting a second AP on the same channel is just dumb and c) they’re charging for this service. I would rather have an open AP and use MAC filtering to limit bandwidth.

You should cook using cast iron

It’s been about 5 years since my wife and I switched full-time to cast iron cookware. We’ve thrown out all of our Teflon and all that remains that’s not cast iron are a few stainless pots and sauté pans.
If you haven’t made the switch yet, you should make the switch:
  • Easy to clean. Just quickly scrub under hot water with a stainless steel scrubber and wipe dry. A few times a week rub a few drops of oil in with a paper towel. Stainless spatulas are also easier to clean than the plastic ones you need to use with teflon.
  • Easy to cook with. Cast iron heats incredibly evenly, meaning your pan won’t have any hot spots. Apply a little bit of oil before cooking and you’ve got a perfect non-stick surface. Fried eggs, omelets, steaks are all wonderful when cooked in cast iron.
  • No health or environmental concerns. Teflon breaks down over time and gets into your food, stainless can warp or rust if mishandled. Cast iron is virtually indestructible. Even if your cast iron rusts you can scrub the rust off, reseason it and it’s as good as new. Your children’s children will inherit your cast iron.
  • Highly affordable. Cast iron pans are about half to a third the cost of equivalent non-stick pans and when you consider they last forever it’s an even better bargain.
See also http://www.macheesmo.com/2010/07/ten-reasons-for-cast-iron/

Embedding images in HTML email for Outlook

#!/bin/sh
echo "Content-Type: multipart/related; boundary=\"boundary-example\"; type=\"text/html\""
echo
echo "--boundary-example"
echo "Content-Type: text/html"
echo
echo "<h1>Email</h1>"
echo "<img src=\"cid:image.png\" alt=\"image\">"
echo
echo "--boundary-example"
echo "Content-Location: CID:something"
echo "Content-ID: <image.png>"
echo "Content-Type: image/png"
echo "Content-Transfer-Encoding: BASE64"
echo
base64 /tmp/image.png
echo "--boundary-example--"
./email.sh | sendmail some@emailaddress.com

Mount a Windows CD on a Mac like a Windows PC would see it

So you’ve got a PC without a CD-ROM drive. (Err, DVD-ROM, DVD-RW, whatever). Or you bought a new PC motherboard without realizing it lacks an IDE port for your antiquated DVD player. Then you bought some software on physical media like it was the 90’s. (Perhaps you got a great deal on tax software, for example). So you have a Mac that has a DVD drive, and you’d like to copy the installer off the Mac but when you insert the CD into the Mac it mounts the Mac partition, not the PC partition.

Drop to the command line and figure out where the CD is getting mounted:

$ diskutil list

Make a temporary directory to mount the image:

$ mkdir /tmp/mnt

Mount the iso9660 partition at that location:

sudo mount -t cd9660 -o nodev,nosuid /dev/disk3s1 /tmp/mnt

Replace /dev/disk3s1 with your CD’s location.

Silent Privilege and Bias

Philip Guo has an excellent article on slate.com today about his experience as an Asian American software engineer.

    Even though I didn’t grow up in a tech-savvy household and couldn’t code my way out of a paper bag, I had one big thing going for me: I looked like I was good at programming.

I admit I’ve had predisposed biases about people before getting to know them. It’s human nature, it’s the way our brains work. The world is a complicated place and our brains need a way to quickly categorize what we experience or else we would be overwhelmed.

As a hiring manager, it’s a part of my brain that I try to shut off when I’m assessing someone’s technical skills–but it’s tough. I’ve even played games in the past where I covered up the person’s name before I reviewed the resume to see if that altered my impression of them. But over the years I’ve encountered enough individuals that violate any kind of stereotypes I had that it’s unwound most of them.

People always talk about race bias and gender bias, but something that surprised me when I first encountered it (in myself and others) was experience or education bias.

Tess Rinearson in this article talks about the “technically entitled,” the programmers that boast about how they’ve been programming since they were 6. You would think that someone who has been coding for that many years would be amazing, right? In my experience that’s not always the case. I’ve had candidates tell me on the phone they’ve been doing C++ since they were in middle school but when you dig into it they can’t answer simple questions about the language. Me personally, I started coding at a very young age but I know quite a few people that didn’t start until they were in college and they are way better software engineers than I am. If you assume there’s a correlation between experience and ability you could run into trouble.

What’s especially surprised me talking with and interviewing folks from different colleges and universities around the country is its dangerous to assume there is a correlation between education (school and/or GPA) and programming capability. You would think Stanford has an amazing computer science program, being in the heart of silicon valley. Anyone with a Stanford CS degree must be amazing, right? Well… I’ve interviewed Stanford grads that could not explain some of the most basic concepts about how an operating system works. But I’ve also interviewed Stanford grads that during the interview taught me new things about how operating systems work. So you can’t infer anything about ability from education either.

Race, gender, experience, education.. what inferences can you make about people then? None, really.

GMO food

The New York Times has a great article on the fight over banning GMO crops on the big island in Hawaii. It covers the science, the pseudoscience and the hysteria over what GMO food supplies might do to people and the environment.

This prompted me to revisit a blog post I wrote in 2007 on cloned cattle: FDA says cloned livestock is safe to eat. I had written:

    What happens when 10%, 20% or maybe even 50% of our beef comes from the same DNA “mother cow”, or possibly a small genetically similar group of cows? It seems like then it would just be a matter of time until a virus or bacteria strain crops up that has adapted to exploit some weakness of that cow, and then it spreads like wildfire throughout our cattle. But what if that virus was undetectable some how, and turned out to be the next Mad Cow Disease?

Reading that again I sound a little hysterical. To clarify, by “just a matter of time” I was thinking a span of decades or centuries. But I do still have the same concern and I haven’t heard anyone address it yet (at least not in any media sources I follow). If we dramatically decrease the genetic diversity of a crop (animal or plant) could that introduce a single point-of-failure on our food supply?

I’m not convinced there is a health risk with GMO food to a single consumer or group of consumers… but is there a risk to the industry producing that crop?

To illustrate, I will actually get a little hysterical: Imagine 50 years from now Golden Rice is a huge success and its being grown everywhere. It now makes up 99% of the world’s rice production. (The other 1% is non-GMO organic sold only at Whole Foods and other stores that only 1%’ers can afford to shop at ;). So basically all of the world’s rice is now genetically very similar. Now imagine there is a random mutation in a pathogen that effects rice (like RGSV — Rice Grassy Stunt Virus), and this mutation effects Golden Rice particularly bad. Because there is so little genetic diversity the virus could probably rip through the world’s rice fields faster than we could control it. This would be an economic and human disaster.

This is probably a far-fetched scenario, but it is my one concern with GMO crop production.

VirtualBox DKMS with custom kernel

I have a custom kernel on my linux machine (that I apparently did not install correctly) so when I went to use VirtualBox I got an error about DKMS not being ready. The docs say I needed to update my “virtualbox-ose-dkms” package to the latest version but when I did that it failed to install, saying the module could not be built.

Running the reconfigure command manually gave me this error:

$ sudo dpkg-reconfigure virtualbox-dkms
------------------------------
Deleting module version: 4.1.12
completely from the DKMS tree.
------------------------------
Done.
Loading new virtualbox-4.1.12 DKMS files...
Building only for 3.2.37-32corexeon
Module build for the currently running kernel was skipped since the
kernel source for this kernel does not seem to be installed.
* Stopping VirtualBox kernel modules [ OK ]
* Starting VirtualBox kernel modules
* No suitable module for running kernel found [fail]
invoke-rc.d: initscript virtualbox, action "restart" failed.

What you’ll find online if you search for this error is its missing the linux-headers-`uname -r` package. I had it installed, but it still wasn’t finding it.

The issue turned out to be my /lib/modules/3.2.37-32corexeon directory was linking to the original location where I had compiled the kernel, but I had since moved those files. I updated this symlink and then it worked:

/lib/modules/3.2.37-32corexeon/build
-> /usr/src/linux-headers-3.2.37-32corexeon/

Thanks *nix notes!

Continue reading “VirtualBox DKMS with custom kernel”

Theresia Gouw on organization structure

In “Taking Our Company to the Next Level,” a lecture in the Entrepreneurship Through the Lens of Venture Capital series, Theresia Gouw is talking about organization structure in software engineering companies. Her thesis is: have an org structure, but keep decision making flat. There’s a rule of 5 with direct reports, and a rule of 5 with teams.
I cracked up when, in the middle of this, she offers this caveat:
“…well, this is in software engineering. Obviously if you’re building the next spacecraft and you’ve got significant complex systems it would be different.”
Overall, her points do align with my personal experience:
“You get the most out of people when there are around 5 or 6 people on a team, in terms of productivity. And happiness, when you do the surveys. For what its worth.”