Dumping a WordPress site
I had a WordPress blog set up at my last job, and it's definitely fun to run a dynamic website. But once I left I didn't feel like maintaining it (in particular, watching for crackers and comment spammers), so I decided to make it static. It's definitely doable with a little with a little MySQL- and unix-fu.
I found an excellent howto on the subject by Ammon. There were a few tweaks (which I left in the comments) but the major work is there.
One big annoyance was that I had moved out before archiving. That was particularly troublesome because the database server for this particular site was a FreeBSD box under my desk. That box isn't connected to the internet at the moment, and if it were, it wouldn't have the same IP address anymore. Luckily, I routinely backed up that database, and so I fired up a temporary MySQL server on my laptop and changed the wordpress configuration files (nice DRY--I only had to change three lines, and two of them were avoidable had I kept the same username and password).
The part I couldn't figure out next was how to change the site to insert boilerplate text in every page explaining it was an archive and they should go to my new site. I think it's a testimony to the flexibility of drupal, my new content management system, that I already know how to do such a thing in that framework. Just make a block and make sure it goes in the content region. But WordPress keeps its custom widgets and gewgaws on the sidebar, which wasn't prominent enough for me. I ended up having to hack the theme file to hard-code the message in the page header. I still believe there's a "right" way to do this, but all I was going for was a dump of the site so I'm not going to apologize for forking. :-)
(A solution outside of WordPress would be to parse every HTML page after dumping [with XSLT or even a regular expression-enabled scripting language like perl] and insert the boilerplate on every page. I've done that with dynamic sites that have already been archived. But that seems unfair to ask since I had an active, dynamic CMS working for me already.)
After that, wget is your friend. This little command line tool has dozens of options to fetch web pages and even whole websites. Luckily Ammon had a sample usage that I could copy. Although adding the "-k" flag to correct links was needed for me.
Then you get rid (after backing up) of the dynamic scripts and moved the archived material into their place. And that's it!
This is the kind of task that you almost feel justified spending five hours on if you can spend another 30 minutes writing a blog post about it. But I'm glad it's done.