a very rambling and awfully incoherent saturday writing, done without the benefit of breakfast, or even lunch.

I worry, perhaps, that the Internet is getting too good at what it is trying to do, which is to say, feeding people interesting information as soon as it happens; putting them in touch with interesting people and offering zero-latency discovery, communication, and dialogue.

Why would I worry about this? Because I feel it deadening my own capacity for deeper reflection upon the world. It’s the very reason for which I do not watch television. Television is simply too effective at capturing my attention – everything in me directs me, forces me to stare at this glass box with moving pictures. I resent such animalistic string-pulling, so I don’t have a TV and don’t watch one. In the same way that the television is graphically titillating, the Internet is information-titillating.

Every single random, weird, or funny thing that happens to anybody, anywhere, all within moments of it happening; it’s infinitely entertaining, and yet, infinitely dull — I do not know these people! And the rate at which the information comes in leaves little opportunity for quiet reflection upon their import.

This is the quintessential divide between knowledge and wisdom.

There is much money to be made in knowledge transfer – people will pay good money to consume information they are interested in. But a deep understanding of how to apply these facts to everyday life, how to actually live a fulfilling and active life – that has very little profit in it, because happy people without needs don’t tend to buy as much as people who are convinced that going shopping really is a good way to relax and enjoy themselves.

The goal would then be to create ever more information that is ever more valuable to a person while at the same time, giving them so much that they have no time to digest it. “The unreflective life is not worth living,” it is said, and yet this is what we are driven towards as information gluttons.

It is in the production of knowledge that we learn most; a subject is not deeply understood until it can be taught. For every hour we sit passive, consuming, we ought spend two thinking, producing. Even if noone is to read our words, hear our music, admire our paintings, the creation of them alone and the consideration put into their conception will help us think more critically about that which we consume and, in that endless cacophany of information, will give us a voice by which we may uniquely be heard.

work isnt wolf In a wood will not escape

I had a weird experience today chatting with a hacker on ICQ who had taken over my friend’s account. The conversation’s too weird to have made up:

343907: David do you want a digital copy of PC Magazine?
Think: ?
343907: Who such, why in a cap, why without a cap?
343907: russian joke
Think: er…i’m not sure i get it.
343907: Read the alphabet Study to think on russian
Think: This isn’t Phantom Joe, is it.
Think: why are you hacking his account?
343907: because I am russian HACKER
Think: you are lame
Think: you’re just being rude, not cool.
Think: it’s not like ICQ account-nabbing is difficult, either.
343907: ???? i` dont understend you
Think: you’ve gone and messed up this poor guy’s account
Think: he’s a nice guy – an australian.
343907: i`m badboy from odessa
343907: Terribly?
Think: no, but he’s had to go sign up for a new account and try to remember all of his friends’ UINs. it’s just a pain, that’s all.
Think: there are much cooler ways to use your talents.
Think: you live in Odessa?
343907: yes
Think: that would make you a Ukranian hacker, not a Russian hacker. 😉
343907: In soul I russian !! Odessa only there is on Ukraine and city this Russian
343907: Well what was frightened?
Think: frightened?
343907: Fright
Think: who is frightened?
343907: you
Think: of what?
343907: because I Ukranian HACKER
Think: [sigh]
Think: absolutely shaking, yes.
Think: truth be told, i think you are a little kid who is just desperately trying to impress others.
Think: you’re talking with me because you’ve got nothing better to do with your time.
Think: i think you should just hand the account back over to my friend and put yourself to productive use.
343907: Fuck OFF
Think: i’m sorry, child.
Think: get a job.
Think: there are better ways to make money and friends.
343907: work isnt wolf In a wood will not escape
Think: eh?
343907: [russian text]
Think: i’m sorry, but my client doesn’t display Russian, nor do I understand Russian.
343907: Search the interpreter
Think: i’m sorry, but i’m at work and am a little busy.
Think: i do hope you’ll use your talents in a more productive fashion
Think: than annoying nice people that you don’t know.
343907: ok

Server Issues (2005)

Some of you may have noticed that was down last week for several days. Email to me bounced and there was general havoc in my online life. What happened? Did I get slashdotted? No.

I happened to notice that my email wasn’t working Tuesday afternoon. (I had been away on Monday.) I tried going to my website; nothing. I tried pinging my computer — clearly down. I biked over to where I keep my server, and that’s when I noticed that something was really wrong.

The power supply in my server had died hard at about 5:30pm on Monday night after over three years of faithful service. In plainspeak, it was out cold; not even the slightest response to pushing the power button. (This has nothing to do with the power problems on my Compaq client.)

I tore out the harddrives (lacerating myself several times and breaking a spare cable) and biked back to my dorm fast as could be. I got a spare server out from under my bed (doesn’t everyone have one of those?) and set to work transferring the soul of d.w.o to the new box.

As it turned out, I wasn’t entirely successful in transferring the operating system; just the website itself. This wasn’t too bad, because it forced me to upgrade my server components to the latest versions of things. It took me three days to get the whole shebang up and going, and even now my email is still a little weird (messages don’t get properly time/date stamped!), but the whole joint is (roughly) working.

RPM Find was an absolute godsend at finding the different files that I needed. There seem to be a good number of tools for helping someone update their server, although I’m a little surprised that they haven’t taken the next step in automation. (“Update Server?” *click*)

I spent too much time administrating, though, and am now looking to completely outsource all of my hosting. Unfortunately, it’s difficult to find a provider that can offer all of the flexibility that I have in running my own server at the same time as being cheap. I’ll let you know when I jump on anything definite. =)

In the meantime, it’s back to writing a cryptographically secure Instant Messenger, writing a book, learning TAPI, and structuring web data!

Bypassing Ad Blocking

This article describes modern ad-blocking technique, their effectiveness, and how advertisers are likely to work around them.

I’ve had some great degree of success in using the FireFox AdBlock plugin to pretty much wipe all ads from my web browsing experience. When I see an ad now, it’s a bug that I can fix easily – I just right-click on the image or ad frame, pick “AdBlock”, and the URL for the image/frame comes up, usually looking something like:

I’ll then replace all of the ad-specific bits with stars to create a pattern match:


What I’ve done is in one fell swoop eliminate any ads at all from an ad provider. Truth be told, there are only about a dozen big ad providers like this, so with a dozen entries in your filter list, you’ve already blocked a substantial percentage of ads on the Net.

Another class of ads is served by a site from generic ad-serving software. So if you’re on, reading some cool content, you might see some ads like:

These are blocked with still fairly simple filters:*

The issues really start coming up when sites deliver pages that intermingle real content along with advertising within the delivered HTML. As long as the client has to do separate and distinguishable work in fetching valuable content versus fetching an ad, it will remain fairly easy for people to write ad blockers.

So what the server really ought to give the client is either ads already baked into the HTML (Yahoo already does this in placing ads for itself on its own properties!) or URLs for ads that are made indistinguishable from the URLs for content.

In the former method, the server performs the ad-fetch itself and injects the results into the returned document as part of a seamless, singular HTML file. This works fine for text-only ads, but to ensure the images render properly, they will have to be located at URLs indistinguishable from “real” images by a simple pattern match.

One solution is to have hashed / random directory and file names, tracking on your server which are ads and which are content. So a site could have images as follows:

the ad:

If you want to block ad images, you’ll have to block all images.

Now, with a finite number of ads, users could still in theory assemble a relatively comprehensive list of content vs. ad images and even coordinate these lists in realtime (so only the first user to see an ad would have to endure it). So ideally, the URLs would be session-based. Namely, every visitor to your website sees a different URL for fetching ad and content images, even though your webserver is internally mapping them all to the same images. As a side note, content images and ad images should be of the same format and size – some rudimentary ad blockers simply block out ad banners that match an industry standard ad format.

The latter approach of having client-loaded ad content is analagous to the image-cloaking above. Separate frames or loaded JavaScript should be at changing URLs that overlap with the same patterns as content. With JavaScript another option is to bake in the ad code with other critical site scripting (like navigation) in the same .js file. This makes it harder to block.

Ultimately, I think that the web will follow TV trends, where advertising becomes more thematically baked into commercial sites. If advertising is truly visually and programatically separable, it will be separated, and site operators will end up operating at a tremendous loss. To avoid this, and to save free commercial services, I think we’ll see operators deploying techniques like this commensurately with the rise in popularity of sophisticated ad blocking tools.

In some twisted sense, both sides want this arms race to stop. People who already have a good ad blocking solution don’t really want to many other people to catch on – otherwise the gig’s up as content providers are forced to bypass users’ blocks. They’d like to remain an elite of people getting an ad-free experience. And the advertisers certainly would rather not have to dramatically ramp up their spending on technology to outwit the blockers; they like the status quo just fine. Frankly, as long as quality ad blocking requires pattern matching comprehension and installing FireFox with a plugin, everyone’s probably fine. The issue will be if next-gen browsers try to make ad blocking technologies more accessible – a short-term win but a long term unpleasant war.

The PS3 & Blu-Ray

The PS3 will be significant for Blu-Ray. And vice versa.

The PS2 sold very well in Japan when it was released for a simple but amazing reason: it was one of the cheapest DVD players on the market. And oh, by the way, it could also play thousands of games and was a next-gen video console.

If Blu-Ray looks to pass muster with the court of public opinion (Goodness knows how people are eager to show off their HDTVs in ways that DVDs just can’t), then the PS3 will probably sell like hotcakes, since the console is likely to be coming out just as the first commercial Blu-Ray players are being launched in the US. This puts those initial non-Sony player manufacturers in the unenviable position of having to undercut the PS3, since it’s not likely that their devices will also be able to play thousands of video games.

Folks considering buying a Blu-Ray player will consider getting a PS3 now, as folks on the fence about a PS3 versus, say, a Nintendo Revolution might be nudged into getting the PS3 just to be able to play Blu-Ray discs.

Microsoft is, of course, playing a similar card with HD-DVD and the XBox360. But I would personally bet on Sony over Microsoft when it comes to consumer hardware: about the only consumer hardware Microsoft has done well has been their mouse. (Apologies to fans of the Microsoft Natural keyboard. And maybe to Halo fans. But no apologies to the clueless monkeys who bought things like the MS cordless phone.)

Like many of you, I’m eager to see what develops when technology giants spend billions to outdo each other in making shiny, fun toys.

David’s Two Rules of Business

I’m still a business newbie, I’ll admit it. I haven’t made a million dollars yet and I haven’t been on the cover of a magazine in a while. (Although I was the subject of a Fortune cover article back in the day.) But it’s been my sense that beyond the typical business schtick that I’m busily trying to catch up on, there are two solid rules for getting ahead with a company that others haven’t really just come out and said. So here goes.

Rule One: Employ Monkey Armies.

As an individual, you are only worth as much value as you create. And there’s only so much value you can create alone! This is why many businesses have more than one employee. To gain leverage, you’ll need minions to do your bidding. Ideally, armies of unpaid servants. The best (and arguably most ethical) way to do this is with servers. Get your hands on some servers to leverage the heck out of yourself. Write scripts to automate your work. Automate everything – let your army of servers take care of things for you.

If tasks cannot be automated by computers and aren’t critical, find bored people on the Internet and make your task into a game. A lot of people like playing games and, like Tom Sawyer, they may end up gladly doing your work for you, producing content and categorizing it, rating it, editing it, all for their amusement and your benefit.

If it’s critical, find someone else to do it. You should deal primarily with either people who are inexpensive (as contractors) to do relatively simple work or with people to whom you’d trust your life to partner seriously. You’d be amazed at the kind of help you can get for $10-$20/hour, even in the Bay Area. In the grand scheme of things, this is not much money. Spend it and give yourself leverage.

Rule Two: “Steal.”

I don’t mean this literally. Don’t run around looting or robbing people. But if there are services, software, or resources that you can use for free, you really should do so. Run on Linux (or *BSD). Use MySQL (or Postgres). Use Perl, PHP, Python, and Ruby. With the tools you will be scaling, make sure they’re free. (It’s okay to spend good money for productivity software or software that doesn’t need to scale, like Photoshop.) Find cheap servers for sale and negociate a good hosting contract. You’ll be in charge of your own cluster in no time. It’s not as hard as you think – just do it!

Google spends huge amounts of money making AdWords easy to use; but nobody said you have to use them longterm. This means that you can very quickly try a dozen different slogans, put them in front of tens or hundreds of thousands of people, see how they fare against each other, and change your messaging dependently. You’ve paid Google about ten bucks for what otherwise would have been tens of thousands of dollars of market research.

Why Blade Servers Aren’t Smart

This article explains why thin form-factor servers like blades may not be a good idea.

You may have heard of the superthin form factor of Internet servers called blades. They’re designed to be packed to the gills in a datacenter, so hundreds of servers can be squeezed into a single rack. This sounds like a good idea for people who are short on space.

The truth is that most people are not actually short on space. Space at a colocation facility, while not free, is very cheap. That’s because most of what actually costs money is power and bandwidth. Power has to be charged doubly since every watt of electricity given a server has to be spent again to pump the server’s heat out of the facility. With the dramatic recent increases in fuel costs, we can probably expect electricity prices to continue climbing at a rate that will make it a significant portion of the cost.

Since the goal is to save money, someone considering purchasing a blade system should ask if it is likely to save them real estate costs (yes, since you can squeeze more servers in), electrical costs (not really, since blade servers are not necessarily more power efficient on a computations-per-watt basis), aquisition costs (MUCH higher per computational unit than for a standard server), and maintainance costs (MUCH higher, since you’re locked into a rare-vendor solution). Plus, since there’s no real standardization on blade form factors (deliberately!), upgrades are going to be expensive and support might simply vanish a year or two down the road if the vendor decides not to engage blades anymore.

Blades offer a mild savings in real estate costs for a huge penalty in upfront costs plus a great deal more risk with support and future expansion. While they’re sexy, blades are just not a good idea.

VoIP Colorizing Logger

This article describes a system for making useful transcripts of Voice over IP (VoIP) chats.

If you already have a VoIP based conferencing system that’s taking and mixing several users’ voice inputs and rebroadcasting the result, it would be handy to have an automated transcriber that could record the conversation in a useful format for later reference. Transcription results could be available on a website after the call, on a website in realtime, streamed over IM in realtime, or emailed out to participants as the call terminates.

A speech parser process is attached to each incoming line, and as each word is completed is appended to a shared buffer with a tag corresponding to the voice line being parsed. Results are colored per the voice line, and preliminary output could be as follows:

Hi. Jim here. Anyone else on this line yet?
Ted here. John is here too. Let’s begin.
Hi Jim.
Great. Are we a go for the presentation tomorrow?
We’ll need to update the Northeast numbers. They’re off.
Mary said she’d fix that tonight. I think we..
We’ll actually just be presenting West Coast numbers.
Okay, great.

After the conversation has been completed, a third party, like a secretary, could go over the results and fill in which color corresponded to which name. This could be especially useful if subsequent conversations were to be stored in the same database – searches could then be performed, such as finding out what Ted said about Q3 numbers in the last few conversations. Having pre-filled “likely participants” available as a drop-down list selection could make quote assignment particularly easy. It is understood that this system might not work ideally for situations where multiple parties are on a single line. But as offices become increasingly virtualized, it’s more likely than ever that none of the participants are actually in a room together during a meeting.