Don Havey

Sustainably harvested information

Archive for July, 2008

Not yet dead  

So I’m crazy busy again, but should have time to do some posting this coming week.

In the meantime, I’m making an open call for project ideas. If anyone out there has a good one that they want me to try my luck at, send me an email or leave a comment. I’m sick of building my own ideas. Think of this as a cheap alternative to going back to school for me, and a low-pressure alternative to being a research professor for you.

The article has

no responses yet

Written by Don

July 24th, 2008 at 12:31 pm

Categories: Etcetera

Tags: ,

Least-used letter pairs  

Least-used letter pairsCuriobot is in the middle of a structural makeover and this morning I decided that the code needed a little condensing. Most popular Javascript applications simplify function and variable names to just a few letters, and some CSS-heavy pages use a similar unique identifier to cut down on bandwidth. This technique also obfuscates your code, if you’re worried about people stealing it and using it for their own (evil) intentions.

In a big old “Web 2.0″ (*gag*) application, condensing your code can reduce your page load times by a serious percentage. I think it’s especially important if you’re offering content for mobile users, where every kilobyte counts. Replacing 300 occurrences of 12-character variable names with 2-character names can save you 10*300 bytes = 3kb per download. That’s probably 5%, maybe 10% of your total script/CSS/HTML weight… so it adds up on high-traffic applications. Of course, as most web developers are quick to point out, I pity the poor bastard who has to try to understand and/or modify a function called “zx()”. Ever taken a look at Google’s scripts? Totally incomprehensible for that reason.

Technically, you could name up to 676 (26×26) functions, variables, or classes using a two-letter pair, but the problem you run into is that when your content is not perfectly isolated from your page’s structure, you can’t simply search-and-replace the page to find every occurrence of the classname “ea” without running into it in a sentence somewhere (”I went to the beach” or “Welcome to my neat-o website”).

For the minimum amount of content-structure collisions, I want to favor the two-letter combinations that are used least frequently in my content. So I wrote a quick Processing app to examine any given text and return an ordered set of letter pairs, sorted according to their frequency. Here are the results for a few sample texts:

The pairs that are not in bold were not found at all in the sample text (use them for function/class/variable name replacements). Those that were found are displayed by frequency.

Here’s the code. Try inputting text that corresponds to your particular usage for better results.

So the next time you’re thinking of renaming “load_excellent_content()” to “le()”… don’t. Try “vq()”.

The article has

no responses yet

Written by Don

July 11th, 2008 at 2:06 pm