What comes after the yottabyte?

I was reviewing the data storage requirements for a project recently which had me talking in terabytes, and thinking long-term in petabytes. For those of you who don’t know, tera- and peta- are the binary prefixes for measuring units of digital information that come after giga- (as in “gigabyte”).

The list of prefixes, which most people started using with the term “kilobyte,” are collectively called the SI Prefixes. SI prefixes are defined under the International System of Units (“SI” for short), which is maintained by the International Bureau of Weights and Measures.

But there’s a problem; currently the list has 5 more prefixes past giga: tera, peta, exa, zetta, and yotta. Translation: we’re out of prefixes in just a few more generations, probably faster if Moore’s Law has anything to say about it. Indeed, you may have noticed how much quicker the public dialog went from megabyte to gigabyte than it did to make the same transition from kilobyte to megabyte. What to do?

One answer is to draw from existing lists of unofficial prefixes, currently used mostly by theoretical mathematicians. But given the speed at which we are advancing, it is safe to assume that we’re going to speed through individual prefixes quickly, resulting in more terminology overlap in common vernacular. For example, data centers are already discussing their capacity in the petabyte and exabyte range, and we haven’t even seen those hit mainstream yet.

Therefore I posit that it’s worth considering using a system that has inherent ordering, i.e. an existing list that we can repurpose. This speeds adoption, eliminates confusion, and solidifies the naming convention for a much longer period with minimal effort.

Looking at the problem in this light, I quickly identified the Greek Alphabet as very viable candidate:

  • Current SI prefixes match mnemonically with Greek letters (you can hear the similarity between “gigabyte” and “thetabyte”).
  • The list is universal, eliminating the need to debate on future prefix selection and ordering. Anyone who’s suffered “death by committee” knows what I’m talking about.
  • It can instantly extend the SI prefix list another 24 levels.
  • The Greeks themselves attached numerical values to each letter.
  • There is some precedent: The National Weather Service names tropical storms after Greek letters once the A-Z naming list is exhausted.

Best of all, the doomsayers of the future might point to the Omegabyte as a sign of the end times. I think I know what the Mayan’s were thinking:

Mayan Calendar Mystery

Additional reading and sources:


Write code like they do in Hollywood

Want to look like a badass hacker in front of your friends?

Head over hackertyper.com and just start pushing buttons.

As long as you avoid mashing your palms against the keyboard, your non-techie friends will look at you like this.

matrix shades


Brian Rolle machine gun celebration

I’m not much of a sports guy, but this sack celebration by Eagles’ Linebacker Brian Rolle in last night’s game against the Giants has to be one of the best defensive celebrations I’ve ever seen.

Brian Rolle machine gun celebration

Equal parts hilarious and badass!

(Apologies for the gif format, the NFL gestapo makes it hard to find real video clips.)


Fix WordPress “Fatal error: Allowed memory size” messages

You may (or may not) have noticed this site was down for the past few days, displaying a blank page no matter what URL was entered. After recalling that I had turned off PHP error onscreen outputting, I was presented with this lovely message:

Fatal error: Allowed memory size of 33554432 bytes exhausted
(tried to allocate XX bytes) in /some/file on line XX

This is a general PHP error, but the exact amount (33554432 bytes) occurs very frequently on WordPress sites. Here’s what’s happening, and how to fix it.

Every installation of PHP has an ini value called memory_limit, which hardcaps how much memory PHP may use on any given request. There are other ini settings that dictate the memory usage for specific actions, like post_max_size, but this one is universally applied; you can’t load anymore than what this value dictates.

I do a lot of stuff with files in PHP, so I blow this value out from it’s default of 128 MB up to 2 GB. I set this globally in my php.ini file like so:

memory_limit = 2G

I could also set this value from inside an .htaccess file:

php_value memory_limit 2G

Since this value was set before the site even loads, imagine my surprise when PHP is telling me that the limit is now 33554432 bytes, or 32 MB. The limit was changed by a define in WordPress called WP_MEMORY_LIMIT.

Here’s where things go awry. WordPress tries to check the existing memory_limit before applying it’s own terms, however the value must be set in terms of MB. Look back at my example settings; the PHP memory shorthand allows me to set in gigabytes using a “G“. But WordPress does a straight integer comparison to figure out if the existing setting is more/less than 32 MB. Here’s the offending code, a portion of an if statement on line 42 in wp-includes/default-constants.php

( (int) @ini_get('memory_limit') < abs(intval(WP_MEMORY_LIMIT)) ) )

What you end with in that sample is essentially if (2 < 32), which resolves true and thus WordPress applies it's own default limits.

You have three options to rectify the situation: two easy fixes, plus one badass "server-baller" fix. First the easy ones. As WordPress outlines in the article on increasing memory, simply define WP_MEMORY_LIMIT in wp-config.php using MB syntax. If I wanted to set it to 2 GB, it would look like this:

define('WP_MEMORY_LIMIT', '2048M');

Your value will be honored as long as it is an integer value greater than 32, but could break if WordPress changes it's memory size comparison calculation.

Alternatively, I could change my the memory_limit setting directly in php.ini or .htaccess to use MB syntax:


memory_limit = 2048M

[.htaccess]
php_value memory_limit 2048M

Realistically I would probably edit php.ini. Well, that is, I would, if I didn't know about the badass server-baller third option, which overrides php.ini AND prevents WordPress (or any other site for that matter) from making any changes to the value.

Look back at my .htaccess setting, and notice that I set the value using the command php_value. PHP has a similar command called php_admin_value, which you can use for setting all the same stuff as php_value.

The difference: php_admin_value can only be set within Apache configurations, i.e. it cannot be used inside of .htaccess files. Furthermore, any value set using php_admin_value cannot be overridden by any later calls to php_value or ini_set. You can override php.ini defaults, and lock that value in for the remainder of the request.

So back to our WordPress example, I opened the VirtualHost for frankkoehl.com and entered this line:

php_admin_value memory_limit 2G

Now the site will always allow a memory limit of 2 GB, no matter what. WordPress can try and set whatever value it likes, it will never be recognized.

Note that this function requires that you have access to Apache configurations, at least the ones running your site. Most shared hosting packages won't offer this level of access.

It's worth mentioning that WordPress sets a default value of 32 MB, lower than PHP's own default of 128 MB. I disagree that such a low limit is necessary, but they are trying to be good stewards to servers everywhere, so I can see where they're coming from.

I still have no idea why this memory issue popped up in the first place, but it gave me an excuse to share a cool configuration option that has saved me from having to scour code on several bloated, crappy sites to find obscure ini settings. I have a recent real-world example you can read about on Stack Overflow.


Recover hijacked default keyword search engine in Firefox

Earlier today I installed what I thought was an update to the Xvid codec in order to watch a video. I should have been more careful with the source, as their installer proceeded to modify my Firefox installation, adding some junk toolbar called “Start Now” and changing my default search engine to Bing.

(Sidebar: they can plead ignorance all they want, I’m certain the MS overlords are using every back-alley approach they can find to break the Google stranglehold.)

The toolbar was easy enough to remove, find it in the addons and push the “Remove” button. Changing my keyword search engine back proved a little more difficult.

A little background: the keyword search engine is the site that comes up when you type a word or phrase (not a URL) into the address bar (not the search bar). For example, if you open Firefox, enter “hot koehl”, and click go, you’ll likely end up on a Google search results page. I disable the search box entirely and use this method for almost all my searching. Very handy.

These asshats modified the Firefox configuration to send me to Bing without any warning or consent. To change it back, open a new tab and type about:config in the address bar. Enter the phrase keyword.url in the search bar that appears near the top.

The default is Google, so we can simply reset it by right-clicking the line and choosing the “Reset” option. Restart your browser and you should be good to go.

This type of adware typically modifies the search bar as well, so you should have a look at the config settings for browser.search while you’re at it. Config entries appear in bold if they’ve been modified from the default value.

Further reading on Firefox’s keyword functionality found over at mozillaZine.


Using hashmarks for URL anchors in Apache rewrites

Today I had to make an Apache rewrite that redirected a custom URL not only to a different page, but also to a specific anchor link on the destination page. In other words, /foobar had to actually load /some/other/url#foobar.

However, by default Apache rewrites will escape the hash (#) symbol, converting to its hexcode equivalent %23.

So this rewrite…

RewriteRule ^/foobar/?$ /some/other/url#foobar/ [R=301,L]

will produce this URL…

http://example.com/some/other/url%23foobar

In order to make this rewrite work, we must prevent Apache from escaping the hash mark by using the noescape flag. A little tweak to the rewrite, and we’re good to go…

RewriteRule ^/foobar/?$ /some/other/url#foobar/ [R=301,L,NE]

See that little NE on the end? That’s all we need to make rewrite anchors work.

Documentation on all the rewrite flags can be found in the Apache docs for the RewriteRule Directive.


Remove parent directories from tar archives

You run a Linux web server, and have painstakingly crafted custom backup processes for all your important data. Undoubtedly, the backup copies end up being stored in the form of a tar archive. Everything works great — copies are made, compressed, and sent off-site. There’s just one naggy little issue: whenever you open one of those tar backups, it includes the entire directory tree above the folder you actually need.

This problem always drove me nuts when creating MySQL database backups. I want to make a tar backup of /var/lib/mysql/database_name and call it database_name.tar.gz. If I then unpack the tar under /root, I’ll have a bunch of worthless subfolders to drill down through: /root/var/lib/mysql/database_name. All I really want is the database_name folder, the /var/lib/mysql/ parent directories can all go away.

Here’s how to get rid of those pesky parent folders once and for all. Before running your tar command, you must first use cd to move into the parent directory that contains the folder you want. Then, within your call to tar, leave the parent folders off your target files/folders.

Here’s the command sequence for previous example:

cd /var/lib/mysql
tar -czf /any/folder/you/want/database_name.tar.gz database_name

The resulting file will unpack directly into a single subfolder called database_name.

If you run your tar process as part of a larger scripting event, your script might not run/maintain a shell interface. In these cases, the cd command will be executed in a vacuum, and won’t have any effect. I run into this problem when I try to perform shell magic inside of PHP scripts.

Fortunately, there’s an easy workaround: you can append multiple shell commands together using &&. With &&, each command will be executed and completed before the next one begins, mimicking a person typing out commands one line at a time.

Here’s what our MySQL example looks like as a single line:

cd /var/lib/mysql && tar -czf /any/folder/you/want/database_name.tar.gz database_name

Enjoy your tar backups sans worthless parent directories!


Key-based logins for SSH

Here’s the scenario: You have a remote-hosted Linux server that you currently access via SSH by entering a username and password. You would like to use a public/private key pair so that you don’t have to enter a password every time you log in. You do most of your work from a Windows/Mac client, where you generally don’t mess with the command line and would rather not bother to learn.

Here’s how to enable key authentication for the root user on your remote Linux server. Once you’ve gone through the process you’ll be able to easily replicate it for any other user accounts that require SSH access.

1. Create the key pair from your server
I don’t know what your local system looks like, nor do I care. We know the server has everything you need since it already has SSH installed, so let’s just do the heavy lifting from there!

SSH into your server as root, and run the following commands:

cd ~
mkdir ~/.ssh
chmod 700 ~/.ssh
ssh-keygen -t rsa

Your output from the call too ssh-keygen will look something like this:

Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.

You may enter a passphrase to encrypt the key, but keep in mind you’ll have to enter that passphrase every time you wish to connect, effectively defeating our purposes for using keys in the first place. Just leave the password lines blank when prompted.

[Sidebar]
In practice, the connection procedure for passphrase’d SSH keys and standard username/password combo look exactly the same: connect to server > enter password > get access. However, using keys with a passphrase is far more secure than a simple username/password challenge. A would-be attacker needs the key file AND the passphrase in order to get in. This extra layer of connection requirements raises the security bar significantly, especially if you’re careful about where/how you store the file and passphrase.
[/Sidebar]

2. Which keys go where?
If you navigate to /root/.ssh, you’ll find 2 files: id_rsa (the private key) and id_rsa.pub (the public key). Here’s the part that causes the most confusion for newcomers to key-based authentication:

The public key (id_rsa.pub) belongs on your remote server.

The private key (id_rsa) resides on your local client, or whatever machine is connecting to the remote server.

It seems counter-intuitive for some people, espcially if you’re familiar with SSL. The SSH connection process has the remote server giving offering up the public key for every connection request, placing the onus on the client (i.e. you) to respond with the proper private information.

3. Install the public key on the server
The public key is on the server, but not in the right place yet. We need to place it where SSH will look for it when a key-based connection is made.

cd ~/.ssh
cat id_rsa.pub >> authorized_keys2

SSH looks for potential public key matches in the file authorized_keys2. The file might be called authorized_keys (no number two) on some systems. The cat command appends the entire key to that file, or creates it if it doesn’t already exist.

At this point the file id_dsa.pub is no longer needed. For the sake of security, you may tuck it away somewhere for safe-keeping, and delete the server copy.

4. Install the private key on your local client
Pull id_rsa (without the “.pub”) down to your local machine using your file transfer method of choice. Once you have a local copy, you should remove the one located on the server, and throw another copy in a dark corner somewhere for safe keeping. This is especially important when setting up key-based authentication for the root user: if you lose the key and disable password logins (discussed below), you will be unable to connect to your server via root. No bueno.

At this point you need to add the key to the SSH connection details of whatever SSH client you use. This fairly straightforward with nearly any SSH client, save one: PuTTY. For whatever reason, the PuTTY team uses a special format for SSH keys, and we need to convert your private key to that format before we can connect.

5. Convert SSH private key for PuTTY on Windows

  1. Open the PuTTY Key Generator. The executable is called puttygen.exe
  2. From the Menu select Conversions > Import key
  3. Browse to your local copy of id_rsa and choose Open.
  4. Click the Save private key button, listed under Actions on the lower portion of the window.
  5. Save the resulting .ppk file in a permanent location (PuTTY will look for the file in this location every time you connect)
  6. Open PuTTY, and load the saved session for your server
  7. Navigate to configuration menu to Connection > SSH > Auth
  8. Click the Browse button under Private key file for authentication, find your new .ppk file, and click Open
  9. Be sure to save the changes made to your PuTTY session!

There are plenty of tutorials on PuTTY around the web should you require more information.

6. Harden SSH by disabling password authentication
Using keys-based authentication allows you to enforce better protection on your server from SSH-based attacks. Test your connection before completing going any further.

Connect to your server using your fancy new SSH key, and open /etc/ssh/sshd_config in your favorite editor. Find the entry for PasswordAuthentication, or add the line as follows:

PasswordAuthentication no

Save and close the file, then restart SSH (on Debian/Ubuntu systems the command will be /etc/init.d/ssh restart). That will prohibit all SSH connections allow you to still connect as root, but only using your key. Again, make sure your key-based authentication is working properly before applying this change.


Translate signs through video with Word lens

My I was trotting along the tubes today with my trusty steed StumbleUpon, when I came across a new iPhone app called Word Lens. I could try to explain how they use the camera to translate signs and display the translation as video right on your phone in real time, but it wouldn’t do justice to their awesome demo:

Mobile apps now officially amaze me.


PHP IDS website down

PHP IDS is an intrusion detection system written in PHP. It allows developers to incorporate security analysis directly into their systems. Since it’s written in PHP, it can go wherever the site goes, and has minimal server requirements. Plus, since it works at the site level, a developer can pick and choose how and when to apply it to site actions. I love it.

Unfortunately, I went to the PHP IDS site today to grab the latest version, only to discover the site is either temporarily down or closed entirely. The error message is in German, but Google Translate gives the following translation:

This server is no longer in operation.

Please tell the operator that the DNS on the new IP 46.4.40.248 convert their.

schokokeks.org

Anyone know what’s going on?

Update April 4, 2011
Looks like they had some domain issues, and are now back at phpids.org. The lead developer of PHPIDS, .mario, has a post detailing the shutdown and return.


Next Page »