Celebrate the little victories

Today is something of a landmark for me. Just a few minutes ago, I launched my first full-time commercial venture when I flipped the switches and took Fwd:Vault out of beta. There’s an announcement over at the official blog if you’re interested in the details. Here I’d rather talk about what’s going through my mind, lest any of you are proceeding down a similar path.

First off, this has been 14 months in the making. I started working on this shortly after the startup I was working for went belly-up in November 2008. Like so many people these days, I found myself facing a lean job market. Starting a business has been a lifelong dream of mine, so after talking it over with my wife — a world-class vet with double my brain power — we agreed that the timing was right for me to pursue my dream.

So many startup publications talk about “taking the plunge,” of overcoming the fear that holds people back from getting started. This was not the case for me, and I’m not sure why it has to be the case for anybody. If you think and plan ahead, you can avoid the worst of the action-paralyzing fear. I wanted to run my own business since I was a kid, which instantly diffused fear around the general concept. I kept trying to come up with viable business ideas until I had one that stood up to scrutiny, decreasing some of the fears of failure. I worked on it in my free time until the opportunity to go full-time presented itself, removing the fear of having no income. Knowledge and understanding are they key. If you fear the unknown, know more.

Other people on the entrepreneurial road falter when they look at the work involved. Admittedly, looking back on the last year ‘n change, I’m astounded at how much I’ve done. My subversion repository had 700 commits when I launched. The site and service cover 1200 files in 175 folders (that doesn’t include framework stuff, I wrote every one of those). I taught myself a library’s worth of new tech, including automated recurring billing, search engines, email syntax, Amazon S3, daemonizing, undocumented PHP functionality, and even more HTML/CSS/JS techniques. On the business side, I registered an LLC, got a business address and phone number, bought servers and domains, began proper bookkeeping practices, won a competition, dealt with consultants, performed basic market research, investigated advertising venues, taught myself basic SEO/SEM, and learned to analyze traffic.

That’s simply a staggering amount of work to think about at once, and I never would’ve gotten any of it done if I tried. You simply cannot look at it as a whole all at once and keep your sanity. Every day was just one or two tasks: get a page working: fix an email processing bug, and so on. You know where you and where you want to go. In between is simply a mountain of very tiny to-do’s. As long as you keep an eye on the prize — launching a business — the task list sorts itself.

Finally, I put the most important exercise in the title of this post. Every time you complete a page, add a feature, piece together another part of your business structure, celebrate it! Relay your latest conquest to your wife, family, friends, whoever will listen. Write a blog post about it (you’ll find tons of posts on this site inspired by my startup efforts).

Even if they don’t care — my wife glazes over every time I get into technical stuff — or nobody listens — this blog averages less than 100 hits/day — you’ll feel energized knowing that you were able to proclaim “I finished something, I took a step.” That’s so crucial, because of all the naysayers you will meet, the worst one is your own self-doubt.

Then, when you finally reach your big goal, mark the calendar, and celebrate that day every year. Savor it when facing your next mountain. And write a blog post, leave a mile marker for the next guy.

My next hill starts tomorrow. For now…
I did it! I started a business!


If you haven’t checked out Fwd:Vault yet…

…I suggest you do so immediately. We’ll have a major announcement by the end of the day, and the perks that come with signing up beforehand will go away at that time. Basically this is your last chance to get into the Fwd:Vault Beta, and enjoy the perks we have planned for our beloved early adopters.

Not-so-subtle hint: Beta users will have the chance to enjoy a serious lifetime discount.


Advice for my teenage self

I ran into the Youth Minister for my high school youth group at a friend’s wedding a few weeks back, and he graciously offered me the opportunity to write a letter to be read aloud at an upcoming retreat to the high school attendees. I was more than happy to oblige.

Below is an excerpt of that letter that I thought would apply to anyone in high school or college, just getting your feet under you, just as life prepares to pull out the rug.

If you’re in that target audience, I truly hope you find something worthwhile here. If, like me, this special time has come and gone for you, what would you tell your past self? I’d love to hear what it is in the comments.


I’d like to pass on a bit of advice. Not the fluffy “sieze the future” nonsense you‘ll get at your graduation, but real practical advice that you can use today, right now in fact. I asked myself, “If I could go back in time to when I was 15, 16, 17 years old, and give myself that sage knowledge that only shows up in hindsight, what would I say?” This is what I came up with…

  • First things first: the winning Powerball numbers on November 18, 1998 will be
    21 – 25 – 33 – 39 – 46, Powerball 18
  • Frank, at 17, you’re still just a kid. Heck, at 21 you’ll still be just a kid. That’s okay. There’s plenty of time to be an adult. And no, going to school, doing homework, and taking tests does not count as responsibility. It’s nothing compared to the stress you’ll feel when you’ve got bills to pay and a family to care for. Enjoy the complete lack of responsibility while it lasts, just remember that you have just enough knowledge to be dangerous.
  • Here’s the biggest secret of high school: all the other kids are all worrying about what everyone else thinks of them too! Everyone is ridiculously self-conscious, and these feelings affect everyone differently. It’s the reason why you see others become divas, bullies, introverts, goths, emo’s, anti-trenders, etc. Be confident in yourself, because how you feel is exactly how everyone around you feels. That’s the key to genuine popularity in high school, and we all figure it out after the fact. Fortunately, being mindful that we’re all in the same boat is also the key to building long-lasting relationships throughout your life, so the lesson doesn’t go to waste. Still, I always wanted to be popular in high school. For a great example of this mindset in action, go re-watch Ferris Bueller’s Day Off.
  • You end up meeting your wife early senior year of college, right after you take a personal pledge to finally give up worrying about relationships. So quit worrying! Dating is fun and exciting, but dwelling on that relationship can swallow you up, and there’s so much more to do, learn, see, and enjoy right now. Of all the people you’ll meet throughout your life, so far you have only met two couples who knew each other in high school.
  • I know it’s really hard to believe right now, but your parents aren’t as dumb as they seem. They actually get a whole lot of what you’re going through, but talking about it in a candid manner with you right now kinda mucks up the parent-child relationship, since it requires them to bring up their own past mistakes. When you have that conversation with Mom on your first college break about drinking and partying, you’ll understand. For now, trust that they always have your best interests at heart, and never hesitate to ask them questions. As much as I hate telling myself this, the only dumb one in that relationship right now is you.
  • Late senior year, an Adult Leader named Arnold told you and bunch of your friends that, quote, “By the time you graduate, who you are — your personality — is essentially who you will be for the rest of your life.” It’s some of the best insight you’ll ever get when it comes to dealing with people, including yourself. So if there’s something you’re not happy with — your tendency to be critical of others comes to mind — get working on it right now. You know those jerks in your life, the bullies who you wish would just grow up? Most of them never will, and you’ll meet all new idiots in college and out in the workplace, and they’ll all look the same. These people are a reality, so just start ignoring them right now. On a positive note, the laid back attitude that you’ve fostered will help you through many a tough spot, including an agonizing all-nighter at your first big job when the servers crashed.
  • One of the big reasons Arnold’s advice is so good, is the corollary to it that you discover in college: your decisions matter. All of them. Profoundly. Sex, drugs, drinking, drunk driving, skipping classes, anything illegal…these actions can never truly be undone, and you will carry them with you for the rest of your life. Choose carefully. So far you’ve done okay, but you’ll meet plenty of people who were less fortunate. No, let’s be honest here, “less fortunate” isn’t the right term. “Stupid” is more apt. Don’t be stupid.
  • Don’t fear failure. Be more afraid of missing opportunities due to fear of failure. You’ll see lots of examples where the big difference between wild success and mediocrity is simply showing up. On a related note, you do finally start your own business, just like you’ve always told yourself you would. You’ve been working out of your basement for about a year now, and the business itself isn’t profitable yet, but you’re getting there. The experience is every bit as awesome as you expected.
  • Don’t blink. It’s really easy to keep looking forward to the next milestone. Finishing a school year, getting your driver’s license, being able to vote…life keeps going at the same pace, and later on you only wish it would slow down. Senior years of both high school and college go particularly fast for you. Enjoy every day, even the lousy ones. Like I said, you really don’t have any responsibility right now.
  • One last thing, when you go to return your graduation tuxedo, drive really carefully. This old man abruptly hits the brakes on you and you rear-end him.

That’s what I’d want to tell myself anyway, but hopefully you’ll find something useful in there too.


Run your servers without timezone offsets

I recently made the decision to store times on Fwd:Vault systems in Greenwich Mean Time, or GMT. I decided to do this because I have time-sensitive events happening along several dimensions. Email coming into the system has several timestamps associated with it: the user’s initial delivery, relay from their mail server, and receipt by the Fwd:Vault mail server. Payment receipts come into Fwd:Vault from our billing provider, which gets stored in my system and made available to the user.

Up until now, my server time was set for the US Eastern, where both I and the server physically reside. Then I started building the code to display local time based on a user’s selected timezone.

Ugh.

Here’s the problem: displaying local time requires at least one time conversion, from server time to the user’s timezone. If the time is initially set to anything other than no-offset GMT, you have two calculations to do, from the server timezone to GMT, then GMT to user timezone. You can do it, of course, but who really wants to write even more code?

Now add to this equation the fact that most data-delivery systems have settled on sending time data in GMT. A very good practice, to be sure, but presents the need to do another timezone conversion when the data come into your systems. Going back to my example, I had to convert payment times from GMT to US Eastern before dropping them into my database.

Finally, add to the mix the potential for time data coming in from more than one source with more than one offset. Again back to my case, payment data is GMT, as is the Twitter feed I store and display on the site. Meanwhile, email was set to US Eastern. This matched the server and MySQL database where all the data ends up residing, so I was still looking at just one time conversion. But what happens down the road, when my server configuration changes, or I move to another timezone?

Tying this information to me makes as much sense as tying it to any one of my users. It’s the same rationale that data service providers use when delivering GMT time data, it applies to me, and it applies to you too.

I’m just too lazy to try and keep all that timezone switching straight in my head.

If you find yourself in the same scenario, save your sanity and your future support efforts. If you run a website that (a) displays time-sensitive data, and (b) allows users to create an account, you really owe it to everyone involved to store time in a neutral fashion and adjust time displays according to the user’s selected timezone.


Open source mentoring

I was quoted in another IT World article last week discussing how mentoring occurs in open source communities. Below is the full text I sent to the author, in case anyone wanted more background on my comments.

In my experience, mentoring in open source projects occurs between the project leaders (i.e. artisans) and their most active community members (i.e. journeymen). In other words, rarely does the complete newbie directly benefit from the knowledge of team. They must first absorb what the team has to offer in the form of their project code — prove themselves worthy, in a sense — and then join the high-level discussions.

I have a perfect example from my own experience. The Zen Cart project fosters a vibrant support community, however direct, unscheduled contact with team members is strictly prohibited.

Unfettered developer access to the team is limited to a select few community members who are proactively contacted by the team, instead of the other way around. I was fortunate to fall into this category, but it required serious upfront work.

I taught myself the Zen Cart platform in order to launch my then-employer’s first ecommerce site. During my time launching and maintaining the site, I developed several utilities for the program in order to fill some holes in their functionality. For example, the available accounting reports were lackluster, so I built a custom reporting tool to output all the sales figures our accountant would need. I released this and several other modules back into the community as a “thank you” for all the support they had provided. Attention from the team followed, culminating in an offer to join them as a support team member.

Here, (finally!) begins the mentoring. The team shared their private development plans so I could coordinate my my modules with the release schedule, and offered direct advice on how to improve my offerings. In turn, I provided feedback on my experience with the program, offering recommendations on future areas of improvement.

But the relationships quickly extended beyond the Zen Cart project itself, and today I consider the team members professional colleagues. I contact the team members any time I needed advice or support on my projects, Zen Cart-related or not. I just had a phone call with one of them last week to discuss online billing setups for my current project. Who better to seek help with online payments than the programmer who’s checkout code has reached “featured cart” status with PayPal?

In this ecomony, these relationships are invaluable. They have made themselves available as professional references if I need them which, according to one interviewer, just makes my resume levitate off the pile.

The article is chock full of additional insights and perspectives, so be sure to check it out.


You need to know SEO

I’ll admit it, I was real lazy getting on the SEO train. It took starting my own company for me to finally start paying attention. SEO for my previous major site work was handled for me. ClassicWines had other staff dedicated to the issue, and Destination ImagiNation had such a huge network of affiliate sites that the SEO literally handled itself.

Even this blog went unattended in the SEO category. I had posts tagged, I submitted the site to the search engines and Technorati, I installed Platinum SEO Pack, I figured that was good enough.

It came to me shortly after the early soft launch of Fwd:Vault. I had dutifully installed Google Analytics to monitor traffic. I logged into the service for the first time a few weeks after things were rolling, and my search results sucked. I showed up for one term: “fwd”. I used their keyword tool to see what they were primarily pulling off the site, and most of it was the legalese from the policy pages. That’s when I knew this would require some serious attention.

If you find yourself in the same position, you owe it to yourself to get educated. The benefits for a startup are obvious, and I don’t know an existing company that wouldn’t like more traffic. Plus SEO knowledge/ability is a great resume booster.

On board? Great! Here’s how to get started.

First some reading. These are all SEO-related blogs that currently reside in my Google Reader setup.

Now that you’ve got the info, let’s get our hands on some utility sites. I’m not going to explain how you use these sites, that should be obvious from your self-education outlined above.

So there you go. Take all that stuff, add a few weeks of study, and you’ll know all you need to do a decent SEO job.

Sidebar: hiring outside help
I really do mean “decent.” SEO specialists can claim they know the voodoo better than you, but most of that is smoke these days. SEO is not quite the wild west it was in the late 90’s and early zeros; effective practices have become more standardized and the tools to maximize that effectiveness more available. Speaking practically, most will provide access to network relationships you can leverage for link sharing, subscriptions to the more expensive SEO tools (the Enterprise version of SEM Rush costs $500/month), and their own cocktail of page optimizations.

Nonetheless, they definitely bring a wealth of experience to the table, just as any other expert would. So you should look at hiring an SEO specialist just as would any other position. Just as small businesses keep their own books until they’re big enough to warrant an accountant, you owe it to yourself (and your wallet) to give it a shot. Look for outside help if your own efforts prove fruitless. If nothing else, you’ll be more educated and ready to negotiate with your SEO specialist.


Philly Startup Leaders

Living in a Philly suburb, I never thought my Philly proximity would have any effect on my startup, Fwd:Vault. However that was before I discovered the Philly Startup Leaders.

Comprised of small businesses at all stages, manned by people of all experience levels and backgrounds, the mailing list we share alone is invaluable. When you add in access to startup events and conferences in the area, not mention original events like “Entrepreneurs Unplugged” and the totally unique “Fishbowl” event, and you’ve got a must-have tool for any bootstrapper who considers Philly the closest major city.

If that includes you, what are you waiting for? Go sign up!


Fix emails dropped or blocked by Comcast

As an email-based backup service, Fwd:Vault ran into spam filters pretty quickly. Most of this can be mitigated with proper server configuration and getting records in the right places (i.e. abuse.net). From there it’s simply a matter of reminding users to check the spam folder when things are missing.

However through the tribulations of one of my testers, I found out that Comcast goes the extra mile for users of their comcast.net webmail. Unlike most setups, where spam is simply redirected to a spam-specific folder, Comcast will delete the message outright, without issuing any kind of notice to the sender or recipient.

Truly, above and beyond (belief).

Of all the lousy IT practices I’ve seen over the years, this one takes the cake. No spam filter is perfect, so it’s guaranteed that they are dropping legitimate emails (case-and-point: I’m losing Fwd:Vault account emails). Plus it appears they default to a “highly suspicious” mode with newer systems, as fwdvault.com, my IP address, and my DNS records are completely fresh and unblemished.

Finally, the sheer size of their operation means that getting a hold of anyone to actually fix the problem when it happens to you is virtually zero. I’d go so far as to say that they can get away with this nonsense precisely because they are a large ISP. As a former “your company IT guy,” I can imagine getting at least an earful, and at worst a pink slip, if I were caught doing this.

Despite my astonishment, I couldn’t deny reality. Through my logs I watched Fwd:Vault’s mail server find their systems, connect, and deliver the message and get a 250 response code (i.e. all good). Then over in my comcast.net inbox I’d get exactly nada, ditto for the spam folder. Since the actual delivery had no technical issue, I had zero clue as to the cause of the problem. I wasn’t on any blacklists, the IP was static, and my DNS records were in good order, including a reverse DNS record with my hosting service.

Fortunately, it seems that someone in the trenches at Comcast is fighting the good fight, as I took two long-shot attempts today and it seems one of them paid off. Here’s what I did, hopefully it works for you.

1. Use the feedback form at comcastsupport.com
I tried to retrace my steps on how I found this one, but their sites are so damn convoluted I kept going in circles. However I know I started from inside the web mail interface, aka their “SmartZone”.

(See kids? That’s what we call irony. Can you say, “irony?”)

Whatever, here’s the link. You don’t need to log in to use the form:

http://www.comcastsupport.com/forms/net/sccfeedback.asp

I selected Spam or Junk Mail in the checkboxes and wrote something to the effect of:

I am not receiving mail from example.com in my Comcast email. I own and operate the mail server for this domain and have confirmed through my logs that the message is delivered properly (response code 250) to Comcast MX servers.

My tests delivered via the server mx.comcast.net (IP 00.00.00.00). It’s been over 24 hours and I have not received a bounce, nor is anything showing up in my inbox or spam folder.

As I have nothing else to go on, I am looking for help from your end.

I did not receive any reply, however I also took another step…

2. Use their RBL Removal Form
This should only apply if your mail server has actually been blocked by Comcast, in which case you would likely see an error code of 550 in your logs. If your server picks up the full response from Comcast, you may also get additional helpful information as outlined in their list of custom mail delivery error codes.

None of this applied to me, as the connection and delivery went off without a hitch. Still, I figured it was worth a shot; a bureaucracy this big is bound to have systems running into one another.

I sent in a request to be removed from their RBL by way of this form:

http://www.comcastsupport.com/Forms/NET/blockedprovider.asp

Most of the information will depend on your setup, however I did check the boxes for Implemented technology to filter or prevent transmission of spam and Changed the rDNS records to reflect a consistent and non-dynamic setting just in case. I included text similar to what I outlined earlier in the Issue Description box.

I saw emails coming through less than 30 minutes after sending this message. However, I sent the feedback first, followed by a brief online chat with their support, who directed me to the RBL form. All told it was at least an hour between my first step and the delivered message.

Update: I received this message back in response to my RBL request…

Thank you for contacting Comcast Customer Security Assurance. We have received and reviewed your RBL removal request.

Below each IP address you submitted in your request, we have included the result of our research. Please do not reply to this message.

[IP address(es)]

We have received your request for removal from our inbound blocklist. After investigating the issue, we have found that the IP you provided for removal is currently not on our blocklist.

We need the IP address currently blocked to further investigate this issue. The IP address is a number separated by decimals and is located in an error code starting with “550″ in the returned email from Comcast. You can learn more about how to identify a blocked IP by visiting our Frequently Asked Question page at:
http://www.comcast.net/help/faq/index.jsp?faq=SecurityMail_Policy18667

Please verify the IP(s) and resubmit your request to http://www.comcastsupport.com/rbl

So it looks like the RBL request didn’t do anything. Unless it did, and some numb-nut at Comcast was covering for their idiotic policies.

My gut tells me that I caught a particularly helpful support person manning the feedback desk who was able to punch the few keys it took to rectify the problem. If that’s the case, thanks for the help, and I hope the rest of you get to run into him/her as well. I sent the message around 2:00 pm on a Monday.

You can find more helpful information, including a link to the Blacklist Removal Request Form, on the Comcast Postmaster Site.

Best advice I can give: encourage your users to switch to Gmail. :)


Mentioned in recent IT World article

I was recently quoted in an article over at IT World, discussing underused developer tools (e.g. security testers). My quote is on page 2:

http://www.itworld.com/development/74088/developer-tools-you-dont-use-and-why-you-dont-use-them

Also FYI I am on vacation the rest of this week; return to our regular schedule next Monday.


Archive your entire Twitter timeline

My code for displaying Twitter posts on your site is pretty handy, but it does have drawbacks. Each page load involves calling a remote URL, downloading a resulting XML file, and parsing the results, increasing your load times and using bandwidth. To minimize the impact, you can really only display a handful of the most recent posts.

Plus, the downloaded stream is never saved. Google does index Twitter, but the thoroughness and benefit to you are subject to much speculation.

We can solve both problems by locally storing and serving Twitter posts ourselves. Once you have them in your own system, you can display as many of them as you want without expensive external URL lookups. Plus, with the content centrally located on your site, getting Google to index and apply it to your rankings is straightforward.

Note for SEO geeks:
Yes, I am aware that displaying and indexing Twitter posts on your own site does technically fall under the category of duplicate content, so save your typing.

Given the disparate nature of Twitter content and the utter disconnect from my sites, I’m not too concerned about incurring a penalty for it. Your opinion and experience may vary. You should at least familiarize yourself with Google’s rules for duplicate content. If your paranoid, consider applying canonicalization to pages that display large portions of a Twitter timeline.

Let’s get started
The end of the post includes a link to download all the code, as well as a link to a live demo.

I am assuming that you’ve got a standard PHP/MySQL stack for your site, ideally running on Linux, super-ideally Debian (Digg uses it for a reason, you know).

I am also assuming that you know how to use it; bring a decent understanding of SQL, PHP, and basic web programming. Here’s your first test: the demo assumes your PHP installation is version 5 and includes the Simple XML libraries.

First, here’s the SQL INSERT command for the table that our example will use. Apply this to your database:

CREATE TABLE IF NOT EXISTS twitter (
  `id` bigint(10) unsigned NOT NULL,
  `created_at` datetime NOT NULL,
  `source` varchar(255) NOT NULL,
  `in_reply_to_screen_name` varchar(255) NOT NULL,
  `text` varchar(255) NOT NULL,
  UNIQUE KEY `id` (id)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;

Now let’s have a look at the class, which is the meat of the entire thing:

class Twitter {
  public function __construct($twitter_id) {
    $this->id = (int)$twitter_id;
  }
 
  public function user_timeline($page, $count = '200', $since_id = '') {
    $url = 'http://twitter.com/statuses/user_timeline/' . $this->id . '.xml?count=' . $count . '&page=' . $page;
    if ($since_id && $since_id != '') {
      $url .= '&since_id=' . $since_id;
    }
    $c = curl_init();
    curl_setopt($c, CURLOPT_URL, $url);
    curl_setopt($c, CURLOPT_RETURNTRANSFER, 1);
    curl_setopt($c, CURLOPT_CONNECTTIMEOUT, 3);
    curl_setopt($c, CURLOPT_TIMEOUT, 5);
    $response = curl_exec($c);
    $responseInfo = curl_getinfo($c);
    curl_close($c);
    if ($response != '' && intval($responseInfo['http_code']) == 200) {
      if (class_exists('SimpleXMLElement')) {
        return new SimpleXMLElement($response);
      } else {
        return $response;
      }
    } else {
      return false;
    }
  }
 
  public function rebuild_archive($your_timezone) {
    $orig_tz = date_default_timezone_get();
    date_default_timezone_set('GMT');
    $tz = new DateTimeZone($your_timezone);
    $sql = "SELECT id FROM twitter ORDER BY id DESC LIMIT 1";
    /**
     * INSTALLATION
     * execute $sql on your DB to get the latest twitter post
     * set the value of `id` to a variable named $since_id
     * set $since_id to false if the table is empty (i.e. a new install)
    **/
    $tweet_count = 0;
    for ($page = 1; $page <= 200; ++$page) {
      if ($twitter_xml = $this->user_timeline($page, '200', $since_id)) {
        foreach ($twitter_xml->status as $key => $status) {
          $datetime = new DateTime($status->created_at);
          $datetime->setTimezone($tz);
          $created_at = $datetime->format('Y-m-d H:i:s');
          $sql = "INSERT IGNORE INTO twitter
                    (id, created_at, source, in_reply_to_screen_name, text)
                  VALUES (
                    '" . $status->id . "',
                    '" . $created_at . "',
                    '" . addslashes((string)$status->source) . "',
                    '" . addslashes((string)$status->in_reply_to_screen_name) . "',
                    '" . addslashes((string)$status->text) . "'
                  )";
          /**
           * INSTALLATION
           * Execute $sql over your DB here
          **/
          ++$tweet_count;
        }
      } else {
        break;
      }
    }
    $sql = "ALTER TABLE twitter ORDER BY `id`";
    /**
     * INSTALLATION
     * Execute $sql over your DB here
    **/
    date_default_timezone_set($orig_tz);
    return $tweet_count;
  }
}

Twitter::user_timeline()
This method is a modified version of my previous twitter_status() function.

The big difference is that we’re passing additional arguments to Twitter’s user_timeline API call: count (specifies the number of statuses to retrieve) and page (specifies the page of results to retrieve).

Twitter::rebuild_archive()
This method takes the results from user_timeline() and places them in your DB. Its lone argument is the string representation for the timezone of your server. To find out what the string is and why you need it, just read the second post of my twitter series. For me on the US east coast, I use 'America/New_York'.

Quick Warning
Hopefully you noticed several large comment blocks with INSTALLATION in all caps: I didn’t include any code to run SQL over your DB. Every system includes their own wrapper for database calls, including mine, so I’m not wasting time writing out SQL inserts using raw PHP functions that you’ll just remove. Find the three blocks labeled “INSTALLATION” and follow the instructions to execute the list SQL.

Now we just need to run it.

require('/path/to/twitter.class.php');
$Twitter = new Twitter('12345678');
$Twitter->rebuild_archive('America/New_York');

We instantiate the class and pass the ID number of our Twitter account. You’ll find instructions on getting this number about halfway down my first post on displaying Twitter updates. After that, a single call to Twitter::rebuild_archive() will grab all available updates and store them.

If the `twitter` table is empty, it will grab your entire Twitter timeline, up to 3200 posts. If you have more than 3200 posts, you’re out of luck for the time being, although I’d recommend you take a break from the computer, take a shower, and say “Hi” to the wife and kids.

After the first run, subsequent runs will only grab new posts by way of the API’s since_id argument.

If you have the access, you can easily make this into a cron job:

#!/usr/bin/php5
<?php
require('/path/to/twitter.class.php');
$Twitter = new Twitter('12345678');
$Twitter->rebuild_archive('America/New_York');
?>

Save that last block of code to a file, set it to be executable (chmod 755 usually), and set the job to run hourly. That top line identifies the interpreter that the system should use to read the file. You may need to change it to reflect the location of the PHP executable on your system.

Want to see everything described above in action? Check out the Developer’s Diary on Fwd:Vault.

Don’t worry about cut ‘n paste, just download the zip file with the class and all the examples:
Twitter Archiver (.zip)

Update 08-19-2009: Removed references to function calls specific to my framework.

Update 12-16-2009: The `id` field has been bumped up to a BIGINT. Twitter ID numbers are bigger than what an unsigned INT field can hold.


Next Page »