Tag: Google

Embed Google Route Maps of Your Flights into Your Web Site

Do me a favor. Paste this URL into your browser’s address bar:

http://jetrecord.com/routes/sea+jfk

And now paste this one:

http://jetrecord.com/routes/sea+jfk.map?width=450&height=300

Oh, yes, my friend! You can now embed route maps from Jetrecord into your web site, assuming your web site or blog allows you to embed content using iframes or something similar. WordPress does. How do I know? Well …

Why would you use this? Just for fun, of course! Are you flying somewhere cool? Do you know the airport codes for your origin and destination airports? Use the URLs above as a guide. Jetrecord will create your route maps on the fly.

For example, the airport code for Denver International is DEN. The airport code for Boston’s Logan International is BOS. Type this into your browser’s address bar: http://jetrecord.com/routes/den+bos

Jetrecord will draw the map for you and create the route page. Scroll to the bottom of the route page below the Google map. There is some code listed there to embed the map. Copy the code, modify the HTML if you want to (you may want to modify the dimensions of the map and the iframe), and paste it into your web site. Or pass the URL around via email, create a link to it, whatever you feel like doing.

Have fun! Oh, and be sure to read the terms and conditions at the bottom of the blog post on Jetrecord.

Google Sitemaps with Ruby on Rails, Capistrano, and Cron

This is a slight modification of code originally written by Alastair Brunton. I recently implemented this for Jetrecord and since Alastair was so generous, I decided to share the love as well. I have changed Alastair’s code to generate a sitemap index file plus sitemap files for each model, all of them gzipped to save on bandwidth.

I have also added Capistrano code to copy sitemap files from the previous release to the current release so we don’t lose our sitemap files when we deploy a new release.

Remember, Google sitemaps are for publicly available URLs. They’re for pages that you want Google to find and index. If you don’t want Google to find your CIA Operatives records, don’t tell Google about it!

Let’s just go straight to the code. I am going from the top down in my application’s root directory.

app/models/your_model.rb

You must add this code to each model that you want to generate a sitemap for. Here is an example for Airports on Jetrecord.

# put this inside app/models/airport.rb
def self.get_paths
  path_ar = []
  self.find(:all).each do |model|
    path_ar << {:url => "/airports/#{model.to_param}", :last_mod => model.updated_at.strftime('%Y-%m-%d')}
  end
  path_ar
end

config/sitemap/sitemap_tasks.rb

This is for Capistrano. You probably don’t have a config/sitemap directory. I created one and put my Capistrano sitemap task in it. This tells Capistrano, “After deploying my new release, copy the sitemap files from the previous release and store them in the same location in the current release.”

Capistrano::Configuration.instance(:must_exist).load do
  namespace :sitemap do
 
    desc "Copy the sitemap files after deploy"
    task :copy_sitemap, :roles => :app do
      puts "copying Rails sitemap files"
      sudo "cp #{previous_release}/public/sitemaps/* #{current_release}/public/sitemaps/"
    end
 
    after :deploy, 'sitemap:copy_sitemap'
  end
end

config/deploy.rb

This file usually contains your typical Capistrano recipes. All you have to do is require the sitemap_tasks file we created above.

# At the top of the file, after any other required files
require 'config/sitemap/sitemap_tasks'

lib/google_sitemap.rb

This is the meat of the whole thing. Kudos to Alastair for setting this up. The reason I modified it into using a sitemap index with sitemaps for each model is because Google allows a total of 50,000 links per sitemap. I have 48,000 navigation fixes, 20,000 airports, and 3,000 navaids in Jetrecord. By necessity I have to split my sitemap into many sitemaps.

I’m also gzipping the sitemap files because Google can read them and it saves bandwidth. Oh, and the URL to ping Google has changed, as has the XML namespace for their sitemap tags.

require 'net/http'
require 'uri'
 
# A class specific to the application which generates a google sitemap from the contents of the database.
# Author: Alastair Brunton
# Modified: Harry Love 2008-06-09
class GoogleSitemapGenerator
 
  def initialize(base_url, sources)
    @base_url = base_url
    @sources = sources
  end
 
  # 1. Iterate through each model's #get_paths method
  # 2. Create sitemap file for each model
  # 3. Create sitemap index file
  # 4. Ping Google
  def generate
    path_ar = []
    sitemaps = []
    @sources.each do |source|
      # initialize the class and call the get_paths method on it.
      path_ar = eval("#{source}.get_paths")
      xml = generate_sitemap(path_ar)
      save_file(source, xml)
    end
    index = generate_sitemap_index(@sources)
    save_file('index', index)
    update_google
  end
 
  # Create a sitemap document for a model
  def generate_sitemap(path_ar)
    xml_str = ""
    xml = Builder::XmlMarkup.new(:target => xml_str)
    xml.instruct!
    xml.urlset(:xmlns => 'http://www.sitemaps.org/schemas/sitemap/0.9') {
      path_ar.each do |path|
        xml.url {
      	  xml.loc(@base_url + path[:url])
      	  xml.lastmod(path[:last_mod])
      	  xml.changefreq('weekly')
        }
      end
    }
    xml_str
  end
 
  # Create a sitemap index document
  def generate_sitemap_index(sitemaps)
    xml_str = ""
    xml = Builder::XmlMarkup.new(:target => xml_str)
    xml.instruct!
    xml.sitemapindex(:xmlns => 'http://www.sitemaps.org/schemas/sitemap/0.9') {
      sitemaps.each do |site|
        xml.sitemap {
      	  xml.loc(@base_url + "/sitemaps/sitemap_#{site}.xml.gz")
      	  xml.lastmod(Time.now.strftime('%Y-%m-%d'))
   	}
      end
    }
    xml_str
  end
 
  # Save the xml file (gzipped) to disk
  def save_file(source, xml)
    File.open(RAILS_ROOT + "/public/sitemaps/sitemap_#{source}.xml.gz", 'w+') do |f|
      gz = Zlib::GzipWriter.new(f)
      gz.write xml
      gz.close
    end
  end
 
  # Notify Google of the new sitemap index file
  def update_google
    sitemap_uri = @base_url + '/sitemaps/sitemap_index.xml.gz'
    escaped_sitemap_uri = URI.escape(sitemap_uri)
    Net::HTTP.get('www.google.com', '/webmasters/tools/ping?sitemap=' + escaped_sitemap_uri)
  end
end

lib/tasks/sitemap.rake

This is the rake task that we’ll call periodically from Cron to generate new sitemap files.

require 'google_sitemap'
namespace :google_sitemap do
  desc "Generate a Google sitemap from the models"
  task(:generate => :environment) do
    # Generate sitemaps for each of the models listed in the array
    sources = %w( Airport Navaid Fix AnotherModel AnotherModel AndAnotherModel EtCetera )
    sitemap = GoogleSitemapGenerator.new('http://yourdomain.com', sources)
    sitemap.generate
  end
end

public/sitemaps

Assuming this directory doesn’t exist already, create it.

Also, depending on what stack you’re using to deploy your Rails app, you may also need to tell your server to skip proxying HTTP requests to this directory. For example, I’m proxying requests to Mongrel via Apache. So, in the Apache virtual host conf file for my app, I had to add a ProxyPass directive so Apache would serve the sitemap files instead of Mongrel.

# Right after the ProxyPass directives for images, stylesheets, and javascripts
ProxyPass /sitemaps !

Don’t forget to restart Apache after you save the new conf file!

Add a Cron Job

Lastly, you need to add a cron job to call the rake task so we can generate new sitemap files from time to time. The frequency is up to you and the requirements of your app.

Unfortunately, I’m not up to date on raw Cron commands. I use a GUI provided by my web host. But here’s the command I’m using on Solaris to call the rake task. You’ll have to edit this to suit the specifics of your application and server environment.

cd /var/www/apps/myapp/current &amp&amp /opt/local/bin/rake RAILS_ENV=production google_sitemap:generate

Don’t forget to tell Rake to use the production environment. Another potential gotcha: you usually have to give cron the full path to rake. You can find out where it is on your server by logging in as the user you plan to use for the cron job (usually root) and doing “which rake”. If that doesn’t bring it up it means rake isn’t in your PATH. That’s okay. You’ll just have to do a little more digging to find out where rake is installed on your system.

If I’ve left out anything let me know. By the way, this would make a great plugin or gem, if only I knew how to make them.

Automatically Delete Spam from Gmail

Gmail’s spam filter is pretty good. I get roughly one or two false positives a year. However, I think spam notification in Gmail should be redesigned. Currently, when new spam arrives, the folder link turns bold and displays a mail count. This grabs my attention.

Continue reading …