CategoryRuby

All things ruby (the programming language)

Generating a twitter OAuth access key – the semi-manual way

[UPDATE]
Apparently someone at Twitter was listening, or I’m going senile/blind. Let’s call it a combination of both.

Instead of following all the steps below, you could just login with the Twitter account you want to use on http://dev.twitter.com, register your application and then click ‘Edit Details’ on the application overview page at http://dev.twitter.com/apps. Next click the ‘Application detail’ button on the right, followed by the ‘My Access Token’ button in order to get your Access Token and Access Token Secret.

This makes the old post below rather obsolete. Clearly a case of me thinking everything is a nail and ruby is a hammer (don’t they usually say this about java coders?) 😉

[ORIGINAL POST]

OAuth is great! OAuth allows your application to use your user’s data without the need to ask for their password. So Twitter made the API much safer for their and your users. Hurray! Free pizza for everyone!

Unless of course you’re using the Twitter API for your own needs like running your own bot and don’t need access to other user’s data. In such cases a simple username/password combination is more than enough. I can understand however that the Twitter guys don’t really care that much about these exceptions(?). Most such uses for the API are probably rather spammy in nature.

!!! If you have a twitter app that uses the API to access external user’s data: look for another solution. This solution is ONLY meant when you ONLY need access to your own account(s) through the API.

Other Solutions

Mr Dallas Devries posted a solution here which involves requesting and scraping a one-time PIN.
But: I like to minimize the amount of calls I make to twitter’s API or pages to lessen my chances of meeting the fail whale. Also, as soon as the pin isn’t included in a div called anymore, this will fail.

However, mr Devries’ post was a starting point for my solution, so I’m much obliged to him posting his findings.

Authenticating with the Twitter API: old vs new

Acessing The Twitter API the old way:

require 'twitter'
httpauth = Twitter::HTTPAuth.new('my_account','my_secret_password')
client = Twitter::Base.new(httpauth)
client.update('Hurray!')

The OAuth way:

require 'twitter'
oauth = Twitter::OAuth.new('ve4whatafuzzksaMQKjoI', 'KliketyklikspQ6qYALcuNandsomemored8pQ6qYALIG7mbEQY')
oauth.authorize_from_access('123-owhfmeyAgfozdyt5hDeprSevsWmPo5rVeroGfsthis', 'fGiinCdqtehMeehiddenymDeAsasaawgGeryye8amh')
client = Twitter::Base.new(oauth)
client.update('Hurray!')

In the above case,
ve4whatafuzzksaMQKjoI is the ‘consumer key’ (sometimes also referred to as ‘consumer token’) and
KliketyklikspQ6qYALcuNandsomemored8pQ6qYALIG7mbEQY is the ‘consumer secret’. You’ll get these from Twitter when you register your app.

123-owhfmeyAgfozdyt5hDeprSevsWmPo5rVeroGfsthis is the ‘access token’ and
fGiinCdqtehMeehiddenymDeAsasaawgGeryye8amh is the ‘access secret’. This combination gives the registered application access to your account. I’ll show you how to obtain these by following the steps below.

(Basically you’ll need a bunch of keys and you’ll have to jump a bit through hoops to obtain them for your server/bot. )

How to get these keys

1. Surf to the twitter apps registration page

go to http://dev.twitter.com/apps to register your app. Login with your twitter account.

2. Register your application

Enter something for Application name, Description, website,… as I said: they make you jump through hoops.

If you plan on using the API to post tweets, Your application name and website will be used in the ‘5 minutes ago via…’ line below your tweet. You could use the this to point to a page with info about your bot, or maybe it’s useful for SEO purposes.

For application type I choose ‘browser’ and entered http://www.hadermann.be/callback as a ‘Callback URL’. This url returns a 404 error, which is ideal because after giving our account access to our ‘application’ (step 6), it will redirect to this url with an ‘oauth_token’ and ‘oauth_verifier’ in the url. We need to get these from the url. It doesn’t really matter what you enter here though, you could leave it blank because you need to explicitely specify it when generating a request token.

You probably want read&write access so set this at ‘Default Access type’.

3. Get your consumer key and consumer secret

On the next page, copy/paste your ‘consumer key’ and ‘consumer secret’. You’ll need these later on. You also need these as part of the authentication in your script later on:

oauth = Twitter::OAuth.new([consumer key], [consumer secret])
4. Obtain your request token

run the following in IRB to obtain your ‘request token’
Replace my fake consumer key and consumer secret with the one you obtained in step 3. And use something else instead http://www.hadermann.be/callback: although this will only give a 404, you shouldn’t trust me.

irb(main):001:0> require 'oauth'
irb(main):002:0> c = OAuth::Consumer.new('ve4whatafuzzksaMQKjoI',
'KliketyklikspQ6qYALcuNandsomemored8pQ6qYALIG7mbEQY',
{:site => 'http://twitter.com'})
irb(main):003:0> request_token = c.get_request_token(:oauth_callback => 'http://www.hadermann.be/callback')
irb(main):004:0> request_token.token
=> "UrperqaukeWsWt3IAlfbxzyBUFpwWIcWkHP94QH2C1"

This (UrperqaukeWsWt3IAlfbxzyBUFpwWIcWkHP94QH2C1) is the request token: Copy/paste this token, you will need this next.

5. Authorize your application

surf to https://api.twitter.com/oauth/authorize?oauth_token=[the above token], for example:

https://api.twitter.com/oauth/authorize?oauth_token=UrperqaukeWsWt3IAlfbxzyBUFpwWIcWkHP94QH2C1

This will bring you to the ‘An application would like to connect to your account’- screen on Twitter where you can grant access to the app you just registered. If you aren’t still logged in, you need to login first. Click ‘Allow’. Unless you don’t trust yourself.

6. Get your oauth_verifier from the redirected url

Your browser will be redirected to your callback url, with an oauth_token and oauth_verifier parameter appended. You’ll need the oauth_verifier.

In my case the browser redirected to:

http://www.hadermann.be/callback?oauth_token=UrperqaukeWsWt3IAlfbxzyBUFpwWIcWkHP94QH2C1&oauth_verifier=waoOhKo8orpaqvQe6rVi5fti4ejr8hPeZrTewyeag

Which returned a 404, giving me the chance to copy/paste my oauth_verifier: waoOhKo8orpaqvQe6rVi5fti4ejr8hPeZrTewyeag

7. Request an access token

Back to irb, use the oauth_verifier to request an access token, as follows:

irb(main):005:0> at = request_token.get_access_token(:oauth_verifier => 'waoOhKo8orpaqvQe6rVi5fti4ejr8hPeZrTewyeag')
irb(main):006:0> at.params[:oauth_token]
=> "123-owhfmeyAgfozdyt5hDeprSevsWmPo5rVeroGfsthis"
irb(main):007:0> at.params[:oauth_token_secret]
=> "fGiinCdqtehMeehiddenymDeAsasaawgGeryye8amh"

We’re there!

123-owhfmeyAgfozdyt5hDeprSevsWmPo5rVeroGfsthis is the access token.
fGiinCdqtehMeehiddenymDeAsasaawgGeryye8amh is the access secret.

Try it!

Try the following to post an update:

require 'twitter'
oauth = Twitter::OAuth.new('ve4whatafuzzksaMQKjoI', 'KliketyklikspQ6qYALcuNandsomemored8pQ6qYALIG7mbEQY')
oauth.authorize_from_access('123-owhfmeyAgfozdyt5hDeprSevsWmPo5rVeroGfsthis', 'fGiinCdqtehMeehiddenymDeAsasaawgGeryye8amh')
client = Twitter::Base.new(oauth)
client.update('Cowabunga!')

Now you can go to your twitter page and delete the tweet if you want to.

Twitter gem – undefined method `stringify_keys’

Have you been getting the following errors when running the Twitter gem lately ?


/usr/local/lib/ruby/gems/1.8/gems/httparty-0.4.3/lib/httparty/response.rb:15:in `send': undefined method `stringify_keys' for # (NoMethodError)
from /usr/local/lib/ruby/gems/1.8/gems/httparty-0.4.3/lib/httparty/response.rb:15:in `method_missing'
from /usr/local/lib/ruby/gems/1.8/gems/mash-0.0.3/lib/mash.rb:131:in `deep_update'
from /usr/local/lib/ruby/gems/1.8/gems/mash-0.0.3/lib/mash.rb:50:in `initialize'
from /usr/local/lib/ruby/gems/1.8/gems/twitter-0.6.13/lib/twitter/search.rb:101:in `new'
from /usr/local/lib/ruby/gems/1.8/gems/twitter-0.6.13/lib/twitter/search.rb:101:in `fetch'
from test.rb:26

It’s because Twitter has been sending back plain text errors that are treated as a string instead of json and can’t be properly ‘Mashed’ by the Twitter gem. Also check http://github.com/jnunemaker/twitter/issues#issue/6.

Without diving into the bowels of the Twitter gem or HTTParty, you could ‘begin…rescue’ this error and try again in 5 minutes. I fixed it by overriding the offending code to return nil and checking for a nil response as follows:

module Twitter
  class Search
    def fetch(force=false)

      if @fetch.nil? || force
        query = @query.dup
        query[:q] = query[:q].join(' ')
        query[:format] = 'json' #This line is the hack and whole reason we're monkey-patching at all.

        response = self.class.get('http://search.twitter.com/search', :query => query, :format => :json)

        #Our patch: response should be a Hash. If it isnt, return nil.
        return nil if response.class != Hash

        @fetch = Mash.new(response)
      end

      @fetch
    end

  end
end

(adapted from http://github.com/jnunemaker/twitter/issues#issue/9)

If you have a better solution: speak up!

Twitter traffic might not be what it seems

Are you using bit.ly stats to measure interest in the links you post on twitter?

I’ve been hearing for a while about people claiming to get the majority of their traffic originating from twitter these days.

Now, I’ve been playing with the twitter ruby gem recently, doing various experiments which I’ll not go into detail here because they could be regarded as spamming… if I’d conduct them on a large scale, that is.
It’s scary to see people actually engaging with @replies crafted with some regular expressions and eliza-like trickery on status updates found using the twitter api. I’m wondering how Twitter is going to contain the coming spam-flood.

When posting links I used bit.ly as url shortener, since this one seems to be the de-facto standard on twitter. A nice thing about bit.ly is that it shows some basic stats about the redirects it performs for your shortened links.

To my surprise, most links posted almost immediately resulted in several visitors. Now, seeing that I was posting the links together with some information concerning what the link is about, I concluded that the people who were actually clicking the links should be very targeted visitors.
This felt a bit like free adwords, and I suddenly started to understand why everyone was raving about getting traffic from twitter.

How wrong I was! (and I think several 1000 online marketers with me)

On the destination site I used a traffic logging solution that works by including a little javascript snippet in your pages. It seemed that somehow all visitors disappeared after the bit.ly redirect and before getting to the site, because I was hardly seeing any visitors there. So I started investigating what was happening: by looking at the logfiles of the destination site, and by making my own ‘shortened’ urls by doing redirects using a very short domain name I own. This way, I could check the apache access_log before the redirects.

Most user agents turned out to be bots without a doubt. Here’s an excerpt of user-agents awk’ed from apache’s access_log for a time period of about one hour, right after posting some links:

AideRSS 2.0 (postrank.com)
Java/1.6.0_13
Java/1.6.0_14
libwww-perl/5.816
MLBot (www.metadatalabs.com/mlbot)
Mozilla/4.0 (compatible;MSIE 5.01; Windows -NT 5.0 - real-url.org)
Mozilla/5.0 (compatible; Twitturls; +http://twitturls.com)
Mozilla/5.0 (compatible; Viralheat Bot/1.0; +http://www.viralheat.com/)
Mozilla/5.0 (Danger hiptop 4.6; U; rv:1.7.12) Gecko/20050920
Mozilla/5.0 (X11; U; Linux i686; en-us; rv:1.9.0.2) Gecko/2008092313 Ubuntu/9.04 (jaunty) Firefox/3.5
OpenCalaisSemanticProxy
PycURL/7.18.2
PycURL/7.19.3
Python-urllib/1.17
Twingly Recon
twitmatic
Twitturly / v0.6
Wget/1.10.2 (Red Hat modified)
Wget/1.11.1 (Red Hat modified)

Of the few user-agents that seem ‘real’ at first, half are originating from an ip-address used by Amazon EC2. And I doubt people are setting op proxies on there.

Oh yeah, Googlebot (the real deal, from a legit google owned address) is sucking up posted links like fresh oysters.
I guess google is trying to make sure in advance to never be beaten by twitter in the ‘realtime search’ department. Actually, I think it’d be almost stupid NOT to post any new pages/posts/websites on Twitter, it must be one of the fastest ways to get a Googlebot visit.

Same experiment with a real, established twitter account

Now, because I was posting the url’s either as ‘status’ messages or directed @people, on a test-account with hardly any (human) followers, I checked again using the twitter accounts from a commercial site I’m involved with. These accounts all have between 500 and 1000 targeted (I think) followers. I checked the destination access_logs and also added ‘my’ redirect after the bit.ly redirect: same results, although seemingly a bit higher real visitor/bot ratio.

Btw: one of these account was ‘punished’ with a 1 week lock recently because the same (1 one!) status update was sent that was sent right before using another account. They got an email explaining the lock because the account didn’t act according to their TOS. I can’t find anything in their TOS about it, can you?
I don’t think Twitter is on the right track punishing a legit account, knowing the trickery I had been doing with it’s api went totally unpunished. I might be wrong though, I often am.

On the other hand: this commercial site reported targeted traffic and actual signups from visitors coming from Twitter. The ones that are really real visitors are also very targeted. I’m just not sure if the amount of work involved could hold up against an adwords campaign.

Reposting the same link over and over again helps

On thing I noticed: It helps to keep on reposting the same links with regular intervals.
I guess most people only look at their first page when checking out recent posts of the ones they’re following, or don’t look too far back when performing a search.

Now, this probably isn’t according to the twitter TOS. Actually, it might be spamming but no-one is obligated to follow anyone else of course.

This way, I was getting more real visitors and less bots. To my surprise (when my programmer’s hat is on) there were still repeated visits from the same bots coming from the same ip-addresses. Did they expect to find something else when visiting for a 2nd or 3rd time? (actually,this gave me an idea: you can’t change a link once it’s posted, but you can change where it redirects to)
Most bots were smart enough not to follow the same link again though.

Are you successful in getting real visitors from Twitter?
Are you only relying on bit.ly to provide traffic stats?

Ruby live regex checker

This Ruby regular expression checker/tester is a hidden gem that’s far too low in the serps: http://rubular.com/.

Ruby crc16 implementation

I needed a crc16 in order to feed some data to a dinosaur.
Google couldn’t help me this time so I rolled my own. The pre-calc table was borrowed from a python implementation.

Hope this saves someone else the hassle…


CCITT_16 = [
0x0000, 0x1021, 0x2042, 0x3063, 0x4084, 0x50A5, 0x60C6, 0x70E7,
0x8108, 0x9129, 0xA14A, 0xB16B, 0xC18C, 0xD1AD, 0xE1CE, 0xF1EF,
0x1231, 0x0210, 0x3273, 0x2252, 0x52B5, 0x4294, 0x72F7, 0x62D6,
0x9339, 0x8318, 0xB37B, 0xA35A, 0xD3BD, 0xC39C, 0xF3FF, 0xE3DE,
0x2462, 0x3443, 0x0420, 0x1401, 0x64E6, 0x74C7, 0x44A4, 0x5485,
0xA56A, 0xB54B, 0x8528, 0x9509, 0xE5EE, 0xF5CF, 0xC5AC, 0xD58D,
0x3653, 0x2672, 0x1611, 0x0630, 0x76D7, 0x66F6, 0x5695, 0x46B4,
0xB75B, 0xA77A, 0x9719, 0x8738, 0xF7DF, 0xE7FE, 0xD79D, 0xC7BC,
0x48C4, 0x58E5, 0x6886, 0x78A7, 0x0840, 0x1861, 0x2802, 0x3823,
0xC9CC, 0xD9ED, 0xE98E, 0xF9AF, 0x8948, 0x9969, 0xA90A, 0xB92B,
0x5AF5, 0x4AD4, 0x7AB7, 0x6A96, 0x1A71, 0x0A50, 0x3A33, 0x2A12,
0xDBFD, 0xCBDC, 0xFBBF, 0xEB9E, 0x9B79, 0x8B58, 0xBB3B, 0xAB1A,
0x6CA6, 0x7C87, 0x4CE4, 0x5CC5, 0x2C22, 0x3C03, 0x0C60, 0x1C41,
0xEDAE, 0xFD8F, 0xCDEC, 0xDDCD, 0xAD2A, 0xBD0B, 0x8D68, 0x9D49,
0x7E97, 0x6EB6, 0x5ED5, 0x4EF4, 0x3E13, 0x2E32, 0x1E51, 0x0E70,
0xFF9F, 0xEFBE, 0xDFDD, 0xCFFC, 0xBF1B, 0xAF3A, 0x9F59, 0x8F78,
0x9188, 0x81A9, 0xB1CA, 0xA1EB, 0xD10C, 0xC12D, 0xF14E, 0xE16F,
0x1080, 0x00A1, 0x30C2, 0x20E3, 0x5004, 0x4025, 0x7046, 0x6067,
0x83B9, 0x9398, 0xA3FB, 0xB3DA, 0xC33D, 0xD31C, 0xE37F, 0xF35E,
0x02B1, 0x1290, 0x22F3, 0x32D2, 0x4235, 0x5214, 0x6277, 0x7256,
0xB5EA, 0xA5CB, 0x95A8, 0x8589, 0xF56E, 0xE54F, 0xD52C, 0xC50D,
0x34E2, 0x24C3, 0x14A0, 0x0481, 0x7466, 0x6447, 0x5424, 0x4405,
0xA7DB, 0xB7FA, 0x8799, 0x97B8, 0xE75F, 0xF77E, 0xC71D, 0xD73C,
0x26D3, 0x36F2, 0x0691, 0x16B0, 0x6657, 0x7676, 0x4615, 0x5634,
0xD94C, 0xC96D, 0xF90E, 0xE92F, 0x99C8, 0x89E9, 0xB98A, 0xA9AB,
0x5844, 0x4865, 0x7806, 0x6827, 0x18C0, 0x08E1, 0x3882, 0x28A3,
0xCB7D, 0xDB5C, 0xEB3F, 0xFB1E, 0x8BF9, 0x9BD8, 0xABBB, 0xBB9A,
0x4A75, 0x5A54, 0x6A37, 0x7A16, 0x0AF1, 0x1AD0, 0x2AB3, 0x3A92,
0xFD2E, 0xED0F, 0xDD6C, 0xCD4D, 0xBDAA, 0xAD8B, 0x9DE8, 0x8DC9,
0x7C26, 0x6C07, 0x5C64, 0x4C45, 0x3CA2, 0x2C83, 0x1CE0, 0x0CC1,
0xEF1F, 0xFF3E, 0xCF5D, 0xDF7C, 0xAF9B, 0xBFBA, 0x8FD9, 0x9FF8,
0x6E17, 0x7E36, 0x4E55, 0x5E74, 0x2E93, 0x3EB2, 0x0ED1, 0x1EF0
]

def crc16(buf, crc=0)
buf.each_byte{|x| crc = ((crc< <8) ^ CCITT_16[(crc>>8) ^ x])&0xffff}
crc
end

mod_rails – first impressions

  • Installing was easy as pie. I was missing the Apache development headers and the installer told me which ‘yum’ command I should use to install them. Nice.
  • When you don’t have heaps of memory, remember setting RailsMaxPoolSize to a lower value to avoid swapping. I’d experiment with RailsMaxPoolSize and saturating the number of concurrent requests using ApacheBench in any case. More concurrent threads is only better until a certain point.
  • Apache/HAProxy/Mongrel was a bit slower than Apache/mod_rails, which was a bit slower than HAProxy/mongrel (without Apache). The difference was nothing to loose sleep over. RailsMaxPoolSize was set the same as the nr of mongrels.
  • While ApacheBench testing, touching tmp/restart.txt to force a restart could cause a hang. Only restart of Apache helped.
  • I would get an occasional time-out and ab stopped (heavy testing, large nr of concurrent requests). BUT this also happened with Apache/HAProxy/mongrel (NOT when using HAProxy/mongrel ‘direct’) so I’d guess this is rather an Apache issue.

Mod_rails seems to be a surprising stable and finished product, and this for a first public release! I’m anxious to check out their Ruby Enterprise Edition.

The occasional hang when touching tmp/restart.txt makes me a bit weary, but this only seems to happen under heavy ApacheBench testing with a high number of concurrent threads. The hangs of Apache under heavy load on my test system, even when no mod_rails is involved might indicate this isn’t even a mod_rails problem. I don’t see this being an immediate issue under normal circumstances.

Thank you Phusion! (I don’t like the logo though)

mod_rails

I’ve been planning on checking out nginx for a while. Mainly as an alternative for apache in my currently preferred apache/haproxy/mongrel Rails install.

I guess I’ll have to hold on to apache a little while longer and check this one first: http://www.modrails.com/

From where I’m standing this is the most interesting thing to happen to Ruby on Rails deployment since mongrel.

Update your rails application without downtime

It’s often not possible or wanted to take down your rails application for updates. Even if it is, I still regard downtime as something that should be avoided if possible.

When using the absolutely fantastic HAProxy to load-balance between your mongrels, you can take down a mongrel process without any adverse effects. When a mongrel process is not answering anymore, HAProxy just forwards the request to the next available mongrel and no harm is done.

So all you need is a way to stop and restart each mongrel, one after the other. I cooked up a bash script to do this. It’s not perfect, but I’ve been using it on several projects. The script stops each mongrel, waits until it’s really stopped, then restarts and checks if it’s really running.

Why a bash script ? I wanted to sharpen my (limited) bash scripting skills (so if you have any tips: feel free to comment).

Customize where needed. (properly indented version here)

If you are running monit, remember to ‘unmonitor’ before and ‘monitor’ after.


#!/usr/bin/env bash

#rails project home dir
#!!Change acc to your needs
cd /home/rails/le_test/current

#Ports mongrels are running on
#!!Change acc to your needs
for PORT in 8000 8001 8002
do
#the location of our pid-files
#!!Change acc to your needs
PF=/var/run/mongrel/mongrel.$PORT.pid

#pidfile exists ?
if [ -e $PF ]; then
PR=$(cat $PF)

#process still running ?
if ps $PR > /dev/null 2>&1; then
echo "$PF $PR Running - killing"
kill $PR
while ps $PR > /dev/null 2>&1; do
echo "$PR still running"
sleep 1
done
echo "$PR killed"
else
echo "$PF $PR Not Running - deleting pidfile"
rm -f $PF
fi
else
echo "$PF does not exist"
fi

#check if a command containing $PORT.pid is still running
#in case process is still active but pidfile was not found
if ps ax|grep $PORT.[p]id > /dev/null 2>&1; then
echo "!!! But process containing $PORT.pid running !!!"
echo "!!! NOT auto-restarting, Manual intervention necessary !!!"
else
echo "Restarting port $PORT"
#mongrel start cmd
#!!Change acc to your needs
mongrel_rails start -d -e production -p $PORT -P $PF
sleep 1
#check if process is actually running now (could also be done by pidfile based check)
while ! ps ax|grep $PORT.[p]id > /dev/null 2>&1; do
echo "Not Running"
done
fi

done

This should work with other load-balancers too, but since HAProxy had all features I expected to find and is very resource friendly and has a tiny footprint (and irons a shirt in 2 mins), I didn’t look any further (after looking at Pound, which only advantage is that it’s easy to install/configure and cisco css switches which are just a tiny bit expensive for my needs)

Getting Real by 37signals

I recently read ‘Getting Real‘, a book (only available as pdf) by 37 signals. 37 signals are the people behind the much hyped Ruby on Rails and a bunch of webapps based on this RoR (=Ruby on Rails) framework.

Why I wanted to read this book
I use RoR with success for an ongoing project at my daytime job and also used it for an ‘after-hours’ sideproject of which at least the development was a success thanks to Rails. I’ve also used Ruby (it works without Rails too) for more little projects and ‘glue’ than I can count.
Seeing that Getting Real is almost as hyped as RoR itself I figured I’d shell out the $19 and check what Getting Real is all about. After all, Rails has turned out to be a very productive and really quite fun web development framework. (but then, after years of doing VC++/MFC development even a lobotomy would seem ‘fun’… or even cobol… nah… just kidding about the cobol)
Continue reading

Fixing a rake schema.rb oracle dump

El Problemo
When issuing a ‘rake db:schema:dump’ (or a simple ‘rake test’ since it uses db:schema:dump) on the legacy oracle db mentioned in the previous post the schema.rb contained some weird entries like:

t.column "timestamp", :datetime, :default => #<date : 4903089/2,0,2299161>, :null => false

The culprits were basically field defaults (like the above, using ‘sysdate’ as default) that aren’t handled properly (not even sure if they should). Since this is a legacy db accessed by tons of mainly C/C++ code on which I’m tacking a web ui, changing this to something that suits Rails is usually not possible.

Any other rake tasks using the schema.rb stumbled over these entries so I was basically unable to do a simple ‘rake test’.

Unfortunately I only found the occasional mention of this issue but not a real solution (yet).

El (hack’n’slash) Solution
Basically all I needed db:schema:dump to work correctly for was being able to run tests. For a variety of accidental reasons I choose to run these tests on a mysql db on a w2k box. (at work I’m using xwin under cygwin on a wxp installation to connect to a CentOS server for my rails development… so this might not seem a logical decision… but explaining the reason behind choosing mysql to run the tests is really totally out of scope of this post)

Easiest solution was to filter the badies (renegades and outlaws) out of schema.rb before it was used for something else.

I made an extra db:schema:clean task to read/filter/write schema.rb, called it from db:schema:dump and put these in a file called ‘databases.rake’ in my Rails lib/tasks folder.

databases.rake:

namespace :db do
namespace :schema do
desc "Create a db/schema.rb file that can be portably used against any DB supported by AR"
task :dump => :environment do
require 'active_record/schema_dumper'
File.open(ENV['SCHEMA'] || "db/schema.rb", "w") do |file|
ActiveRecord::SchemaDumper.dump(ActiveRecord::Base.connection, file)
end
Rake::Task["db:schema:clean"].invoke
end
desc "Fix the sysdate timestamps from an oracle schema.rb"
task :clean => :environment do
File.open(ENV['SCHEMA'] || "db/schema.rb", 'r+') do |f|
lines = f.readlines
lines.each do |it|
it.gsub!(/:default => #<date : 4903089\/2,0,2299161>,/, '')
it.gsub!(/:default => "EMPTY_BLOB\(\)",/, '')
end
f.pos = 0
f.print lines
f.truncate(f.pos)
end
end
end
end

(blame WordPress for losing indentation… at least I do!)

Ok… probably really basic stuff for the harcore Railsers out there, but I hope this will help some ‘nuby’ Railsers save some time hunting for where/what/how with this great framework.

(2nd post today, I’ve got another one up my sleeve but will stop now before someone asks me to take a ‘doping test’)