GitHub down yet again

Tuesday, February 2nd, 2010

I am experimenting with a new Engine Yard account today and their Setup basically requires you to have your projects in git. Since the site was from a private svn repos I decided I’d go ahead and signup for a paid Github account so I can host my projects there privately instead of having to setup my own git server. I should have trusted my gut because not more than a few hours later here I am trying to access my Github account and the site is broken again.

Rails threading with Spawn plugin

Tuesday, December 8th, 2009

My previous post was about using Thread in Rails which simply doesn’t work properly when you’re doing anything with ActiveRecord despite what anyone else is claiming.

This post will focus on my second attempt at a solution to intra-request threading in Rails. Basically, I have a Rails app where I want to run multiple computations at the same time. Since I had problems with Ruby’s native Thread method previously, I had no intention of going down that route again. I decided to try out the Spawn plugin.

Spawn was able to successfully segregate my multiple threads, or forks in my case. This got around the MySQL errors I reported in my previous post but it created a whole new set of problems all its own.

The first problem I had was that the spawn forks I was creating weren’t able to communicate their changes back to the main Rails request. I was creating forks and they were running fine and even saving data via ActiveRecord. I could confirm this via the console. The problem was that I need to know what data was being written by the spawn forks inside of the main Rails request. It was as if the main Rails request didn’t even know that something new was written to the database. I was able to get around this by forcing a reload from the database on the object I was trying to get association data from.

No big deal and it seems to make sense to do that anyway. That’s not where the real problem occurs though. After I got past that I realized that since each fork was essentially using a ‘copy’ of the database and not actually live connections that the data validations randomly failed. Basically if two forks are writing data to the database at the same time when there is a validates_uniqueness on the data. They both write to the database and both actually pass validation. The result is a database full of incorrect data which should have never passed the validation. I’m still not sure if that’s a problem with Spawn or an inherent problem with ActiveRecord’s connection pool.

Fun with Rails ActiveRecord and Ruby’s Thread

Tuesday, December 8th, 2009

I’ve been working on threading a Rails application lately and after reading headlines like ‘Rails is thread safe’ I figured how hard could it be. My first discovery was that when people talk about Rails and threads there are two different types of threading in Rails.

      Multiple request threading – This is where Rails itself threads among different requests to your web server and allows ActiveRecord to behave properly without having to keep a copy of Rails in memory for each request.
      Intra-request threading – This is where you have 1 request to your web server and inside the action you want to create multiple threads that run concurrently.

I’ll be talking about Intra-request threading. In particular, I want my threads to execute some code, read and write to the database, and play nicely with each other. My first attempt was to use the Ruby Thread method. This seemed to work somewhat until I started seeing strange errors coming in from MySQL. The problems seemed to occur randomly and what I determined was that the threads were trying to write to the database at the same time which ended up causing some collisions of sorts resulting in ‘lock wait timeout exceeded’ errors.

After considerable Googling, I found numerous posts about setting:

ActiveRecord::Base.allow_concurrency = true

The problem with that is that this is deprecated in the newest version of Rails in favor of connection pooling.

The short answer: Don’t use the Ruby Thread method within Rails when doing anything with ActiveRecord.

The World’s Oyster screencast

Saturday, September 5th, 2009

Here is Micah Friedline with another screencast. This time it’s about The World’s Oyster, a self-updating address book. Stay in touch with friends, business contacts, and more with the easy to use address book that keeps itself updated as you move.

The World’s Oyster

Thursday, August 20th, 2009

Its been a long year and I’ve had very little time to keep things updated around this site but here’s a preview of something we’ve been working on with a client. Its a self updating address book so you never lose track of your contacts when their information changes. Signup, try it out, and let us know if you have any ideas for improving it.

The World’s Oyster

Rails HTML Sanitize gem

Saturday, January 17th, 2009

I was recently working on improving the search engine rankings of a site with lots of user generated content and noticed that users were creating 404s through bad links. The users were able to add links to other sites in their comments and such but sometimes the links were bad. Sometimes they were even local links so the search engines were effectively seeing a bunch of internal 404s from the user generated content. This was essentially defeating any seo being done elsewhere on the site and needed to be fixed quickly. My original idea was to use hpricot to scrub all the anchor tags and append a rel=”nofollow” tag to them all. I was mulling over how to write the hpricot parsing code when I found the Sanitize gem. It does exactly what I needed and saved me the hassle of writing the hpricot parsing code. The gist of it is:

Sanitize.clean(html, Sanitize::Config::BASIC)

As an added bonus, it also can scrub out unwanted script tags and more. Now, the site won’t be nicked for having internal 404s from the user generated content since they’ll all have rel=”nofollow” on them.

Ditching Mongrel for mod_rails

Sunday, May 25th, 2008

I build a lot of Rails apps on a regular basis and each one I add to my server takes another bite out of my limited resources. The way I’ve traditionally setup a new Rails app was using a Mongrel cluster. I found it to be a lot more reliable and faster than the fcgi approach people use to use (and some still do). The downside to setting up a few dozen Rails apps on your server with each running a Mongrel cluster is that it eats up all your memory. One of my sites is starting to get a lot more traffic than it has been in the past and its putting additional strain on the server. As a result I decided to find an alternative to Mongrel. I’ve tried searching for alternatives in the past but everything sent me back to Mongrel. Until today of course when I came across Jamie Flournoy’s blog about mod_rails.

Excited for an alternative to raising a pack of resource hungry mongrels on my server I installed the gem and tried it out. It was exactly what I was looking for as far as ease of use straight away. All I needed to do was stop a mongrel cluster and simplify its virtual host directive in Apache to leave out the mod_proxy_rewrite and the other wonky rewrite rules. The first app I tested went smoothly but suddenly the server started misbehaving. Resources were being eaten and it wasn’t clear what was doing it because the app I was testing with is behind an Apache password and I’m the only user. I ended up having to turn off the mod_rails to get my system back in control. The problem turned out to be that by default mod_rails tries to test if your virtual host directory is a rails app or not. I have a few apps that I tossed in an instance of WordPress into a blog directory inside my rails app directory. I found it convenient to toss them all into the same directory since its all the same website. As a result mod_rails was doing a ../ check to see if the blog directory was a rails app which it decided it was. That’s where the craziness came in because its a php application. Anyway, the quick solution was to move the blog directory out of the Rails app directory.

Other than that my memory usage is way down. I’ve migrated all my low traffic sites to mod_rails and I’m happy with how they’re performing. There is a little delay on the initial load of the app but subsequent calls are quick because its already loaded. I can wait an extra 2-5 seconds for my low traffic apps to load in exchange for hundreds of extra megs of free memory.

I haven’t moved over my higher traffic money making sites yet and I’m not entirely sure I will until I’ve tested mod_rails a bit more. I’m extremely happy with the results thus far though.

Reducing Rails model callbacks

Tuesday, April 22nd, 2008

I’ve been working with a client to optimize parts of their Rails application. The problem is that a method in the app does some simple updating of a few model objects but because the model has so many relations it goes through a ton of unnecessary callbacks. There are issues related to data concurrency which means you have to do the callbacks but in this particular situation there won’t be any concurrent updates to the data so the callbacks can be omitted. The solution to the problem was trivial when using the save_without_callbacks Rails plugin. Just adding a simple:

some_object.skip_callbacks = true

before the update_attribute reduced the number of SQL UPDATE queries from 90 to 8. Lesson learned. If you need to skip model relation callbacks on save this plugin is for you. Be careful about data concurrency issues though.

Googlebot and redirect_to :back

Thursday, February 28th, 2008

The other day I noticed a pretty significant SEO related problem with using a built in Rails construct. I noticed a problem when I started getting application errors that were letting me know that the user agent was none other than our friendly Googlebot. A closer look at the app shed light on a problem you may not have even expected. When using

redirect_to :back

it will take a look at the HTTP_REFERRER and redirect the user to that url. The problem, however, is that Googlebot doesn’t send a referrer and neither do a whole bunch of other search engine spiders. The result is that when they visit your site they get a nice 502 server error because Rails raises an exception. It doesn’t know what url to redirect to so it send a 502 error. Googlebot then sees your site as a bunch of 502 errors in the situations where you’re using redirect_to :back. Take a look at the Rails API and you’ll see the last line clearly mentions this.

The solution is to catch the RedirectBackError the redirect_to raises when there’s no referrer. Its a simple fix but one you need to be on the lookout for or else you might end up with a few 502 errors giving you bad mojo with the Google gods.

Rails 2.x rant about project evolution and legacy systems

Friday, February 8th, 2008

I’ve made the leap and I’m starting to use Rails 2.x for all my apps now. Overall it was a pretty smooth transition over to 2.x. One thing I noticed immediately after running the scaffold generator, ./script/generate scaffold Something, was that the layout template was named something.html.erb. What’s this new .erb extension I thought to myself. I did a little searching and found Ryan’s Scraps where he explains the reason for the change. Its all about semantics. Notice the post date was back almost a year ago.

It makes me want to reflect about Rails in general. Am I really a year behind here? I’ve been developing in Rails for a few years now and love it for all the right reasons. One of the most important facets the framework offers is its constant adaptation to better ideas. The Rails project seems to be one of the most fluid projects I’ve ever seen in terms of embracing new and better techniques for doing things. I think that simple fact is what gives it an edge.

In this industry, a developer has to constantly stay ahead of the curve and always be thinking of how to do things better. I end up getting into the trenches and working on a project for a while but by the time I come up for air I’m doing things the old way already. Here I am upgrading my skills to Rails 2.x and I’m already feeling like I’m old school. Rails apps move quickly and if you don’t pay attention you’ll just as quickly get behind. Other frameworks seem like they’re more consistent over time and as a programmer there’s often little to learn on new releases. Rails is different. It forces you to reconsider what you already know. Take routes for instance. Now everyone is using RESTful routes. You learn the old way and they reinvent it under your noes. Its a beautiful dynamic that forces me as a developer to stay on my toes and to constantly improve my own skill set.

One problem that I face, however, is the complexity of managing multiple project across multiple Rails versions. Rails luckily allows me to freeze the framework into a project so my problem isn’t in incompatibilities. Its in the complexity that arises from having 20 or so projects all running different Rails versions. Some of my projects run older versions of Rails 1.x. Some run the newest 1.x version and even newer projects are now running 2.x. Sure you can say that I should upgrade all those projects but that works in theory. In practice, however, resources are limited and some projects have a priority over others. Time is limited so its usually spent where its needed most. Smaller projects fall through the cracks and sooner or later they seem like old legacy apps even though they’re only maybe a year old. I suppose its just another part of this industry: dealing with legacy systems. Not every legacy system can be upgraded to the latest and greatest. Of course the purist programmers out there want to keep everything current and doing every single best practice you can consider best. Practicality reins supreme though and resources will always remain limited. The best any developer can do is to constantly try to reinvent themselves and never close the door to new ideas. Once you become comfortable in what you know is when you’re going to miss the next big idea.