Sinatra or Rails

I was recently asked by my boss that if I had to choose a Ruby web framework, which one would I choose. This is what I found.

I have only written toy examples in Rails (along time ago) and the same goes for Sinatra (except I’ve looked at it more recently). However, I’m quite familiar with two frameworks in the Python world called Django (which is roughly equivalent to Rails) and Bottle (which is inspired by Sinatra).

Short answer: Really an unfair comparison. Rails is a big framework that provides a lot of features while Sinatra is stripped down and aims at handling HTTP request/responses. Rails use case is for a big website or one that is intended to become big. Sinatra’s sweet spot is small websites or a simple apis. When it comes to choosing a complete website framework, a more equal comparison would be between Rails, Padrino or Ramaze.

Longer answer:
For me the most compelling distinction between the two frameworks is the amount of code each has. If I check out the github repo for Sinatra and Rails and run cloc over them, I get the following:

Project Files Blank Comment Code
Sinatra 53 1516 358 8268
Rails 1745 32740 33770 152274

With Sinatra having about 5% of the code of Rails, it should be clear that there’s no way it could hope to be as featureful as Rails. They are therefore addressing very different goals. According to Sinatra’s maintainer:

“…Rails is a framework focused on writing model driven web applications, Sinatra is a library for dealing with HTTP from the server side.”

This is pretty clear from examples where Rails starts off with scaffolding, introducing directory structure and setting up a database, while Sinatra immediately dives in with routing and an explicit GET request.

Because Rails is opinionated software, a lot of design decisions about how to construct a web application has been made for you. In contrast, Sinatra is just a wrapper around Rack and the request/response cycle. All those design decisions e.g. ORM, Ajax, REST, how to handle DB migrations still have to be made when developing with Sinatra. However this can be to be Sinatra’s advantage. Because there is so little code and ‘magic’, this does mean that Sinatra is really fast. Sinatra can be up to 4 times faster than Rails.

What’s more, you don’t need feel like you’ve made the wrong choice and feel hamstrung by Rails if it gets too complicated. In Rails 3.1, you can embed Sinatra within your Rails application.

Ultimately, I think the comparison is probably unfair, given the different goals of the two particular frameworks. I think in terms of full-featured web frameworks, a more equal comparison can be made between Rails, Padrino or Ramaze. These frameworks intend to achieve the same goals but with differing philosophies. The interesting thing about Padrino is that it is built on top of Sinatra. If you don’t like Rails and Sinatra is too stripped down, Padrino represents a nice middle ground in having a lot of the features you need in a web framework but also being as performant as Sinatra.

When you want to deliver a file to a user with SpringMVC you can use the built in FileCopyUtils to put the file into the OutputStream of the response.

This is an example of the method in your controller class.

    @RequestMapping(value = "/yourURLHere")
    public void handleFileDownload(HttpServletResponse response) {
        File file = myFileService.getFile();
        response.setContentType("application/xls"); //in my example this was an xls file
        response.setContentLength(new Long(file.length()).intValue());
        response.setHeader("Content-Disposition","attachment; filename=MyFile.csv");
        try {
            FileCopyUtils.copy(new FileInputStream(file), response.getOutputStream());
        } catch (IOException e) {

CSV to XML with Powershell

My brother had the need to convert a bunch of SNMP data into XML for Zabbix. He was trying to do this in Excel but fortunately, he asked me if there was a quicker way to do this rather then struggle with crazy string concatenation formulas. I thought this should be trivial in Powershell and it seems I was right:

Here’s an example of what the input CSV “might” look like (input.csv)

type key delay
4 13.1001 600
5 13 700

And here’s the powershell file (convert.ps1):

function zabbix_template($csv) { 
    write "	<item type=""$($csv.type)"" key=""$($csv.key)"" value_type=""1"">				
write "<items>"
Import-Csv $csv_file | ForEach-Object { zabbix_template $_ }
write "</items>"

This is obviously not complete but it’s not far from a working solution. To execute this from a windows command line you would do the following:

c:\>powershell .\convert.ps1 input.csv > zabbix.xml

The two main sources of magic are the Import-Csv cmdlet and string interpolation (No need to use string.format for this). The great thing about the Import-Csv cmdlet is that it creates an object for each row using the header of the csv file. Thus the ‘delay’ column in the csv file becomes a ‘delay’ property on an object for each line. This is brilliant because it means I can pass that object to the function that performs the string interpolation and just reference the columns.

I think this is the day I fell in love with Powershell.

So David Lowe over on his bigpinots blog points makes this observation:

So why do the ‘social networks’ not address this issue. Facebook has a ‘like’ option and a subsequent ‘unlike’ button; but the latter just takes the user back to a neutral position. Google’s +1 is pretty much the same: I can give something a +1 rating or make it unrated; I can’t give anything a -1.

If we are unhappy with a company or site, shouldn’t we be looking to ‘dislike’ and ‘-1’ them? This would also help to clearly differentiate unpopular sites from those that just have low usage; for example, currently, it’s impossible to know if a small number of ‘likes’ is due to unpopularity or because it is a new site or has few visitors.

Now in my daily procrastinations on the interwebs, I frequent two link aggregation sites, namely Reddit and Hacker News. Paul Graham is the guy behind Hacker News and makes a connection between down votes and negative behaviour. He cites the broken windows theory as an inspiration for this view. On his site, it’s difficult enough to downvote a comment that you may as well consider it non-existent:

It’s pretty clear now that the broken windows theory applies to community sites as well. The theory is that minor forms of bad behavior encourage worse ones: that a neighborhood with lots of graffiti and broken windows becomes one where robberies occur. I was living in New York when Giuliani introduced the reforms that made the broken windows theory famous, and the transformation was miraculous. And I was a Reddit user when the opposite happened there, and the transformation was equally dramatic.

I’m not criticizing Steve and Alexis. What happened to Reddit didn’t happen out of neglect. From the start they had a policy of censoring nothing except spam. Plus Reddit had different goals from Hacker News. Reddit was a startup, not a side project; its goal was to grow as fast as possible. Combine rapid growth and zero censorship, and the result is a free for all.

On Reddit (and Digg), you can downvote and upvote links and comments and some would argue that this leads to more incendiary discussions on the site. Frequently you will see people complaining about getting downvoted for a comment which you don’t really see on Hacker News.

On yet another site, Stackoverflow, there’s another take on this:

The problem isn’t downvotes, per se, but encouraging responsible downvoting. That’s why on Stack Overflow, we do it this way:

  • Upvotes add 10 reputation to the post author
  • Downvotes remove 2 reputation from the post author, and 1 from your reputation

The trick here is that downvotes are mostly informational. The cost of a downvote to the users’ reputation (or karma in Slashdot/Reddit parlance) is quite low. It would take a whopping 5 downvotes to equal the effect of a single upvote. And, on top of that, downvotes cost you a tiny bit of reputation. The net effect is that you have to feel very strongly about something to downvote it. Downvotes are serious business, and not to be cast lightly. We designed our system around that maxim.

It doesn’t really surprise me that neither Facebook nor Google have dislike buttons, especially in the case of Facebook where they show the number of votes. There’s no way the Facebook ‘like’ button would be anywhere near as popular if they showed the number of dislikes on a users website. Right or wrong, Facebook has optimised for being ubiquitous, not for being a fair assessment of a user’s site. I would guess that Google’s +1 feature is possibly going to go the same way.

You have to be mindful of these ideas when you’re designing an open system and make sure you incentivise the user behaviour that you want. Otherwise, you could end up inadvertently designing for behaviour that you don’t want.

Why you should profile

So I’m a bit behind on my blog reading and I came across this post by Keyvan Nayyeri via The Morning Brew. In it, he goes on to compare the performance of DateTime.Now versus DateTime.UtcNow.

Now, to be fair, I did not know that DateTime.Now was that much more expensive then Date.UtcNow. I mean, hey, how expensive could it be to add an extra hour to the time value, right? Colour me surprised when according to this post it’s 117 times slower. Reading Keyvan’s post, I came across this line:

Most of the developers are careless about all the properties and methods provided by built-in types in a language, so they don’t discover all the details about them. In fact, they use something that just solves their problem regardless of the side-effects.

True, I have been ‘carelessly’ using the DateTime.Now property not fully knowing the performance characteristics. So I decided to examine how much cost we’re talking about. I downloaded the sample code and ran it on my system. Here’s some of the output:

Testing DateTime.Now …

Sample Size: 5000 – Time: 308607

Testing DateTime.UtcNow …

Sample Size: 5000 – Time: 6753

That is a big difference but what do those bold numbers mean? Those numbers actually come from QueryPerformanceCounter, a Win32 API call that returns values from a high-resolution timer. To calculate the time difference, you need to divide that value from the result of QueryPerformanceFrequency, another Win32 call.

On my machine, that means that the duration of 5000 calls to DateTime.Now and DateTime.UtcNow are 21.55ms and 0.47ms respectively. So on a per call basis we’re looking at a difference of 4.22μs. For those of you who don’t know, μs represents microseconds. Now let’s go back and look at the original Twitter exchange that was the impetus for the post:

“I’m looking through a method right now where someone had used DateTime.Now 15 times. In one method.”

Now my point is this: In programming, abstraction is not a bug, it’s a feature. Whether it’s the idiosyncrasies of the Windows API, hidden behind the .NET framework or all your data stored in AWS. Some of them you can use Reflector to look at how they’re implemented and some of them are trade secrets. What’s for sure is that you will never know the underlying implementations of them all.

Now I’m not arguing for ignorance. Please do understand your tools and technologies. But when it comes to performance, there are only 2 rules:

Rule #1: Measure
Rule #2: Do Your Homework

These rules have served me well over the years because it’s impossible to predict where your bottlenecks will come from. Use profilers to draw you to the problem areas of your code and not the minutiae.

Shaving off a few microseconds with DateTime won’t be half as useful as removing that extra web service call you didn’t know you were making.

« Previous Entries  Next Page »