Category: Tutorials

  • Faster PDFs with wicked_pdf and delayed_job (part 3)

    In part 2 we coded our PDF generator as a background job. But the PDF is still being stored on the local file system. Let’s store it in S3 instead and give our users a URL so they can download it.

    First let’s add the AWS SDK gem to our Gemfile:

    gem "aws-sdk"
    

    Let’s define environment variables for our AWS credentials:

    AWS_ACCESS_KEY_ID=abc
    AWS_SECRET_ACCESS_KEY=secret
    

    Next we’ll modify our background job to connect to S3 and upload our PDF file instead of saving it to the local file system:

    class PdfJob < ActiveJob::Base
      def perform(html)
        pdf = WickedPdf.new.pdf_from_string(html)
        s3 = AWS::S3.new
        bucket = s3.buckets['my-bucket'] # replace with your bucket name
        bucket.objects['output.pdf'].write(pdf)
      end
    end
    

    Nice! But how do we enable our users to download the file? S3 has several options for this. One option would be to make the bucket publicly accessible. The downside to this approach is that it would allow anyone to download any PDFs stored in the bucket, regardless of who originally uploaded them. Depending on what kind of data is being included in the PDFs, this could be a bad idea.

    A better option is to generate a temporary URL. This URL can be given to a user so they can download the file, but the URL is only usable for the period of time we specify. This reduces the likelihood that the PDF will be exposed publicly. Here’s how it’s done:

    class PdfJob < ActiveJob::Base
      def perform(html)
        # ...
        obj = bucket.objects['output.pdf'].write(pdf)
        url = obj.url_for(:get, expires: 3.minutes.from_now).to_s
      end
    end
    

    Looks good. But how do we get this URL back to the user? The background job is asynchronous so it’s not like we can generate the PDF and return the string to the user all in the same HTTP request.

    A simple approach is to write the URL back into the database. Let’s introduce a new user param and update the user with the URL (this assumes the column exists on the users table):

    class PdfJob < ActiveJob::Base
      def perform(html, user)
        # ...
        url = obj.url_for(:get, s3_url_options).to_s
        user.update_attribute(:pdf_url, url)
      end
    end
    

    Now that the URL is available in the database, we can display it on the user’s profile page.

    If we want to get even fancier we can write some JavaScript that’s executed immediately after the user requests a PDF. This script would periodically poll an Ajax endpoint in our app to determine if the URL has been written to the users table yet. When it detects the URL, it would redirect the user to the URL. This would make the PDF generation process seamless from the user’s perspective.

    An example in jQuery might look something like this:

    function poll(btn) {
      $.get("http://www.our-app.com/users/123/pdf_url", function(data) {
        if (data.length > 0) {
          window.location = data;
        } else {
          setTimeout(function() { poll(btn) }, 2000);
        }
      });
    }
    

    Our controller action might look like this:

    class UsersController < ApplicationController
      def pdf_url
        user = User.find(params[:id])
        render text: user.pdf_url
      end
    end
    

    And there you have it. I hope this gave you a good idea of just how easy it can be to generate PDFs in a background job. If your site isn’t getting much traffic, it’s probably not worth going this route. But if it’s a popular site (or you expect it to be one day) it would be well worth investing the time to background this process. It’ll go a long way towards keeping your HTTP response times short, and your app will feel much snappier as a result.

  • Faster PDFs with wicked_pdf and delayed_job (part 2)

    In part 1 we learned why backgrounding is important. Now let’s dive into some code.

    First things first. Add wicked_pdf and delayed_job to your Gemfile:

    gem "wicked_pdf"
    gem "delayed_job"
    

    Now we can generate a PDF from inside our Rails app with this simple command:

    html = "<strong>Hello world!</strong>"
    pdf = WickedPdf.new.pdf_from_string(html)
    IO.write("output.pdf", pdf)</pre>
    

    You’ll notice that the more complex the HTML, the longer it takes wicked_pdf to run. That’s exactly why it’s important to run this process as a background job instead of in a web server process. A complex PDF with embedded images can take several seconds to render. That translates into several seconds of unavailability for the web process handling that particular request.

    Let’s move this code into a background job:

    class PdfJob < ActiveJob::Base
      def perform
        html = "<strong>Hello world!</strong>"
        pdf = WickedPdf.new.pdf_from_string(html)
        IO.write("output.pdf", pdf)
      end
    end
    

    Now we can queue the background job from a Rails controller like this:

    class PdfController < ApplicationController
      def generate_pdf
        PdfJob.perform_later
      end
    end
    

    The only problem is, our job isn’t doing anything particularly interesting yet. The HTML is statically defined and we’re writing out to the same file each time the job runs. Let’s make this more dynamic.

    First, let’s consider the HTML we want to generate. In a Rails app, the controller is generally responsible for rendering HTML from a given ERB template using a specific layout. There are ways to render ERB templates outside controllers, but they tend to be messy and unwieldy. In this situation, it’s perfectly reasonable to render the HTML in the controller and pass it along when we queue a job:

    class PdfController < ApplicationController
      def generate_pdf
        html = render_to_string template: "my_pdf"
        PdfJob.perform_later(html)
      end
    end
    

    This assumes an ERB template named “my_pdf.erb” exists and contains the HTML we want to convert into a PDF. Our method definition within our background job then becomes:

    class PdfJob < ActiveJob::Base
      def perform(html)
        pdf = WickedPdf.new.pdf_from_string(html)
        IO.write("output.pdf", pdf)
      end
    end
    

    delayed_job actually persists the HTML passed to the job in a database table so the job can retrieve the HTML when it gets executed. Since the job is executed asynchronously, the HTML has to be stored somewhere temporarily.

    So far, so good. The job will generate a PDF based on the HTML rendered in the controller. But how do we return this PDF back to the user when it’s ready? It turns out there are a variety of ways to do this. Saving the PDF to the file system in a publicly accessible folder is always an option. But why consume precious storage space on our own server when we can just upload to Amazon S3 instead for a few fractions of a cent?

    What’s nice about S3 is that it can be configured to automatically delete PDFs within a bucket after 24 hours. Furthermore, we can generate a temporary URL to allow a user to download a PDF directly from the S3 bucket. This temporary URL expires after a given period of time, greatly reducing the chance that a third party might access sensitive information.

    Next week I’ll demonstrate how to integrate S3 into our background job using the AWS SDK.

  • Faster PDFs with wicked_pdf and delayed_job (part 1)

    What do you get when you combine the slick PDF generation capabilities of wicked_pdf with the elegance and efficiency of delayed_job? A high performance way to convert HTML pages into beautiful PDF documents.

    I’ve been leveraging wicked_pdf to generate high school transcripts from my SaaS app, Teascript, since 2009. Prior to that I had been using Prawn which ultimately proved to lack the flexibility I needed to produce beautiful PDFs.

    wicked_pdf converts HTML pages into PDF documents using WebKit, the engine behind Apple’s Safari browser (among others). For the past few years, Teascript produced PDFs without any kind of backgrounding in place. This meant that if someone’s PDF took an unusually long time to generate, they were tying up a web server process for that entire duration.

    If multiple users generated PDFs simultaneously, it might prevent other visitors from accessing the site. Not good. Furthermore, if the PDF generation process exceeded the web server’s default timeout, the user might not ever get the PDF, just an error page.

    Any time your web app integrates with a third party API or a system process, it’s a viable candidate for backgrounding. delayed_job to the rescue. By offloading the long-running processes onto background workers, we free our web server to do what it’s best at: serving static HTML and images.

    Backgrounding isn’t a silver bullet, though. It introduces added complexity into the app, making it more vulnerable to failures. This requires writing additional code to handle these failure scenarios gracefully. But at the cost of this added complexity, we can ensure our web server stays fast and lean while our users still get the pretty PDF they want.

    Next week we’ll dive into some actual code. I’ll demonstrate how to integrate wicked_pdf with delayed_job and hook the entire thing up to your Rails app. Don’t touch that remote.

  • Fix Bluetooth in OS X Yosemite

    I love OS X. It’s an incredibly reliable operating system and it’s usually a joy to operate. Unfortunately, since upgrading from OS X Mavericks to Yosemite I had been plagued with Bluetooth connectivity problems:

    • My Apple keyboard would randomly disconnect from the computer. Once this happened, it became impossible to reconnect it again without restarting. Turning the keyboard off and on again wouldn’t fix it.
    • My Magic Mouse’s tracking motion would randomly become jerky and stuttering. This would last for 2 or 3 minutes and then return to normal. Turning the mouse off and on again wouldn’t fix it.
    • Devices that I hadn’t added would show up in Bluetooth Preferences as being permanently “remembered.” Whenever I would try to “forget” these devices and closed the Preferences window, they would immediately show up again after opening Bluetooth Preferences.
    • My mouse and keyboard also showed up in Preferences and could not be “forgotten.” Same as above, as soon as I removed them and closed Preferences, they would appear when I immediately opened Preferences again.

    These problems were incredibly frustrating. I did a lot of research trying to determine how best to resolve them. None of the solutions I found worked. These included:

    • Replacing the batteries in the Bluetooth device
    • Disabling and re-enabling Bluetooth
    • Clearing the PRAM
    • Resetting the SMC
    • Restarting the computer (this temporarily fixed the problems but they always came back)

    However, I believe I’ve finally fixed these strange connectivity problems for good. A couple of days ago I moved the following files to my Desktop and restarted:

    • /Library/Preferences/com.apple.Bluetooth.plist*
    • ~/Library/Preferences/com.apple.Bluetooth.plist*
    • ~/Library/Preferences/ByHost/com.apple.Bluetooth.*

    It’s important to move (not copy) the files. This forces Yosemite to re-create the files on reboot. (I could have just deleted the files but I wanted to keep them around as backups in case something went wrong.) Since doing this, my Bluetooth devices have been happily connecting and disconnecting appropriately and I have no more stuck devices in my Preferences.

  • Slides from my API talk

    Thanks to everyone who turned out for my API talk at the Triangle Ruby Brigade. I wasn’t expecting such a large crowd and the resulting Q&A was really good. It was interesting hearing how other developers are using APIs in their projects, and what problems they are encountering and solving. I’ve posted my slide deck for those who are interested. I also recorded audio from the talk and will be posting a link here when that’s online.

  • Building an external HTTP-based API in Rails

    If you’ll be in or near Raleigh the evening of March 12th, consider dropping by the Triangle Ruby Brigade. I’ll be presenting on how to build HTTP-based APIs in Rails, including:

    • Creating an API controller
    • Wiring up versioned routes for your API
    • Protecting your API with authentication
    • Choosing a transport encoding

    The question of which transport encoding to use is critical. If your API will be consumed by iOS devices, choosing binary property lists over XML or JSON can give you a 30% performance boost as well as an associated reduction in bandwidth consumption. Building an API that generates plists is straightforward with the help of a couple of Ruby gems.

    I’ll be sharing code examples from a recent project that surfaced a large, multi-faceted API to hundreds of iOS devices using binary plists. I’ll also have plenty of resources for those interested in learning more. It’s sure to be a great time! Hope to see you there.

  • How to safely transpose Ruby arrays

    Ruby arrays have this handy method called transpose which takes an existing 2-dimensional array (i.e. a matrix) and flips it on its side:

    >> a = [[1,2], [3,4], [5,6]]
    >> puts a.transpose.inspect
    [[1, 3, 5], [2, 4, 6]]
    

    Each row becomes a column, essentially. This is fine and dandy for polite arrays. If one of the rows in the original array is not as long as the others, though, Ruby chunders thusly:

    >> a = [[1,2], [3,4], [5]]
    >> a.transpose 
    IndexError: element size differs (1 should be 2)
    	from (irb):3:in `transpose'
    	from (irb):3
    

    That ain’t pretty, especially if the intent behind using transpose is to render data in a nice columnar fashion. For example, what if we wanted to render a list of high school courses in columns, one column per semester? Grouping the courses by semester and then transposing would do the trick, but only if there were exactly the same number of courses taken each semester. If even one semester differs, Ruby will blow up. What we really want is for Ruby to just ignore the fact that each grouping may have differently sized arrays and transpose anyway, filling in the empty spaces with nils.

    Here’s how to do just that:

    class Array
      def safe_transpose
        result = []
        max_size = self.max { |a,b| a.size <=> b.size }.size
        max_size.times do |i|
          result[i] = Array.new(self.first.size)
          self.each_with_index { |r,j| result[i][j] = r[i] }
        end
        result
      end
    end
    

    Now we call safe_transpose on our matrix of courses and Ruby does the right thing. It calculates the length of the longest row and uses that as the baseline to perform the transposition. So our original example becomes:

    >> a = [[1,2], [3,4], [5]]
    >> puts a.transpose.inspect
    [[1, 3, 5], [2, 4, nil]]
    

    Nice and neat. Caveats: the code above hasn’t been refactored or tested. Your mileage may vary. If you see a better way to do this, let me know and I’ll post an update.

  • Sending mail from Rails through Google SMTP

    I just ran into a problem configuring a Rails app to deliver email through Google’s SMTP servers. ActionMailer doesn’t support TLS/SSL which is required by Google. Fortunately, the action_mailer_optional_tls plugin provides this functionality.

    I wanted to host my SMTP settings in an external YAML file so I wouldn’t end up checking my username and password into the repository. (The YAML file is placed on the production server and a softlink is created during each deploy.) For some reason, I kept getting “connection refused” messages whenever I tried to send email with this configuration:

    # smtp.yml
    production:
      address: smtp.gmail.com
      port: 587
      tls: true
      domain: foo.com
      authentication: :plain
      user_name: someone@foo.com
      password: secret
    
    # production.rb
    smtp = YAML::load(File.open("#{RAILS_ROOT}/config/smtp.yml"))
    ActionMailer::Base.smtp_settings = smtp[Rails.env]
    

    The problem was the hash returned by YAML. The keys were strings, whereas ActionMailer was expecting the keys to be symbols. The fix was to make the hash use indifferent access:

    ...
    ActionMailer::Base.smtp_settings = smtp[Rails.env].with_indifferent_access
    

    That cleared up the “connection refused” problem. Now my app is sending email like a champ.

  • How to block ads on Facebook

    The ads they run on Facebook are getting downright annoying. I’m confident I’m not alone in this feeling. Here’s a beginner’s tutorial describing how you can prevent Facebook ads from being displayed. You also pick up a few other nice features in the process.

    1. Download and install Firefox

    If you’re already using Firefox, you can skip this step. If you’re not, consider this your wake-up call. Internet Explorer just doesn’t cut it anymore. Aside from being completely open source, Firefox allows installation of scripts that enhance your browsing experience. You won’t be able to block Facebook ads without using Firefox.

    Visit this page, click on the big green download link, save the installer to your hard drive, and run it. Proceed through the installation…

    2. Install Greasemonkey

    Greasemonkey is an add-on for Firefox that allows for customization of the way a web page is displayed. It relies on JavaScript, but that’s not important to know for what we’re doing (unless you’re a geek).

    First, bookmark this post. You’re going to restart Firefox after this step. You’ll want to get back to this post so you can pick up where you left off.

    Next, visit this page and click the green “Add to Firefox” button. A dialog will pop up. Wait for the countdown to finish, then click the “Install” button. A new dialog will ask if you want to restart Firefox. Yes, you want to, so do it.

    3. Install Facebook Companion

    This is a Greasemonkey script that does three nice things to Facebook:

    • Removes ads
    • Adds an “Ignore All Requests” button (useful if you don’t want to see new app requests)
    • Adds a plus over all thumbnails that, when clicked, pops up a large version of the image

    To install, visit this page and click on the small, gray button on the right side titled “Install this script” (it’s just below the search box). Again, you will be presented with a dialog and an “Install” button. Click it…

    4. Visit Facebook and enjoy

    Now it’s time to visit Facebook and do some ad-free social networking. Enjoy!

    Extra credit: if you’re interested in browsing for Greasemonkey scripts that do other cool things, check out Userscripts.org

  • Setting speed dial numbers on a Sprint RAZR V3m

    I’ve been generally displeased with my Motorola RAZR. Sprint gave it to me over a year ago and aside from terribly poor battery life, it has one of the worst user interfaces I’ve ever seen on a phone. Despite that, it’s very compact and since my Sprint plan doesn’t expire until October of this year I’ve stuck with it.

    Something I couldn’t figure out was how to set the speed dial numbers. Turns out that there isn’t a way to do this through the main “Contacts” list (seems like that would be the best place for it). After Google failed to turn up anything, I began randomly clicking through my settings menu in frustration, attempting to locate the speed dial settings. I finally found them. Finally.

    Go to the main settings pane, then select the “Contacts” button (orange book with a phone icon on the front). There will be an entry on this menu titled “Speed Dial #s” which will let you configure everything you need. Why this wasn’t included on the main contact list I’ll never know, but there you have it.