Category Archives: Code

Why your PDF should be HTML

Over the last few months the issue of formats has come up a few times, when librarians, educators and marketeers have all wanted to use PDFs to deliver information to the user. I thought now would be an opportune time for me to state why, in many cases this is a bad idea (even if done with the best of intentions).

What the experts say

Jakob Nielsen’s Alertbox, July 14, 2003 – PDF: Unfit for Human Consumption
Usability guru Jakob Neilsen is forthright in his appraisal of PDF as a format on web sites.

Joe Clark’s 2005 article about PDF accessibility included here to refer to the section where Joe clearly elucidates why most things should be HTML and even better has written a thorough list of exceptions. If your information doesn’t fall into one of these categories then you really should be using HTML.

Where we are going wrong

Of the many PDFs that are currently available on our various web sites very few can really justify the format that they are in if we use the criteria laid out in the articles linked to previously. I believe that a combination of overstating the role of a particular visual style and understating the inconvenience to the user leads to a situation where uploading a document suffices. I don’t believe this is the case. If we want to provide the best experience for users then we need to be making that extra (small) effort to put the information in the right format. It’s not that hard, and everyone benefits.

Issues with ISSUU

A beta subject guides page has been created by the proactive librarians that we have at Glamorgan that uses , a service for hosting PDFs that wraps them up in flash, and add various user interface feature like page turning animations, zooming, various views and useful social features like commenting and sharing. I think it’s unfortunate that the useful features have been mingled with the user interface fluff that actually makes the information harder to retrieve.

Putting it into practice

To show what’s possible I downloaded a PDF of Lighting Design and Technology & Live Event Technology from a Subject guides page, and spent an hour or two copying and pasting to create an HTML version. The pdf is 115k to download. The HTML is 41.5k. In addition to the smaller file size the user does not need to wait for the PDF reader to open up, can navigate via a table of contents and most usefully can click on the many URLs to go straight to the info. The HTML format enables to the user to directly interact rather than read, then copy the links.

It may not have the visual impact of the issuu PDF version, but it is more functional, in the browser window that people are used to. Also, none of this precludes making the pdf available for those people that wish to download it.


Hope that people find this a useful position statement, and would love to see some response in the comments.

MySQL Replication

The situation

We’ve got a web application that uses a mysql database as it’s backend. Some of the data held in that application needed to be used in another web application, but only read, never written to. So what we had to come up with is a method of using that data.


1. Use the Rails ActiveResource class

Using this method we could read restful data from the remote system directly, and use the ruby objects we pull into our site to display the necessary data.
This is a relatively easy method of achieving our goal, requiring minimal amounts of coding on the remote application and we keep one source of data.

Some of the downsides to this method are if the remote system goes away(network failure, mysql crashes etc etc), then our new web app falls over. Also the remote system will always have to do most of the grunt work, running the mysql queries, creating the xml, etc etc. If it’s a busy remote system this may have a negative impact on how both systems run.

2. Use the Rails ActiveRecord class

The downsides to this problem are the same as the ActiveResource method, with a couple of additional problems. We’d then have access to all the database tables, and there are some tables that contain sensitive data, so that would require some additional work. Also, you can only define one database per application, so if we wanted to add any tables to our new web application, we’d have to add them to the database that powers the remote application as well. That could prove very difficult to manage.

3. XML

This is really a less efficient ActiveResource type solution with the same pro’s and con’s.

4. Master/Slave MySQL Replication

The only real downside for this method is that we’d never done it before. After a bit of testing we quickly found out that we could eliminate all the problems of the other methods.

We can synchronise only the tables we want, so we wouldn’t run into problems with sensitive data.
If the remote system goes away then our system will remain unaffected and the data changes will “catch up” once the remote comes back online.
There is no additional load on the remote system as the new system will query itself.
If we need to add any tables to our new application then that will have no effect on the remote application at all.

MySQL Replication

Replication follows a Master/Slave methodology, in our case the master is the application that is already in place, and the slave will be the new application.

I’ve left out how to create the user for replication(slave_user) and give it the necessary permissions, you can find out how to do that in one of the pages in the article above if you don’t know how to. Also don’t forget to set the bind-address to the IP of the slave on the master, and open up the firewall on the master on port 3306 for the IP of the slave.

First off we need to enable binary logging on the master, set the server-id, the database we want to replicate, (we’ll specify the tables we want in the slave), and some other variables to keep the system from getting out of hand.

sudo vi /etc/mysql/my.cnf

Either add these lines to the [mysqldb] section or uncomment and edit them if they are already there.


restart the mysql server

sudo /etc/init.d/mysql restart

Now we need to tell the slave mysql server that it is the slave, and also which tables to replicate. You can also set some other management options here.

sudo vi /etc/mysql/my.cnf

Either add these lines to the [mysqldb] section or uncomment and edit them if they are already there.


restart the mysql server

sudo /etc/init.d/mysql restart

As we already have data in our master system we’re using the mysqldump method to get the current data(via Sequel Pro). In order for the data in the master to be the same as that in the slave initially, we need to stop any commits on tables, or LOCK the tables of the master database, prior to taking the data dump for the slave.

mysql -uusername -ppassword
mysql>use database_name;

This will block all commits until you close that mysql session. Export the sequel dump from the master now and import it into the slave. I did this with Sequel Pro and the import/export tools available in that.

We now need to grab some details from the master that we’ll need later on when we tell the slave to start replicating.

| File | Position | Binlog_Do_DB | Binlog_Ignore_DB |
| mysql-bin.000004 | 20060 | database_name| |

Now back on the slave we need to use these variables and some others to setup the replication.

mysql -uusername -ppassword
mysql>STOP SLAVE; (check if the slave is already running or not)
mysql>CHANGE MASTER TO MASTER_HOST=‘IP of master’, MASTER_USER=‘slave_user’, MASTER_PASSWORD=‘slave_users password’, MASTER_LOG_FILE=‘mysql-bin.000004’, MASTER_LOG_POS=20060;

Go back to the master and close the mysql session you started earlier. This will release the lock on the tables and allow updates and commits on the master again.


That’s it, any changes in database_name.table_name on the master will be replicated over to the slave. If the master goes away for any reason, the slave system will still function and will ‘catch up’ when the master comes back online.

What we’ve now got working is a central data storage point that pushes any changes in it’s own data out to a remote system as and when the data changes. It’s pretty easy to add more slaves if you need to for scalability.

If you had a lot of people/systems all working on the same dataset, and you wanted to make certain that they were all using the same data, then the remote systems that only read data could be the slaves, and any systems that need to write data could use the master. You can also set up master to master replication, so that if some remote systems needed to write data as well, then they could.

My little bundler of Joy

Bit of Background

One of the new bits of technology coming into Rails 3.0 is the all singing and dancing Gem Bundler . If you’ve ever found yourself in config.gem hell, or have pushed a perfectly fine project from development to production and it’s all gone kaput due to gem problems, than this is the medicine that you need. Here’s the blurb

Bundler is a tool that manages gem dependencies for your ruby application. It takes a gem manifest file and is able to fetch, download, and install the gems and all child dependencies specified in this manifest. It can manage any update to the gem manifest file and update the bundled gems accordingly. It also lets you run any ruby code in context of the bundled gem environment.

We’re not running Rail 3.0 yet, but thats no reason not to start using bundler.

Let’s install it, I’ve settled on version 0.7.1 as I was having odd problems with any of the newer versions.

sudo gem install bundler -v 0.7.1

Next step is to create a file called Gemfile in your applications root directory, and then subsequently fill it with what gems you need, and also in that environments you need them. Here’s the one I’m using.

Sample Gemfile

Now add these lines to your .gitignore file, this will make deploys easier and faster


Now it’s time to start a bundling. From inside your application run

gem bundle

That will now do all it’s magical stuff for you, creating the bin executables and the necessary files and folder in the vendor/bundler_gems folder.

Will it work yet? Not quite 🙁

To get it working with rails 2.3.4 I had to do the following things.

1. Create a file called config/preinitializer.rb with this code in

require "#{File.dirname(__FILE__)}/../vendor/bundler_gems/environment"

2. Add the following line to all environment files, ie, config/environments/*.rb

Bundler.require_env RAILS_ENV

Now when I start my app using script/server, or nginx and passenger it all works as it should. I even removed all my local gems just to check. (stupid idea as not all my rails projects are bundled yet :-))

Deploying the medicine

I’ve unashamedly knicked the deploy stuff from numerous people, I love you all, even though I can’t remember your names.

Capistrano Deploy Script

Watch out for….

It might sound obvious but some gems can’t be handled in this way. Have a good think about that prior to using bundler, one for us was passenger. I had a hell of a time trying to workout how to use it with bundler, than I got some sane advice to just install passenger as a module of nginx, problem gone. The unicorn is stalking me at the moment though, so we may have a passenger who wants to get off shortly.

We were using REE in production, in my experience REE and bundler didn’t get on, so I had to remove all REE and install the ruby that comes down the pipe with aptitude, then rebuild nginx with passenger module after doing that, it all worked ok then.

I now deploy from the application like so

./bin/cap staging/production deploy

That way I know what version of capistrano I’m using per app

Don’t delete all your local gems until all your projects are bundled, and even then be careful.

If you have rake tasks that are being run by cronjobs, make sure that the rake you are accessing is present.
RAILS_ENV=production rake some:jobname will not work from a cron job now, as rake is not available like that any more. You’ll need to edit it to point at the rake in your project bin.

RAILS_ENV=production /path/to/project/bin/rake some:jobname

Or add the bin folder in your project path to your $PATH, that might be a better solution.

Happy bundling everyone.

Our Current Technology

ubuntu 8.04, rails 2.3.4, ruby 1.8.6, nginx 0.7.64 (with passenger 2.2.8 installed as a module)

People who’ve helped immeasurably

#carlhuda on freenode

What’s long, hard and… green

Don’t be so disgusting, it’s obviously a cucumber.

Or Cucumber with a capital C, perhaps, which was frequently long, hard and red (steady!) until our shiny new testing regime was unveiled!

As part of our commitment to use Cucumber more effectively, here are a few of the things I have learned about its use, and if you learn them too (from me) you will enjoy using Cucumber just as much as this guy enjoys being brutally murdered:

Background, the new GivenScenario

Remember GivenScenario? If you don’t, I don’t blame you. It died in infancy, and in fact was boxed and buried before I even had the chance to use it.

The idea was that you could define some sort of ‘set up’ scenarios that saved you from having the same code repeated all over your feature spec. There were a number of problems with this solution though, as you ended up with scenarios that had no value other than as leg-ups for other scenarios, and it could get pretty confusing if your GivenScenario had a GivenScenario itself, and so many other reasons. It makes you wonder why they ever thought it was a good idea. IDIOTS.

So anyway, you write ugly, repetitive features like this:

Scenario: view preferences page
  Given I am logged into cas as "Peter" "Portal" with an username of "00700001"
  And I am a Business School student
  And I am enrolled on a course
  And there are campuses
  And I am on the preferences page

Scenario: should show user personal details
  Given I am logged into cas as "Peter" "Portal" with an username of "00700001"
  And I am a Business School student
  And I am enrolled on a course
  And there are campuses
  Then I should see "Peter Portal"
  And I should see ""

There is a newer, and better way to do this though. Introducing Background, which you can use in pretty much the exact same way, but it doesn’t count as a scenario itself. Joy.

  Given I am logged into cas as "Peter" "Portal" with an username of "00700001"
  And I am a Business School student
  And I am enrolled on a course
  And there are campuses
  And I am on the preferences page

Scenario: view preferences page
  Then I should see "Your Profile and Preferences"

Scenario: should show user personal details
  Then I should see "Peter Portal"
  And I should see ""

Lovely (this gets much more lovely than you can see here, as there are a lot of scenarios with the same background).


As I was writing some particularly awkward Cucumber steps involving assigning something to something else and then assigning it to something else (see, pretty confusing), and having been told assigning stuff was much harder than it actually turned out to be, I was really struggling to figure out what was going on.

I ended up putting in a load of print statements to see what my variables were doing.

There is a better way though. I was pointed to this article, which describes how you can use breakpoint in Cucumber steps.

However, you shouldn’t do this, because breakpoint is deprecated, and it’s now called debugger. You can put this in anywhere and when you run the test you get an interactive prompt and you can inspect any variable you wish (and probably even set things if you’re trying something out). The most convenient way to do it, as described in the article, is to put it in a step by itself:

Then /^I debug$/ do

That way you can simply drop in the step whenever you want to know what’s happening:

Scenario: list users
  Given I am logged into cas as "Peter" "Portal" with an username of "00700001"
  And I am on the glamlife users index page
  And I debug
  Then I should see "Glamlife Users"

When you’re finished with the debugger, just press Ctrl+D, and Cucumber will continue on its merry green way.

You can even put it in a Background and debug all your scenarios at once if you really want to!

Honourable mention goes to save_and_open_page, which will open the page in a browser so you can see visually what’s wrong with it, instead of having to interpret the HTML, see the Technical Pickles blog for more on this.


As usual there’s a Railscast covering much of this stuff, and more (there’s some particularly good stuff in there for people with many similar features with just a couple of variables), and for anyone who is just here to look at the pictures, there’s also a good one with an introduction to Cucumber.

Don’t be an idiot

All this won’t help you though, like it didn’t help me, if you miss something obvious like the fact that if your ‘visit’ line doesn’t come after your other Givens, you data won’t be there no matter how much you debug. 🙁

Liquid Refreshment

This week I have been experimenting with Liquid.

Not an excuse for boozing at work, but a method of allowing content editors to incorporate dynamic content without having access to all the wonderful and dangerous things that Ruby allows.

In short, you can use tags which look like to output variables or to do simple computations. If you’re interested (and who wouldn’t be?), you can read more about Liquid here.

The object of this investigation was to allow content editors to include polls in their content that users could vote on. Polls are a feature that we’ve been thinking about adding to Glamlife for a while, and it was decided that giving editors the option to add them into Glamlife’s ‘chunks’ would be the ideal way, to give them as much flexibility as possible.

So my first task was to try and dig up as much information about Liquid as I could find. As is so often the case with these cutting-edge technologies, there isn’t much out there. The Liquid developers’ wiki was useful, but not exactly comprehensive. I was also able to find a few blog posts about it, for example this one about custom tags, and this overview of Liquid’s features.

If a picture is worth a thousand words, how much is a video worth? I found that the most useful resource by far was Ryan Bates’s ‘Railscast’ video tutorial on using Liquid. After watching this and having a little play with it in the Rails console environment, I felt pretty good about Liquid and set about trying to shoehorn the poll form into it.

As many of you will be aware, however, life has a nasty habit of crushing your dreams just when you think they might be coming to fruition. Imagine my despair, then, when I found that the Rails helper methods (which are required for the form) are unavailable in the lib files which you are required to code your custom tags in. As yet, with my relatively limited knowledge of Rails, and despite watching another Railscast video, I have been unable to solve this problem.

An interim solution will be to simply hard-code the form into a view. This is inflexible, and will only show the latest poll created, but it will at least get the feature out into the wild and satisfy the clamour for exciting new features. Keep your eyes peeled on Glamlife (if you have access to it!) to see if we have more success with this!

Cucumber Stories

Cucumber is a replacement for Storyrunner. David Chelimsky has a good writeup of the history and the plans.

UPDATE Theres seems to be some work on creating a repository of common scenarios and stories at the aptly named Cucumber Stories

Why is this interesting?

Well, the theory goes that once the feature requirements are written in the featurename.feature files then the coders write steps that correspond in featurename_steps.rb files. This is where code that actually does the things being described lives. By defining features in this way the coders are clear what exactly is required and the time consuming back and forth working out what people mean is reduced. By writing down what is meant it gives people a solid starting point. If the stories need to change then they can, but this aspect of the project is a clear way to get clarity from the clients on what they want.

Thats’s the theory.

To get started you can see that Craig Has some instructions and background on how to get it up and running.

And once you’ve done those things you’ll be itching to get cracking and write some features and scenarios. Craig also helps us out here

What it feels like to write them

I’ve been doing this for a couple of days now and to be honest am still struggling a little about how much detail I need to be going in to in the scenarios. So i decided to describe the behaviour in broad terms. If that is acceptable to the coders in the team that is fine; if they think that more detail is needed then they can define that or any other member of the team.

Here is an example of what I’m talking about.

Feature: Provide News
In order that news is relevant and timely
As an editor
I want ability to add news

Scenario: Add person to system
Given They are allowed
And they have been trained
When an editor is added
Then they have ability to add, create and edit news

In this example, the business value is the provide news from my area. The role is defined as an editor and the action is ability to add news.

So, I then go into a scenario which is to detail how the feature actually will work. In the example we describe the behaviour needed to add a person to the system. Given is a reserved word which sets the context. In this exammple it assumes that the user only arrives at this step once approved. Note that it doesn’t specify how that is done, just that it should have been done (We will come to that in another post.)

And is also a reserved word that allows us to add the the behaviours. IN the example above I am requiring that have been trained. Again, how we verify this is for elsewhere; all we need here is that it has taken place. When those steps have taken place we define that the next step is to add the editor, and to verify that has taken place we use the reserved word Then to check that the previous step has been completed.

The next stage

This is the first part of the process. The really clever bit is that steps are written in corresponding step files to make the behaviours we defined here a reality. Steps are beyond the scope of this article, but there are some good examples and explanations

I’ll let you know how we get on with this method.

Git for Designers

Git for Designers

GIT is a version control system that is getting a lot of good reviews lately, and we’ve started to use. I thought I’d relate a methodology that we’ve arrived at in very simple terms, that even a designer can understand.

We all have GitHub repositories, and we are using local branches to work on our respective development tasks. So, the dilemma is to how to share work in our branches and repositories with each other without having to commit to the master repository.

An answer comes in the form of remote branches and pull requests.

The theory goes like this…

  • Make a copy of the latest version of the code, on one’s own repository, get a local copy of that code, and develop to your hearts content – including creating branches for different aspects of the work.
  • Periodically check the original to make sure one’s repos and local code is up to date.
  • Commit changes to local and own repositories – including any branches that you think you might want to share.
  • Get those changes into the master repository.

Ok, so let’s make it happen.

First thing to do is make a fork of the original repository, by pressing the fork option on the github home page.

This creates a repository that you can then get onto your machine with the git clone command.

git clone

CD in to the folder that has created

Have a little check of all your branches with

git branch -a

So, the next thing we need to do is track the remote branch so that we get any changes from the original repos that we forked from.

git remote add branchname

branchname is the name you give to the tracking branch. It doesn’t show up when you run git branch -a yet, but some lines have been added to your config – open up .git/config and you will see that the branchname points to the original repos.

[remote "branchname"]
url =
fetch = +refs/heads/*:refs/remotes/practiceb/*

So now, I’d like to get something into this branch to check it’s tracking correctly. I can do that by running

git fetch branchname

Which gets some files from repos.So, this time when you run

git branch -a

Your remote branch called branchname appears in your list or branches.

So now you are tracking this branch, but According to the GIT Manual you cannot checkout a remote tracking branch. Instead you need to create a local branch, which you do with the following command.

git checkout --track -b newlocalbranchname branchname/master

Just to be sure have a look with git branch -a and you should see the new branch with an asterix, indicating the branch you are currently on.

So, if you remember the theory, we’ve now got a local branch with the changes from the original repository, and a local master branch with changes from our forked repos. This gives us the mechanism to get any changes from the original repos.

We run git pull whilst in our newlocalbranch. This fetches and merges. We can then switch to our master branch with git checkout master from where we run git merge newlocalbranch pulling the changes over.

The next bit of the theory was to put our changes up. That is pretty easy – after our minutes of productive work you decide you’re happy with your changes.

git commit -a

And, then git push to get it onto your github repository

The final piece of the jigsaw is to get your changes into the original repository. You can do this by sending a pull request to the admin for the repository. The button is on the github homepage of the repository.

And there you have it.

Click where?

When placing hyperlinks on a site, there can be a temptation to use ‘click here’ as the link text, assuming that the context of a link is immediately apparent.

The rest of this article contains some help and advice on this issue.

One of the issues that we face as developers of various CMS is what to do when the people writing the content write in a way that is contrary to the WCAG. Point 6.1 of the guidelines explains, in a rather technical way, the problem. For a more informal discussion and some real world examples I’ve linked to the article why ‘click here’ is bad practice.

I’ve selected a few highlights from the page –

“Click here” is device-dependent. There are several ways to follow a link, with or without a mouse. Users probably recognize what you mean, but you are still conveying the message that you think in a device-dependent way.

There’s usually a fairly simple way to do things better. Instead of the text “For information on pneumonia, click here”, you could simply write “pneumonia information”.

Accessibility isn’t something that can be left to developers to worry about.