Building a Facebook App Using Amazon Web Services.
Friday, February 8th, 2008Since this fall, Synthesis has been working on a Facebook application that lets users create groups and use text messaging to hold threaded conversations privately within the group. We used several different technologies while developing this that made the architecture decidedly different from the norm for web applications.
For starters, simply being a Facebook app means there’s another layer between the server and the browser.
For starters, simply being a Facebook app means there’s another layer between the server and the browser. Facebook apps appear as part of the Facebook site, providing hooks to lots of features within Facebook itself, but at the cost of needing to work around under-documented features. The application is built using canvas mode, meaning that our server returns snippets of HTML and FBML (Facebook Markup Language) to Facebook’s servers whenever the user makes a request, and Facebook then renders it for the user as if it were a Facebook page. Facebook provides a ton of help if you’re writing your app in PHP, but we went with Grails in order to leverage the scalability of Java. This meant that documentation was sparse in a few key areas; we found that the easiest way to go was often porting code from Facebook’s PHP libraries. There is, however, a great third-party alternative to Facebook’s Java API that’s open source, has javadocs, and surprisingly enough, provides better compatibility with Facebook’s API.
For hosting and dynamic storage, we chose to built it on top of Amazon’s Electronic Compute Cloud (EC2) and Simple Storage Service (S3) services. The biggest reasons for going with them are that they provide lots of configurability, ease of setup, competitive pricing, and the ability to scale easily if and when the application takes off virally.
EC2 is great because it allowed us to upload a custom disk image configured to have exactly the software configuration we need. So, when the number of users starts increasing, we can turn on more instances and configure them to share the load. The details make this much more complex, of course, since the application itself needs to be built in a way that makes such scalability possible. However, the ease and flexibility of having custom images ready and waiting to be turned on when needed is a huge plus.
However, there are two big downsides to using EC2.
First is that the instances have no permanent storage. Each instance offers 150 GB of local storage, but it’s tied to that particular instance. When the instance is shut down (or crashes), it goes away. Storing data on S3 helps, but S3 uses block oriented storage that’s pretty different from a typical file system, making it difficult to simply treat it as an infinitely sized NFS mount.
Second is that the IP address assigned to the instance isn’t permanent. This leads to a similar problem: if anything references that IP address (either through DNS or any other third party service) and the instances goes down, those references will be broken. (Update 4/3/08: Cool! Amazon fixed this, now it’s possible to reserve a static block of IPs and dynamically assign them to instances.
We solved the first of these problems by implementing an aggressive backup strategy using S3. S3 is effectively an infinitely large storage space with all the replication and high availability features that Amazon’s own web site is built on. In the future the “correct” answer may well be to go with Amazon’s simple DB service. It’s currently in closed beta though, and the model it’s built around isn’t ideal for real-time updates, since there’s a lag between when a request is made to store data and when the request is actually performed.
Frequently changing DNS addresses are common enough to have a standard solution: dynamic DNS.
The second problem, of impermanent IP addresses, is much more straightforward. Frequently changing DNS addresses are common enough to have a standard solution: dynamic DNS.
This application highlights the evolution of web apps as both the underlying infrastructures become simultaneously commoditized and far more customizable, and as individual sites like Facebook begin offering APIs, making them into their own proprietary platforms.