Online Information Storage: Completing the Web as Platform
May 12, 2006 - Show original item
I've written several times in the past (most particularly here) about online storage and its importance to the next generation of the Web. The developments in this space over the last year have been fascinating indeed. For true Web 2.0 software (which I'll define here simply as networked applications that explicitly leverage network effects) or plain old Software as a Service (SaaS), key platform services are still not available on the Web as they would be in a traditional computing system. In this particular case I'm referring to the fact that unlike regular software that runs on your PC, most online software today will not use the storage location of your choice. Instead, these applications tend to prefer storing things in a location of the online software provider's choice, usually on their servers. This can be far from desirable.
Now, while there certainly are some online applications that don't require you to store your data on the software provider's servers (Basecamp comes to mind), it brings up a related set of crucial issues that haven't been addressed very well by the SaaS and Web 2.0 community yet. Specifically, these are the ability to 1) choose an own online storage provider in conjunction with an online software provider, 2) support for portable information formats, and the 3) resulting freedom to provider switch including controlling the hosting domain/URL. Here's a quick run-down of what these possibilities would (and should) provide to end users:
Online Software Support for Open Storage Providers: Whether I use Writely or Writeboard, I should be able to choose to store files created by my online software on Omnidrive or Openomy or whatever other storage provider I choose, including myself. This brings up a related but separate issue of standard APIs, which is another discussion I don't cover here. The bottom line: Users must be able to choose the location of a trusted, 3rd party storage vendor or storage devices under their own control. Not having this will continue to affect online software adoption, since data storage has numerous implications for trust, privacy, long-term security, etc.
Adopting Portable Information Formats: Even with the advent of XML and the countless XML standards for vertical and horizontal data formats of almost every description (OpenDocument being just one tiny but important example), online software generally doesn't use open methods today and rarely supports storage of data in widely recognized formats (there are exceptions here too; Writely does a pretty good job writing files out in Word format, probably a major success factor for the service.) Online software suppliers that support open storage but not portable formats provide limited value to their customers since it's difficult to convert proprietary or poorly known formats to something useful. Or just take the data and use it elsewhere (such as easily being able to use your Salesforce data in NetSuite, for example.)
Freedom to Provider Switch: Without the first two, provider switching is difficult because users must have both access to and control of the data created by their online software, but it has to be in a format that's relatively useful. Blogs are one good example of this increasingly common issue. Between well-populized downtime and outright denial of service attacks, customers should be able to almost effortlessly switch services to a working provider, quickly and painlessly, and then keep right on working. Disruption of service to a lot of people dependent on the data and online software is costly, and the lack of control in today's online software can be a major source of frustration. Provider switching has implications such for URL provisioning as well. It makes little sense to be able to provider switch without having URL portability. While structured URL standards for online software are a long way off, the ability to control a URL, the domain name of which at least belongs to the provider itself, is a non-trivial problem. This implies that online software in the future will be provisioned off of URLs with domains owned and controlled by the users of the software themselves.
I beat the drum of maturing open, 3rd party online storage every few months, even though the market isn't quite ready to tackle these issues yet. And though the online storage market is still barely burgeoning, I do this because not being in a posture to address them when the market is ready will impede the adoption of online software for so-called "serious" uses (enterprise use, critical private uses, etc.) until they are addressed.
In any case, the state of online information is rapidly changing with the likes of Google's forthcoming GDrive, Microsoft's rumoured Live Drive, and Amazon's already available online storage service, S3. Never mind the almost infinite selection of smaller providers that already exist today, many of which can even integrate with your local operating system desktop and look exactly like a hard drive attached to the your computer.
A Brief Survey of Online Storage Services
It was actually pretty hard to find a good list of online storage services, particular pure services that don't specialize in a particular type of data. Some services like Ofoto and Flickr prefer a limited set of choices related to a particular media type. What I'm referring to here is general purpose online storage that can be used for storage of data of arbitrary types and in any volume (from 1 byte to multi-gigabyte.) While S3 is one of the few big players already shipping, here's a rundown of what's available today.
AllMyData: The free version offers unlimited backup, supports backup and sharing, but actually requires sharing or payment to increase storage size. Could not ascertain if an API is avaialble.
Box.net: Online storage with 1GB free and more available for purchase. Offers an API, file backup and synchronization, and sub-accounts and RSS feeds for groups.
esnips: Holds everything from bits and pieces of Web content to entire media files and documents. Supports sharing, tagging, 1GB of free storage, and has desktop integration. No API information was available.
IBackup: A variety of options and services are provided by IBackup including IDrive which provides 128-bit encryption, proxy support, caching, collaborative access and much more.
Mozy: Offers automatic backup, strong encryption, and integration with PCs. While free, it is primarily aimed at backup and not as much at generic online storage.
Omnidrive: This service is in beta, but provides online storage with Web and some desktop integration. Notably, Omnidrive makes much of security including compressing and encrypting all data. Omnidrive supports several different popular API formats including SOAP, REST, RSS, and more.
Openomy: Is a completely free online storage service with 1GB of storage. Openomy offers an API and tag based organization.
Amazon's S3: Web-scale online storage for developers. End-user interfaces are created by others on top of S3. S3 can store objects by key up to 5GB in size, is relatively secure, supports REST and SOAP. Interestingly, multiple open protocols are supported including HTTP and BitTorrent. Highly innovative. Some extensions for developers and end-users include:
S3Ajax: An Ajax wrapper around S3 for use in RIA applications.
PHP integration with S3
TechnoAg S3: A Windows desktop application for viewing, uploading, modifying, and downloading Amazon S3 objects. Available as source or binary.
Streamload: An online service for storing media and non-media files with up to 25GB of storage capacity for free, and virtually unlimited storage for a price. Streamload does not appear to support an API.
Strongspace: Provide a secure online location to store any type of file. Sharing is supports along with SFTP, provides support for Basecamp storage. Does not provide encrypted storage and does not appear to support an API beyond SFTP.
Xdrive: Provides 10GB of storage for a small fee, has Web and desktop integration both, and has file backups services.
Of course, the premise of all this is that more and more of our information will continue to flow online along with the software that we use. Our personal data, work documents, images, audio, video, and everything else is more useful to us and others if we move it online, making it accessible anywhere, taking advantage of professional data center backup, and getting off of the hard drive upgrade treadmill. Particularly as our personal archives of digital content become large and highly valuable as we aggregate them over the years, they become irreplaceable assets that require better care than we might be able to give them without professional help.
As always, please add any online storage sites that I missed in the comments below. I'll add them to the list if they have any general purpose utility.