This is the Message Centre for Pastey

Offline First – Is It Fundamentally Flawed?

Post 1

Pastey

A few years ago I started developing apps for mobile phones using something called PhoneGap. This was great because it allowed me to build them in html, javascript and css, technologies I already knew, without having to learn Java. The problem though was how to keep the data in the app up to date? The app in question was to be used inside museum, with their big thick walls that totally block out any signal, so it couldn’t just keep an open connection. The data had to be stored in the app, and be able to be updated if there was a signal.

So I started looking into offline storage. This is actually a lot more interesting than it sounds, as there were so many different ways of doing it, badly. However, there were also some methods that when used together did it right. They provided a really nice way to solve the problems of a website (which the app really was) being needed when offline.

And these are the basics of what’s now known as Offline First, being able to use a website even if there is no signal.

So what can we do for Offline First?

A website is made up of several things, and by looking at them individually we can see ways to make them available offline.

Firstly there’s the html, and by this I’m referring to the page templates rather than the content. This is what tells your browser what sits where, and unless you have a major redesign, this rarely changes. The content changes, but the layout doesn’t. A good way to store this offline is using a Cache Manifest, a file that tells your browser what files to store long-term. You can create all your layout files, list them in the Cache Manifest, and then the browser will grab them off the server, put them into storage and use them whenever you’re offline. It gets better though, when you are online the browser checks your Cache Manifest version number against that of the latest one online, and if it’s the same, it doesn’t bother downloading the files again it keeps using the ones you’ve already got. Reducing data and bandwidth usage and speeding up page loading time.

Next up there’s the furniture. These are the images that are used on the pages, rather than in the content. Things like header banner images, icons, logos, etc. They’re there across the site, and again rarely change. And again, these can be put into the Cache Manifest.

Thirdly, there’s the style sheets. The html tells your browser how to lay out the content, the style sheets tell it how to make the content look. What fonts to use, what colours things should be, that sort of thing. And again, these rarely change so you can put them into the Cache Manifest.

The last building block is the JavaScript, the code that tells the content how to behave. How to make menus appear and hide, how to make the page interactive basically. And once again, this itself doesn’t change much so can be put into the Cache Manifest.

By using the Cache Manifest, it’s possible to have your entire website, bar the content, sit offline in the device’s long-term storage and load very quickly without having to be online and go to the server for the files.

Then we come to the trickier part, the content. This does change so you can’t just stick it in the Cache Manifest and leave it there. Thankfully, the way that websites have grown to be developed over the years have given us the answer, we just have to change how it’s done.

Most websites store the content in a database, when you ask for a page the server gets the framework together, then queries the database to get the content to put into it. What we can do to make this work offline is use SQLite and store the content in a database on the user’s device. Now instead of requesting it from the server, we can use JavaScript to request it from the local storage. This is again quicker than accessing the server, and uses less data allowance or bandwidth.

In effect, by using both the Cache Manifest and the SQLite/Local Storage we’re able to turn the device into its own webserver with the website all stored offline. This is great, this is the ideal of Offline First. It doesn’t matter if you’ve got signal or not, you’ve got the website there on your phone or tablet. But the problem is, it’s a snapshot of the website, if the website gets updated your version will be out of date until you’re next online and able to resynchronise it.

Generally though, this out-of-synch issue isn’t really a problem. Most websites are pretty much static anyway, it’s only the community based ones that need to be synchronised and updated more regularly. For instance, Twitter would be kinda dull if you were the only person on it for an hour, and then you get all the updates at once. It doesn’t really encourage user interaction. But most sites aren’t really like that, most sites get updated once a day, if that, and that’s pretty much it, and even with those that do require more constant connection and updates can take advantage of the benefits of storing the assets offline, and using the local version as a backup if there is no connection.

So Offline First is great, it gives us websites when we have no signal and makes loading things much faster. The problem is though, I can see two fundamental flaws in it.

The first is the storage itself. Mobile device storage isn’t growing much, certainly not fast enough for the uses we put it to. The operating systems are growing and leaving less free space to store things on. And then the cameras are getting better and better, and the images take up more space, and we’re taking more and more images. Almost every social app now not only has the ability to take and upload photos, but positively encourages you to. Plus we’re using the devices for more and more, like watching TV and videos, which we download to local storage to watch later. All of these things take up more and more of the diminishing amount of storage available. So if we suddenly start storing all the websites we look at offline, we’ll run out of space very quickly. Website we want to be able to check when we’re in a tunnel or a fast moving train will be pushed out because the space was used up by a website you loaded by accident.

The second flaw is one I think is more serious, laziness. Without the constant lack of decent signal being a reminder to users that their coverage sucks, they’ll stop complaining to their providers and the infrastructure problem that has brought about the need for Offline First will be seen to have diminished.

The ability for a website to store itself locally should be a choice that the user makes. There should be a button, or a user preference that says “Yes, please use up my storage” and websites that don’t have that user consent should just not do it. But more importantly we should not use this as second best solution for a lack of infrastructure. We should not become complacent with a lack of progress, and allow the providers to become complacent.

At best, Offline First must be a method for reducing server load and speeding up websites, and only a temporary stop-gap solution for a lack of real-time data access.


Offline First – Is It Fundamentally Flawed?

Post 2

Icy North

Fascinating topic, thanks Pastey.

I'm trying to get my head around which websites would have content databases small enough to be dowloaded into local phone storage. I don't think you'd get the Edited Guide into one, a small subset, maybe.

You also stirred up memories of a past life in which I ran Oracle databases in a non-networked environment. I would run around with floppy disks whenever I needed to upgrade the software.

Now they'd have worked in a library smiley - smiley


Offline First – Is It Fundamentally Flawed?

Post 3

Pastey

Mostly I think it's blogs that'd work. You subscribe to a blog, it chucks the latest xMB into your device. If I remember rightly, there's a 5MB limit to a single SQLite database.
But that wouldn't stop say, the BBC news website from setting the latest 5mb of news stories as a synchronised download/update. And it's near pure text we're looking at here, downloaded in JSON format and then using JavaScript to put it into the SQLite database on the device. A separate cache manifest could tell the browser what images to download and have ready offline for those articles.


Offline First – Is It Fundamentally Flawed?

Post 4

Nosebagbadger {Ace}

Out of interest, what is the (rough) memory size of the EG (or of h2g2)? Is it possible to even get a rough guess for the former?


Offline First – Is It Fundamentally Flawed?

Post 5

Pastey

At the moment we don't know, the database is set up to be optimised in a different way, but I can tell you the database is in GBs rather than MBs.


Offline First – Is It Fundamentally Flawed?

Post 6

Asteroid Lil - Offstage Presence

Thank you for such an informative post, Pastey. You've given me at least 3 new topics to research. smiley - run


Offline First – Is It Fundamentally Flawed?

Post 7

Pastey

Glad to be of service smiley - winkeye


Offline First – Is It Fundamentally Flawed?

Post 8

2legs - Hey, babe, take a walk on the wild side...

Mobile phone storage seems (to me anyhow), to have grown, hugely, the last few years... Well from my anchient NOkias, with a few tends of MB, to err, however much my S3 mini has, plus with the err, huge SD/MMC card I bought for next to nothing for it, off EBay... although I have managed to fill a lot of it with most of my CD collection smiley - laughsmiley - musicalnote I'm sure if I deleted all my music off it, which I rarely play anyhow, I could fit the guide on smiley - snorksmiley - silly Mind, of course, that would just be the offline for one website, and conceviably one might want more than a single website to access 'offline' smiley - alienfrown You don't have to travel far, even from a city, to suddenly find ropy 3G signals near here, I notice this the most when at William's, and the battery life just vanishes, as it is searching for a HSDC or 3G or whatever it is, signal constantly and lossing it just as often though it useually just about remiens good enough for general internet/phone use smiley - erm


Offline First – Is It Fundamentally Flawed?

Post 9

2legs - Hey, babe, take a walk on the wild side...

Hmmm. smiley - zen
If. say. and this might be really stupid an idea....
All the non chnaging much stuff; CSS Templates, HTML, Javascript, site graphics, sits in the site manafest. (first four or five bits, everything cept the 'content' in the FP here), OK, stick that in the err cash manafold thing or whatever its called smiley - winkeye
say for the dozen or two dozen sites one wants to use a lot, say on a given mobile device.... that's all on the phone/device, taking up not vast amounts of space...
So, that just leaves the big bit; the 'content' for each site; the 4242 GB of edited guide entrys (OK, whatever it is),
the 69 GB of some o otter site.
etc...
Your phone has a link, between the manafold cash thing, and a remote 'cloud' thing.
The remote cloud thing, updates this* 'content' bit, from the relivent websites, you've told it to do it for (by having the cash manafold thinggy for those sites on your phone),
Thne, rahter than having all* the content for all these sites, on your device/phone all the time, you could, like, decide, 'this week', or 'today', to make the cloud storage thinggy, put the edited guide, and the otter site, on your phone, as your heading off on a train journey to some back of beyond place that might not have WiFi/3g/4g,and so have access to them?
hmm. either thats a really stupid idea because it doesn't work, or my sleep deprived brian sort of misses what the problem is entirely smiley - laughsmiley - ermsmiley - ermsmiley - sleepy


Offline First – Is It Fundamentally Flawed?

Post 10

Pastey

That's the ideal we should work to, but that the user choses which sites get to store that offline data.


Offline First – Is It Fundamentally Flawed?

Post 11

2legs - Hey, babe, take a walk on the wild side...

Oh definatly... I must admit I'm not a huge fan of some of this cloud stuff, so I've avoided it myself... I like to know where my actual 0s and 1s all are... and at least if they're on my harddrives... Mind, I don't have a huge need for any of it, really, if I did, I'd probalby just set up a remote server of my own at home, to have stuff on, which I could access from my otter devices, err, no idea how to do it, but I know its possible, so as always I assume I could just figure it out and learn how to set it up, securly, as and when I had any need to do so... ADmitildy I do use dropbox, which I guess is a sort of cloud thinggy... but that's just to exchange a few files, mainly photos, and nothing I'm overly protective of really smiley - blush


Key: Complain about this post

More Conversations for Pastey

Write an Entry

"The Hitchhiker's Guide to the Galaxy is a wholly remarkable book. It has been compiled and recompiled many times and under many different editorships. It contains contributions from countless numbers of travellers and researchers."

Write an entry
Read more