Node.js: Instagram feed (B2C) on website (Part Two)

5 min readFeb 17, 2020

Here, I’m continuing from the post that previously handled the Facebook & Instagram setup and querying of Instagram’s new API for feeds.

With the previous post giving a nod to using the API for the Instagram feed, I’ve done a quick implementation on the website. Yet, it hasn’t necessarily been performant. Each cold page load requires approx. 12 service calls (11 to Instagram, and 1 to my server) see below for Instagram feed load speed → the big green bar in the network tab. I’ve already implemented front-end caching with a store on the SPA and request caching is built-in with service worker. Of the 11 calls to Instagram, 1 is to get the ids and the other 10 are in parallel. What’s left, is to cache my Instagram feed such that each user only hits the cache and the service subsequently does a check to see if the payload is outdated.

The igFeedFetch takes about 1.6s but is after page render

Some considerations:

I considered pulling the latest data for each response, but that would bring me back to square one in terms of speed.
Also, the feed doesn’t update that many times a week. Given that there are sufficient visits, the majority of users would get the updated feed.
I’m also toying with the idea of pulling the feed with realtime database, so it would update whenever the database changes. What’s stopping me is the frequency isn’t high and neither is the impact. #YAGNI
I’m thinking that the most important conclusion here is to decouple the get from cache and the update cache actions. Once I’ve isolated these actions discretely, I could tweak the final behaviour as a combination of such.

Tasks for this spike:

Setup Firestore, then
Create a collection for stored documents.
Modify endpoint to return the cache, instead of querying.
For the cache, query ig API to check outdated, then
Update cache with any new items, and
Invalidate cache by removing old items that are expired

1. Setup Cloud Firestore

As I’m using the firebase ecosystem, Cloud Firestore seems like a logical choice. I login to the firebase console and add I have the choice of setup with Production or Test database rules — given that it will be accessed through firebase admin it shouldn’t be affected by restricted production access, however, I might need to test it from node.js locally. So I opt for Test rules.

2. Create a collection for stored documents

Knowing that I’ll need to do some sort of ordering, I scanned the response data from the media API of step 3 from the last post to see if the id’s were in some sort of sequence. Bad news, the Instagram post ids don’t seem to be comparable, if anything, they look to be in descending numbers!

To handle sorting I’ll add the timestamp field in the Instagram media request. Then I’ll use it to sort the feed when displaying.

The data structure looks like this:

3. Modify endpoint to return the cache, instead of querying

The first modification to our endpoint would be to return the cached media directly from our Firestore; and remove calling external APIs.

As I’ll be going back to fetching the Instagram API in the next step, I decide to leave the old tests as-is. I let the endpoint use a new function just to get from cache.

One of the gotchas I had is that Firestore SDK doesn’t return an array of results, it returns a snapshot, and that that snapshot is not iterable. Also, don’t be mislead by the forEach method provided in the documentation, map, sort doesn’t work out of the box. Instead use docs to access the array and I can resume functional first order ways of resolving the query results:

4. On to caching — querying the Instagram API to check outdated

Now that I’m able to read from the cache, it’s time to plan how abouts data is going to be loaded from Instagram. In other words, I’m only going to load that which is already not in the cache. Time for sequencing~

Highlighted in blue: Scope for this step. Highlighted in purple: the new flows.

As I was typing the sequence diagram I noticed that I’ll be able to reuse most of the original flows. The major change is the use of Firestore to get the cache and filter.

5. Update cache with any new items, and

With the data flowing in, I need to update the cache with the new items. Collection.add does that with an auto generated id but it would be much more useful to add the actual Instagram media id so that I can reference it later

db.collection("instagram-feed").doc(id).set(media)
// db.collection("instagram-feed").add()

6. Invalidate cache by removing old items that are expired

Finally we clear up the cache by removing items with DocumentRef.delete

Promise.all(
  outDatedRefs.map((documentRef) => documentRef.delete())
)

Woohoo, I’ve deployed the cached service and it’s up and running.

Cloud functions reports a ~100ms response time by reading it from firestore!

In real life, the delay is about half a second for the round trip, not too bad. 😎