Whenever we talk about caching in an application, the first thought that come to our mind is building an end to end caching mechanism. This can be done by using an in-memory memcached solution or a separately hosted Redis instance. There is nothing wrong with these approaches but if our use case is simple enough then we can get by with something far less complicated.
This is where Materialized Views comes into picture and can help us in our caching needs for a set of use cases. Using this approach we can bypass the complexities involved in understanding a new caching technology and the caching can be achieved with existing database which we are using.
What are materialized view and how do they work?
A materialized view is an in-memory database object that contains the result for a query. It differs from a view as a view is not stored in-memory and is always updated whenever there is an underlying change in the table. Whereas in case of materialized view, the results of the query are stored in-memory and they do not change with changes in the table until a refresh is triggered. So now once we have formed a materialized view, we don’t need to query the original database and we can retrieve the results from in-memory materialized view.
That sounds lot like a typical cache solution such as Redis. The advantage here is that instead of using a new technology such as Redis, this improve in query performance is provided by the database itself. We also can perform more queries on top of the materialized view which might not be so straightforward in a key-value based caching system such as Redis.
Materialized view also has a refresh frequency based upon which it runs the query on the database and updates its result. How frequently we perform this refresh depends upon our use case. Refresh can also happen manually, through triggers or by listening for an event change. For example: refresh the materialized view when we have performed 100 updates on the table.
A typical use case
Let us consider the use case described as part of this blog post. Here the use case is to retrieve most charted sound tracks based upon a timeframe such as past one hour, past one week etc. In such a case if we store the results for queries which compute the top charted tracks in a materialized view then we would avoid the trip to database and will be able to retrieve the results from the materialized view. As in this use case, we want most charted for last one day, we can choose to refresh the view every day which means we will end up querying the database only once instead of querying it for each request.
The refresh mechanism can be more fine-tuned using custom triggers to provide more accuracy. Just note that this fine-grained accuracy comes at a cost. Refreshing the materialized view acquires an exclusive lock on the table. Based upon the size of the table, this operation can end up becoming time-consuming and in turn block your normal database operations. Though with databases such as Postgres, this issue can be overcome by setting the keyword CONCURRENTLY
to true. (Fun fact: I once brought down a production application as I executed a create index command without setting CONCURRENTLY
to true and in turn locked the primary table).
There ain’t no such thing as a free lunch
Now that we have seen how a materialized view might be a replacement for well-known cache solutions, there must be question about what is the catch. As it’s said in Mr.Robot
“It’s good. So good, it scratched that part of my mind. The part that doesn’t allow good to exist without a condition.”
And there is a condition for sure. Note that when you are using a materialized view, you are now tied to your database for your caching needs. Each database solution handles the materialized view slightly differently and if you plan to switch over to a different database solution, you will need to migrate all the materialized view and there can be off-chance that the new database won’t suffice to your current usage of materialized view.
Another compromise we make while creating a materialized view is that we don’t have the most updated data as part of our view. If our use case slowly migrates to a point where we can’t work with eventually consistent results then materialized view might end up becoming an incorrect choice for our use case.
So looking into these arguments, you can keep materialized views in your arsenal and choose to use it if you find your use case fitting the requirements.