Upgrading our Google Lighthouse Service

We are always looking for ways to improve our products and processes at RapidSpike. We’ve recently completed an upgrade to our Google Lighthouse service software to Google’s latest version 9.6.7. This required us to almost entirely rearchitect how we run the tests, which allowed us to take advantage of the latest AWS technologies. This blog post explains what this change required.

What is Lighthouse?

Lighthouse is a tool provided by Google for auditing how websites are performing in a number of key areas. It provides developers with valuable insights into how their web pages are performing and suggests opportunities for things to fix or improve. The goal is to have a positive impact on Web Vitals metrics and the usability of a website.

The audits are split up into five categories; Performance, Accessibility, Best Practices, SEO and Progressive Web App (PWA). Each category is given a final score out of 100 but is made up of hundreds of granular audits. Our Lighthouse monitoring service has been a popular choice among our users ever since we launched it, but we knew it was time to bring it up to scratch with the latest version from Google.

Why are we rearchitecting?

Earlier this year, we made the decision to rearchitect our Lighthouse monitoring software so that it would run on containers within AWS’s ECS. We had been using Lambda (AWS’s serverless offering) but this became problematic. The latest versions of Lighthouse require a newer version of Chromium that doesn’t run 100% effectively in Lambda’s virtualised environment. Also, the Lamdba solution was expensive to run.

Lambda is designed perfectly for short-running, low resource-consuming uses. It lends itself to simple and concise tasks that don’t require a lot of resources to complete. Examples of this would be sending emails, calculating financial data or processing images.

Unfortunately, this means that Lambda is not as well suited to testing websites inside real browsers (which is how Lighthouse works). This type of task is long-running and requires more resources, both of which are costly within Lamdba. The resources and runtime available to the functions are also not guaranteed and therefore have become incompatible with newer versions of certain dependencies.

Thanks to ECS Fargate’s fully managed EC2 instances, ECS containers would allow us to run real browsers and access almost unlimited resources – meaning that we don’t have to worry about the same constraints that Lamdba imposes on the runtime environment.

How did we do it?

Migrating from Lambda to ECS was a complex process on the face of it. It included numerous re-architecting steps and development efforts in order to ensure the continuity of the service during – and after – the transition. One of the biggest changes required was how we would replace the un-upgradable part of the Lambda function with ECS. The existing function handled everything except test scheduling; test messages were picked up from SQS queues using a function trigger, the function then ran the Lighthouse test, and finally processed and stored the artefacts and resulting data. The only part of this that actually needed changing was the middle section that ran the Lighthouse test itself.

So we split the workload into three parts; a queue ‘handler’, the container to run the Lighthouse test and a results ‘post-processor’. The queue handler and post-processor would still be able to live in Lambda functions. Taking this path meant that scheduling, queuing and much of the original Lambda function were, therefore, able to be reused.

The queue handler now simply accepts the URL of the page to test and the test config meta-data, via an SQS queue. It parses the message and fires off the ECS command accordingly. ECS Fargate handles hosting the container and running the Lighthouse test itself, which uploads output artefacts to S3. The post-processor is triggered by the presence of the artefacts in S3 and then processes them into the form we require. The post-processor also inserts metrics and references to our databases for future consumption and manipulation.

This new workflow allows us to be much more upgradable in the future, whilst being more cost-effective now. The Lambda functions that remain are not resource-intensive so can run on much lower specifications. This is where Lambda becomes much cheaper; with low-resource, concise, short-running tasks. We’ve even found that the ECS containers themselves do not need as much resource as we allocated to the previous Lambda function either, whilst still maintaining the same throughput and consistency of output findings.

Conclusion

Whilst time-consuming and frustrating to begin with, re-factoring and updating the existing Lighthouse testing mechanism became a straightforward decoupling exercise and migration to a new AWS service. A lot of the original functionality remained so we were able to reuse a lot of code and maintain most of the existing mechanism with minimal fuss.

Once we’d got Lighthouse tests running accurately in the new ECS environments it became much easier to see the benefits of the new architecture in terms of reduced costs, improved scalability, upgradability and output accuracy.

Although there have been some growing pains with this upgrade it has generally been a very positive experience that has taught us a lot about the challenges of operating in a space that is constantly evolving and improving.

Ultimately, all that this means to our customers is that we’re going to be able to provide the latest version of Google Lighthouse going forward and will be committed to staying as up-to-date as possible with their releases. This means our users will always have access to Google’s latest recommendations.