âIf a picture is worth a thousand words, a replay is worth a thousand picturesâ - Zack Rosen, Pantheon CEO
Summary
Challenges:
- Growing Engineering team from 60 to 200 with a limited understanding of how to scale platform and address performance issues
- 10-12 second initial load time on new dashboard with no clear path to address issues
- Determining tech stack tradeoffs whilst adding new code on top of an old codebase
Approach/Solutions:
- Adopting Replay to reproduce difficult bugs and performance issues
- Using Replay to see what was happening in the code when performance drags occurred
- CEO and FE Leadership re-prioritized mid-sprint and set clear direction for its dashboard
Why Replay?
- The only developer tool that actually gave a deep understanding of perf bottlenecks
- Enabling CEO and Engineering Leadership to make its Dashboard a more performant site
- Accelerate onboarding of new engineers
Results
- Within 6 months, improved platform load time 500% (10-12 second to 2 second average)
- Higher confidence from C-Suite in understanding performance issues and setting priorities
- Mean time to resolution on bug fixes reduced significantly (weeks â days)
- Engineers can start contributing to bug fixes within days on the job
Big Mission, Big Challenges
âWe are building software for difficult processes, so our software is inherently complex. And we are doing all this new stuff with our platform on top of old software and legacy systems. We are building the plane while itâs in flight.â - Shane Duff, Front End Lead
Pantheonâs mission is to make the web a first-class citizen that delivers results. Pantheon is building the worldâs best WebOps (Website Operations) Platformâ one that empowers marketing and development teams to take control of their websites, while giving them the agility to win in the dynamic world of digital marketing.
With Pantheonâs platform, teams can build best-in-class WordPress and Drupal sites with agile workflows, scalable infrastructure, and a lightning-fast content delivery network. Pantheon powers over 300,000 sites and is trusted by thousands of marketing and development teams around the world.
Shane Duff, Front End Engineering Lead, joined Pantheon because the company was solving a real problem for him as an engineering leader and a freelance developer. After joining, he became frustrated with performance issues and in trying to reproduce them in order to share internally; especially issues felt by their most important customers.
âThe tools that we were using before were barely better than useless,â said Shane. âYouâd say thanks for the console log screenshot but I donât know how to help you. So youâd spend 2-3 days trying to recreate the issues and finding it in the code. With Replay, all that wasted time has been eliminated. We are a way faster development team and we can actually scale.â
These issues were compounded by the complexity of their application and number of users. The CMS dashboard can have unlimited users working on it at a given time, and Agencies and EDU customers have dozens of stakeholders to manage.
âSince we are building software for difficult processes, our software is inherently complex,â said Shane. âAnd, we are doing all this new stuff with our platform on top of old software and legacy systems. We are building the plane while itâs in flight.â
With all that came slower performance for their core platform, which was a major impediment to growth and delivering on promises to their most important customers.
Debugging Performance at Scale
âIf a picture is worth a thousand words, a replay is worth a thousand picturesâ - Zack Rosen, Pantheon CEO
At the same time, Pantheon was rapidly growing their team from 60 to 200 engineers. As Pantheon was preparing the new dashboard for scale, the team struggled to pin down performance issues and the executive team was frustrated. In particular, Pantheon CEO Zack Rosen was feeling the pain when using their product himself.
Shane gave Replay to Rosen, who recorded about 7-8 replays of the major performance problems. It was a real âa-haâ moment for the CEO, who was able to discover the root cause of the problems that all of their largest customers were also seeing. âIf a picture is worth a thousand words, a replay is worth a thousand picturesâ, said Rosen.
Before Replay, Pantheon was seeing 10-12 second loading time. After using Replayâs collaboration features to diagnose the issues and build shared understanding, they improved loading time by 500%.
âWe found multiple performance issues watching the timeline as things came down the Network Panel,â said Shane. âFor example, static file caching was a bigger issue than we thought, and we had misconfigured our setup. Primarily, it was about seeing the path that the CEO was taking to recreate these severe performance problems and seeing what was happening in the code and that exact moment in time.â
A Clear Path Forward
âIt all started because we had those replays. [We knew] where exactly in the code the problems were showing up and had a path to solving them. Now, from PM, Engineering, Support to our CEO, we use Replay to track down bugs with so much more ease and save so much time diagnosing what went wrong.â - Shane Duff, Front End Lead
Before Replay, Pantheon struggled with identifying main causes for performance issues and how to address them.
Pantheonâs use of Replay started with its CEO and Front End Leadership creating recordings that served as assets so everyone could see the performance bottlenecks. Pantheon was able to reset priorities mid-sprint and chart a path to address the performance bottlenecks and accelerate product development.
âIt all started because we had those replays and we knew where to go,â said Shane. â [We knew] where exactly in the code the problems were showing up and had a path to solving them. Now, from PMs to Engineers, to Customer Support and our CEO, we use Replay to track down bugs with so much more ease and save so much time diagnosing what went wrong.â
By getting up to speed quickly and spending less time debugging, engineers could stay focused on the Pantheon mission.
âNo one likes tracking down bugs,â said Shane. âAnything you can do to make that process simpler so engineers can get time back to spend on actually building is extremely valuable.
Faster Development, Faster Apps
âThe biggest value for us was in identifying and reshaping our roadmap for the quarter mid-sprint. It made our CEO very happy. Without Replay, I donât know how we wouldâve been able to take our new dashboard down from 10-12 seconds to 2 seconds.â - Shane Duff, Front End Lead
With Replay, Pantheon now has a tool to diagnose issues happening in the browser. Pantheonâs leadership team can prioritize work as the team scales against specific problems, and has more confidence in understanding and addressing performance issues. Not only that but it has helped Pantheon prioritize tech stack decisions like server-side rendering and caching strategy.
While Pantheon does not measure actual mean time to resolution on bug fixes and development, the team says the difference is striking.
Pantheon talks in terms of days, even weeks being wasted before Replay. And QA has embraced Replay with open arms when prioritizing and estimating bug fixes.
âBugs without a replay and no clear answer naturally tend to stay open longer,â said Shane. âBugs with a replay are often greeted with âYeah of courseâ by an engineer and these naturally get estimated lower because there is no uncertainty.â
Another major value driver for Pantheon is accelerating Engineering onboarding. Pantheon can introduce its entire stack to new engineers on the first day with Replay so they can better understand the codebase quickly.
âNow engineers are empowered to actually get inside the front end and fix issues within days, rather than weeks or months,â said Shane.
Whatâs Next?
âWeâd love to make it way easier for customers to share Replays. Most of our top clients would be willing to install Replay and use it to share issues. Our Customers have invested a lot in our platform and want to help us make it better.â - Shane Duff, Front End Lead
For Pantheonâs Customer Experience team, everyone gets replays as soon as they ask for them, but building that muscle memory is a work-in-progress.
âWeâd love to make it way easier for customers to share replays,â said Shane âMost of our top clients would be willing to install Replay and use it to share issues.â
Pantheon is also excited to use Replayâs experimental Node recorder to understand their back-end issues better.
âSo hurry up :) â said Shane.