In addition to the End-User Board seat, Netflix is a Gold member of the OpenJS Foundation. Each Platinum member is entitled to appoint one Director to the board, and Gold and Silver members vote to select their representatives. The board also includes community representation, with two Community Director positions elected.
Alex is the Engineering Manager for the Node.js Platform team at Netflix, responsible for curating the Node.js development experience for hundreds of engineers across the company. His team builds on the shoulders of the incredible open source communities that have found a home in the OpenJS Foundation and advocates for the continued support and sustainability of the vibrant communities that have made today’s ecosystem possible.
Using a serverless Node.js platform, you and your team at Netflix curate a complete end-to-end development environment for Netflix engineers who are creating, developing, and deploying Node.js services. How does being on the OpenJS Foundation board help your work at Netflix?
Being a part of the OpenJS Foundation has opened opportunities for collaboration and information sharing. We now have a better understanding of how Node.js is growing within the industry and how Netflix can play a meaningful part in that story. The deeper engagement goes two ways; it allows us to learn from the community and enables us to share unique challenges with our peers.
Node.js v18 will be released April 18. How does Netflix evaluate new releases of Node.js?
Similar to how AWS raised the abstraction bar for hardware management in the early cloud days, we’re raising the abstraction bar for service development.
My team owns and operates a managed platform, “NodeQuark”, for Node.js services at Netflix. The mission of the platform is to provide not only the Node.js runtime, but all the integrations needed to hit the ground running in the Netflix ecosystem in a transparent manner. NodeQuark tracks Node.js LTS releases, and the managed architecture allows us to seamlessly test, validate, canary, and ship updates to both the Node.js runtime and our ecosystem integrations without having to involve our customers. Our platform customers need only bring their business logic.
We do have scenarios requiring more traditional services that are not built on NodeQuark, where service owners are responsible for their entire stack, including evaluating new Node.js releases on their own cadence. In the future we will continue to explore these use cases more thoroughly to support them within NodeQuark.
As part of the OpenJS board, what do you hope to accomplish in 2022?
I’ve been discussing goals for the board with Robin Ginn, our Executive Director. Fundamentally, I want to build better relationships with people engaged with the foundation. Why are you showing up? What keeps you engaged? How is that benefiting you and your company? Better understanding these types of benefits and motivations will help the Foundation better support our members. And that will lead to eventual growth and further investment in the success of the Foundation.
What advice would you offer others wanting to get involved with the Foundation?
Information technology services and application development with offices in India and the United Kingdom and North America
“NodeXperts is using Node.js as a key component developing innovative IT solutions around the globe. This is great to see. They have expertise in design, DevOps, and cloud services and recently won awards for work in their field,” said Robin Ginn, executive director, OpenJS Foundation. “We are excited to welcome NodeXperts as an OpenJS Foundation member and look forward to adding their invaluable experience and knowledge to our community.”
NodeXperts has over 100 clients in 10 countries building apps and pipelines that ensure quality products and services. NodeXperts have worked on over 250+ projects to create technical solutions to business and enterprise problems. They focus on improving performance of new applications for their clients.
Ryder’s low-code, screen scraping solution was an effective solution for a long time, yet, as their customers’ expectations evolved, they had an opportunity to upgrade.
To keep up with consumer demand, they implemented Profound Logic’s Node.js development products to create RyderView. Their new web-based solution helped transform usability for their customers and optimize internal business processes for an overall better experience.
Third-party freight carriers across North America rely on Ryder’s Last Mile legacy systems to successfully deliver packages. Constantly adding features the legacy system made for a a monolithic application that was no longer intuitive nor scalable.
The Ryder team, lead by Barnabus Muthu, IT & Application Architect, wanted to develop an intuitive web application that provided real-time access to critical information. Muthu wanted to balance the need for new development with his legacy programs’ extensive business logic.
Profound Logic’s Node.js development solutions were a great fit and allowed Muthu to expose his IBM i databases via API to push and pull data from external cloud systems in real-time. He was also able to drive efficiency on dev time by using npm packages. Using Node.js, Ryder was able to built a modern, web-based application that no longer relied on green screens, while leveraging his existing RPG business logic.
This new solution was named RyderView and it transformed usability for its customers, translating to faster onboarding and reduced training costs for Ryder.
For third-party users, it led to improved productivity as the entire time-consuming processes were made obsolete. Previously, Ryder’s third-party agents used paper-based templates to capture information while in the field. Now that Ryder’s new application used microservices to push and pull data from iDB2, end users were upgraded to a mobile application. These advancements benefited Ryder as well, allowing them to eliminate paperwork, printing costs, and the licensing of a document processing software.
The Weather Company uses Node.js to power their weather.com website, a multinational weather information and news website available in 230+ locales and localized in about 60 languages. As an industry leader in audience reach and accuracy, weather.com delivers weather data, forecasts, observations, historical data, news articles, and video.
Because weather.com offers a location-based service that is used throughout the world, its infrastructure must support consistent uptime, speed, and precise data delivery. Scaling the solution to billions of unique locations has created multiple technical challenges and opportunities for the technical team. In this blog post, we cover some of the unique challenges we had to overcome when building weather.com and discuss how we ended up using Node.js to power our internationalized weather application.
Drupal ‘n Angular (DNA): The early days
In 2015, we were a Drupal ‘n Angular (DNA) shop. We unofficially pioneered the industry by marrying Drupal and Angular together to build a modular, content-based website. We used Drupal as our CMS to control content and page configuration, and we used Angular to code front-end modules.
Front-end modules were small blocks of user interfaces that had data and some interactive elements. Content editors would move around the modules to visually create a page and use Drupal to create articles about weather and publish it on the website.
DNA was successful in rapidly expanding the website’s content and giving editors the flexibility to create page content on the fly.
As our usage of DNA grew, we faced many technical issues which ultimately boiled down to three main themes:
Slower time for developers to fix, enhance, and deploy code (also known as velocity)
Our site suffered from poor performance, with sluggish load times and unreliable availability. This, in turn, directly impacted our ad revenue since a faster page translated into faster ad viewability and more revenue generation.
To address some of our performance concerns, we conducted different front-end experiments.
We analyzed and evaluated modules to determine what we could change. For example, we evaluated getting rid of some modules that were not used all the time or we rewrote modules so they wouldn’t use giant JS libraries.
We evaluated our usage of a tag manager in reference to ad serving performance.
Because of the fragile deployment process of using Drupal with Angular, our site suffered from too much downtime. The deployment process was a matter of taking the name of a git branch and entering it into a UI to get released into different environments. There was no real build process, but only version control.
Ultimately, this led to many bad practices that impacted developers including lack of version control methodology, non-reproduceable builds, and the like.
Slower developer velocity
The majority of our developers had front-end experience, but very few them were knowledgeable about the inner workings of Drupal and PHP. As such, features and bug fixes related to PHP were not addressed as quickly due to knowledge gaps.
Large deployments contributed to slower velocity as well as stability issues, where small changes could break the entire site. Since a deployment was the entire codebase (Drupal, Drupal plugins/modules, front-end code, PHP scripts, etc), small code changes in a release could easily get overlooked and not be properly tested, breaking the deployment.
Overall, while we had a few quick wins with DNA, the constant regressions due to the setup forced us to consider alternative paths for our architecture.
Rethinking our architecture to include Node.js
Stakeholders were happy with the lite experience, commenting on the nearly instantaneous page loads. Analyzing this proof-of-concept was important in determining our next steps in our architectural overhaul.
Differing from DNA, the lite experience:
Rendered pages as server side only
We used what we learned with the lite experience to help us and serve our website more performantly. This started with rethinking our DNA architecture.
Metrics to measure success
Before we worked on a new architecture, we had to show our business that a re-architecture was needed. The first thing we had to determine was what to measuring to show success.
We consulted with the Google Ad team to understand how exactly a high-performing webpage impacts business results. Google showed us proof that improving page speed increases ad viewability which translates to revenue.
With that in hand, each day we conducted tests across a set of pages to measure:
We used a variety of tools to collect our metrics: WebPageTest, Lighthouse, sitespeed.io.
As we compiled a list of these metrics, we were able to judge whether certain experiments were beneficial or not. We used our analysis to determine what needed to change in our architecture to make the site more successful.
While we intended to completely rewrite our DNA website, we acknowledged that we needed to stair step our approach for experimenting with a newer architecture. Using the above methodology, we created a beta page and A/B tested it to verify its success.
From Shark Tank to a beta of our architecture
Recognizing the performance of our original Node.js proof of concept, we held a “Shark Tank” session where we presented and defended different ideal architectures. We evaluated whole frameworks or combinations of libraries like Angular, React, Redux, Ember, lodash, and more.
From this experiment, we collectively agreed to move from our monolithic architecture to a Node.js backend and newer React frontend. Our timeline for this migration was between nine months to a year.
Ultimately, we decided to use a pattern of small JS libraries and tools, similar to that of a UNIX operating system’s tool chain of commands. This pattern gives us the flexibility to swap out one component from the whole application instead of having to refactor large amounts of code to include a new feature.
On the backend, we needed to decouple page creation and page serving. We kept Drupal as a CMS and created a way for documents to be published out to more scalable systems which can be read by other services. We followed the pattern of Backends for Frontends (BFF), which allowed us to decouple our page frontends and allow for more autonomy of our backend downstream systems. We use the documents published by the CMS to deliver pages with content (instead of the traditional method of the CMS monolith serving the pages).
Over time, we implemented and evolved our usage from our first project. After developing our first few pages, we decided to move away from ExpressJS to Koa to use newer JS standards like async/await. We started with pure React but switched to React-like Inferno.js.
After evaluating many different build systems (gulp, grunt, browserify, systemjs, etc), we decided to use Webpack to facilitate our build process. We saw Webpack’s growing maturity in a fast-paced ecosystem, as well as the pitfalls of its competitors (or lack thereof).
Webpack solved our core issue of DNA’s JS aggregation and minification. With a centralized build process, we could build JS code using a standardized module system, take advantage of the npm ecosystem, and minify the bundles (all during the build process and not during runtime).
Moving from client-side to server-side rendering of the application increased our speed index and got information to the user faster. React helped us in this aspect of universal rendering–being able to share code on both the frontend and backend was crucial to us for server-side rendering and code reuse.
Our first launch of our beta page was a Single Page App (SPA). Traditionally, we had to render each page and location as a hit back to the origin server. With the SPA, we were able to reduce our hits back to the origin server and improve the speed of rendering the next view thanks to universal rendering.
The following image shows how much faster the webpage response was after the SPA was introduced.
As our solution included more Node.js, we were able to take advantage of a lot of the tooling associated with a Node.js ecosystem, including ESLint for linting, Jest for testing, and eventually Yarn for package management.
Linting and testing, as well as a more refined CI/CD pipeline, helped reduce bugs in production. This led to a more mature and stable platform as a whole, higher engineering velocity, and increased developer happiness.
Changing deployment strategies
Recognizing our problems with our DNA deployments, we knew we needed a better solution for delivering code to infrastructure. With our DNA setup, we used a managed system to deploy Drupal. For our new solution, we decided to take advantage of newer, container-based deployment and infrastructure methodologies.
By moving to Docker and Kubernetes, we achieved many best practices:
Separating out disparate pages into different services reduces failures
Building stateless services allows for less complexity, ease of testing, and scalability
Builds are repeatable (Docker images ensure the right artifacts are deployed and consistent) Our Kubernetes deployment allowed us to be truly distributed across four regions and seven clusters, with dozens of services scaled from 3 to 100+ replicas running on 400+ worker nodes, all on IBM Cloud.
Addressing a familiar set of performance issues
After running a successful beta experiment, we continued down the path of migrating pages into our new architecture. Over time, some familiar issues cropped up:
Pages became heavier
Build times were slower
Developer velocity decreased
We had to evolve our architecture to address these issues.
Beta v2: Creating a more performant page
Our second evolution of the architecture was a renaissance (rebirth). We had to go back to the basics and revisit our lite experience and see why it was successful. We analyzed our performance issues and came to a conclusion that the SPA was becoming a performance bottleneck. Although SPA benefits second page visits, we came to an understanding that majority of our users visit the website and leave once they get their information.
We designed and built the solution without a SPA, but kept React hydration in order to keep code reuse across the server and client-side. We paid more attention to the tooling during development by ensuring that code coverage (the percentage of JS client code used vs delivered) was more efficient.
Removing the SPA overall was key to reducing build times as well. Since a page was no longer stitched together from a singular entry point, we split the Webpack builds so that individual pages can have their own set of JS and assets.
We were able to reduce our page weight even more compared to the Beta site. Reducing page weight had an overall impact on page load times. The graph below shows how speed index decreased.
Note: Some data was lost in between January through October of 2019.
This architecture is now our foundation for any and all pages on weather.com.
weather.com was not transformed overnight and it took a lot of work to get where we are today. Adding Node.js to our ecosystem required some amount of trial and error.
As we continue to architect, evolve, and expand our solution, we are always looking for ways to improve. Check out weather.com on your desktop, or for our newer/more performant version, check out our mobile web version on your mobile device.