Biting the Bullet of Technical Debt

Posted by MozCTO

Rand has talked about the technical debt  that is impacting our ability to grow and deliver new products. We knew we’d have to bite that bullet at some point, but sometimes it’s not a clean bite…you’ve got to gnaw away at it until you finally break through.

To that end, we created an 18-month roadmap to pay back that technical debt, and have worked out the stepping stones needed for each team to chip away at that proverbial bullet. It’s going to take a lot of hard work and some of our funding to help get us there, with the ultimate goals of giving you, our customers, greater value, enabling further growth, and getting to 99.9% uptime. We’ll update you as we take each step along the way. But for now, take a look at the roadmap as we see it.  

Get to 99.9% Uptime

The first step on the road to success is upgrading system operations. We’re focusing our efforts here on hardening our network infrastructure and increasing system redundancy and monitoring, with the following key goals:

  • Better and redundant equipment: We’re implementing the network at our own co-location facility in a way that allows us to grow and is not as vulnerable to equipment failures. We are also moving off hosted servers, load balancers, and switches in favor of our own equipment. The new equipment is much higher quality, and will be duplicated here in Seattle and at our colocation site in Herndon, Virginia.
  • Rigorous monitoring: I love that we have enthusiastic customers willing to tweet when one of our systems is down, but that is not the normal way to monitor systems! Our system administrators are implementing monitoring not only on our servers, but also on the jobs, queues, and a plethora of other things that keep our service running. Increased monitoring will help us catch problems before the servers go down, and hopefully head off problems like the latest rankings outage before they affect our customers.

The Tech Ops Team

Mark

David J

Stephen Wood.jpg

jacob.jpg

Nicholas Kosuk.jpg

 

Mark
Sr. Director

David
Principal Engr

Stephen
Sys Admin

Jacob
Sys Admin

 Nicholas
Tech Writer

 

rogergray.jpg.jpg

Who

 

 

Fay
Database architect

Dave K
Office Admin

New System/
Network Engineer

New DBA

 

 

The Tech Ops Stepping Stones

Tech Ops Stepping Stones

Deliver Our Largest, Freshest, Most Reliable Index

In parallel to this systems work, we are also working on our applications reliability and scalability.  The Big Data team’s work includes:

  • More reliable data processing: We’re moving our processing out of the cloud and onto our own hardware.
  • Fix things right: We now have the luxury of the time and a little cash in the bank to do things right. We’re not going to cobble together a hack that will get us over the hump today, but will come back to bite us tomorrow.
  • Improve the index: Our goal is to triple our index size and release more frequently, getting back to our May 2012 index size, while also increasing freshness…with the ultimate goal of creating an index every 7-10 working days.

The Big Data Team

Carin Overturf.jpg

Phil

Brandon

Martin

Doug

Dan

Carin
Senior Manager

Phil
Principal Engineer

Brandon
Principal Engineer

Martin
Principal Engineer

Doug
Senior Engineer

Dan
Engineer

Maura

Who

Kenny

Who

David B

 

Maura
Senior Engineer

Sarfraz
TPM

Kenny
Web Dev

Brad K
Senior Engineer

David B.
Engineer

 

 

The Big Data Stepping Stones

Make Everything Bullet-proof

The Production Engineering Team (PE) is knee-deep in the bowels of the production systems: reviewing code, suggesting where new or more hardware could be used, and making things more maintainable and bullet-proof in general. PE has already implemented code changes to our core systems over the last few weeks to address some of the current sticking points. Some of the things this team is working on:

  • New servers: We’re in the process of standing up over 200 new servers.
  • Reducing complexity: We’re reducing the types of databases and queuing systems we run on. We’re picking systems that either we can support or that have dependable support to help us reach our goal of 99.9% uptime. Between data storage/retrieval and queuing, we have 7 (that I know of) different types of systems.  We aim to get down to one queuing system and two or three different database types.

For more information on these recent fixes, check out the blog post Where are My Rankings?

The Production Engineering Team

Shawn

Thomas

David W

Evan

Ben

 Shawn
Senior Manager

Thomas
Senior Engineer

David W. 
Engineer

Evan 
Engineer

Ben
Engineer

Ethel.jpg

shelly

Who

Who

 

Ethel
SDET  

Shelly
TPM

New Ruby Engineer

New Ruby Engineer

 

The Production Engineering Stepping Stones

Net New Development

The Net New Development Team is working on implementing on new product features. Shhhhh!

The Net New Development Team

Walt

Chris

andrew.jpg

Walt
Sr. Software Manager

Chris
TPM

Andrew
SDET

 

Myron

Marty

Patrick

Brandon R

Ben K

Myron
Senior Engineer

Marty
Engineer

Patrick
Engineer

Brandon
Engineer

Ben K.
Engineer

Wes

John

AK

Jason

Koos

Wes
Principal Engineer

John
Senior Engineer

AK
Engineer

Jason
Engineer

Koos
Engineer

New Net Stepping Stones

Top Secret!

Rock the Marketing Website

Inbound Engineering is the team focused on the Marketing website. The team goals are:

  • Create new services: Create the Common Email service, the new Moz Authorization service, and the front end for Q&A.
  • Upgrade billing: Upgrade our billing infrastructure for more reliable payment processing.
  • Upgrade the website: Build additional functionality into the marketing website.

Inbound Engineering Team

Casey

Dudley

Devin

Who

Who

Who

Casey
Senior Web Manager

Dudley
Senior Director

Devin
PHP Engineer

New PHP
Engineer

New PHP
Engineer

New PHP
Engineer

Inbound Stepping Stones

Inbound Engineering

Make Tweets Sing

The Followerwonk team is working on advancing the customer experience and digging deeper into Twitter and what makes Tweets sing.  We’re going to use split-testing to specific goals to measure customer experience, which will help us decide on designs and features that our customers like the best.

Followerwonk Team

Peter

Who

Marc

Who

Peter
Followerwonk Founder

Galen
Software Engineer

Marc
Software Engineer

Amy
TPM

Followerwonk Stepping Stones

Followerwonk Roadmap

Test and Document

In lockstep with these teams, our test and doc folks are adding testing and documentation that will improve quality and communication across the company. These teams are still small, but are already having a big impact. We have already seen an improvement in our last index release, where testing contributed to it going out with no issues.

Test and Docs Team

lisa.jpg

Nicholas Kosuk.jpg

Ethel.jpg

andrew.jpg

Lisa
Technical Writer

Nicholas
Technical Writer

Ethel
SDET

Andrew
SDET

Docs Roadmap

Test Roadmap

Sharing Our Success

As we take each step along our technical roadmap we will share our accomplishments, turning these planned stepping stones green over the next 18 months. As we gnaw away at our technical debt, we hope you’ll start seeing benefits from the changes along the way. Stay tuned!

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!