By Patrick Calahan and Scott Nyberg
As new developer productivity technologies emerge, small and nimble enterprises with newer codebases swiftly embrace innovation. Conversely, larger organizations, rooted in larger and aging codebases, face obstacles replacing legacy technologies.
Salesforce faced such a challenge with its primary Source Code Management (SCM) system. For nearly two decades, the codebase for Salesforce’s central ‘Core’ application server was stored in the Perforce SCM. In 2021, Salesforce committed to enhance the efficiency of its internal developers by migrating it to a modern SCM powerhouse: Git.
Just two years later, the migration is nearly complete, signifying a major stride toward implementing a highly agile and efficient development environment.
Patrick shares his excitement about the Git migration.
How was the Git migration challenging?
Despite a long-standing desire to bring Git’s advantages to internal developers, the sheer size and complexity of the Core codebase historically posed significant barriers including:
Many files. Surpassing 30 million files and featuring a footprint approaching 30 gigabytes, the challenge was optimizing common operations like ‘git status’ at this scale.
Binary files. The core codebase includes hundreds of thousands of binary files. That introduced complexities as Git’s proficiency with such files is limited — creating storage and management issues.
A long commit history. Core’s changelist history spans decades. This sparked concerns about the scalability of particular Git operations in the wake of millions of changes in the history.
Legacy automation. Core development depended on dozens of internal tools and systems that were tightly coupled to Perforce. Migrating them simultaneously would have been impractical.
Numerous developers. The Core codebase’s collaborative environment involved around 5,000 Salesforce employees. Ensuring a seamless transition for this expansive and diverse development community was key.
How were Git migration challenges overcome?
During the migration process, several of the above challenges had relatively straightforward technical solutions. For example, Git Large File Storage effectively handled binary files, while recent improvements to Git itself, such as the fsmonitor file watcher significantly improved Git’s ability to manage a large number of files.
On the other hand, contending with the specific issues posed by the army of developers and the sweeping scale of Core’s supporting infrastructure necessitated a unique approach. Instead of tackling those challenges all at once, a small Salesforce team initiated the process by building a simple prototype to prove the basic feasibility of transitioning Core to Git. To meet these challenges, Salesforce developed an incremental migration strategy.
The crux of this strategy was to apply a variant of the ‘Strangler Pattern’. In this framework, the Git repository becomes the preferred SCM going forward while Perforce remains in place and kept in sync with Git. Through a periodic sync process, Git commits were reflected as Perforce changelists, allowing the legacy automation to operate undisturbed until its eventual migration to Git. To further facilitate the transition, a proxy layer was developed to enable much of the legacy tooling to interact with Git and Perforce in an SCM-agnostic manner.
This approach instantly accelerated developer productivity, eliminating delays resulting from a protracted migration of all automation processes.
How was the migration’s success measured?
The incremental approach afforded opportunities for iterative enhancement, necessitating a tight feedback loop — evaluating the new system’s performance, understanding developer satisfaction, and measuring actual productivity gains.
Recognizing the critical role of metrics to the project’s success, strategic investments were made early on:
Fine-grained metrics: Git’s built-in, low-level ‘trace2’ diagnostic framework logs data on every Git command run on developers’ machines. Salesforce anonymized and aggregated this log data to get an overview of developers’ Git usage and performance.
Aggregate metrics: In order to measure progress, Salesforce improved its internal bug tracking system to help track aggregate improvements in its Git-based developer’s velocity. Comprehensive reports and dashboards revealed an impressive 8-10% increase in overall developer productivity with Git versus Perforce.
Qualitative metrics: Surveys were conducted to gain insights into developers’ general satisfaction and detect areas for improvement. Feedback consistently rated the Git experience ‘good’ to ‘great’.
How is Git advancing developer productivity?
Git offers numerous advantages over Perforce, introducing tremendous flexibility for developers and automation systems — enabling the quick creation of temporary codeline branches and seamless switching between them. Git is a game-changer for developer productivity at Salesforce, powering several key productivity enhancements including:
Safe multitasking: Local branching empowers developers to easily transition between building features and fixing critical bugs.
Tighter inner loop: With Git, developers can commit as frequently as they want to. Frequent checkpointing helps ensure that work is never lost and that experimental drafts can be always be retrieved. This also encourages delivery of smaller, self-contained units that are easier to manage and review.
Collaboration and creativity: Git’s flexible branching significantly enhances teamwork, enabling developers to easily share work, collaborate on features, and prototype novel solutions to difficult problems.
Left-shifted automation: Flexible branching also helped modernize Core’s continuous integration systems. By extensively testing proposed branches prior to merging, the stability of development codelines has improved — helping developers take their efficiency to new heights.
Patrick highlights Salesforce Engineering’s work culture.
How does Git fit into engineering’s future plans?
Once all internal developers transition to Git, attention will focus on updating the existing infrastructure to align with Git, enabling operations under a unified SCM framework.
The migration may also drive improvement of the release process. Git’s flexible branches will help transform the current manual and error-prone procedures into an automated, continuous system. This will minimize time-consuming tasks, driving increased agility at Salesforce in response to customers’ evolving needs.
Learn more
Hungry for more migration stories? Read this blog to learn how Salesforce migrated 200,000 machines from CentOS 7 to RHEL 9
Stay connected — join our Talent Community!
Check out our Technology and Product teams to learn how you can get involved.
The post Explaining Salesforce’s Large-Scale Migration to Git: How We Enhanced Developer Productivity appeared first on Salesforce Engineering Blog.