Why all the drama?

This is the first of two blog posts to try to summarize recent discussions around Drupal core direction. First I want to look at some of the history leading up to the discussion and where things currently stand.

While I worked a lot on the Drupal 7 release, and overall am very happy with how it turned out, there have also been a lot of problems - especially apparent during latter half of the release cycle.

It took 15 months from the code freeze of Drupal 7 in September 2010 until 7.0 was released in January 2011. Contrast this with the complete Drupal 6 cycle which lasted just over a year (although that had a similar ratio of thaw to freeze).

It has taken a further nine months since 7.0 to bring down the number of critical and major bugs to a somewhat manageable level.

Combined, this is a two year period where no major refactoring or new features have been committed to Drupal core (in theory, loads of stuff went in after code freeze in practice).

Something obviously went wrong here compared to the original plan.

Workload

Drupal 7 increased the number of long-tail code contributors to core (people with two commit mentions or less) by several hundred.

There is an even longer tail of contributors who posted bug reports, tested patches etc. but did not get mentioned in commit messages. We do not have statistics for those (yet), but we do know that for each ‘core developer’, the number of issues posted per day increased by a factor of ten compared to Drupal 6.

The short/medium tail of contributors (people mentioned in three or more commit messages) stayed relatively static - dozens rather than hundreds.

The number of core committers actually decreased from 3 to 2 compared to most of the Drupal 6 cycle. Due to the superhuman efforts of Angela Byron, commits per day increased despite this.

The three months after Drupal 8 was opened, there was only one branch maintainer, Dries. Due to our backport policy this meant a single bottleneck for Drupal 7 fixes too. Angie was given commit rights to Drupal 8 for bug fixes to rectify this situation and has been working through the backlog that built up. The spike in Drupal 7 installs (after 4-5 months of being more or less static) directly coincides with this change. However there is still no co-maintainer for Drupal 8.

This spreadsheet has raw stats (in progress, and thanks eaton and xjm!)

More people sometimes means less work, but it also sometimes means more work.

During Drupal 5 development and earlier, issues would often be committed after 5-6 comments and a single review, now it’s quite common for issues to reach 50 or 100 comments and go through multiple iterations before commit, often several months if not years pass between an issue getting opened and a fix getting committed.

Combine this with two years of bug fixing and a lot of people are feeling very burned out. Many core contributors (and bug reporters) have also made the shift from part time or hobbyist developers to working full time on Drupal during the same period, this reflects the growth of Drupal in general. This is a separate issue in itself which is not covered here but was discussed very well in Eaton’s DrupalCon talk, this has some good and bad effects, but overall the stakes are higher now.

Refactoring vs. adding features vs. maintaining legacy features.

During ‘code thaw’, the main focus is on refactoring code, and adding features (whether developer or user facing). At the same time, core has accumulated a lot of legacy features during it’s 11 year history. Many of these are showing their age.

During Drupal 7, we added some new features (Field API for example), but failed to find the time to refactor legacy features (profile module) to use them. At the very end of the release cycle, we were so embarrassed by profile module that we actually hid it from the UI for new installs, and Dries removed it without a viable upgrade path this week (although Drupal 8 release is now blocked on that being available so don’t panic if you’re using it now).

Drupal 7 also added new features that were built on legacy APIs or features that had not been fully modernized - for example Field API relies partially on the form API which in some cases has not been fully up to the task and didn’t get a full overhaul in Drupal 7, while it also allowed us to massively update some legacy APIs like node, comment, user, file and taxonomy (although there wasn't time to finish that job building out a full Entity API). Overall it was a positive trade-off but with many critical issues along the way and many outstanding tasks to fully apply those concepts to core and develop them further.

Also, the Dashboard module was based on Block module which has not had a major update for several years and has numerous competitors in contributed modules. Many people consider the Dashboard to have been only a negative trade-off.

Maintaining legacy features such as Poll or profile module means that every wide ranging initiative (API refactoring, test coverage) needs to touch that many extra lines of code. Building new features on top of outdated APIs that aren’t up to the task adds a lot of code debt and overhead to later refactoring and bugfixing. Both of these have been identified as part of the reason for the increasing workload - in terms of difficulty refactoring, and the vast number of issues opened compared to the number of people who consistently work on those.

During the Drupal 7 cycle, some old features were removed or partially removed, either entirely or to contributed projects (ping module, throttle module, user access rules). These are outweighed by the functionality of over 100 contrib modules that was added to core during the same period (see Upgrade Status module).

In the past, Dries has been extremely resistant to removing some features from core despite a lot of community pressure to do so in some cases, and those features being relatively unmaintained. Discussions over the past couple of weeks have been trying to approach this from a high level in terms of which classes of, and specific features are maintainable. Discussion are still very much ongoing, and no we're not going to rip everything out above includes. More on this in the next blog post.

Letting things get out of hand

While we introduced automated testing early in the Drupal 7 release cycle, it took many, many months before we were able to run automated tests against every patch. qa.drupal.org assumes a 100% pass rate for tests, and while some patches might go in working towards that, at the same time others would go in breaking different tests. This created a chicken and egg situation where we needed a 100% pass rate so we could block patches that failed, but we couldn't block patches that failed until we had a 100% pass rate.

As part of the effort to build out test coverage of core and fill in the gaps, several people (me included, mea culpa) opened dozens if not hundreds of critical bugs against Drupal 7 where test coverage was missing. This seemed like a good idea at the time especially since we had a promise of a longer code thaw if we got to 100% test coverage.

However as time went on, while we ended up with a lot of tests, they tended to reflect the already tightly coupled nature of core. That also left some areas untestable for a long time due to complexity and architecture, which began to accumulate hidden bugs. Most notable was the upgrade path from 6.x-7.x.

At the same time, the number of reported critical bugs continued to grow (reaching around 400 at one point), but while this was happening, brand new features continued to be committed (which in turn added more critical bugs to the list).

We are still fixing critical bugs in the core upgrade path now, more than two years later, due to this imbalance.

Getting out of the rut

Technical and forward-planning I'll discuss in the next blog post, but I wanted to highlight some changes already being made (or in progress) that try to deal with some of these issues, especially from a social standpoint.

Many of these were already being put in place months before the recent flurry of activity on Planet and have not been the main focus of those discussions, however they’re closely linked to the problems we’re facing.

Release management

In terms of managing the overall release cycle, two things are already being put into place.
Drupal 8 initiatives mean that particular areas of feature development have official blessing and communication channels as well as identified leads.

Issue queue thresholds mean that core is only allowed to reach a certain level of brokenness before we take a break from committing major end user and API features, and move back to clean-up and bug fixes. The idea here is to never be more than a few months from a viable release at any time during the cycle - at least in terms of stability.

Quality

Several ‘gates’ are being put into place (for usability, accessibility, performance, testing), to make it clearer what the requirements are for getting patches committed to core, and to provide a framework for patch authors, reviewers and committers to be able to decide whether particular issues are covered in those areas.

Core office hours

This is an attempt to make it easier for new people to get involved in core, and to do so in a way that’s effective - experienced core contributors make themselves available at specific times to answer questions, co-review patches etc. See http://drupal.org/node/1242856 for the announcement.

Between these, they will hopefully help to better focus efforts around the work we're already doing, and make getting involved more accessible.

Read onto part 2.

Comments

Just a heads up, the "spike in Drupal 7 installs" link in the 6th paragraph is malformed, causing content to be cutoff.