In the community of media and journalism innovators, it is commonly accepted that releasing software with an open-source license is the best way to maximize the chance that others will use your code. Yet, by any measure, the vast majority of open-source software goes nowhere.
That’s why we’ve spent some time at Knight Lab trying to understand the dynamics of software adoption — especially the factors that cause open-source software to be widely adopted. After all, our mission is to develop software for journalists, publishers and media consumers — software that gets used.
In researching the topic, I discovered that two faculty members at the University of Massachusetts literally wrote the book on this topic. In their book, “Internet Success: A Study of Open-Source Software Commons,” Charles Schweik and Robert English studied more than 174,000 open-source projects shared on SourceForge, one of the largest and oldest (founded in 1999) repositories for open-source code.
“From 1998 to 2005, there was a lot of hype about high-profile success stories like Linux or Apache, that were really anomalies,” Schweik said. “We were trying to get a better handle on the broader population of open-source projects, and at the time Sourceforge was arguably the dominant open-source hosting site.”
I’ll summarize their key findings below, but first I think it is important to understand a little about how they approached their research.
The researchers based their work on an analysis of projects hosted on SourceForge as of 2009. They classified projects based on what “stage” the software had reached as of that time:
- Initiation stage — the period of time before the first public code release
- Growth stage — the period after the first release until work on the software appeared to have stopped.
A successful product, as defined by Schweik and English, achieves at least three software releases and has value for “at least a few users.” As evidence of user value, they looked at project attributes such as the number of downloads and installations, the presence of continued development activity, posts to project discussion boards or email lists and the addition of one or more developers.
Schweik and English supplemented their analysis of the SourceForge data with a survey of 1,400 open-source developers. This enabled them to ask questions about a subset of open-source projects that could not be answered just by looking at SourceForge data. Using this data they were able to identify characteristics of open-source projects that correlate statistically to greater likelihood of success.
Here’s a summary of what they found:
1. Most open-source projects are not successful
Among the 174,333 projects they reviewed, Schweik and English found they could assess success or abandonment for 145,475. Of that total, only one in six — 17 percent — were successful. Almost half the projects (46 percent) were abandoned in the initiation stage — before the first software release. More than a third (37 percent) were abandoned after the initial release.
2. Successful projects have some common characteristics
- A “relatively clearly defined vision and a mechanism to communicate the vision early in the project’s life”;
- A clearly defined set of users who have a need that can be met by the software;
- Well-articulated and clear goals established by the project’s leaders;
- Good project communication — a quality website, good documentation, a bug-tracking system and a communication system such as an email list or forum;
- Once a project has achieved its initial release, a software architecture that is modular — so future development tasks can be carved out at different levels of complexity for other developers to work on. (Modular architecture alone isn’t enough — many abandoned projects were also modular, Schweik said.)
Most of these characteristics come down to effective leadership, Schweik told me in an interview. “There is someone who was more diligent, who was trying both to describe their project to the world and to provide a clear vision of where the project was going, and articulating those goals,” Schweik said.
3. Open-source projects flourish when developers are also users of the software
“What we found across the board was the motivation of a user-centered need,” Schweik said. “I need this software, so I want to work on it.”
This is consistent with evaluations done by the Knight Foundation for software projects funded under the Knight News Challenge in 2007-08 and 2009. Successful projects, such as DocumentCloud and Ushahidi, had developers who had reason to use the software after its initial release.
4. The global Internet has made it easier to find software collaborators
While open-source projects can be successful without building a large community of code collaborators, Schweik and English were surprised to discover where additional developers were coming from. When a project added a developer to the team, 58 percent of those developers came from a different continent than the original lead developer — most commonly, Europe. Schweik said that for many successful projects, developers from multiple continents had never met in person.
The Internet has enabled “intellectual matchmaking,” Schweik said. “By having a hub where people go looking for things, it’s allowing people with a passion and interest to go connect with each other, and it’s driven by this user-centered need.”
Adding one or more developers was an indicator of software success, the research found. The addition of even one developer was meaningful, since most open-source projects are relatively small, Schweik said.
5. Some characteristics thought to be important in the spread of open-source software turn out not to matter
Schweik and English looked at many different characteristics that researchers had suggested could be important for open-source software success. Here are some of the characteristics they found do not matter:
- which operating system (Apple, Windows, Unix) the code was written for;
- how many developers were involved;
- whether the project has a formalized system of governance (Schweik suggested this doesn’t matter much because most open-source projects are so small);
- which type of open-source license was used;
- whether the project has a source of funding.
“In our research, 75 percent of the open-source projects had no funding of any kind,” Schweik said. “Investment doesn’t appear to drive open-source projects. Rather, the need for the software drives development.”
That’s not to say funding is unimportant — “projects that are funded have higher success rates,” Schweik said. But he said it may well be that good projects attract funding, rather than the other way around.
6. Success doesn’t have to mean large-scale adoption
Much of the attention to open-source software has gone to massive projects such as Linux, which has had more than 1,000 code contributors and is used throughout the technology industry and businesses large and small. But a project can also be successful, Schweik said, if it meets the ongoing needs of a small number of users.
“A clearly identifiable need, even without a big user base, can be successful,” Schweik said.
what should funders look for?
I asked Schweik to put himself in the position of a funder, such as the Knight Foundation, that wants to invest in making open-source projects successful. What criteria should such a funder use to determine which projects to support financially?
For projects in the initiation stage (no software has yet been released), Schweik said, a funder should look for:
- how well-defined the vision for the project is;
- whether the project’s developers have a track record of contributing productively to open-source projects or “leading through doing”;
- how professional and up to date the project’s Web presence is;
- evidence that the project leaders have thought about a marketing plan;
- evidence that the people involved in starting the project have a need for it that will extend beyond the first release.
For projects in the growth stage (after the first code release), Schweik said, a funder should consider:
- whether the project has a well-defined set of users;
- whether there is a natural constituency of developers interested in continuing to use the software;
- whether the developers have prior open-source experience;
- as with the initiation stage, whether the people leading the project have demonstrated leadership by articulating a clear vision, having a professional web presence and maintaining an active bug-tracking system or other communication platform for interacting with the user community.
Rich Gordon is a professor and director of digital innovation at the Medill School of Journalism. At Medill, he launched the school’s graduate program in new media journalism. He has spent most of his career exploring the areas where journalism and technology intersect. Prof. Gordon was an early adopter of desktop analytical tools (spreadsheets and databases) to analyze data for journalistic purposes. At The Miami Herald, he was among the first generation of journalists to lead online publishing efforts at newspapers. At Medill, he has developed innovative courses through which students have explored digital content and communities and developed new forms of storytelling that take advantage of the unique capabilities of interactive media. In addition to teaching and writing about digital journalism, he is director of new communities for the Northwestern Media Management Center, where he is responsible for a research initiative focusing on the impact of online communities, including social networks, on journalism and publishing.
This post originally appeared on the Knight Lab blog.
The Knight Lab is a team of technologists, journalists, designers and educators working to advance news media innovation through exploration and experimentation. Straddling the sciences and the humanities the Lab develops projects, prototypes and innovative bits of code that help make information meaningful, and promote quality journalism, storytelling and content on the internet. The Knight Lab is a joint initiative of Northwestern University’s Robert R. McCormick School of Engineering and Applied Science and the Medill School of Journalism. The Lab was launched and is sustained by a grant from the John S. and James L. Knight Foundation, with additional support from the Robert R. McCormick Foundation and the National Science Foundation.
I agree with most of the analisys, but imho many points apply as well to any software project released under any license, especially the first point: there is just not enough room for everyone in most markets and most projects will end prematurely, notwithstanding the flavor of the license!