September 23rd 2016

By Chris White

TAGS:
Open Source

Software Evaluation, Part Two: A Closer Look

Adding software dependencies to a project can be good or bad. On the one hand, they can save us time not having to implement things that have already been built for us. But on the other hand, they can sometimes cause problems.

In part one of this miniseries, we looked at how to make a basic assessment of a third-party software project with a view to adding it as a dependencies in one of our own. We looked at how the project handles breaking changes, testing, release branches, and licensing.

In this post, we’re going to look at how to scrutinise a project closer.

Making Modifications

With some dependencies, we might not need to make any modifications. It might be as easy as adding a line to our Gemfile and moving on.

But other times, we might want to make changes, particularly with large dependencies or dependencies that provide key functionality. This could be anything from adding a feature from a fork, adding an unreleased feature, backporting a bug fix, applying a security patch, or even making our own modifications.

If you think this applies to you, it’s important to consider how easy making a change would be.

How large is the codebase? How well organised is it? Is it documented for developers? Is it written in a language you know how to edit? Or will you have to pay for someone to make changes or you?

You could actually try to make a small change to see how easy you find it. This will give you a rough idea of what you’re dealing with.

Contributions

Let’s say you’ve made a change yourself and you’d like to get it merged into the upstream project? How easy is that going to be?

Some projects have long drawn out processes that require many code reviews. You might even be dealing with a community that likes to bikeshead every last change. Other projects might be much more laid back, quickly merging pull requests without any fuss at all.

This isn’t as simple as saying that longer processes are bad and shorter processes are good. In fact, for large complex projects (especially projects like databases, where data corruption could be a huge issue) a longer process can actually be a good sign. It means they more carefully scrutinise every last change.

On the other hand, a community that likes to make it hard for you to land contributions could end up being very frustrating if you need to make a lot of changes further down the line. (It might also be an indication of poor community health.)

Either way, you probably want to look at whether they have continuous integration set up, or other methods of automatically testing code before merging it in.

Second-Order Dependencies

Software projects you add as dependencies for your code sometimes come with dependencies of their own! We can call these second-order dependencies.

Of course, these second-order dependencies might come with their own third-order dependencies, and so on. But how far you want to go with this is entirely up to you…

For most projects, you don’t need to get too involved. But a few quick checks might save you a lot of headaches further down the line.

Do any of those second-order dependencies regularly introduce breaking changes? That sort of thing is going to bubble up easily and could disrupt you.

Another thing you can look at is implementation language. Are the second-order dependencies written in popular languages? Or are they written in young or uncommon languages with regular changes and poor OS packaging support?

Community Health

Open source software projects often live and die by the health of their communities. So this is one important thing to look at. How big is the community? How diverse is it? How long has it been around? Is it growing or shrinking?

You can get an idea of this by checking out the project IRC channels, mailing lists, bug trackers, forums, user groups, social media, and so on. How many posts are there? How often do experienced community members reply? Does the project have a code of conduct or is it otherwise welcoming and inclusive?

The makeup of the contributor team is important too. How many core contributors are there? If there’s only one and the they don’t seem to be interested in adding more people, that’s a risk. That person could stop working on the project at any time.

The larger the set of core contributors, the more diverse their backgrounds and reasons for contributing, and the willingness of the project to train up and add new contributors are all factors to consider when assessing how long the project is going to be around.

It’s also important to look at where the project has been and where it’s going. Does it look like it’s on a trajectory (in terms of feature set, community size, popularity, and so on) that is going to be suitable for your project in six months? How about five years?

Support

Most healthy open source projects will respond to bug reports and support queries in a reasonably okay timeframe. But the responsiveness and helpfulness you’ll experience via IRC, mailing lists, and bug trackers is going to vary a lot. And this is probably something you’ll want to check out beforehand.

Documentation is also an important form of support. And one that a lot of open source projects could do better with. So you’ll want to check out what the state of the documentation is, especially for larger and more complex projects.

You can also find documentation away from the official project websites. You could check to see how often people blog about the project. Or you could check question and answer sites like StackOverflow to see how regularly the project is covered.

Don’t forget paid support too. A lot of larger open source projects are backed by commercial entities who often pay people to work on the project. Many of these companies also offer paid support plans. And even if they don’t, you might be able to hire freelancers or consultants from the community.

Also don’t forget sites like BountySource, which provide an easy way to pay to get people working on issues you care about.

Security

Security is an ever increasing concern for many people. And it’s almost certainly something you want to look at while evaluating software.

One quick way to get a sense of how seriously a project takes security is whether they have a security contact. Even better if they have a dedicated security address and some sort of public encryption key advertised so you can communicate securely.

If the project expects you to file security issues in the public issue tracker along with everything else, this should be a red flag.

Another thing to look at is how the project actually handles security issues. Does it consider them separate from regular bug reports? Do they get CVE numbers? Does the project prioritise security fixes? Does it release security patches for older releases?

Aside from how the project itself handles security issues, you can also take a look at how secure the code is. Does the project make use of industry best practices? Can the code be sandboxed? How does it use the network? How is authentication and authorization handled?

You can also run automated security testing and network analysis to get a more comprehensive overview.

Conclusion

In this post we looked a number of ways to do a more indepth review of a software project before adding it as a dependency.

Specifically, we looked at:

How easy is it to make changes to the code?
How easy is it to contribute your change upstream?
Does the software have any dependencies of its own you need to consider?
How healthy is the community?
What sort of support is available?
How does the project approach security?

It’s probably not necessary to look at all of this for all of the projects you add as a dependency. But certainly, the larger and more crucial the dependency, and the larger and more crucial the project you’re working on, the more of this you’re going to want to look at.

Be prepared to make compromises. The goal of this effort is to evaluate, minimise, and understand the risk you’re introducing to your project. You’ll never be able to completely remove risk.

Finally, because software moves at such a rapid pace, it’s also worth considering introducing semi-regular re-evaluations. A project you add as a dependency today might seem okay, but perhaps one year down the line the situation has changed.

Share your thoughts with @engineyard on Twitter

Talk about it on reddit

About Chris White

Chris White is a Distribution team member at Engine Yard and works on the automation of many of the virtualization solutions used at the company. For more than 10 years, he has enjoyed hacking away on the Gentoo Linux Distribution. While not checking out what’s under the hood of widely used technologies, Chris also enjoys brushing up on his Japanese.