Software versioning is an important part of the development cycle of any program, since that is what allows users and other developers of the software to track the changes done to the product at a glance. It's something that is taken for granted. And most software projects probably don't put much thought into that. However, in order to avoid some caveats of the traditional versioning system explained below and in order to make your versioning more consistent as well as user-friendly, it is worth adopting (or creating) a standard for versioning your software.
After a bit of research into the subject, I have put quite a bit of thought into my own versioning system that I will present here. But first let me explain what potential problems led me to making a system like that (if you're here for the specification only, feel free to skip the next section).
I have looked up several versioning systems in search of one that would be good to use for my Arcomage project - a rather small program, but with extensible features, which necessitated a versioning system to begin with. In the end, each of the systems I looked up was lacking in one way or another. Probably the most complete specification for general use I have seen so far is SemVer, or Semantic Versioning. If you haven't read about it, you can do so now by going to their website. Now, this versioning suggestion is pretty nice, and I highly recommend using it for libraries, which need fine control more than user readability. It would work for programs as well - for the first release, that is. Before and after that, the specification is unfortunately rather vague. What is more, it doesn't work for small projects and it is not very informative to the users, or it is limiting the developers' freedom. Consider these situations:
- You have started a small project, and wish to use a versioning specification from the get-go. You are including new features to your program (SemVer calls it an API, but the term is not applicable for non-libraries; I will also use it here as an alias for multiple features), but you don't think it is stable enough for the public. Say, you create two functions that handle two newly introduced features, and you have fixed a bug after implementing both. Then you changed one of the features to be more extensive. With SemVer, it is not defined at all what should be done in that case. You can't go over 0.X.X, even if the API has changed. So you are limited to changing the minor and patch numbers. At that point, it would probably mean that the program is version 0.3.0. If you were to take the traditional versioning model (the one that is not written, but depends solely on the developer's common sense, and is expected by the end-user), it would probably be more similar to 0.1.3.
- Now the developer believes that the program is good enough for public testing, and he wants to reflect that in the version. This is when the 'alpha', 'beta' and 'RC' terms come into play, and that's one of the most vague things described by SemVer and the most confusing for both developers and users alike. In this case, the developer wants to announce a beta release. With SemVer, you probably have to note it as 1.0.0-beta.1. With the traditional model, it would be completely undefined, since it could be 0.1.4, 0.2.0 or 1.0.0beta1, with the two former not indicating the beta status at all, and the latter option being severely limiting and confusing for the users.
- Oh my, the developer got a lot of criticism after the first public release, so he wants to revise his whole program! That's unfortunate - it means that the developer is effectively going back from the beta stage to the alpha stage. Again, we have a lot of issues. SemVer doesn't indicate what happens in that regard. You can't go out of the beta stage, since you would be effectively changing the earlier version, but you can't stay in the beta version, since you are breaking the API. And you can't skip a version, since it regards 1.0.0 as the milestone, even though that is probably the preferred method to use there. With the traditional approach, it is just as big of a problem. If he chose the first option in problem #2, the version would be 0.1.5, with the second option it would be 0.2.1, but once again it doesn't even indicate that it was an alpha or a beta, and in the first option it would seem as if the alpha version is actually as stable as the previous beta to the users. If the last option was picked in problem #2, the developer may choose to go back and create a 0.1.4 version and if all goes well, release 1.0.0-beta2 after that, or skip an internal version altogether.
- Now consider that the developer has improved the program significantly, and has released a stable version 1.0.0. And then, he has a brilliant idea about a feature he wants to implement, but the new feature obsoletes two of the older features. Uh oh! With SemVer, you have to either give up on that idea, or leave the old features in as deprecated functions, or bump the major version to 2.0.0, even if that's for a small revision. With the traditional model, it would probably mean using 1.0.1 instead.
- The program has grown, and the developer is going for the second version of the software. The stable version of the first edition is, say, 1.5.10. Now the developer wants to target the second edition with a lot of new and exciting features that are bound to break the API. He has two choices with SemVer - either continue with the version 1.X.X adding new features, but retaining compatibility as long as possible, or go to 2.0.0-alpha.1. Both routes are going to cause severe problems. If the former option is picked, users will believe that the alpha version is as stable as the final release was (technically they would be right, if it was a library, but it's not nearly like that if we are talking about an application). So they would upgrade, just to see a whole lot of segmentation faults and crashes. If the second option was picked, the developer would be limited to changing the number after 'alpha', which means that the minor and patch versions won't matter at all, so the version will end up being 2.0.0-alpha.6268 before it goes to beta. With the traditional approach, it is again very undefined, since every person would find their own scheme to use in that case.
Now, all of those issues there would probably be avoided by simply releasing less often and not noting internal releases at all, nor noting the development stages of 'alpha' or 'beta'. But that means that SemVer (and even the traditional method) doesn't fit for the "release early, release often" philosophy-guided software, not even talking about perpetual beta software. SemVer mainly fits large and established libraries, like libpurple. And it also has a few other flaws, for example, the fact that developers will probably forget to increment the version number after some changes, since it is never noted when exactly should the version be incremented. When the change has just begun? When it is finished? What if you want to fix a few things after you implement the changes?
I am proposing a slightly different versioning system that is more aimed at applications and being user-friendly, while removing as much ambiguity as possible and still being flexible. I will call it GemVer for now, and I will use it for my software from now on.
Gem Versioning Specification
The basic structure of GemVer is this:
major.status.minor.patch
While this is very similar to SemVer, there are in fact rather big changes not reflected here. Let me go into more detail and give you an actual version example:
1.beta.03.09
- The major version shows the number of stable release the version is aiming for. That is the target for all of the beta versions. It doesn't indicate the number of API changes like SemVer, but instead the current number of the development cycle the software is currently in. All software starts at the cycle #1, and this number increases only after a final version of the software is released, and a second version starts being worked on.
- The status version is written in words to prevent confusion. Available words are:
- prealpha: Indicates that the software is not ready for internal testing, that is, it doesn't compile or segfaults immediately after launch due to the framework still being built. This status is only needed when the project is just starting or if the software is being rewritten from scratch.
- alpha: Indicates that the software is ready for testing, but is not yet feature-complete. Features can be added or removed at will. This is the longest state in the development cycle. If it extends for a long time and there are no plans for releasing a more stable version, you have a perpetual beta (technically it should be a perpetual alpha).
- beta: Indicates that the software is under a feature freeze. No new features can be introduced, only bug fixes. Users know that this, minus the bugs, is what they can expect out of the finished product.
- rc: Release candidate stage. The developer has solved all of the bugs in the beta version and the software is nearly ready to be used in production; the only changes allowed in this stage are if new bugs are found. This stage is timed; if after a given time no new bug reports are filed, the release candidate becomes a final release without any changes. If there are any new reports, the timer is reset.
- final: Also known as Gold master, Gold release, Release to Manufacturing or Release to Web. This is a stable release candidate for which no new bugs were found during the given timeframe. That means that you can't mark something as final if you have changed the code of the program in the same commit - an RC version gets promoted to final automatically.
- The minor version indicates the amount of new features implemented and large bugs fixed. It is incremented every time a feature is done implementing, that is, the developer is fairly sure the new function(s) will not change significantly or that the large bug is truly fixed. The minor version consists of a set amount of numbers, so that there would be no confusion between versions such as 1.rc.1.11 and 1.rc.11.1. The amount of digits should be kept as low as it is reasonable (if you are using a bugtracker, it would probably be the amount of issues open, possibly plus one digit). Two digits is a good enough amount for a small project, since it is unlikely to go over 99 in a single release cycle. Large projects might need three or even four digits. The minor version doesn't get reset in a single development cycle.
- The patch version indicates a small change. It could be a new feature that has just begun being implemented, but is not complete yet, or a bug that a fix is being worked on, or a small bug that was fixed, or anything else small you can do in a commit. The patch number gets reset on every minor version in order to keep the number small. It makes the most sense to have the same amount of digits as the minor version for readability.
One very important thing to remember is that you basically need to use a version control system to make this work (after all, it's version control for a reason!). This system, unlike SemVer, covers internal versions just like release versions. Every commit must increment one of the version numbers - that way, you will never accidentally forget to increase the version number. To know which version to increment, you just have to ask yourself a simple question or two after each commit: "Did this commit introduce any new major functionality or fix a major bug? If so, is it the last one required to advance the development cycle?" If you answered "No" to the first question, you need to increase the patch version only. If you answered "Yes" and "No", you need to increase the minor version and reset the patch version to 0. If you answered "Yes" to both, you need to increase the status to the next iteration, increase the minor version and reset the patch version to 0. You only reset the minor version when you increment the major version. You increase the major version once you have a list of features to implement in the next edition of the software and have started working on it.
Another way of thinking, especially if you are familiar with Git, is to regard the minor version as the number of branches merged into the project, and the patch version as the number of commits submitted since the last branch was merged. This could even be automated fairly easily. Of course, this works only if you branch when fixing every larger bug and implementing every new feature, but since Git allows for very easy branching, using this model is even encouraged. You could even have separate branches for different statuses, like alpha and beta, to have the opportunity to go from one to another when needed. In that case the beta branch would have all the well-tested features and alpha would have all the bleeding-edge features.
In addition to a version control system, it would be very beneficial to have a bug tracker. That makes it easy to see whether you have fixed a bug or implemented a feature or not. GitHub is very useful for this, since it provides both an issue tracker and a specialised Git version control hosting space.
Also note that GemVer does away with the traditional 0.X.X system for starting projects altogether. Under GemVer, all projects start at 1.prealpha.00.00 automatically, since the software is in the first development cycle. The first commit should already bump the project to 1.prealpha.00.01.
This shows that GemVer makes it very easy for the users to tell what changed in a new version and whether they should update or not. If the application version is 1.alpha.10.18, it shows that the program has 10 features or bug fixes since the beginning of the project (pre-alpha) and 18 minor changes since the last feature was implemented. If it is 1.beta.052.125, then the software is in feature freeze and has 52 features implemented and bugs fixed since the start of the project and 125 micellaneous commits since the last major feature was implemented. This way, there is no longer need for "beta 1" or "beta 2", you indicate that directly by the amount of bugs fixed, so users know just how different the two are. Even more, you can even stop the beta phase and go back to alpha at will, since the minor version is never reset during a single cycle. With this system, it is obvious that 1.alpha.086.015 is newer and has more features than 1.beta.025.152, but the latter is guaranteed to be more stable than the former, while they are both similar in concept (since the major version hasn't changed). 1.rc.087.502 means that the user can use the software with nearly perfect stability and expect a final release in a short while. 1.final.089.001 means that compared to 1.final.088.000, there has been one major bug fixed and one minor alteration.
Since I don't have all the specifics figured out, the form is still open for discussion, and perhaps flexible. For instance, instead of version 1.beta.56.08, you could use these ways of typing the same version:
- 1.56.08-beta
- The downside of it is that it is not obvious that it is a beta of version 1, and not a beta of version 1.56.08. However, it is more traditional and more compliant when writing a library, since linking against 1.beta is not a good idea - after all, you want to link against the final release as well. But when writing a library, you should probably stick with SemVer and your own internal versioning system.
- 1/beta/56/8
- 1;beta;56;8
- 1-beta-56-8
- On the positive side, this method lets you omit the preceding zeroes, since there no longer can be any confusion with decimal characters. On the negative side, it is unusual and may not look good for users.
- 056.008
- You can use this for software in a perpetual beta (technically a perpetual alpha), since neither the status nor the release cycle number ever changes. However, it requires a lot more digits.
- 1.beta.10.46.08
- If you wish to be even more precise, you can separate the minor version into two, one for features and one for bugs, and retain the patch number at the end.
- 1.beta.56.1058
- You can change the patch number into a build number (not reset the number on every minor version change so that the build number is actually the revision/commit number). The positive side is that both of the last numbers will always grow up, and users will be able to tell how many commits are needed for a usual major feature to be implemented. The negative side is that the numbers no longer have a consistent length and the last number will be too high to remember for the user.
The downside of using any of these templates as a whole is that it requires you to use words. You might think, why not substitute the status with a number, say, pre-alpha is 0 and final is 4? The problem with that system would be that you could get a paradox, since you may be forced to release versions in this order: 1.2.11.05, 1.2.12.08, 1.1.13.01 (if you go back from the beta phase to the alpha phase). Users will always think that the beta version is newer, while it is not true. Also, your software will always go to final at .4, which doesn't match the usual .0 being the final.
Now you might wonder, what if you are using a system that only accepts integers as version numbers? What to do then? Well, just skip the status. You will lose that information, but it won't disrupt the readability of the version: 1.12.08 is less than 1.13.01, even if you can't see which one is a beta and which one is an alpha. Unfortunately, that means that the stability of the release is not reflected in the version in this case.
Another problem could be with ordering, as some auto-ordering scripts might guess that, given the alphabetic order, rc > prealpha > final > beta > alpha, which is completely untrue. In that case, you might alias prealpha = aleph, alpha = alpha, beta = beta, rc = gamma and final = omega (as delta is used for delta patches; aleph is the Phoenician first letter, from which alpha was derived, and it happens that e < p in the Latin alphabet).
So, that's my idea so far. I have already tested this a bit on Arcomage, and so far I haven't had a lot of problems. If you have any suggestions, feel free to post a comment!