Let’s talk about some good habits to form when organizing the code of your personal projects. This post is aimed at code newbies who have never worked in a team environment before. I want to share some of the strategies for source control you might learn when you work on a well-functioning team, on a big project in the future.
Let’s say you have some website you are making. Right now I’m working on a project I literally call EveryDamnWebsite because it’s an example of all the stuff I hate about most modern websites nowadays. I maintain the code in a folder on my local computer and also on GitHub, here: https://github.com/xerocross/every-damn-website-2019. There are alternatives to GitHub. It’s not the only option, but it’s what I use. Also, I am hosting this site currently on Heroku at everydamnwebsite.herokuapp.com. I will use this project as an example.
When people start using Git, I’ll bet $5 they tend to do everything on the master branch and never even use any other branches. The first thing I want to suggest to you is that you never (or almost never) make changes directly to the “master” branch. We will introduce branches into our workflow. For now I assume you have a “master” branch. Whatever the main body of your code is at this moment, that’s your master.
As a preliminary step, from master create two new branches, one called “stage” and one called “production”. Remember that when you create a new branch
checkout -b stage it duplicates whatever branch you are on at the time. So for both of these make sure you are on the master branch and then create these two.
We will come back to stage and production in a moment.
Instead of directly changing master, from now on any time you want to change something, create a new git branch, starting from master. So, first you make sure you are on master. Then you
checkout -b newBranch. You have to be on master when you make the new one because then newBranch becomes a copy of master.
Make your changes on the new branch. Ideally, the changes you make in a given branch should collectively represent one cohesive unit of change. It should not be a grab-bag of various unrelated changes you want to make. (I admit however sometimes I do include unrelated updates and changes in a branch though.) There is nothing stopping you from having more than one branch in progress at a time. Just make sure you start each one from master when you create it. And merging the changes later can be a nightmare, but that’s a topic covered well enough in various tutorials.
Commit changes on your new branch early and often, and push your branch and its commits to the remote repository.
When you are satisfied that the code on the new branch is finished, push the branch and make sure your local version is up-to-date with the remote version. Now we go to the repository web site. In my example, I go to the GitHub website and view my repository. There, I will do what we call “raising a PR”, or creating a PR. “PR” stands for pull request. We will create a PR to pull the new branch into the stage branch. For specifics on exactly how to do that, it’s best if you Google that or search for help within your code repository website.
Creating a PR for this is not strictly necessary, but let me explain why I think it’s a good idea. Typically a PR is for when you don’t have write privileges on the branch you want your code merged into. Say I don’t have the authority to merge code into stage myself. I need my team lead’s approval for that. In that case, when my code is ready I raise a PR and send it for his attention. My team lead does have write privileges for the stage branch. Code repository websites like GitHub also offer a nice interface for viewing the actual changes made by my code. He will be able to see the differences, side-by-side, between the existing code and the new code I’m submitting. If he approves the PR, then the code is merged into the stage branch. All he has to do is click a button and this closes the PR and merges the code. The PR remains as a record, but it’s not “open” anymore.
The PR is a useful tool for elevating code to the attention of someone who actually has the proper privilege to write to a repository or branch. But aside from that, creating a PR has other uses. A PR is a record. You can keep the PR as a piece of historical data indefinitely. It’s a record of what the code was before, what was merged in, and exactly how it was changed, who submitted it, and who approved it.
Now, if this entire project is your own baby all alone, then you are never required to use PRs at all. You can just merge one branch into another using the
git merge command. And I’m not saying that’s bad! It’s certainly quick and simple, and those are virtues. What I’m saying is that a paper trail can be useful, and the paperwork involved in a PR helps you to be more careful about reviewing and approving code, especially when it is your own code because you are biased (either for it or against).
Now let me explain why you merged into stage.
Stage is a testing environment. It is intended to be as much like the production code environment as possible without actually being production. You need to actually host the stage branch. It cannot be allowed to alter production databases or trigger external hooks like credit card transactions that might happen in production. But as much as you can possibly make it, stage looks like production and it is a real, hosted website—not just something you serve and look at on localhost. You should host it in the same way you host prod. So in my example I have stage-everydamnwebsite.herokuapp.com. It’s an actual, live website. Users are never intended to view my stage environment. It’s for internal use only. But I don’t take any special precautions against them seeing it. If you think it’s necessary to put your stage environment behind a password, then by all means do so.
Stage isn’t always the same as prod because now stage contains the new code we wrote from our new branch. We build it and update the host. We run our tests, do whatever QA we deem appropriate.
A very common scenario is that things in stage don’t work out quite as we wanted. So we might have to go back to our working branch and make some changes. Either we find that problem ourselves, or possibly some very helpful QA person finds it for us. (I’m being very sincere there. QA is an important job, and I’ve had the privilege of working with some very excellent QA people.)
In my own example right now I’m working on branch called enable-close-embedded-vid. I thought I was done. I raised a PR to stage and approved that PR myself. Then I let stage build so I could view it live, and I discovered a problem I had not anticipated. It was an easy fix. I made that change to the same branch as before, enable-close-embedded-vid, and now I push the new change to the remote.
I already merged that previous PR, so that one is done. I can’t edit that one to include this new small change. That previous PR is now a historical record. You cannot piggyback new changes onto something that has already been approved by the boss and merged. I have to raise another PR. So, yes, in the real world this would mean going back to your boss or team lead and telling him/her you need to make another small change. This is very, very common.
Thus, I create the new second PR and approve it myself, so it gets merged into stage.
Now I update the host again. In my case, I have my deployment system set up so that the stage host automatically updates and builds and deploys whenever this stage branch is changed. It’s connected to GitHub so that it all just happens automatically, and it only takes like a minute or three.
When stage is done re-building, I do the QA again and in particular I make sure I fixed the error I noticed before. For a big important project like a website with millions of visitors per month, you probably have a whole QA team that will run a complete battery of tests on the stage environment to make sure nothing is broken.
Also, on a real-world team your changes from your branch are being combined with other people’s changes from other branches—code written by teammates with some different but adjacent goal at the moment.
At this point in my own EveryDamnWebsite example I can verify that stage looks good. That problem I noticed before is fixed, and the goal of the original work is done. That is: remember I had some reason for creating the new branch in the first place. Here I verify that the new code is functioning as expected. Obviously I did check that it worked on my own local machine before raising a PR for stage, but it’s important to also verify that it works live on stage.
I am satisfied now that the code is ready to be merged into master. I go to the GitHub website again now to raise a PR. I look at the enable-close-embedded-vid branch and this time I raise a PR to merge it into master. People don’t usually care so much about PRs into stage, but PRs for master are the ones people hold onto forever as a record of the event. This kind of paper trail can be essential for things like financial websites that have to be audited. For that reason, you typically have to follow some specific naming scheme for the PR. In my case, I’m going to name the PR the same as the branch: enable-close-embedded-vid. In a work environment, both the branch and the PR would probably be named after some numeric identifier for the story the code was meant to resolve.
The person you raise the new PR to may or may not be the same person as before. Raising a PR to master might require both your boss and his boss to approve. Let’s assume that this time you get the approvals, and somebody with write privileges merges your code into master. In my case in this example, that person is me. I have write privileges, so I approve the PR.
Now pay attention because this next thing is important. The code actually deployed and hosted live—the code my users see in action—that is not master. That is the production branch. The process of updating the production branch from master is what we call “deployment”. How often new code is deployed is a matter of policy that changes from one place to another. Weekly deployment is a typical situation.
Let me try to clarify this. Now that my code is in master, the production branch is outdated. Exactly how you go about switching over to new code in a serious-business production environment is the realm of what we call “devops”, a field of engineering I don’t know much about. It needs to be seamless, and devops engineers have ways of accomplishing that. It’s a topic for someone else to teach you because I am not a devops engineer. From my point of view, all that needs to happen is we delete the existing production branch and create a new one from master, then we let the changes propagate through to the hosted production environment.
To accomplish this, I’m going to execute
git branch -D production followed by
git checkout master and then
git checkout -b production. That updates production on my local machine. To update it on the remote, I will do a forced push. I want to overwrite the remote. (By the way, on this topic in particular other devs and dev teams might have very different procedures. But the end result is the same. The production branch has to be updated to match master, and there shouldn’t be anything in production that is not in master.) They should be identical.
git push -f does a forced update.
More likely, in a big-product website with millions of visitors, what you would do is force-push master to overwrite stage first. Then QA would do some kind of test to make sure everything still looks good. Then, we would force-push master to overwrite production. And that’s when the changes actually propagate through until the user sees the updated software. In my case, I can verify now that my updates are live on the production website, https://everydamnwebsite.herokuapp.com/ .
Deployment is a big deal. It’s not the kind of thing where your boss just looks over the code and approves it and pushes to production. In my experience working on a team, we decide as a team if and when we will deploy, and several key people need to be on hand and ready for emergencies when the code goes live. Senior developers, senior QA people, senior database people, etc., are on hand or on call during a deployment. If some key person is sick that day, the team probably just won’t deploy, or they’ll arrange to have someone of nigh-equivalent skill on hand.
After a deployment, in a typical setting, QA will then do a battery of tests on the live code in production. Just how thorough these tests are is a matter of company policy and how good their QA is. But usually you do some kind of QA on production deploys.
This is a lot of paperwork and verification to endure if you are one man working alone and nobody is forcing you to do this. It’s so much easier to just commit changes to master and deploy master. And, again, I’m not saying that’s wrong—for a small project where nobody is really watching. But I worked at LendingTree. If there was a bug in production or if, God forbid, the website went down in production even for a few minutes, the company lost money. A lot of money. So we put these checks and double-checks and triple-checks in place, and we left paper trails we could follow later to prove those checks were done.
Also, I want to be clear on this: this isn’t how every team does it. Most professional teams probably do something fairly similar to what I describe here, but there might be important differences. This is not a textbook guide on source control and code deployment practices. It’s a collection of one man’s observations. I hope it proves helpful.
In my own little one-man projects, I work through a procedure like this… sometimes. It’s not because I think it’s necessary. Certainly nothing I’ve done is that mission critical. But I like to keep this kind of procedure fresh in my mind. Good practices, you know? It’s important not to forget them. And if you are a newbie, self-taught, never worked on a big project with a team, then this is something you can practice on your own just to prepare so you are more familiar with real world workflows when the time comes. That said, when you do get your first job, remember to do things the way your team does things. Good luck!