SW Project Development

More: Interview & Tips from Neil Jones, Software Engineer, YouTube, Google

"Interview" with Neil 2018 as part of Google Faculty in Residence Program

DESIGN

1) In your varied work experience, can you share how the design process is performed?

-- What information should be in a design document?

When you write a piece of software of more than about 100 lines, you have to make choices. A design document is simply a record of what choices you make, and why. In particular, any time you come across multiple ways of doing something where there isn’t an “obvious only real choice”, you should jot it down. Write down the alternatives. Write down the pros and cons of each alternative. Make a decision and write down why you picked it. Especially highlight open questions, or issues where you’re not sure what the best way of doing something is. The key insight here is that you are designing a piece of software right now, but you -- or someone like you -- will be maintaining it forever. The design document helps future you remember what current you was thinking (you will forget) and it also provides a way for other people to comment on your approach. Maybe you missed something, you’re not perfect.

For instance, here is the sort of thing you might expect in a design document for, say, a web server that keeps track of a shopping list.

We need to have the ability to share a list between more than one person. How are we going to grant access to another account?

On the shopping list, we can hold a list user ids that represent all the users that can edit it.

Pros

Makes it easy to write an admin page for a shopping list

Cons

Makes it hard to find all the shopping lists a particular user can edit, eg to show in their profile

On a user, we can hold a list of “shopping list ids” that represent all the shopping lists that a user can edit.

Pros

Makes it easy to write an admin page for a user

Cons

Makes it hard to find all the users that can edit a particular list

We can have another table in the database that holds access permissions on a per-user/per-list basis.

Pros

Solves both the per-user and per-shopping list queries

Cons

Takes longer to implement

Requires a new data storage table and access class

We have chosen the third option because it will give us more flexibility in the future at an acceptable cost in time. (or: We have chosen the first option since that represents 90% of our expected usage and we are under a very strict deadline.)

Open question: how do we deal with caching of shopping list state? To make the Shopping List page render quickly, we store the results of the last database query in memory. Do we also need to keep the permissions object from option 3 stored in memory too, as well as in the database?

The general problem of “how to design software” is of course not something I can answer in finite time. But I would outline it roughly this way:

Understand what problem you are trying to solve, and why

Break it down into smaller pieces

Keep breaking it down into smaller pieces until you feel comfortable that you know how it all fits together and how the smaller pieces work

Document this process

Talk it over with someone else

Think it through again

Start coding

The problem with an outline like this is that it might give you the impression that design is a linear process. It isn’t. Instead, you’ll definitely start and do 1 and 2, but you might want to do a quick prototype or just doodle around a bit in code, so you’ll jump to 7, at which point you’ll learn something new about the problem and then you go back to 3 and break the pieces down further or rearrange them. All of this is normal, and not wrong! However, you should understand that conceptually, documentation and discussion of the problem domain comes before the code is finalized, and that breaking a big problem you can’t solve down into small problems that you can comes before all that.

TOOLS

2) What tools or kinds of tools are critical for SW Developers to have had practice with? (IDEs, Version Control, Issue Tracking, Project Boarding-Milestone lists, etc)

I would broadly classify tools for building code into two categories: things that make your job possible and things that make your job easier. A third category, tools that you’ll probably have to use but don’t directly affect your work, are important but will be obvious to you when you come across them.

Things you must master in order to be any good as a software engineer

A simple text editor.

Code is just text files. You do not need a large IDE in order to write code, and there will be some scenarios where you cannot run an IDE at all (eg, changing a configuration file on a remote server). There are other scenarios, like changing your terminal’s prompt, for which the IDE will probably cause problems, since IDEs tend to only work in some notion of a “project” that doesn’t apply to isolated files. So you have to have an editor that you are good at in order to do those tasks, but it turns out that it’s not unreasonable to write even large products entirely in a text editor with no IDE-like features. For instance, I wrote the first version of the YouTube Kids Android app using vi, a text editor that nobody would ever accuse of having...coding features...not out of some masochistic urge but because I frequently commuted by bus at the time and vi was just plain easier to deal with on a flaky network connection.

In other words, you can do your job without an IDE. You cannot do it without a basic text editor, even if you have an IDE.

Some basic editors worth learning (pick only one and get really good at it):

Vim

Emacs

Atom

Nano

Pico

Notepad or notepad+

Depending on your work environment, visual slickedit or bbedit or textmate may be options too, but it depends. For instance, if you work at a place that only uses Microsoft Windows, BBedit isn’t going to help you since it’s a mac-only editor. On the other hand, if you are working mostly in a cloud or linux environment, notepad isn’t going to help much. You’ll know you’re competent with an editor when you can take a list of 1000 mailing addresses in comma-separated-value format, switch the order of the first and last names and make it tab-delimited, without going through the file line by line.

The terminal, the filesystem, and ssh.

No matter what role you have as a software engineer, you will eventually need to issue commands to a computer that does not have a traditional UI. To do this, you will need to be able to use a command line, even if your primary platform is Microsoft Windows or ChromeOS. Becoming familiar with the terminal and what a “shell variable” is, knowing how to navigate the filesystem from the terminal, and knowing how to use ssh to connect to a remote computer and copy files back and forth is paramount. Combined with an editor, ssh and access to a terminal should -- technically -- be all you need in order to start working on something. It won’t necessarily be pleasant, but it is possible.

Version control.

You work hard on that code, so why not save it? There is absolutely no reason to not use version control. Pick one system (git is popular) and learn to use it well. Then, if you have to use a different version control system for some other project then just map it back to git. For instance, maybe you get a job at a company that uses mercurial instead of git. Ok, not a problem, just learn how mercurial handles branching, and relate that into your understanding of git. It’s like going to Texas after living in Oregon your whole life: mostly the same but a few things are different. You’re adaptable.

Like dealing with text files, this is bread and butter use-it-all-day-every-day kind of stuff.

A build system.

You need people to be able to reliably compile or run your program, even if it’s in a scripting language. Moreover, you need everybody else to be able to compile and run your program the same way you are. This is where the build system comes in. All build systems are different, and learning one won’t really help you too much with the others -- but what you need to be in the habit of doing is using a build system to compile and test your code, not your IDE. In fact, most IDEs now integrate directly with many different build systems to figure out how to set up a project, so it’ll make your life easier anyway. The build system will help you manage the dependencies within your code and on other external modules. Good examples of popular build systems these days include Gradle, Maven, and Bazel. There are many others, but woe betide the person who uses GNU make so try to avoid that one unless you hate yourself. Lives have been squandered trying to figure out Makefiles.

At least one computer language, in detail.

Obviously, to write computer code, you need to know a language. But you shouldn’t just know how to write a basic for loop or how methods are expressed in Java -- you should know one language deeply. Let’s suppose you choose Java. How do closures relate to Runnables? What is the new streams API all about? What is “final”, when it is on a method versus when it is on a member variable? What’s the difference between those two things, anyway, when you use “synchronized” instead of “final”? And what happens with final parameters in functions? What is an inner class, and why do people keep using “static” on them?

Like a versioning system, knowing one “base tool” will help you understand all others of the same kind. In the case of a computer language, knowing one language deeply will help you express your logical ideas clearly and efficiently in that language. But it’s also true that computer languages are not written to express logic to computers, but to people (assembly or bytecode is used to express logic to computers). Understanding the nitty-gritty of that will set you up to understand other software engineers’ work.

Things you want to get good at in order to go home early

An IDE.

You might get the impression from the above that I am against IDEs, but I am not. You should use an IDE to do things that you would be able to do yourself in a text editor, but would be labor-intensive. For instance, many IDEs have the ability to rename a method in a Java class. That’s great! You right click, scroll to “refactor”, click “rename method” then fill in a little dialog box and then boom, it’s renamed. If you did it in a text editor, you would need to change the name of the method, and then you’d need to hunt down every instance of that class and change the methods called on that instance too. And didja remember the unit tests? Nobody ever remembers the unit tests until they fail. It’s really great for that to happen automatically and saves you a possibly large amount of work, but you don’t want to get to the point where you don’t understand that this is what the IDE is doing -- because then you don’t actually understand your own code as well as you think you do, and you become afraid of your code, and then it’s just game over at that point.

A lot depends on what language you intend to use for any particular job, but generally speaking JetBrains IDEs are excellent. Eclipse is also somewhat popular. XCode is mostly only usable for MacOS and iOS projects in Swift or Objective-C, but some people have success with it on Java or C++.

A debugger.

You can get surprisingly far -- like I’m talking about “having an entire career and then retiring” far -- by sprinkling print statements throughout your code when you are debugging. That, of course, is the first thing you should master when you’re debugging, because it will always work and it’s always right and there’s no configuration. However, it is very slow and painful, like a root canal. A symbolic debugger is truly amazing and will make debugging problems in your own code substantially easier and can cut down on the number of iterations you have to do when writing a unit test (see below, about unit tests) by about a factor of 10.

The drawback of a debugger is the configuration you need to go through in order to get it to work. And this is why you have to have that plan B in your back pocket, of just printing “Hey! Got here! i=%d” in the middle of a for loop.

Sticky notes.

It’s amazing how useful these things can be in order to keep track of tasks, manage a project, even doodle out some simple class designs or an entire database. Plus you can leave them on your laptop screen when you shut the lid to go home.

A pair of noise cancelling headphones.

This is perhaps a bit tongue-in-cheek, but the fact of the industry now is that most companies have switched to an “open office plan”. Or maybe you’re a consultant that works out of coffee shops and on airplanes. Either way, coding will require a lot of concentration, and that can be hard with a audible noise and conversation happening around you. Rather than drown out the noise with an even louder one, getting noise cancelling headphones can be the difference between thinking clearly and quickly and rat-holing into a pointless diversion.

Other tools you’ll have to use but you can pick up as you go

For these, you shouldn’t bother learning them in depth, or even at all, yet but instead just understand why they are important. For the most part, they will help other people understand your work, so you should just bite the bullet and agree to use them when you are asked to. When I was starting out as an engineer, I’d sometimes push back, thinking that these things are “red tape,” but I’ve slowly come to understand that I’m not the only person in the world and that my job isn’t the only one that needs to be done. Don’t make things hard for other people -- it isn’t nice, and it can get you fired.

A bug tracker

This is a tool that can keep track of work you need to do, and things you didn’t do right the first, second, or third time. It remembers all those bugs for you so that you can focus on only one at a time and not worry about forgetting something.

Calendaring and project management tools

If you are getting paid to build something, then the person who is paying you sees value in what you do. It’s not unreasonable for that person to ask you when something will be done, and for you to be able to give an educated answer. These tools help you do that. Somewhat surprisingly, getting an accurate answer for “when will this be done” is virtually impossible so there are a lot of different approaches people have for project estimation. Just understand and accept that you may have to provide answers to “how hard is it to…” questions that you’re not comfortable with.

CODE STYLE & VERSION CONTROL & CODE REVIEW

3) Is Code Styling (commenting, naming conventions, modularization) important? Is it important for students to learn about Software Version Control? What about Code Review?

Yes, this is all important.

Code style is important so that you can pick up any piece of code in your organization and read it without cringing. The thing is, code style is entirely arbitrary. Look, I don’t feel strongly about tabs or spaces, or whether your curly braces are on the same line as the “if” or on the next line, or if you use Hungarian notation for your variables, or use an underscore (_) before or after your member variable names in C++. What I do care about is that it is consistent. Someone, somewhere, needs to just state for the record where those dumb curly braces go, when it’s ok to leave them out, whether or not lpszWinName is an ok variable or if it’s _lpszWinName or win_name_ or winName or WinName or _WindowName or what. Because I don’t care! I really don’t, and forcing me to make those decisions all day long instead of letting me focus on the decisions that matter -- like, how are we going to clear the cookie cache at the same time on two web servers that are on different sides of the continent? -- is a total waste of my time and your money. So. Code style matters, not because there’s a religiously right/wrong answer, but because if you can just agree to agree on the style guide then you can set up your editor to enforce it automatically and you can never have to think of it again because you have better things to do. Even taking a nap has more value than dealing with code style issues.

Version control I discussed earlier. Not using version control is like driving without brakes. Technically it’s possible, but you’re a fool to do it.

Code review is a touchy topic. I think code review is important in most software engineering companies, but it really depends outside of that. At Google it’s almost reached the status of Religious Virtue, which I think can be taken too far if applied outside of a large software firm. The value of a code review is that you get a second (or third) set of eyes on your work, and peer reviewed work is almost always better. It distributes knowledge, makes everybody feel ownership of the code, and usually leads to higher quality code with fewer bugs: you are more likely to write thorough unit tests when you know someone is watching. However, it only works if you have enough people that are dedicated and reliable. For instance, in an open source project with no reliable volunteers, you can’t just not ever submit any code because nobody reviewed it. Or if you are in a company that is not a software engineering company (eg, a pharmaceutical company) then you may not actually have someone who can review your code in a timely way. So I’d put this in the “if you can, you should” bucket.

It also puts everybody’s vulnerabilities on display, so that you’ll feel less like an imposter when you see dumb mistakes that you didn’t make; you are absolved for not being perfect, because everybody makes those same dumb mistakes. The code review is more about how open you are to correcting them than it is about how you avoid ever making them.

TEAM Collaboration

4) What tools are useful for communications between software developers on projects? (email, instant messaging-slack , face-to-face meetings, issue tracking, etc?)

The good news is that there’s nothing about software engineering that requires different communication tools than any other kind of work. You need to be able to talk to people, keep track of decisions, etc. It all depends on the project or company in which you are working, but overall the thing to remember is that building software involves an awful lot of working with people, probably more than you’d expect. You’ll need something to handle docs, calendars, spreadsheets, instant messaging, email, group emails, possibly phone calls, video conferencing.

TESTING

5)Is testing important and are there any special tools students should know about?

Testing is critical, and tools depend on the language. It comes in different flavors: unit tests, integration tests, etc. but the one to focus on now is unit tests. Just about all code you write should ideally have unit tests. You use these to ensure that the code you wrote compiles cleanly and does roughly what you said it would, and to protect yourself against future code changes. They will also protect you against unplanned changes to external modules, like if you upgrade apache-commons to version 2.37 from 2.35 and it breaks your FooWidget, then running your tests would show that. And with a build system, running tests is easy. These should run and pass before you commit code into your version control system.

I say “should ideally have” in the above because sometimes it’s not practical due to time, budget, or complexity. However, that needs to be the rare case, and one of the hardest parts of software engineering is swallowing that bitter pill of writing unit tests. They are ugly. They are not fun to write for most people, and nobody ever gets a promotion just for writing comprehensive tests. The thing that will help get you through those long wintery nights of dealing with mocks is that you can get a promotion for writing high quality code that doesn’t break often, and that’s nearly impossible to do unless you write thorough unit tests.

Interestingly, ensuring clear unit test coverage is a great way to improve the design and interface of your code. How exactly do you get dependencies into your class, for instance? Requiring unit tests forces you to rely on “weak coupling” instead of “strong coupling” that end up making your software design cleaner. It was never intended to work this way, but it’s a happy side-effect. As an example, let’s say you have a class “FooWidget” that needs another class “BarManager” in order to work. FooWidget could instantiate its own BarManager. That will work, but now it’s hard for you to test the interaction of just FooWidget in a unit test. A better approach is to take a BarManager as an input for the construction of a FooWidget. If we could do that, then it would be easier to test it. But it turns out that it then makes it possible to share the same BarManager in all your FooWidgets, which can make your code very fast and memory efficient. This is sometimes called “refactor for test”, which is to take your initial code and change it to be easier to test.

For a Java developer, it is critical to know and have experience with JUnit and Mockito. For C++ it’s a bit trickier, but gmock can fill that void. Python has pyunit. The reason it is good to understand at least one language’s unit testing facilities is so that you can pick up others quickly and see the role that each part plays. You’ll understand this when you see where and how to use a mock object.

GENERAL

6) Is there any other advice for students that you would like to add regarding Best Practices for Project Development?

Understand that best practices were born out of trials and tribulations of engineers who have suffered from bad choices more than you need to. Be flexible, relentless, and inquisitive. Know that in software engineering, more problems surrender to persistence than are conquered by brilliance...you wear them down, rather than solve them in a flash of insight.

And if you are going to write Java code, read “Effective Java” by Josh Bloch.

And if you are going to write code professionally, read “Cracking the Coding Interview” to put you on the same footing as all the other people trying to write code professionally.