CS 346 (W23)
Toggle Dark/Light/Auto mode Toggle Dark/Light/Auto mode Toggle Dark/Light/Auto mode Back to homepage

Implementation

The implementation phase is focused on developing features and/or making changes to software. At this stage, you should have an understanding of what your user needs from this feature (requirements) and high-level technical decisions have been made (analysis & design).

This step consists of four sets of related activities:

Objective Implement your design
Activities Document low-level/component design; Implement features; Produce unit tests
Outcome Source code and unit tests

Implementation consists of all activities required to realize a feature. This includes any remaining design, coding, and writing appropriate unit tests. It may also include other supporting activities: writing documentation for other users, or scripts that may be needed to deploy it properly.

The Goals of Software Implementation

In A Philosophy of Software Design, John Ousterhout suggests that software design is all about:

  • Abstraction: picking the appropriate abstractions to represent the domain.
  • Problem decomposition: breaking a task into small, comprehensible problems.
  • Reducing complexity: simplifying the problem enough that you can build an appropriate mental model.

Good design focuses on reducing complexity in software; decomposing problems in a way that minimizes complexity. In other words, we’re fundamentally trying to break down a complex problem into smaller, manageable chunks. You need to be able to form a conceptual mental model of what you’re building (“holding the system in your head”).

Courtesy of Ousterhout1, here are some suggestions for avoiding complexity in your designs:

1. Classes Should be Deep

This is another way of talking about information hiding [Parnas 1972]. We hide information not due to distrust or security concerns, but as a way of reducing complexity for the programmer. We want to package up functionality in a way that reduces the programmer’s need to understand how it works.

The interface of a class is the complexity cost of the implementation (again, complexity refers to the programmer’s mental model - what they have to know to use the class properly). We want an interface that hides a LOT of complexity to make the cost of using the class worth the benefit.

Shallow vs. Deep Classes [Ousterhout 2018]

Here’s an example of a shallow class that has a negative benefit: it costs more mental effort to determine what it does than it does to just write the underlying code yourself (and it would be fewer keystrokes!)

	private void addNullValueToAttribute(String attribute) {
		data.put(attribute, null);
	}

You cannot always eliminate shallow classes, but a shallow class doesn’t help you much in the fight against complexity.

“Classes and Methods Should Be Small” [Many CS textbooks].

I absolutely disagree. Classes and methods should be as large as they need to be to appropriately abstract their functionality.

Example: shallow functionality, complex interfaces Look at the Java SDK for the consequences of class “explosion”, where each class adds an almost trivial amount of functionality. The class hierarchy is far too broad and deep, and requires a huge amount of effort to figure out.

e.g. Java File libraries. You need multiple classes to open a file with buffering, with serialization. The common case requires three classes (!). See Kotlin, which wraps all of this in a single class.

Example: deep functionality, simple interface Unix File I/O handles the same problem much more simply and elegantly.

int open(const char* path, int flags, mode_t permissions);
int close(int fd);
ssize_t read(int fd, void* buffer, size_t count);
ssize_t write(int fd, const void* buffer, size_t count);
off_t lseek(int fd, off_t offset, int referencePosition);

This abstracts: on-disk representation, disk block allocation, directory management, permissions, caching, device independence.

2. Define Errors Out Of Existence

Exceptions are a huge source of complexity.

Common wisdom: detect and throw as many errors as possible. Better approach: define semantics to eliminate exceptions. Goal: minimize the number of places where exceptions must be handled

Often we can redefine the semantics so that there is no error.

Examples of poor design choices

  • TCL unset a variable that doesn’t exist throws an exception
  • Windows cannot delete a file if it’s open.
  • Java substring class throws out-or-range exceptions.

The common case should be simple and just work. Save exceptions for runtime behaviour that you cannot otherwise manage.

3. Practice Strategic Programming

Most of us are driven to make tactical programming decisions:

  • Goal: get next feature working or bug fixed ASAP
  • A few shortcuts are taken, hacks are put in-place.
  • Result: bad design, high complexity.

Complexity is incremental and these kludges and hacks build up over time. Mistakes build up over time, until our code ends up in a poor state - highly complex, poorly designed.

In the long-term, thinking tactically harms code quality. We need to be disciplined and think strategically.

Strategic programming

  • Goal: produce a great design (while solving today’s problem)
  • Simplify future development
  • Minimize complexity
  • “Sweat the small stuff”

Tactical vs. Strategic Programming [Ousterhout 2018]

We need an investment mindset: extra time taken now will pay off in the long-term with higher-quality code. Yes it’s slower at first, but usually worth the investment2.

Examples

Most startups are purely tactical.

  • Pressure to get products out quickly/first.
  • “We’ll clean this up later”.
  • Code quickly turns to spaghetti.
  • Extremely difficult an expensive to repair later.

Facebook

  • Culture of “Move quickly and break things”.
  • They’ve since changed their motto to “Move quickly with solid infrastructure”.

Google and Apple

  • Both companies have a strong design culture, which attracts the best engineers.
  • Their products tend to work very well.

Continuous Integration (CI)

Test-Driven Development addresses the issue of doing local, small-scope testing as part of implementation. However, it doesn’t address issues related to the system as a whole, or that might only occur when components are integrated.

Martin Folwer introduced the term continuous integration to describe a system where we also perform integration testing at least once pr day.

The fundamental benefit of continuous integration is that it removes sessions where people spend time hunting bugs where one person’s work has stepped on someone else’s work without either person realizing what happened. These bugs are hard to find because the problem isn’t in one person’s area, it is in the interaction between two pieces of work.

– Fowler, 2000.

A system that supports continuous integration needs, at a minimum, the following capabilities:

  • It requires a revision control system, with a centralized main revision that can be used.
  • The build process should be automated so that anyone can manually launch the process. [it should also support automated testing based on other events, like integrating a branch in the source tree].
  • Tests should be automated so that they can be launched manually as well.
  • The system should produce a final distribution.

CI Systems

Continuous Integration Systems are software systems that provide these capabilities. Early standalone systems include Jenkins (Open Source), and CircleCI. Many source control platforms also provide CI functionality, including Bitbucket, GitHub and GitLab.

For example, you can automate GitLab so that it will build and run your tests anytime a specific action is performed like committing to a branch, or merging a PR. This is managed through the CI/CD section of the project configuration.

GitLab CI

The GitLab configuration and terminology is pretty standard:

  • A pipeline represents the work or the job that will be done.
  • A stage represents a set of jobs that need to be executed together.
  • Jobs are executed by runners, which define and where a job will be executed.

These all represent actions that will be taken against your source code repository at specific times. The examples that they provide include:

  • A build stage, with a job called compile.
  • A test stage, with two jobs called test1 and test2.
  • A staging stage, with a job called deploy-to-stage.
  • A production stage, with a job called deploy-to-prod.

In this way, you can setup your source code repository to build, test, stage and deploy your software automatically one or more times per day, as a result of some key event, or when manually executed.

Although we have a GitLab instance running, we do not have access to a cluster that can run jobs for us. In other words, we cannot do this in production using our current setup – at least not without gaining access to a Kubernetes cluster somewhere.

  1. John Ousterhout. 2018. A Philosophy of Software Design↩︎

  2. This assumes that we will pass the point of intersection of these curves. MOST software will live long enough to justify the time investment. ↩︎