Photo by CHUTTERSNAP on Unsplash
In my last two posts, I started to break down the types of areas where an organization might need to deploy data tools/infrastructure along two axes: the categories of common use cases and the stages that you’ll encounter in most of these use cases. You can think of these as defining a grid of functionality. Depending on the use cases your organization will encounter, you’ll want to have one or more options for each square. In this post, I want to start getting into the high-level design decisions that you should consider for each block in the grid, whether you’re purchasing an off-the shelf solution, building from scratch or somewhere in between.
Before getting into the detailed functionality that you need, there are three general areas of design decision which you can think of as defining three layers that make up the tool or system:
- Control: How will the user interact with the system and tell the computer what they want?
- Compute: Where will the computation be done that allows the computer to carry out the user’s instructions?
- Storage: How will the intermediate data be stored before, during and after the stage?
For each of these decisions, you can very quickly get into the weeds with all the different options and nuances. But at the high level, there are some general categories of options that can be understood even without a deep software engineering background. So in this and the next two posts, I’m going to walk through them with one post for each of the layers. We’ll start out closest to the user with the Control layer.
Mouse or Keyboard
While technical details and context have a large influence over which options are ideal for the control layer of each block, user preferences ultimately play the largest role, particularly when it comes to the main divide between the different options: graphical interface vs. text. This essentially comes down to whether the user spends more time with their hands on the mouse or the keyboard.
Graphical interfaces are typically easier to get started with because they immediately present the most common options. Less technical users tend to be most comfortable with a graphical interface while more technical users often find that once they’ve learned a text-based interface they’re able to use it much faster and more effectively. These users often get frustrated if they have to use graphical interfaces for common, repetitive tasks that could be done faster, or even be automated with a text-based interface.
On either side of the mouse/keyboard divide there are a number of options that we’ll explore below. In some cases, it’s possible to provide the user more than one option for the control layer, e.g. both a graphical and a text interface. So when you make these decisions it’s important to think about who will be responsible for that particular box in the grid and what kind of interface all the different users are likely to prefer.
Graphical Native/desktop Apps
A native/desktop app is a program that runs directly on the user’s laptop or desktop computer. Before browser applications started to become common after around 2005, this was essentially the only type of graphical interface. Native applications offer the most flexibility and control because they’re custom-tailored to the computer they’re running on. They’re also faster for graphics-intensive applications and large-scale data visualizations, and may be the only option for particularly graphics-intensive applications.
The drawback of a native application is that you need to maintain different versions for each type of computer that it will run on – Mac, Windows, Linux, etc. Some applications are only available for one or two systems, which is a problem if members of your organization want to use a different one. The application will also need to be installed individually on each user’s computer and then maintained to ensure consistency. So this can turn into a much larger IT headache than just running a web app on a single server.
While many native apps will run entirely on the user’s computer, it’s also possible for them to connect to external services to either store data or kick off large-scale compute jobs. I’ll write about these storage and compute layers in the next two posts, but it’s important to note that going with a native app for the control layer won’t prevent you from using most options for the other layers.
A web application is a program that runs in your web browser using reactive/interactive HTML components, usually based on the HTML5 standard. It may be loaded from files on your hard drive and completely self contained on your computer, but more often it’s loaded from an external server and continues to communicate with that server as you use it. Some also offer a hybrid “offline” mode can function without an internet connection, at least for short periods.
The main benefit of a web app is that you can write one version that runs on many different computers, as long as they have a recent version of a major web browser. Moreover, for apps that are loaded from an external server there’s no installation process: You just go to the URL and it works. This also means there’s no individual upgrade process. When a new version comes out, upgrading the server will ensure that every user is also up to date.
Early web apps were very limited in functionality and much slower compared to native applications. However, they’ve improved drastically in the last decade, to the point where you often won’t notice the difference. A number of off-the-shelf programs that started out as native apps have begun offering a web-based version, though it sometimes has limited functionality. Some of these, such as MS Office 360, even allow you to open the same document in either the native or web version of the application. In this case, both versions of the control layer communicate with the same external compute layer.
Command Line Utility
A Command Line Utility (CLI) is a program that is run by entering a command into the command line of a terminal. For multi-step processes, this often involves running the same program once for each step. The parameters that define the step are typically included in the line when you invoke the program rather than the program prompting the user while it’s running.
Each step in a multi-step process is defined by a line of text, which makes it easy to repeat common processes by either copy/paste or by writing a shell script that enters the commands for you. This also makes sharing processes with others relatively easy – You can just document the list of commands, without needing to include screen shots and descriptions of where to find buttons, text boxes, etc.
For users who aren’t used to working with CLIs, they can be very intimidating and confusing, particularly if something goes wrong. Even for more experienced users, learning a new CLI can be tricky if it isn’t well documented. But once an experienced user has figured out the important commands, it will often be the fastest and most reliable way to get done what they need.
The user can run the CLI on their desktop/laptop, or on a remote system through an SSH connection that allows them to type into the command line on the remote computer. In either case, as with a graphical native application a CLI can run entirely on the same computer or communicate with a remote system for the compute and storage layers.
A code library is a collection of source code that allows different programmers to use the same pre-defined functionality across multiple programs/scripts/etc. As with the other types of control layers, a library may define algorithms that run entirely on the user’s computer, or may communicate with an external compute or storage layer.
Each library is written for a single programming language, though there are ways to “wrap” a library from one language in a shell that creates a library in a different language. A number off-the-shelf applications provide versions of their client library in multiple languages. For it to be usable by someone in your organization, this must include a language that your users are working in, or could potentially work in.
Note that the user will be doing the programming in an application that could be a native app (a text editor or IDE) or a web app such as a Jupyter notebook, or a text-based application like vim or emacs. The application they use for this is mostly independent of the application that the client library is interacting with, but the high-level design may need to consider what editors or types of editors the user can write their code in.
For technical users who are familiar with, or able to work in one of the available languages, a client library provides the most flexibility for controlling and automating processes and can allow them to connect and coordinate multiple applications in the same program/script. The client will often have more fine-grained options than a CLI. In fact, an application’s CLI will often be written using the same client library that is also made available directly to users. As with the CLI, a library may run entirely on the user’s computer or communicate with a remote compute or storage layer.
On the other hand, the technical bar for using a client library is significantly higher because it relies on the user being comfortable with programming. While many semi-technical users can learn to be comfortable with a CLI, programming is often a non-starter. So if any of your target users for a given block of the use-case grid are not programmers, you’ll need to have at least one other option for the control layer.
A REST API is typically a behind-the-scenes part of one of the other types of layers, rather than it’s own stand-alone option. But it’s worth mentioning because some applications make it directly accessible to users, while other’s don’t.
An Application Programmatic Interface (API) is a way for a program/system to allow other programs/systems to connect and communicate with it across a network using a pre-defined language. (The term API is often used in a more general sense, but we’ll use this narrow meaning here.) REST is a standard framework for defining such APIs and is currently the most common by far, though there are other frameworks such as SOAP that were more common in the past.
Any control layer that communicates with an external compute or storage layer uses some sort of API provided by that layer. If you’re building or managing those layers yourself, you can connect any type of control layer you want to the API. But off-the-shelf solutions that provide their own control layers may or may not allow users to connect other applications to the API, or may provide a separate API that is more limited.
Being able to communicate directly with an application’s REST API both gives you more options for defining your own control layer and allows you to have other existing applications communicate and share data with it. So when assessing off-the-shelf software it’s important to evaluate not only the options it provides for control layers, but also what it offers for a REST API.
In the grid of capabilities defined by the categories of data use cases and the common workflow stages, each block requires one or more options for the control layer that allows users to specify what they want the computer to do. Some of these blocks fall more along the lines of development/coding, and some are more operational. But for all of them, it’s important to think about what types of users will be responsible for them, and what type of control layer will allow them to work most efficiently and effectively.
2 thoughts on “Data Infrastructure Layers 1: Control”