Building Your Data Governance Toolbox

Photo by Cesar Carlevarino Aragon on Unsplash When you first start learning about data governance, it often seems like a hairball of tightly knit ideas where you can’t understand any one piece until you’ve studied and learned the whole thing. I’m not an expert by any stretch, but I’ve wrestled with learning about data governanceContinue reading “Building Your Data Governance Toolbox”

Layers of Data Infrastructure 3: Storage

Photo by Cobro on Unsplash In my last two posts I’ve explored the high-level design decisions related to two of the three layers that define each pipeline stage of each category of data use cases: Control and Compute. The Control layer defines how the user interacts with the system, while the Compute layer defines howContinue reading “Layers of Data Infrastructure 3: Storage”

Data Infrastructure Layers 2: Compute

Photo by Noah Negishi on Unsplash In my last post I described how you can think of your organization’s data infrastructure as a grid of blocks defined by category of use case and stage of the pipeline. Each block can be further broken down into three layers: Control, Compute and Storage. Last time I brieflyContinue reading “Data Infrastructure Layers 2: Compute”

Data Infrastructure Layers 1: Control

Photo by CHUTTERSNAP on Unsplash In my last two posts, I started to break down the types of areas where an organization might need to deploy data tools/infrastructure along two axes: the categories of common use cases and the stages that you’ll encounter in most of these use cases. You can think of these asContinue reading “Data Infrastructure Layers 1: Control”

Common Stages of Data Workflows

Photo by tian kuan on Unsplash I want to start going into more details of the categories of data use cases that I introduced in my last post. When you think of each use case, it’s easy to focus on a fairly narrow piece of it – typically the most interesting parts. But within eachContinue reading “Common Stages of Data Workflows”

Categories of Data Use Cases

Photo by Martin Woortman on Unsplash As the head of software engineering at a small startup with ambitions to grow much larger, I think a lot about how to design data infrastructure that will both address our immediate needs and adapt to future needs. I’ve seen what happens at large companies when each team hasContinue reading “Categories of Data Use Cases”

Requirement Diameters and Abstraction

In my last post, I discussed an idea called Requirement Diameters – the distance between all the lines of code that enforce a given software requirement – and the coding principle that these diameters should be kept as small as possible, particularly for requirements that are more likely to change. In this post, I willContinue reading “Requirement Diameters and Abstraction”

Code Factoring and Requirement Diameters

Experienced software teams know that to agree on the design of a project, you must first clearly define and communicate its requirements. But even when this is done well, disagreements over code design often persist due to different understandings of how the requirements are likely to change as the project evolves, and how to prepareContinue reading “Code Factoring and Requirement Diameters”

Writing Software from the Outside In

Whenever you’re developing software, a certain amount of refactoring and rewriting is inevitable. This is sometimes due to a new idea that will simplify the design or a change to the project requirements. But unfortunately, it often also happens because of a misunderstanding about how the software will connect and interact with its external environment.Continue reading “Writing Software from the Outside In”