This tutorial is out of date, but we’re leaving it for posterity. We’ve written a new tutorial that demonstrates this same functionality with different data using the Flow web app. You can find it here.
Estuary Flow lets you synchronize data across all the systems you care about, with millisecond latency. What does that actually look like in practice? Here’s a simple — but powerful — example.
Materialized views are a ubiquitous part of data integration. With Flow, you can make your materialized views not only synchronous between your different systems, but also continuous — even in systems that don’t support continuous materialized views natively.
With our updated Flow template on GitHub, we demonstrate how to create a continuous materialized view in PostgreSQL. We also provide a (truly) quick and easy path to try it yourself.
Far-reaching potential, compact containers
Flow is an enterprise-level tool designed to tie into all your data systems. By definition, production deployments are complex, unique, and hard to replicate for ad-hoc tutorial or testing purposes.
But learning and testing are never optional. For these purposes, you use local (or virtualized) development environments.
The key? We use VSCode devcontainers to package up the entire environment. In this example, you got Flow, Flow’s dependencies, and a ready-to-use PostgreSQL database.
Still, setting up locally takes time, and no one has much of that to spare. That’s why Flow is moving towards VM-backed, portable development environments. You’ll notice this tutorial encourages you to use GitHub Codespaces (if you have access — it’s new and currently being rolled out progressively). In Codespaces, the development environment is set up from your browser in moments.
Whether your environment is local or virtual, you still benefit from the devcontainer, and the workflow looks the same. Looking forward, we plan to support all users with quick, VM-based testing and learning.
But wait… that was easy
You might’ve thought that this tutorial was almost suspiciously easy to run, especially if you were able to use Codespaces.
In a way, that’s kind of the point — Flow aims to make formerly challenging tasks (like continuous materialized views in PostgreSQL) painless.
Also, yes, in this introductory example we did configure the data flow for you. Every Flow pipeline is defined and configured by one or more YAML files, called the catalog specification. The catalog spec in this example is called word-counts.flow.yaml. You can find it in the template repository.
In an upcoming post, we’ll walk through that catalog spec and peel back the conceptual layers that make up this example, so stay tuned.
In the meantime, you can learn more about how Flow works from these resources:
- Engineer-written blogs about Flow materialization and how Flow is powered by continuous MapReduce
- Learn exactly what Flow is (and isn’t) by comparing it to systems you know well
Think Flow could help your stack? Our full GA release date is still TBD, but you can stay in the know via our monthly newsletter, or sign up for the private beta waitlist.