More about Estuary and related technologies, straight from the team.
Our blog breaks down basic concepts and takes you into the minds of our engineers. We also dig into the business principles that guide our company and allow us to build great solutions for yours.
The problem with credit-based pricing for data platforms
When you choose the platforms and systems that comprise your modern data infrastructure, the pricing model is a major factor to consider. Your data warehouse and data pipelines handle a ton of data every day, and they can quickly make or break your budget. Evaluating exactly how and why you’ll be charged is a crucial …
How to create a real-time materialized view in PostgreSQL
PostgreSQL is a powerful open-source database that supports materialized views. Traditional materialized views are database objects that contain the results of a query — usually, a focused subset of a large dataset. They’re powerful optimization tools. But what about materialized views of a real-time data feed? When you run a query at a set interval, …
Everything you actually need to know to use Estuary Flow
There are only a few core concepts you need to know to start using Estuary Flow. We explain them with illustrations.
Four software engineering best practices to improve your data pipelines
Software and data engineering are very different, but you can apply the same best practices to both – if you’re smart about the context.
Why ELT won’t fix your data problems
If we’re not careful, the modern data stack and ELT can cause new incarnations of problems that have been plaguing us for years.
Announcing Estuary’s partnership with Rockset
As we expand our beta program, Estuary is thrilled to announce our partnership with Rockset.
DataOps for business: A comprehensive introduction
DataOps is a holistic approach that recognizes the far-reaching impacts data has for business, and addresses common problems on several fronts.
Why you don’t actually need a reverse ETL platform
If you set up your architecture right, you just need one, robust data pipeline system.
Data for all: Why data democratization matters at every scale
It’s vital to avoid data-related power imbalances among individuals, teams, and entire businesses. But data democracy takes work.
Understanding the modern data stack, and why open-source matters
The “modern data stack” is a framework used to conceptualize how different data tools work together. But there’s more to it than that.
The complete change data capture guide for MySQL
A comprehensive guide to change data capture in MySQL for cross-functional data teams.
The complete change capture guide for PostgreSQL
A comprehensive guide to change data capture in PostgreSQL for cross-functional data teams.
3 reasons to rethink your approach to change data capture
As we navigate this rapidly evolving space, we should be familiar with the challenges of change data capture as it currently exists. But we should not expect them to remain the same for long.
A comprehensive introduction to change data capture (CDC)
Change data capture, or CDC, is the process of recognizing a change in a source data system so that a downstream system can act on that change, usually by updating a target system to reflect new information.
How new pipeline tools are changing data engineering in the 2020s
For the past several years, the workforce has been chronically short on data engineers. But what will happen as vendors build services that automate much of their daily work?
Try it yourself: Continuous materialized views in PostgreSQL
Estuary Flow lets you synchronize data across all the systems you care about, with millisecond latency. What does that actually look like in practice? Here’s a simple — but powerful — example.
This article won’t tell you how to build a data mesh
Or, how to take an organization-first approach to modern data architecture
The power and implications of data materialization
The simple concept of materialization can tell us a lot about how (and why) data is stored and represented. In this post: logs and tables; data loading and queries; exports, and more.
Connector stories: Snowflake and BigQuery
Data warehouses like Snowflake and BigQuery are essential to your data stack. Today we’re talking with the two software engineers who built Estuary’s Snowflake and Bigquery connectors to get an inside look at their process.
The Estuary story and guiding principles
Our mission and guiding principles are defined not just by the common challenges of our field, but also by our personal experiences. Here’s how we got to where we are, the beliefs we hold as a result, and the vision we work towards.
Database vs data warehouse vs data lake: Key differences and usage
We hear these terms used a lot, and to the uninitiated, they can sometimes seem interchangeable. So, what’s the difference between these types of data storage systems?
Three data scaling pitfalls and how to avoid them
Being prepared for data scalability challenges and staying aware of best practices can help you avoid common issues.
Connector stories: Apache Kakfa
Apache Kafka is an extremely popular open-source event streaming platform. We talk to Estuary developer Alex about his process and the insights he gained building the Kafka connector.
Introducing Estuary’s open-source connector repository
We believe that an ecosystem of open-source connectors will be critical to the future of data integration. That’s why Estuary is excited to announce our open-source connector repository.
The costs of data integration explained, and how to minimize them
There is a cost associated with putting your data to work, and the benefits you gain depend on the systems you put into place. To maximize net value, you need to strike a balance between minimizing costs and maximizing gain.
Re-evaluating Kafka: issues and alternatives for real-time
Kafka’s challenges have exhausted many an engineer on the path to successful data streaming. What if there was an easier way?
5 example use-cases for real-time data processing
The applications for real-time data processing are diverse and far-reaching. This article highlights just a few.
Real-time vs batch data pipelines: a comprehensive introduction
Real-time and batch are two broad categories of data processing. Though they handle data differently, both are vital to the systems that make our businesses and society run.
Why MapReduce is making a comeback
What if we could adapt MapReduce to real-time data processing? Spoiler: it’s awesome and we’re building a next-gen data platform based on it!
ETL vs ELT: Breaking down the split paradigm
At first glance, ETL and ELT seem rigidly defined and mutually exclusive. But to understand our data integration options, we need to look closer.
What’s a data pipeline? The business essentials.
The term “data pipeline” evokes a mental image that speaks for itself. But their significance for your data goes far beyond that.
Putting an end to Unreliable Analytics
When building a product or service, it’s imperative to know that input data will be as expected. If that information comes…
Lead With Vision, Not Metrics: How OKR’s Can Be a Dangerous Tool
Data, if used properly, is key to success in any organization. It’s impossible to improve without metrics which help us understand…
Data Democracy Unlocks Value for Organizations. Here’s How to Start
Sometimes the most relaxing place to travel is nowhere at all.
Keep Privacy and Governance in Mind When Developing or Updating Your Systems
Design a fitness plan that’s simple and fun and you’ll never miss a workout.
How Kubernetes will Enable a new Genre of Vendor to Enterprise
What do you do when geography gets in the way of friendship or love?
Safely Sharing Data Between Companies
You don’t have to be a professional designer to appreciate visual balance and beauty.
How to take advantage of the data revolution in your business
Data is exploding and it takes planning to get in front of that trend.
A Unified Data Foundation for Real-time and Batch
Batch and streaming usually require different architectures and building infrastructure, but shouldn’t.