Snowflake to Parquet in minutes
Snowflake is an industry leading, cloud-native data warehouse. Data tables are arranged in a multidimensional database. The result is a user-friendly, flexible, scalable, and responsive storage solution that provides a strong alternative to traditional on-prem data storage. Using the Snowflake driver, this connector integrates your Snowflake data without affecting the source schema.
Apache Parquet is an open-source, column-oriented data storage format of the Hadoop ecosystem designed to provide fast querying on large datasets. Parquet is routinely used for creating very highly scaled data lakes that can still be queried. Parquet is similar to other column-storage file formats that are available in Hadoop.
Estuary helps move data from
Snowflake to Parquet in minutes with millisecond latency.
Estuary integrates with an ecosystem of free, open-source connectors to extract data from Snowflake with low latency, allowing you to replicate that data to various systems for both analytic and operational purposes. This data can be organized into a data lake or loaded into other data warehouses or streaming systems.
Data can then be directed to Parquet using materializations that are also open-source. Connectors have the ability to push data as quicikly as a destination will handle. Parquet likes files that are around 1 GB each. So, if you have high data volumes, Flow can keep your data lake up-to-date in near real-time.