Stars
5
Forks
0
Language
Jupyter Notebook
Last Updated
Dec 22, 2023
Similar Repos
Repo | Language | Stars | Description | Updated At |
---|---|---|---|---|
Java | 6 | Apache Airavata Data Lake | Jun 28, 2022 | |
Scala | 11 | Apache Hudi Demo | Apr 27, 2023 | |
Python | 5 | Build Glue(Spark) Streaming pipeline for clicksstreams and power data lake with Apache Hudi and Query … | Apr 10, 2023 | |
Python | 2 | Power your Down Stream Elastic Search Stack From Apache Hudi Transaction Datalake with CDC | Apr 03, 2024 | |
Dockerfile | 56 | Real-time Data Warehouse with Apache Flink & Apache Kafka & Apache Hudi | Aug 21, 2022 | |
Shell | 19 | Demonstration Oracle CDC Source Connector with Kafka Connect | Sep 08, 2022 | |
Java | 66 | 汇总Apache Hudi中的一些Demo,便于快速上手Apache Hudi(Apache Hudi Demos to help beginners know about Hudi) | Apr 11, 2023 | |
None | 2 | Learn About Apache hudi + Flink and Kinesis | Sep 05, 2023 | |
Python | 6 | Maximizing Efficiency in Data Lake (Hudi) Glue ETL Jobs with a Templated Approach and Serverless … | May 23, 2023 | |
Python | 2 | getting started with pyspark and apache hudi glue | Mar 06, 2023 | |
Matlab | 11 | MATLAB source code for Lake Analyzer | Dec 08, 2022 | |
Jupyter Notebook | 33 | Jupyter notebooks and AWS CloudFormation template to show how Hudi, Iceberg, and Delta Lake work | Apr 04, 2023 | |
Java | 26 | Datastax CDC for Apache Cassandra | Jul 28, 2022 | |
C | 6 | AVT9152 demonstration source code and documentation | Mar 01, 2023 | |
TypeScript | 115 | Data Lake as Code, featuring ChEMBL and OpenTargets | Jul 28, 2022 | |
JavaScript | 5 | WebID demonstration source code | Mar 22, 2020 | |
Java | 2 | Support querying Apache Hudi delete rows cross timeline | Jul 12, 2022 | |
None | 446 | 汇总Apache Hudi相关资料 | May 27, 2023 | |
None | 13 | Data Lake ETL Code for GHCrawler | Mar 13, 2022 | |
Jupyter Notebook | 3 | Build Data Lake using Open Source tools | Mar 10, 2023 | |
None | 2 | Efficient Data Ingestion with Glue Concurrency: Using a Single Template for Multiple S3 Tables into … | Apr 14, 2023 | |
None | 3 | Project : Using Apache Hudi Deltastreamer and AWS DMS Hands on Labs | Apr 16, 2023 | |
Java | 2933 | CDC Connectors for Apache Flink® | Aug 21, 2022 | |
None | 2 | Samples to make Debezium CDC events's data available to Apache Hop | Jun 17, 2022 | |
None | 4 | Source Code for 'Practical Enterprise Data Lake Insights" by Saurabh Gupta and Vonkayala Venkata Giri | Nov 08, 2021 | |
Jupyter Notebook | 2 | Source Code for 'Data Lake Analytics on Microsoft Azure' by Harsh Chawla and Pankaj Khattar | Jun 01, 2022 | |
Jupyter Notebook | 4 | Sample Python app for Data Lake Analytics and Data Lake Store, built upon the Data … | Jan 14, 2021 | |
Java | 7 | Apache Flink/Apache Kafka streaming data analytics demonstration using Streaming Synthetic Sales Data Generator | Apr 12, 2023 | |
C# | 11 | Sample .NET client library for Data Lake Analytics and Data Lake Store, built upon the … | Jan 14, 2021 | |
C++ | 14 | Sample source code for Demonstration, Experiment and Test | Jul 13, 2022 | |
Java | 9 | Data Lake Engine | Jun 08, 2022 | |
Scala | 2 | Data lake fun! | Aug 15, 2019 | |
Java | 17 | Demonstration Project : Fast Data Analytic platform with Clickhouse, Apache Kafka and ksqlDB | May 03, 2023 | |
Java | 83 | Replicates database CDC events to Apache Iceberg Tables | Dec 02, 2022 | |
Java | 15 | An incubating Debezium CDC connector for Apache Cassandra | Aug 02, 2022 | |
Java | 2 | An incubating Debezium CDC connector for Apache Cassandra | Mar 19, 2024 | |
C++ | 159 | Open Source Oracle database CDC | May 10, 2023 | |
None | 3 | Demonstrate the Dremio Data Lake engine accessing Apache Iceberg tables stored in HDFS | Nov 10, 2022 | |
C++ | 41 | Source code for Lizard NES game (demonstration) | Jun 22, 2022 | |
None | 103 | Apache Pulsar Source code analysis | Aug 12, 2022 | |
C | 986 | Source code for Apache/mod_wsgi. | May 22, 2023 | |
Scala | 29 | PostgreSQL and GreenPlum Data Source for Apache Spark | Mar 25, 2022 | |
Java | 20 | Elasticsearch-cdc plugin, which supports capture data changes in elasticsearch, and sink the cdc data into … | Sep 08, 2022 | |
R | 2 | R code for data import, analysis, and visualization for global lake color. | Jul 07, 2022 | |
C# | 2 | Demonstration code for displaying access and request token data | Aug 25, 2019 | |
Java | 3 | Apache Kafka Streams streaming data analytics demonstration using Streaming Synthetic Sales Data Generator | Nov 03, 2022 | |
Java | 388 | Change Data Capture (CDC) service | Jul 25, 2022 | |
TSQL | 46 | Data Engineering with Spark and Delta Lake | Aug 02, 2022 | |
R | 2 | Serve COVID19 data and analyses from Maine CDC | Apr 22, 2020 | |
None | 2 | Maine CDC Data and likely some R bits | Nov 22, 2018 |