Stars
9
Forks
1
Language
Jupyter Notebook
Last Updated
Mar 27, 2022
Similar Repos
Repo | Language | Stars | Description | Updated At |
---|---|---|---|---|
None | 2 | Hadoop Tutorial Project to demonstrate Hadoop map, reduce and jobs using a simple word count. | Dec 07, 2016 | |
Java | 50 | CommonCrawl WARC/WET/WAT examples and processing code for Java + Hadoop | Jun 29, 2022 | |
Java | 36 | CommonCrawl WARC/WET/WAT examples and processing code for Java + Hadoop | Apr 26, 2023 | |
Java | 102 | Big Data ETL and Utilities for Hadoop Map Reduce, Spark and Storm | Nov 16, 2022 | |
Python | 138 | This small utility retrieves from the CommonCrawl data set unique subdomains for a given domain … | Aug 13, 2022 | |
Java | 9 | Java example of analyzing twitter data with hadoop map reduce. | Feb 17, 2023 | |
Java | 2 | Java example of analyzing wikipedia data with hadoop map reduce. | Jun 24, 2018 | |
Perl | 5 | Text auto-summarizer, provides scored lists summary sentences, important phrase-fragments, and frequently occurring words | Aug 28, 2019 | |
Python | 5 | Tutorial how to use Hadoop Map Reduce using Openstack SWIFT to read and write data | Aug 24, 2018 | |
Python | 3 | Big data analysis and application, including word segmentation and word frequency calculation through jieba, word … | Oct 01, 2023 | |
Jupyter Notebook | 2 | This coding repository showcases various features present in airline twitter sentiment dataset by performing exploratory … | Jan 14, 2024 | |
JavaScript | 5 | SignalK Node Server Plugin that updates and retrieves data from a SignalK cloud server | Aug 15, 2022 | |
Python | 2 | most frequently occurring words in kdramas turned into flashcards to help me learn korean and … | Jun 20, 2022 | |
Python | 28 | A command-line utility program for automating the trivial, frequently occurring data preparation tasks: missing value … | Jun 03, 2022 | |
JavaScript | 21 | Retrieves sports data from a popular sports website as well as from the NCAA website, … | Sep 05, 2022 | |
Ruby | 6 | Fantasy sports scrapers, tools and data. | Feb 08, 2016 | |
Python | 2 | Scan the wiikiPedia for the particular word and calculate the frequency and percentage, how frequently … | Apr 30, 2019 | |
Python | 18 | Retrieves data from GitHub and returns JSON styled data. | Jun 25, 2022 | |
None | 305 | Metrica Sports sample tracking and event data | May 01, 2023 | |
Java | 5 | High performance upload of data to cloud storage via Hadoop FS clients | Jun 21, 2018 | |
Java | 5 | Schema-free SQL for Hadoop, NoSQL and Cloud Storage | Dec 07, 2021 | |
Python | 8 | The Auditree data gathering and reporting tool. | Jan 18, 2022 | |
Python | 3 | Example for realtime data gathering and execution | Dec 15, 2021 | |
Python | 80 | Streamlining phylogenomic data gathering, processing and visualization | Jun 02, 2022 | |
Java | 4 | Sports news app that shows latest sports news using Rest API call, SQLite database to … | Oct 09, 2021 | |
Python | 4 | Tools and libraries for gathering and analyzing data | Aug 16, 2022 | |
None | 2 | Getting Started with Hadoop and Big Data | Jul 08, 2022 | |
C# | 4 | Worker that loads and retrieves data from "slow" endpoints. | Jan 24, 2022 | |
PHP | 8 | Retrieves data from Harvest and prepares it for Geckoboard | Feb 20, 2021 | |
Python | 2 | Retrieves Revit data and writes it to MS Excel. | Aug 01, 2022 | |
Python | 2 | export telegram group statistics and generate word cloud | Jun 13, 2022 | |
JavaScript | 3 | Word cloud of academic papers and their authors | Mar 01, 2019 | |
Jupyter Notebook | 3 | scrap twitter account and create a word cloud | May 19, 2021 | |
R | 12 | OCR an image and get a word cloud | Sep 13, 2021 | |
Java | 2 | ES Hadoop数据双向读写 share data between es and hadoop base on ES-Hadoop | Jun 07, 2022 | |
JavaScript | 10 | Word cloud generator is a domain hosted web application for generating word clouds on accepting … | Mar 10, 2023 | |
TeX | 5 | Data and code for "NOPE: A Corpus of Naturally-Occurring Presuppositions in English." | Oct 31, 2022 | |
Java | 33 | MessagePack-Hadoop integration provides an efficient schema-free data representation for Hadoop and Hive. | Nov 10, 2021 | |
Java | 33 | Riak data as input to hadoop m/r and output of hadoop m/r | Aug 13, 2019 | |
None | 9 | Corpus of domain names scraped from Common Crawl and manually annotated to add word boundaries … | Jan 06, 2022 | |
Shell | 7 | A Docker sandbox with Hadoop 0.0 (aka Nutch 0.8-dev) and word count example. | Apr 15, 2021 | |
Go | 2 | Flash Boys 2; frontrun.me web code, data gathering, and public data. | Apr 05, 2021 | |
R | 4 | Mine PDFs for word frequencies, create a word cloud and merge with PDF metadata | May 12, 2021 | |
Java | 338 | The Spatial Framework for Hadoop allows developers and data scientists to use the Hadoop data … | Aug 10, 2022 | |
Java | 2 | The Spatial Framework for Hadoop allows developers and data scientists to use the Hadoop data … | Jun 12, 2013 | |
Rust | 2 | Magic the Gathering data types and card database library | Oct 14, 2020 | |
None | 6 | Wiki for planning, gathering data and miscellaneous note-taking | Jul 30, 2022 | |
Jupyter Notebook | 9 | Analyzing crime reported in the U.S. using data derived from commoncrawl, New York Times api … | Jul 02, 2022 | |
Python | 2 | Tracker repository for Sports Viz Sunday. Combining two of the greatest things in the world: … | Feb 28, 2023 | |
None | 2 | The Frequently Asked Questions of data, stats, careers and more. | May 06, 2021 |