Projects by Year

Tags:

2023

NUWE: EcoForecast - 🏆 5th Place in Schneider Electric Hackaton

NUWE: EcoForecast - 🏆 5th Place in Schneider Electric Hackaton

This project, ranking 5th solo among 236 teams, aims to predict Europe's next-hour renewable energy surplus. It utilizes XGBoost, LightGBM, and LSTM models, trained on 2022 data and tested on early 2023 data, providing key insights into renewable energy forecasting trends.

Unsupervised Text Summarizer: Multi-Document Compression

Unsupervised Text Summarizer: Multi-Document Compression

Implementation of Lamsiyah et al. (2021) for unsupervised multi-document extractive summarization, employing transformer and compression-based embeddings, supported by a three-dimensional scoring system for document/sentence relevance.

Semantic Data Management for Scientific Publications

Semantic Data Management for Scientific Publications

Property Graphs (utilizing Neo4j), Distributed Graphs (leveraging Pregel), Knowledge Graphs (employing GraphDB), and a culminating project that integrates these graph databases for machine learning applications.

Extended Micrograd with RNN Support

Extended Micrograd with RNN Support

An extension of Andrej Karpathy's Micrograd to include support for Recurrent Neural Networks (RNNs) in addition to standard neural networks. This project provides a minimalistic yet powerful way to understand and experiment with RNNs.

Extractive Summarization using BERT & HuggingFace

Extractive Summarization using BERT & HuggingFace

Custom fine-tuned BERT model using HuggingFace for extractive summarization, along with a utility to convert abstractive to extractive data, aimed at enhancing model training and interpretability.

Kindle <Highlights and Notes> to Notion

Kindle <Highlights and Notes> to Notion

An automated parser for transcribing Kindle notes and highlights directly into a Notion page, simplifying the process of consolidating and revisiting key takeaways from readings.

Back to Top ↑

2022

USA Flu Time-Series Prediction

USA Flu Time-Series Prediction

A time-series analysis to predict the number of flu cases in the USA. The framework is based on ARIMA models. It consists of pattern recognition, model estimation, statistical validation, predictive analysis, and outlier management.

Big Data Management Backbone for Real Estate

Big Data Management Backbone for Real Estate

An end-to-end big data pipeline with Real Estate data. It features raw data ingestion into Apache Hadoop, storage in MongoDB, and data transformation, machine learning, and real-time data streaming with Apache Spark. Final KPIs are visualized in Tableau.

Data Engineering Pipeline to Cluster Countries

Data Engineering Pipeline to Cluster Countries

This project develops a data management and analysis pipeline to cluster countries based on multiple indicators such as economical, geo-political, and governmental data. SQL is used for the data warehouse, Python for the data analysis.

Back to Top ↑