Jhames Mejia (@codingjhames) Bsky

Final Peer Reviews for #DEZoomcamp complete! ✅

Just evaluated a complex SkyPulse Streaming Pipeline. Impressed by the real-time flight data visualization and the use of Apache Flink + Redpanda. Reviewing expert architectures like this is a game-changer for my engineering criteria.

2 weeks ago 0 0 0 0

GitHub - CodingJhames/secop-ii-data-pipeline-aws Contribute to CodingJhames/secop-ii-data-pipeline-aws development by creating an account on GitHub.

Final Project for the Data Engineering Zoomcamp complete! 🚀

Built an end-to-end pipeline to audit Colombian public procurement (SECOP II) bridging Law and Tech. Used AWS (S3, Glue, Athena), dbt for transformations, and Looker Studio.

Check my repo & dashboard: github.com/CodingJhames...

3 weeks ago 0 0 0 0

The Data Warehouse is officially online! 🌩️

Mapped 100k raw SECOP II records in S3 using AWS Glue and queried them with Athena. The best part? Connecting my local James-T-850 via #dbt to run the first staging models directly in the cloud. 🚀

github.com/CodingJhames...
#DataEngineering #AWS #dbt

3 weeks ago 0 0 0 0

GitHub - CodingJhames/secop-ii-data-pipeline-aws Contribute to CodingJhames/secop-ii-data-pipeline-aws development by creating an account on GitHub.

Data Ingestion milestone reached! 🏗️

Successfully deployed #AWS S3 with #Terraform and ingested 100k records from the SECOP II API using Python on my James-T-850. Everything is now stored as Parquet.

Repo: github.com/CodingJhames...

#DataEngineering #Python #Cloud #DEZoomcamp

3 weeks ago 1 0 0 0

GitHub - CodingJhames/de-zoomcamp-james Contribute to CodingJhames/de-zoomcamp-james development by creating an account on GitHub.

Week 7 #DataEngineering Zoomcamp 🏎️
Streamed 4.4M records via #Redpanda & #PySpark on my James-T-850. Speed is nothing without logic!

Results: 📍
📏 Dist: 9506
🏙️ Zone: 74
⏳ Session: 31m
💰 Peak Tip: 10-16 18:00

Progress: github.com/CodingJhames...

#Streaming #Python #BigData #DataTalksClub

1 month ago 0 0 0 0

GitHub - DataTalksClub/data-engineering-zoomcamp: Data Engineering Zoomcamp is a free 9-week course on building production-ready data pipelines. The next cohort starts in January 2026. Join the course... Data Engineering Zoomcamp is a free 9-week course on building production-ready data pipelines. The next cohort starts in January 2026. Join the course here 👇🏼 - DataTalksClub/data-engineering-zoomcamp

Week 6 Spark Module of #DataEngineeringZoomcamp completed! 🛠️

Bridging legal logic with large-scale batch processing.
🔹 PySpark & DataFrames
🔹 Parquet optimization
🔹 AWS EC2 deployment
🔹 Spark UI monitoring (4040)

Solution: github.com/CodingJhames...

Course: github.com/DataTalksClu...

1 month ago 0 0 0 0

Totally agree! Schema drift was a great lesson today. dlt handles schema versioning out of the box, storing it as YAML in the destination. It definitely gives me peace of mind knowing the evolution is tracked even if the API changes. Thanks for the insight!

1 month ago 0 0 0 0

#DEZoomcamp W6: dlt & DuckDB! 🏁

Schema drift: API changed tip_amount to tip_amt. ⚠️
DuckDB's error logs pointed out the fix instantly.

Running #dlt on an AWS t3.micro (1GB RAM).
0.2666 Credit prop | 6063.41 Tips.

Repo: github.com/CodingJhames/de-zoomcamp-james

#DataEngineering #AWS #BuildInPublic

1 month ago 2 0 1 0

Thanks, Jeremy! Performance took a hit due to disk I/O, but the goal was to prove that logic beats a tight budget. I used the zone data for a join, and you're right—it’s the perfect case for broadcast joins to avoid exploding that 1GB RAM limit!.

1 month ago 0 0 0 0

GitHub - CodingJhames/de-zoomcamp-james Contribute to CodingJhames/de-zoomcamp-james development by creating an account on GitHub.

Data Engineering Week 5: Done! 🏁

Pivoted to #AWS from GCP. ☁️
Ran #PySpark on a t3.micro (1GB RAM) using a 4GB Swapfile. Processed NYC Taxi data smoothly without crashes. 🧠

Adaptability > Tools. 🦾

Code: github.com/CodingJhames...

#DataEngineering #Spark #AWS #OpenSource

2 months ago 2 0 3 0

Week 4 Analytics Engineering: Complete! 👨‍💻

Mastered dbt with NYC Taxi data in AWS Athena. Overcoming those Parquet type mismatches was the final boss! 🚀

Project link:
github.com/CodingJhames...

#DataEngineering #dbt #BuildInPublic

2 months ago 2 1 0 0

GitHub - CodingJhames/de-zoomcamp-james Contribute to CodingJhames/de-zoomcamp-james development by creating an account on GitHub.

Week 3 #DataZoomcamp done! 🚀

Migrated the DW logic to AWS Athena & S3 ☁️

🔹 20M+ records with #Kestra 🔹 Optimized queries: 310MB → 26MB scan using Partitioning & Clustering 🔹 Mastered cloud-agnostic DW concepts

Check my repo: 🔗 github.com/CodingJhames...

#DataEngineering #AWS #Athena

2 months ago 2 0 0 0

Posts by Jhames Mejia