Advertisement Β· 728 Γ— 90

Posts by Jhames Mejia

Post image

Final Peer Reviews for #DEZoomcamp complete! βœ…

Just evaluated a complex SkyPulse Streaming Pipeline. Impressed by the real-time flight data visualization and the use of Apache Flink + Redpanda. Reviewing expert architectures like this is a game-changer for my engineering criteria.

2 weeks ago 0 0 0 0
Preview
GitHub - CodingJhames/secop-ii-data-pipeline-aws Contribute to CodingJhames/secop-ii-data-pipeline-aws development by creating an account on GitHub.

Final Project for the Data Engineering Zoomcamp complete! πŸš€

Built an end-to-end pipeline to audit Colombian public procurement (SECOP II) bridging Law and Tech. Used AWS (S3, Glue, Athena), dbt for transformations, and Looker Studio.

Check my repo & dashboard: github.com/CodingJhames...

3 weeks ago 0 0 0 0
Post image Post image

The Data Warehouse is officially online! 🌩️

Mapped 100k raw SECOP II records in S3 using AWS Glue and queried them with Athena. The best part? Connecting my local James-T-850 via #dbt to run the first staging models directly in the cloud. πŸš€

github.com/CodingJhames...
#DataEngineering #AWS #dbt

3 weeks ago 0 0 0 0
Preview
GitHub - CodingJhames/secop-ii-data-pipeline-aws Contribute to CodingJhames/secop-ii-data-pipeline-aws development by creating an account on GitHub.

Data Ingestion milestone reached! πŸ—οΈ

Successfully deployed #AWS S3 with #Terraform and ingested 100k records from the SECOP II API using Python on my James-T-850. Everything is now stored as Parquet.

Repo: github.com/CodingJhames...

#DataEngineering #Python #Cloud #DEZoomcamp

3 weeks ago 1 0 0 0
Preview
GitHub - CodingJhames/de-zoomcamp-james Contribute to CodingJhames/de-zoomcamp-james development by creating an account on GitHub.

Week 7 #DataEngineering Zoomcamp 🏎️
Streamed 4.4M records via #Redpanda & #PySpark on my James-T-850. Speed is nothing without logic!

Results: πŸ“
πŸ“ Dist: 9506
πŸ™οΈ Zone: 74
⏳ Session: 31m
πŸ’° Peak Tip: 10-16 18:00

Progress: github.com/CodingJhames...

#Streaming #Python #BigData #DataTalksClub

1 month ago 0 0 0 0
Preview
GitHub - DataTalksClub/data-engineering-zoomcamp: Data Engineering Zoomcamp is a free 9-week course on building production-ready data pipelines. The next cohort starts in January 2026. Join the course... Data Engineering Zoomcamp is a free 9-week course on building production-ready data pipelines. The next cohort starts in January 2026. Join the course here πŸ‘‡πŸΌ - DataTalksClub/data-engineering-zoomcamp

Week 6 Spark Module of #DataEngineeringZoomcamp completed! πŸ› οΈ

Bridging legal logic with large-scale batch processing.
πŸ”Ή PySpark & DataFrames
πŸ”Ή Parquet optimization
πŸ”Ή AWS EC2 deployment
πŸ”Ή Spark UI monitoring (4040)

Solution: github.com/CodingJhames...

Course: github.com/DataTalksClu...

1 month ago 0 0 0 0

Totally agree! Schema drift was a great lesson today. dlt handles schema versioning out of the box, storing it as YAML in the destination. It definitely gives me peace of mind knowing the evolution is tracked even if the API changes. Thanks for the insight!

1 month ago 0 0 0 0
Advertisement
Post image

#DEZoomcamp W6: dlt & DuckDB! 🏁

Schema drift: API changed tip_amount to tip_amt. ⚠️
DuckDB's error logs pointed out the fix instantly.

Running #dlt on an AWS t3.micro (1GB RAM).
0.2666 Credit prop | 6063.41 Tips.

Repo: github.com/CodingJhames/de-zoomcamp-james

#DataEngineering #AWS #BuildInPublic

1 month ago 2 0 1 0

Thanks, Jeremy! Performance took a hit due to disk I/O, but the goal was to prove that logic beats a tight budget. I used the zone data for a join, and you're rightβ€”it’s the perfect case for broadcast joins to avoid exploding that 1GB RAM limit!.

1 month ago 0 0 0 0
Preview
GitHub - CodingJhames/de-zoomcamp-james Contribute to CodingJhames/de-zoomcamp-james development by creating an account on GitHub.

Data Engineering Week 5: Done! 🏁

Pivoted to #AWS from GCP. ☁️
Ran #PySpark on a t3.micro (1GB RAM) using a 4GB Swapfile. Processed NYC Taxi data smoothly without crashes. 🧠

Adaptability > Tools. 🦾

Code: github.com/CodingJhames...

#DataEngineering #Spark #AWS #OpenSource

2 months ago 2 0 3 0
Post image

Week 4 Analytics Engineering: Complete! πŸ‘¨β€πŸ’»

Mastered dbt with NYC Taxi data in AWS Athena. Overcoming those Parquet type mismatches was the final boss! πŸš€

Project link:
github.com/CodingJhames...

#DataEngineering #dbt #BuildInPublic

2 months ago 2 1 0 0
Preview
GitHub - CodingJhames/de-zoomcamp-james Contribute to CodingJhames/de-zoomcamp-james development by creating an account on GitHub.

Week 3 #DataZoomcamp done! πŸš€

Migrated the DW logic to AWS Athena & S3 ☁️

πŸ”Ή 20M+ records with #Kestra πŸ”Ή Optimized queries: 310MB β†’ 26MB scan using Partitioning & Clustering πŸ”Ή Mastered cloud-agnostic DW concepts

Check my repo: πŸ”— github.com/CodingJhames...

#DataEngineering #AWS #Athena

2 months ago 2 0 0 0