I will build an automated etl data pipeline using python and airflow
Data Engineer and Advanced Web Scraping Specialist
About this Gig
Stop making business decisions on messy, unreliable data.
I am a Data Engineer specializing in the Modern Data Stack. I build robust, idempotent, and fully automated data pipelines that transform raw, unstructured inputs into clean, analytics-ready data.
Whether you need a simple script to move API data or a complete "Medallion Architecture" data lake, I design systems that scale.
My Expertise & Tech Stack:
- Orchestration: Apache Airflow
- Real-Time Streaming: Apache Kafka
- Transformations & Quality: dbt Core (automated testing & data modeling)
- Storage: PostgreSQL, AWS S3, MinIO
- Infrastructure: Docker Compose, Terraform (AWS EC2, RDS)
- Visualization: Metabase integrations
What you can expect:
- Reliability: Pipelines that handle failures gracefully with automated retries.
- Data Quality: Built-in dbt tests (null checks, uniqueness) so you only query accurate data.
- Clean Delivery: Fully containerized code (Docker) with thorough documentation (README.md) for easy deployment on your own servers.
Please message me before placing an order so we can discuss your specific data sources and business requirements!
My Portfolio
FAQ
Do you deploy the pipeline to my cloud environment?
Yes! For the Premium package, I provide Terraform scripts (Infrastructure as Code) to automatically provision the necessary AWS resources (EC2, RDS, S3) and deploy the Dockerized pipeline.

