Looking to automate your data workflows? I specialize in building scalable, cost-effective ETL pipelines using Python and AWS, transforming your raw data into actionable insights.
What I can do for you:
- AWS Glue Jobs: Developing robust ETL scripts using PySpark for large-scale data processing or Python Shell for lightweight integrations.
- Serverless Pipelines: Building event-driven workflows with AWS Lambda and S3 triggers.
- Data Orchestration: Setting up and managing workflows with AWS Step Functions or Glue Workflows.
- Data Loading: Efficiently loading data into Amazon Redshift, S3 (Data Lakes), or RDS.
- API Integration: Extracting data from third-party APIs using Python and storing it securely in AWS.
- Optimization: Fine-tuning existing Glue jobs to reduce DPU (Data Processing Unit) costs.