s
surendra_dataen

Surendra R

@surendra_dataen

Data Engineer

India
English, Kannada, Hindi
About me
I'm a Data Engineer with 3+ years of production experience building scalable data pipelines on AWS and Azure — processing 50M+ records daily with a 40% performance improvement. I specialize in: - ETL/ELT pipelines using Python, PySpark & Databricks - Real-time streaming with Apache Kafka - Medallion Architecture (Bronze/Silver/Gold) Lakehouses - REST APIs for pipeline monitoring (Flask, FastAPI) - CI/CD automation with Docker & Bitbucket Pipelines I deliver clean, documented, production-ready work — on time.... Read more

Skills

s
surendra_dataen
Surendra R
Offline • 

See my services

Data ETLs
I will build a python etl data pipeline for your business

Portfolio

Work experience

Data Engineer

Petrabytes India Pvt Ltd • Full-time

May 2023 - Present3 yrs

At Petrabytes India Pvt Ltd, I work as a Data Engineer and Python Developer, designing and deploying production-grade data systems on AWS and Azure that process 50M+ records daily. Key contributions: - Designed scalable batch and real-time data pipelines on AWS, reducing execution time by 40% through PySpark and distributed processing optimizations. - Built end-to-end ETL/ELT pipelines using Databricks Lakehouse with Medallion Architecture (Bronze/Silver/Gold), enforcing data quality, completeness, and reliability across all data domains. - Integrated Apache Kafka for real-time event-driven data streaming, enabling low-latency ingestion and reliable message processing across distributed microservices. - Implemented RESTful APIs in Python, Flask, and FastAPI for pipeline monitoring, health checks, and AWS infrastructure automation via Boto3 — improving operational observability. - Established CI/CD DevOps workflows using Git and Bitbucket Pipelines to containerize and deploy microservices via Docker across dev and production environments using Infrastructure-as-Code practices. - Contributed to data architecture and Dimensional Modeling (Star Schema) design discussions, collaborating with Product Managers and Application Engineers in Agile/Scrum sprints. - Monitored live site reliability, resolved performance bottlenecks proactively, and maintained zero-downtime deployments to meet service SLAs. Tech Stack: Python, PySpark, Apache Spark, Databricks, Apache Kafka, AWS (S3, EMR, Lambda, EC2), Flask, FastAPI, Boto3, Docker, Git, SQL, Medallion Architecture, Star Schema.