I will build a python etl script to clean merge and consolidate your CSV data

India

I speak English, Japanese, French

1 order completed

Data and Software

I'm a Python data engineer specializing in ETL pipelines, data cleaning and CSV/Excel consolidation. I turn messy, inconsistent exports from multiple sources into one clean, validated, reporting ready...
About this Gig

Do you have spreadsheets from different teams, tools, or departments each with different column names, date formats, duplicate records, and dirty values? Manually cleaning and merging them is slow and error-prone. I'll automate the whole thing in Python + Pandas.

What I do

I build a reusable ETL workflow that:

  • Extracts data from all your CSV/Excel files in one run
  • Maps different source column names into one standard schema
  • Cleans & standardizes trims whitespace, fixes title case, converts all dates to YYYY-MM-DD, strips $/units and converts amounts & quantities to clean numbers
  • Standardizes categories (e.g. status values one consistent set)
  • Validates records and drops rows missing required fields
  • Removes duplicates so each record appears once
  • Consolidates everything into a single, UTF-8, reporting-ready master file

What you get

  • A clean, well-documented Python script you fully own
  • Your consolidated output file (CSV/Excel)
  • A README with install + run instructions
  • Code that's reusable on next month's files no rework

Why me

  • Specialist in data engineering & ETL, not a generalist
  • Clean, readable, commented code (no black boxes)
  • Consistent, repeatable results every run
  • Fast replies, on-ti

Technology:

Amazon Redshift

Apache Spark

Excel

MATLAB

Python

Expertise:

Classification

Data extraction

Data flow

My Portfolio