I will do large language model projects

India

I speak Marathi, Hindi, English

Machine Learning, Quantitative Finance, Data

Hi, I'm Aniket! I specialize in Machine Learning, Deep Learning, and Computer Vision, offering expert solutions for complex AI tasks. My expertise includes: Core AI: Training ML models and LLMs from ...

About this Gig

I will train custom language models from scratch or fine-tune open-weight LLMs on your data. I build GPT-style transformer models from zero using PyTorch, ranging from small 10M parameter demos up to 50M parameter models. I also fine-tune existing models like Llama, Phi-3, and Mistral on your dataset using LoRA/QLoRA.

What you get:

Fully trained model weights and tokenizer tailored to your data
Complete source code with comments for training and inference
Text generation script + setup instructions
Training logs, loss curves, and sample outputs
Full commercial rights

I handle data preprocessing, tokenizer training, model architecture, and training pipeline. You just provide your text dataset in .txt, .csv, or PDF format or I'll use open source data from HuggingFace, Kaggle, and other.

Important: Models under 50M parameters are designed for demos, educational use, and learning your specific data style. They demonstrate how LLMs work but will not have broad knowledge like ChatGPT.

Full Screen

Expertise:

Feature learning

•

Predictive analysis

•

Other

Frameworks:

Scikit-learn

•

Keras

•

PyTorch

•

Panda

Data type:

Text

Programming language:

Python

•

SQL

•

Colab

•

NoSQL

Tools:

Jupyter Notebook

•

OpenCV

•

OpenNN

•

TensorFlow

•

Excel

•

Colab

•

Other

My Portfolio

Other Data Science & ML Services I Offer

Machine Learning
Starting at $100

FAQ

What exactly do I receive?

You get: 1) Trained model weights .safetensors 2) Custom tokenizer 3) Full Python source code for training + inference 4) Requirements.txt and setup guide 5) Training logs with loss/perplexity plots 6) Sample text generations 7) Full commercial rights.

Do you provide the training data?

If you have custom dataset, then you can provide the dataset. I handle cleaning, formatting, tokenization, and training. Accepted formats: .txt, .csv, .json, or PDF. But if you don't, on your choice, I'll use open source data from websites like HuggingFace, Kaggle, and others to train our model.

Will my 10M or 50M model be like ChatGPT?

No. Models under 100M parameters are for demos, proof-of-concepts, and learning specific styles/patterns from your data. They will generate text in your domain style but won't have broad knowledge, reasoning, or instruction-following like ChatGPT. For that you need 7B+ models with massive datasets.

How much data do I need to provide?

For 10M models: 10MB-100MB of text. For 50M models: 50MB-500MB of text. More data = better results. 1MB ≈ 200k tokens. If you're unsure, send me your dataset and I'll check if it's sufficient before we start.

Need to get creative?

Looking for tech experts?

Ready to reach and convert consumers?

Looking for writers?

Get your business running smarter

What's Included

I will do large language model projects

About this Gig

My Portfolio

Other Data Science & ML Services I Offer

FAQ

Related tags