Machine Learning & Data Science Tutor

Who This Page Is For

The CS / data science student

You're in an ML or data mining course and the math is moving faster than your linear algebra background. We bridge the gap and make the lectures click.

The career switcher

You're learning ML to break into data science or research. You've been watching tutorials but want a real teacher to fill the gaps and accelerate the learning curve.

The practitioner with a real project

You have data and a deadline. You need someone to think through model choice, debugging, evaluation, and interpretation with you — not just point at docs.

What We Cover

Classical ML

Linear and logistic regression
Decision trees, random forests
Gradient boosting (XGBoost, LightGBM)
SVMs and kernels
k-NN, naive Bayes
k-means, hierarchical clustering
PCA and dimensionality reduction

Deep Learning

Neural network fundamentals
Backpropagation intuition
CNNs for image tasks
RNNs, LSTMs, attention
Transformer architecture
Transfer learning, fine-tuning
PyTorch hands-on

Math Behind ML

Vectors, matrices, projections
Eigenvalues and SVD (PCA, embeddings)
Gradients and the chain rule
Probability and distributions
Bayes' theorem applications
Loss functions and convexity
Gradient descent variants

Evaluation & Workflow

Train / validation / test splits
Cross-validation strategies
Precision, recall, F1, ROC, AUC
Confusion matrices, calibration
Bias-variance tradeoff
Regularization (L1, L2, dropout)
Hyperparameter tuning

Data Science Workflow

Exploratory data analysis
Feature engineering
Handling missing data and outliers
Encoding categoricals
Scaling and normalization
Pipelines and reproducibility
Communicating results

Modern Topics

Large language models (LLMs) basics
Embeddings & semantic search
Hugging Face basics
Prompt engineering fundamentals
Retrieval-augmented generation (RAG) idea
MLOps overview

A 60-Second Sample

How I'd Explain Why Your Model Has 99% Accuracy and Still Sucks

This is the most common "my first ML model" trap. The fix is mostly about thinking, not code.

Scenario: You're predicting whether a credit-card transaction is fraud. You train a model, get 99.8% accuracy on test data, and feel great. Your reviewer says it's worthless. Why?

Check the class balance. Fraud is rare — maybe 0.2% of transactions. A model that always predicts "not fraud" would hit 99.8% accuracy automatically. Accuracy lied to you.
Switch metrics. Look at the confusion matrix. You probably have 0 true positives — i.e., you never catch any fraud. Accuracy is the wrong knob.
Pick the metric that matches the problem. For fraud you care about recall (catch as much fraud as possible) and precision (don't alarm on too many legit charges). Combine into F1, or use ROC-AUC if you're tuning a threshold.
Address the imbalance. Options: class weights (class_weight='balanced' in scikit-learn), oversampling (SMOTE), undersampling, or threshold tuning. Each has tradeoffs — we'd pick based on what the business cares about more, false positives or false negatives.
Re-evaluate. Your "99.8% accurate" model might now be 70% recall with 60% precision. That's worse on the headline number — and dramatically more useful.

The real lesson: The hardest skill in ML isn't fitting models, it's choosing the metric that matches reality. The best ML engineers spend more time defining "good" than they spend training.

Where Students Usually Get Stuck

"My model trains but I have no idea if it's good"

We build a workflow: split data correctly, pick a baseline, choose metrics that match the problem, interpret the numbers. After one session, evaluation stops being voodoo.

"The math in the textbook is unreadable"

I translate. Every formula has a picture, every picture has a story, and you walk away knowing what each symbol is doing.

"My loss isn't going down"

Most "broken training" comes from 5 or 6 common issues — bad LR, wrong loss, exploding gradients, leak between train and test, label bugs. We diagnose with a checklist instead of guessing.

"I don't know how to pick a model"

We make a flowchart: data size, feature types, interpretability needs → model family. After we run the flowchart on a few projects, you'll do it automatically.

Frequently Asked Questions

What ML topics can you tutor?

Supervised and unsupervised learning, linear and logistic regression, decision trees, random forests, gradient boosting, SVMs, k-means, PCA, neural networks, CNNs, RNNs, transformer basics, model evaluation, regularization, and the math behind all of it.

Do I need to know calculus and linear algebra first?

No — we cover the math you need as we go, at exactly the depth required. If you want to go deeper on linear algebra or calculus for ML, we can do that too.

Can you help with my Coursera / Andrew Ng / fast.ai course?

Yes. We'll work through assignments together and fill in the gaps between what the course assumes and what you actually know.

Can you help me build a portfolio project?

Absolutely. We can scope a project, pick a dataset, work through model selection and evaluation, and write it up so it's portfolio-ready.

What about PyTorch and deep learning?

Yes — PyTorch is a core tool I work with. We cover tensors, autograd, building models, training loops, debugging weird losses, and modern architectures from MLPs to transformers.