Your First Experiment

This guide walks you through training your first machine learning model with KLearn, from uploading data to getting predictions.

Prerequisites

Before starting, ensure you have:

KLearn installed and running (Installation Guide)
Access to the KLearn dashboard at http://localhost:3000
A CSV dataset ready (we'll use the Iris dataset as an example)

Step 1: Prepare Your Data

For this tutorial, we'll use the classic Iris dataset. You can download it or create a sample:

sepal_length,sepal_width,petal_length,petal_width,species
5.1,3.5,1.4,0.2,setosa
4.9,3.0,1.4,0.2,setosa
7.0,3.2,4.7,1.4,versicolor
6.4,3.2,4.5,1.5,versicolor
6.3,3.3,6.0,2.5,virginica
5.8,2.7,5.1,1.9,virginica

Save this as iris.csv.

Step 2: Upload Dataset

Using the Dashboard

Navigate to Datasets in the sidebar
Click Upload Dataset
Drag and drop your iris.csv file
Enter a name: iris-classification
Click Upload

Using the API

curl -X POST http://localhost:8000/api/v1/datasets \
  -F "file=@iris.csv" \
  -F "name=iris-classification"

After upload, you'll see:

Row count: 150
Column count: 5
Column types detected automatically

Step 3: Create an Experiment

Using the Dashboard

Navigate to Experiments in the sidebar
Click New Experiment
Fill in the form:
- Name: iris-species-classifier
- Dataset: Select iris-classification
- Task Type: Classification
- Target Column: species
- Time Budget: 120 seconds (for quick testing)
Click Start Training

Using the API

curl -X POST http://localhost:8000/api/v1/experiments \
  -H "Content-Type: application/json" \
  -d '{
    "name": "iris-species-classifier",
    "dataset_id": 1,
    "task_type": "classification",
    "target_column": "species",
    "time_budget": 120
  }'

Step 4: Monitor Training

Once training starts, you can monitor progress:

Real-time Dashboard

The experiment detail page shows:

Progress bar: Time elapsed vs. budget
Current trial: Which model is being tested
Best score: Current best accuracy
Live logs: Real-time training output

Using kubectl

# Watch the KLearnJob
kubectl get klearnjob -n klearn -w

# Check pod logs
kubectl logs -n klearn -l job-name=iris-species-classifier-xxxxx -f

What FLAML does during training

Data preprocessing: Handles missing values, encodes categories
Model selection: Tests RandomForest, XGBoost, LightGBM, etc.
Hyperparameter tuning: Optimizes each model's parameters
Cross-validation: Ensures model generalizes well
Best model selection: Picks the highest-scoring model

Step 5: Review Results

When training completes (usually in 1-2 minutes for Iris):

Metrics Tab

Best Score: ~0.97 accuracy
Best Model: Usually RandomForest or LightGBM
Training Time: Actual time taken
Total Trials: Number of models tested

Configuration Tab

Task type and target column
Time budget used
K8s job name for debugging

Logs Tab

Complete training logs
FLAML output with trial details
Any warnings or errors

Step 6: Deploy the Model

Using the Dashboard

From the experiment page, click Deploy Model
Or navigate to Models and find your model
Click the Deploy button
Choose deployment type:
- KLearn: Lightweight, good for testing
- KServe: Production-grade with autoscaling
Set replicas (default: 2)
Click Deploy

Using the API

# Get the model name
curl http://localhost:8000/api/v1/models

# Deploy it
curl -X POST http://localhost:8000/api/v1/models/iris-species-classifier-model/deploy \
  -H "Content-Type: application/json" \
  -d '{
    "deployment_type": "klearn",
    "replicas": 1
  }'

Step 7: Make Predictions

Once deployed, your model is ready for predictions:

Find the endpoint

# Check deployment status
curl http://localhost:8000/api/v1/deployments

# Get the endpoint URL
kubectl get httproute -n klearn

Send a prediction request

curl -X POST http://localhost:8080/predict \
  -H "Content-Type: application/json" \
  -d '{
    "instances": [
      {
        "sepal_length": 5.1,
        "sepal_width": 3.5,
        "petal_length": 1.4,
        "petal_width": 0.2
      }
    ]
  }'

Response

{
  "predictions": ["setosa"],
  "probabilities": [[0.98, 0.01, 0.01]]
}

Troubleshooting

Training stuck at "Pending"

Check if the trainer image is available:

kubectl describe pod -n klearn -l klearn.dev/job-name=iris-species-classifier

Low accuracy

Increase time budget
Check for data quality issues
Ensure target column is correct

Deployment not starting

kubectl describe deployment -n klearn iris-species-classifier-serving
kubectl logs -n klearn -l app=iris-species-classifier-serving

Next Steps

Congratulations! 🎉 You've trained and deployed your first model with KLearn.

Deploying Models - Advanced deployment options
Using the Chat Interface - Natural language model building
API Reference - Complete API documentation