Skip to content

Enhance task scheduler #13

Open
Open
@jserv

Description

@jserv

Executive Summary

In conjunction with pull request #6, this proposal outlines a comprehensive enhancement of Linmo's task scheduler to address current scalability limitations and prepare for multi-hart RISC-V systems. The enhancement transforms the current O(n) linear search scheduler into an O(1) bitmap-based system while preserving Linmo's weighted round-robin fairness model and adding comprehensive fairness guarantees suitable for SMP operation.

Key Objectives:

  • Performance: O(n) → O(1) scheduling complexity
  • Fairness: Multi-dimensional fairness across priority, time, and load
  • Scalability: Single-hart → Multi-hart ready architecture
  • Compatibility: 100% API preservation with existing applications
Image

Current State Analysis

Existing Scheduler Limitations

1. Algorithmic Complexity Issues

/* Current O(n) scheduler search - scales poorly */
static list_node_t *find_next_ready_task(void)
{
    list_node_t *node = kcb->task_current;
    int itcnt = 0;
    
    while (itcnt++ < SCHED_IMAX) {  // Up to 500 iterations!
        node = list_cnext(kcb->tasks, node);
        if (task->state == TASK_READY && counter == 0) {
            return node;  // Found after O(n) search
        }
    }
}

Problems:

  • O(n) Complexity: Search time grows linearly with task count
  • Worst-Case Bound: SCHED_IMAX = 500 iterations maximum
  • Cache Misses: Scattered task nodes hurt memory performance
  • Unpredictable Latency: Scheduling time varies with system load

2. Memory Layout Inefficiencies

/* Current single mixed list - poor cache locality */
typedef struct {
    list_t *tasks;              /* All tasks mixed regardless of priority */
    list_node_t *task_current;  /* Linear search starting point */
    list_node_t *last_ready_hint; /* Band-aid for performance */
} kcb_t;

Issues:

  • Poor Locality: Tasks of different priorities mixed in single list
  • Cache Thrashing: Frequent cache misses during priority searches
  • No Priority Ordering: High-priority tasks buried among low-priority ones

3. Fairness Limitations

/* Current fairness is rudimentary */
typedef struct {
    uint16_t prio;    /* Only basic priority counter */
    uint16_t delay;   /* Simple delay mechanism */
    /* No starvation prevention */
    /* No fairness tracking */
    /* No load balancing support */
} tcb_t;

Deficiencies:

  • No Starvation Prevention: Low-priority tasks can be starved indefinitely
  • Limited Fairness Metrics: No tracking of actual vs. expected CPU time
  • No Load Awareness: No consideration of system-wide load distribution

4. Single-Hart Architecture Constraints

  • No SMP Support: Cannot utilize multiple RISC-V harts
  • No Load Balancing: No mechanism for distributing work across harts
  • Global Bottlenecks: Single scheduler becomes bottleneck

Enhancement Objectives & Requirements

Primary Objectives

  1. O(1) Scheduling Complexity

    • Constant-time task selection regardless of system size
    • Bounded worst-case scheduling latency < 10 μs
    • Eliminate SCHED_IMAX iteration limits
  2. Comprehensive Fairness

    • Priority Fairness: Strict priority ordering preservation
    • Temporal Fairness: Proportional CPU time allocation
    • Load Fairness: Even distribution across harts
    • Starvation Prevention: Guaranteed progress for all tasks
  3. Multi-Hart Scalability

    • Support for 2-8 RISC-V harts initially
    • Per-hart local scheduling with global coordination
    • Load balancing and work migration capabilities
    • NUMA-awareness preparation
  4. Backward Compatibility

    • 100% API preservation for existing applications
    • Configurable enhancement levels
    • Graceful fallback mechanisms

Technical Design Architecture

1. O(1) Bitmap-Based Priority Selection

Enhanced Priority Queue Design

#define PRIO_LEVELS 8           /* Optimized for RISC-V cache lines */
#define MAX_HARTS 8             /* Support up to 8 RISC-V harts */

/* Per-hart scheduler state for cache locality */
typedef struct {
    /* O(1) Priority Selection Infrastructure */
    volatile uint32_t ready_bitmap;         /* 8-bit priority bitmap */
    list_t *ready_queues[PRIO_LEVELS];      /* Separate queue per priority */
    uint16_t queue_counts[PRIO_LEVELS];     /* O(1) size tracking */
    
    /* Weighted Round-Robin State per Priority Level */
    list_node_t *rr_cursors[PRIO_LEVELS];   /* Round-robin position */
    uint32_t quantum_cycles[PRIO_LEVELS];   /* Scheduling cycles per level */
    
    /* Performance Optimization */
    uint8_t last_selected_prio;             /* Cache hint for next selection */
    uint32_t local_switches;                /* Context switch count */
    
    /* Hart-Specific Data */
    tcb_t *current_task;                    /* Currently running task */
    uint8_t hart_id;                        /* RISC-V hart identifier */
    
} hart_sched_t;

/* Global multi-hart coordination */
typedef struct {
    hart_sched_t per_hart[MAX_HARTS];       /* Per-hart local state */
    
    /* Global Priority Coordination */
    volatile uint32_t global_ready_bitmap;   /* All harts combined */
    volatile uint32_t hart_active_mask;      /* Which harts are online */
    
    /* Fairness & Load Balancing */
    fairness_engine_t fairness_engine;      /* Global fairness coordinator */
    load_balancer_t load_balancer;          /* Multi-hart load balancer */
    
    /* Performance Metrics */
    uint64_t total_context_switches;        /* System-wide switches */
    uint64_t total_migrations;              /* Inter-hart migrations */
    
} multi_hart_sched_t;

RISC-V Optimized Bit Operations

/* RISC-V optimized priority finding using De Bruijn sequence */
static const uint8_t debruijn_lut[32] = {
    0, 1, 28, 2, 29, 14, 24, 3, 30, 22, 20, 15, 25, 17, 4, 8,
    31, 27, 13, 23, 21, 19, 16, 7, 26, 12, 18, 6, 11, 5, 10, 9
};

/* O(1) priority selection optimized for RISC-V */
static inline uint8_t find_highest_ready_priority_rv(uint32_t bitmap)
{
    if (unlikely(bitmap == 0))
        return IDLE_PRIORITY_LEVEL;
    
    /* Isolate rightmost set bit (highest priority) */
    uint32_t isolated = bitmap & (-bitmap);
    
    /* De Bruijn multiplication for O(1) bit position finding */
    uint32_t hash = (isolated * 0x077CB531U) >> 27;
    
    return debruijn_lut[hash & 0x1F];
}

/* Atomic bitmap operations for multi-hart safety */
static inline void mark_priority_ready_atomic(hart_sched_t *hart, uint8_t prio)
{
    /* Hart-local update */
    hart->ready_bitmap |= (1U << prio);
    
    /* Global coordination */
    __atomic_or_fetch(&multi_sched.global_ready_bitmap, 
                      (1U << prio), __ATOMIC_RELAXED);
}

static inline void mark_priority_empty_atomic(hart_sched_t *hart, uint8_t prio)
{
    /* Hart-local update */
    hart->ready_bitmap &= ~(1U << prio);
    
    /* Check if priority is empty across all harts */
    uint32_t global_prio_mask = 0;
    for (int h = 0; h < MAX_HARTS; h++) {
        if (multi_sched.per_hart[h].ready_bitmap & (1U << prio)) {
            global_prio_mask |= (1U << prio);
            break;
        }
    }
    
    if (!global_prio_mask) {
        __atomic_and_fetch(&multi_sched.global_ready_bitmap,
                          ~(1U << prio), __ATOMIC_RELAXED);
    }
}

2. Comprehensive Fairness Framework

Multi-Dimensional Fairness Tracking

/* Comprehensive fairness metrics per task */
typedef struct {
    /* Temporal Fairness */
    uint64_t total_runtime_us;          /* Actual CPU time consumed */
    uint64_t expected_runtime_us;       /* Fair share based on priority */
    int32_t fairness_deficit_us;        /* Expected - Actual (signed) */
    uint64_t last_start_time;           /* When current execution started */
    
    /* Starvation Prevention */
    uint32_t wait_cycles;               /* Consecutive scheduling cycles waiting */
    uint8_t starvation_level;           /* Escalating boost level (0-7) */
    uint32_t boost_deadline;            /* When to remove temporary boost */
    
    /* Multi-Hart Fairness */
    uint16_t hart_affinity_mask;        /* Preferred hart assignment */
    uint8_t last_hart;                  /* Last hart where task executed */
    uint16_t migration_count;           /* Number of inter-hart moves */
    uint32_t cache_warmth_score;        /* Cache locality metric */
    
    /* Load Distribution */
    uint32_t load_weight;               /* Task's contribution to hart load */
    uint64_t total_wait_time_us;        /* Total time spent waiting */
    
} task_fairness_t;

/* Hart load metrics for balancing decisions */
typedef struct {
    uint32_t total_tasks;               /* Tasks assigned to this hart */
    uint32_t ready_tasks;               /* Currently ready tasks */
    uint64_t weighted_load;             /* Priority-weighted load value */
    uint32_t utilization_percent;       /* Recent CPU utilization */
    
    /* Migration Tracking */
    uint32_t migrations_in;             /* Tasks migrated to this hart */
    uint32_t migrations_out;            /* Tasks migrated from this hart */
    uint32_t migration_cost_total;      /* Cumulative migration overhead */
    
    /* Performance Metrics */
    uint64_t idle_time_us;              /* Time spent idle */
    uint32_t context_switches;          /* Local context switches */
    
} hart_load_t;

Proportional Fair Scheduling Algorithm

/* Calculate task's fair share of CPU time */
static uint64_t calculate_fair_share_us(tcb_t *task, uint64_t period_us)
{
    /* Convert priority to weight using exponential mapping */
    uint8_t prio_level = priority_to_queue_index(task->prio);
    uint32_t task_weight = 1U << (PRIO_LEVELS - 1 - prio_level);
    
    /* Calculate system-wide total weight */
    uint32_t total_weight = 0;
    for (int h = 0; h < MAX_HARTS; h++) {
        hart_sched_t *hart = &multi_sched.per_hart[h];
        for (int p = 0; p < PRIO_LEVELS - 1; p++) { /* Exclude idle level */
            uint32_t tasks_at_level = hart->queue_counts[p];
            uint32_t level_weight = 1U << (PRIO_LEVELS - 1 - p);
            total_weight += tasks_at_level * level_weight;
        }
    }
    
    if (total_weight == 0)
        return 0;
    
    /* Proportional fair share calculation */
    return (period_us * task_weight) / total_weight;
}

/* Update fairness metrics after task execution */
static void update_fairness_metrics(tcb_t *task, uint64_t runtime_us)
{
    task_fairness_t *fairness = &task->fairness;
    
    /* Update actual runtime */
    fairness->total_runtime_us += runtime_us;
    
    /* Calculate expected runtime based on fair share */
    static uint64_t last_fairness_update = 0;
    uint64_t current_time = _read_us();
    
    if (last_fairness_update > 0) {
        uint64_t period = current_time - last_fairness_update;
        uint64_t fair_share = calculate_fair_share_us(task, period);
        
        fairness->expected_runtime_us += fair_share;
        
        /* Update fairness deficit (negative = behind, positive = ahead) */
        fairness->fairness_deficit_us = 
            (int32_t)(fairness->expected_runtime_us - fairness->total_runtime_us);
    }
    
    /* Reset wait cycles since task just ran */
    fairness->wait_cycles = 0;
    
    /* Global fairness update every 100ms */
    if (current_time - last_fairness_update > 100000) {
        last_fairness_update = current_time;
        global_fairness_reconciliation();
    }
}

Anti-Starvation Mechanisms

/* Starvation detection and prevention */
#define STARVATION_THRESHOLD_CYCLES 50    /* Cycles before boost */
#define MAX_STARVATION_LEVEL 7           /* Maximum boost level */

static void check_and_prevent_starvation(hart_sched_t *hart)
{
    for (int prio = PRIO_LEVELS - 2; prio >= 0; prio--) { /* Skip idle */
        list_t *queue = hart->ready_queues[prio];
        if (!queue) continue;
        
        list_node_t *node = queue->head->next;
        while (node != queue->tail) {
            tcb_t *task = (tcb_t *)node->data;
            task_fairness_t *fairness = &task->fairness;
            node = node->next;
            
            if (!task || task->state != TASK_READY)
                continue;
            
            /* Increment wait cycles */
            fairness->wait_cycles++;
            
            /* Check for starvation condition */
            if (fairness->wait_cycles > STARVATION_THRESHOLD_CYCLES) {
                apply_starvation_boost(task, hart, prio);
            }
            /* Check for severe fairness deficit */
            else if (fairness->fairness_deficit_us < -10000) { /* 10ms behind */
                fairness->starvation_level = 
                    MIN(fairness->starvation_level + 1, MAX_STARVATION_LEVEL);
            }
            /* Reset boost if fairness restored */
            else if (fairness->fairness_deficit_us > -1000) { /* Within 1ms */
                fairness->starvation_level = 0;
            }
        }
    }
}

/* Apply temporary priority boost to starved task */
static void apply_starvation_boost(tcb_t *task, hart_sched_t *hart, uint8_t current_prio)
{
    if (current_prio == 0) /* Already highest priority */
        return;
    
    task_fairness_t *fairness = &task->fairness;
    uint8_t boost_prio = MAX(0, current_prio - 2); /* Boost by 2 levels */
    
    printf("FAIRNESS: Boosting starved task %u from prio %u to %u (deficit: %d μs)\n",
           task->id, current_prio, boost_prio, fairness->fairness_deficit_us);
    
    /* Move task to higher priority queue */
    migrate_task_to_priority_level(task, hart, current_prio, boost_prio);
    
    /* Set boost removal deadline */
    fairness->boost_deadline = multi_sched.global_ticks + 
                              (F_TIMER * fairness->starvation_level); /* More time for higher starvation */
    fairness->wait_cycles = 0;
    fairness->starvation_level++;
    
    /* Mark task for boost removal after execution */
    task->flags |= TASK_FLAG_PRIORITY_BOOSTED;
}

3. Multi-Hart Architecture Design

Hart-Local Scheduling with Global Coordination

/* Per-hart O(1) scheduler */
static tcb_t *select_next_task_local(uint8_t hart_id)
{
    hart_sched_t *hart = &multi_sched.per_hart[hart_id];
    
    /* O(1) priority selection using local bitmap */
    uint8_t highest_prio = find_highest_ready_priority_rv(hart->ready_bitmap);
    
    if (highest_prio == IDLE_PRIORITY_LEVEL)
        return get_idle_task(hart_id);
    
    /* Select from highest priority queue using weighted round-robin */
    tcb_t *selected = select_from_priority_queue_wrr(hart, highest_prio);
    
    if (!selected) {
        /* Queue became empty, update bitmaps */
        mark_priority_empty_atomic(hart, highest_prio);
        return select_next_task_local(hart_id); /* Retry */
    }
    
    return selected;
}

/* Weighted round-robin within priority queue */
static tcb_t *select_from_priority_queue_wrr(hart_sched_t *hart, uint8_t prio)
{
    list_t *queue = hart->ready_queues[prio];
    if (!queue || queue->size == 0)
        return NULL;
    
    list_node_t *cursor = hart->rr_cursors[prio];
    if (!cursor || cursor == queue->tail)
        cursor = queue->head->next; /* Start from beginning */
    
    /* Linmo's weighted round-robin: find task with counter = 0 */
    list_node_t *start = cursor;
    do {
        if (cursor == queue->tail) {
            cursor = queue->head->next;
            continue;
        }
        
        tcb_t *task = (tcb_t *)cursor->data;
        if (!task || task->state != TASK_READY) {
            cursor = cursor->next;
            continue;
        }
        
        /* Apply starvation boost if needed */
        uint8_t effective_counter = (task->prio & 0xFF) >> task->fairness.starvation_level;
        
        if (effective_counter == 0) {
            /* Task ready to run - reload counter and select */
            uint8_t base_prio = (task->prio >> 8) & 0xFF;
            task->prio = (task->prio & 0xFF00) | base_prio;
            
            /* Remove from queue and update cursor */
            hart->rr_cursors[prio] = cursor->next;
            list_remove(queue, cursor);
            hart->queue_counts[prio]--;
            
            /* Update bitmap if queue became empty */
            if (hart->queue_counts[prio] == 0)
                mark_priority_empty_atomic(hart, prio);
            
            return task;
        }
        
        /* Decrement counter */
        task->prio = (task->prio & 0xFF00) | (effective_counter - 1);
        cursor = cursor->next;
        
    } while (cursor != start);
    
    return NULL; /* No task ready in this priority level */
}

Global Load Balancing Engine

/* Global load balancing coordinator */
typedef struct {
    volatile uint32_t balance_generation;    /* Prevent race conditions */
    uint64_t last_balance_time;             /* When balancing last ran */
    uint32_t balance_interval_us;           /* How often to balance */
    
    /* Load metrics */
    hart_load_t hart_loads[MAX_HARTS];      /* Per-hart load tracking */
    uint32_t system_wide_load;              /* Total system load */
    
    /* Migration policies */
    uint32_t migration_cost_threshold;      /* Minimum benefit for migration */
    uint32_t max_migrations_per_balance;    /* Rate limiting */
    
} load_balancer_t;

/* Calculate comprehensive hart load metric */
static uint32_t calculate_hart_load(uint8_t hart_id)
{
    hart_sched_t *hart = &multi_sched.per_hart[hart_id];
    hart_load_t *load = &multi_sched.load_balancer.hart_loads[hart_id];
    
    uint32_t priority_weighted_load = 0;
    uint32_t task_count_load = 0;
    
    /* Weight by priority levels - higher priority = exponentially more load */
    for (int prio = 0; prio < PRIO_LEVELS - 1; prio++) { /* Exclude idle */
        uint32_t tasks = hart->queue_counts[prio];
        uint32_t weight = 1U << (PRIO_LEVELS - 1 - prio); /* Exponential weight */
        
        priority_weighted_load += tasks * weight * weight; /* Square for emphasis */
        task_count_load += tasks;
    }
    
    /* Combine multiple load factors */
    uint32_t utilization_load = (100 - load->utilization_percent) * 10;
    uint32_t total_load = priority_weighted_load * 100 + 
                         task_count_load * 50 + 
                         utilization_load;
    
    return total_load;
}

/* Smart load balancing with migration cost analysis */
static void global_load_balance(void)
{
    load_balancer_t *balancer = &multi_sched.load_balancer;
    uint64_t current_time = _read_us();
    
    /* Rate limiting */
    if (current_time - balancer->last_balance_time < balancer->balance_interval_us)
        return;
    
    balancer->last_balance_time = current_time;
    balancer->balance_generation++;
    
    /* Calculate current load distribution */
    uint32_t hart_loads[MAX_HARTS];
    uint32_t total_load = 0;
    uint32_t min_load = UINT32_MAX, max_load = 0;
    uint8_t min_hart = 0, max_hart = 0;
    
    for (uint8_t h = 0; h < MAX_HARTS; h++) {
        hart_loads[h] = calculate_hart_load(h);
        total_load += hart_loads[h];
        
        if (hart_loads[h] < min_load) {
            min_load = hart_loads[h];
            min_hart = h;
        }
        if (hart_loads[h] > max_load) {
            max_load = hart_loads[h];
            max_hart = h;
        }
    }
    
    uint32_t avg_load = total_load / MAX_HARTS;
    
    /* Check if rebalancing needed (>30% imbalance) */
    if (max_load > avg_load + (avg_load * 30 / 100)) {
        uint32_t migrations = 0;
        
        /* Migrate tasks from overloaded to underloaded harts */
        while (migrations < balancer->max_migrations_per_balance &&
               calculate_hart_load(max_hart) > avg_load + (avg_load / 10)) {
            
            if (migrate_task_for_balance(max_hart, min_hart)) {
                migrations++;
                /* Recalculate min_hart after migration */
                min_hart = find_least_loaded_hart();
            } else {
                break; /* No suitable tasks to migrate */
            }
        }
        
        printf("BALANCE: Migrated %u tasks from hart %u to hart %u\n",
               migrations, max_hart, min_hart);
    }
}

Work-Stealing Protocol

/* Work-stealing for idle harts */
static tcb_t *attempt_work_stealing(uint8_t idle_hart)
{
    /* Try to steal from the most loaded hart */
    uint8_t victim_hart = find_most_loaded_hart();
    
    if (victim_hart == idle_hart)
        return NULL; /* No other harts available */
    
    /* Only steal if victim has significantly more work */
    uint32_t victim_load = calculate_hart_load(victim_hart);
    uint32_t idle_load = calculate_hart_load(idle_hart);
    
    if (victim_load <= idle_load * 2) /* Less than 2x load difference */
        return NULL;
    
    /* Prefer to steal lower priority tasks (less disruptive) */
    for (int prio = PRIO_LEVELS - 2; prio >= 4; prio--) { /* Skip idle and high prio */
        hart_sched_t *victim = &multi_sched.per_hart[victim_hart];
        list_t *queue = victim->ready_queues[prio];
        
        if (!queue || queue->size <= 1) /* Leave at least one task */
            continue;
        
        /* Steal from tail (oldest waiting task) */
        list_node_t *node = list_popback(queue);
        if (!node)
            continue;
        
        tcb_t *stolen_task = (tcb_t *)node->data;
        if (!stolen_task)
            continue;
        
        /* Update victim hart state */
        victim->queue_counts[prio]--;
        if (victim->queue_counts[prio] == 0) {
            mark_priority_empty_atomic(victim, prio);
        }
        
        /* Update task migration tracking */
        stolen_task->fairness.migration_count++;
        stolen_task->fairness.last_hart = idle_hart;
        stolen_task->fairness.cache_warmth_score = 0; /* Cache is cold */
        
        printf("STEAL: Task %u from hart %u to hart %u (prio %u)\n",
               stolen_task->id, victim_hart, idle_hart, prio);
        
        return stolen_task;
    }
    
    return NULL; /* No suitable work to steal */
}

4. Enhanced Main Scheduler Integration

/* Main O(1) multi-hart scheduler */
static uint16_t schedule_next_task_o1_multihart(uint8_t hart_id)
{
    tcb_t *next_task = NULL;
    hart_sched_t *hart = &multi_sched.per_hart[hart_id];
    
    /* Handle current task state transition */
    handle_current_task_transition_multihart(hart_id);
    
    /* 1. Try local hart scheduling first (cache locality) */
    next_task = select_next_task_local(hart_id);
    
    /* 2. Check for global priority violations */
    if (!next_task || should_check_global_priority()) {
        tcb_t *global_candidate = check_global_priority_enforcement(hart_id);
        if (global_candidate && 
            (!next_task || task_has_higher_priority(global_candidate, next_task))) {
            next_task = global_candidate;
        }
    }
    
    /* 3. Try work-stealing if still no local work */
    if (!next_task) {
        next_task = attempt_work_stealing(hart_id);
    }
    
    /* 4. Check starvation and apply boosts */
    check_and_prevent_starvation(hart);
    
    /* 5. Fallback to idle task */
    if (!next_task) {
        next_task = get_idle_task(hart_id);
    }
    
    /* Update scheduler state and metrics */
    if (next_task) {
        next_task->state = TASK_RUNNING;
        next_task->fairness.last_start_time = _read_us();
        next_task->fairness.wait_cycles = 0;
        hart->current_task = next_task;
        hart->last_selected_prio = priority_to_queue_index(next_task->prio);
        
        /* Update performance counters */
        hart->local_switches++;
        multi_sched.total_context_switches++;
    }
    
    /* Periodic global maintenance */
    static uint32_t balance_counter = 0;
    if ((++balance_counter % 100) == 0) { /* Every 100 schedules */
        global_load_balance();
    }
    
    return next_task ? next_task->id : 0;
}

/* Global priority enforcement across harts */
static tcb_t *check_global_priority_enforcement(uint8_t current_hart)
{
    /* Find highest global priority across all harts */
    uint8_t global_highest = find_highest_ready_priority_rv(multi_sched.global_ready_bitmap);
    
    /* Check if current hart's highest priority matches global */
    hart_sched_t *hart = &multi_sched.per_hart[current_hart];
    uint8_t local_highest = find_highest_ready_priority_rv(hart->ready_bitmap);
    
    if (local_highest <= global_highest)
        return NULL; /* Local hart has appropriate priority work */
    
    /* Look for higher priority work on other harts */
    for (uint8_t h = 0; h < MAX_HARTS; h++) {
        if (h == current_hart)
            continue;
        
        hart_sched_t *other_hart = &multi_sched.per_hart[h];
        uint8_t other_highest = find_highest_ready_priority_rv(other_hart->ready_bitmap);
        
        if (other_highest < global_highest) {
            /* Found higher priority work - attempt to steal it */
            tcb_t *high_prio_task = steal_high_priority_task(h, current_hart, other_highest);
            if (high_prio_task) {
                printf("GLOBAL: Migrated high-priority task %u from hart %u to hart %u\n",
                       high_prio_task->id, h, current_hart);
                return high_prio_task;
            }
        }
    }
    
    return NULL;
}

Implementation Roadmap

Phase 1: Single-Hart O(1) Enhancement

Tasks:

  • Implement priority queue data structures
  • Add bitmap manipulation functions with RISC-V optimization
  • Create priority mapping from 16-bit Linmo priorities to 8 levels
  • Add performance measurement infrastructure
  • Unit tests for bitmap operations
  • Implement O(1) priority selection algorithm
  • Add weighted round-robin within priority levels
  • Update task state transition logic
  • Integrate with existing HAL context switching
  • Performance benchmarking vs. current scheduler

Deliverables: Single-hart O(1) scheduler.

Phase 2: Fairness Framework

Tasks:

  • Add fairness tracking fields to TCB
  • Implement proportional fair scheduling
  • Add basic starvation prevention
  • Create fairness measurement framework
  • Fairness validation tests
  • Implement deficit-based fairness tracking
  • Add dynamic priority boosting for starved tasks
  • Create fairness reconciliation algorithms
  • Add comprehensive fairness metrics
  • Long-running fairness stress tests

Deliverables: Comprehensive fairness framework with <5% deviation from ideal

Phase 3: Multi-Hart Architecture

Tasks:

  • Design per-hart scheduler state structures
  • Implement hart-local ready queues
  • Add global coordination mechanisms
  • Basic multi-hart task spawning
  • Hart identification and management
  • Implement load calculation algorithms
  • Add task migration mechanisms
  • Create work-stealing protocol
  • Add migration cost analysis
  • Load balancing effectiveness testing
  • Global priority enforcement
  • Cross-hart fairness coordination
  • Advanced migration policies
  • Multi-hart stress testing

Deliverables: Full multi-hart scheduler with >85% SMP efficiency

Performance Evaluation Framework

Comprehensive Benchmark Suite

1. Scheduling Performance Benchmarks

/* Scheduler performance test suite */
typedef struct {
    uint32_t task_count;              /* Number of tasks in test */
    uint64_t avg_schedule_time_us;    /* Average scheduling overhead */
    uint64_t max_schedule_time_us;    /* Worst-case scheduling time */
    uint64_t total_context_switches;  /* Total switches during test */
    uint32_t cache_misses;            /* Cache miss count */
} sched_perf_result_t;

/* Performance test scenarios */
static const struct {
    const char *name;
    uint32_t task_count;
    uint32_t priority_mix[8];  /* Tasks per priority level */
    uint32_t duration_seconds;
} perf_tests[] = {
    {"Minimal Load",     10,  {1,1,2,2,2,1,1,0}, 60},
    {"Moderate Load",    50,  {2,3,8,10,15,8,4,0}, 120},
    {"Heavy Load",      100,  {5,5,15,20,25,15,10,5}, 180},
    {"Stress Test",     200,  {10,10,30,40,50,30,20,10}, 300},
    {"RT Mixed",         30,  {5,10,8,4,2,1,0,0}, 90},
};

/* Run comprehensive performance evaluation */
void run_scheduler_performance_evaluation(void)
{
    printf("=== Linmo Enhanced Scheduler Performance Evaluation ===\n");
    
    for (size_t i = 0; i < ARRAY_SIZE(perf_tests); i++) {
        printf("\nRunning test: %s\n", perf_tests[i].name);
        
        sched_perf_result_t result = run_perf_test(&perf_tests[i]);
        
        printf("Results:\n");
        printf("  Avg scheduling time: %llu μs\n", result.avg_schedule_time_us);
        printf("  Max scheduling time: %llu μs\n", result.max_schedule_time_us);
        printf("  Total context switches: %llu\n", result.total_context_switches);
        printf("  Scheduling efficiency: %.2f switches/ms\n", 
               (float)result.total_context_switches / (perf_tests[i].duration_seconds * 1000));
        
        /* Compare with O(n) baseline */
        uint64_t baseline_time = estimate_linear_scheduler_time(perf_tests[i].task_count);
        float improvement = (float)baseline_time / result.avg_schedule_time_us;
        printf("  Performance improvement: %.1fx faster than O(n)\n", improvement);
    }
}

2. Fairness Validation Tests

/* Fairness measurement framework */
typedef struct {
    uint32_t task_id;
    uint64_t actual_runtime_us;
    uint64_t expected_runtime_us;
    float fairness_ratio;           /* actual/expected */
    uint32_t starvation_events;
    uint32_t priority_boosts;
} task_fairness_result_t;

/* System-wide fairness metrics */
typedef struct {
    float overall_fairness_score;      /* 0.0-1.0, 1.0 = perfect fairness */
    uint32_t total_starvation_events;  /* Count of starvation incidents */
    float max_unfairness_ratio;        /* Worst fairness deviation */
    uint64_t total_test_duration_us;   /* Test duration */
    uint32_t priority_inversions;      /* Priority ordering violations */
} system_fairness_result_t;

/* Fairness test scenarios */
void run_fairness_validation_tests(void)
{
    printf("=== Fairness Validation Test Suite ===\n");
    
    /* Test 1: Equal priority fairness */
    system_fairness_result_t equal_prio = test_equal_priority_fairness(20, 300); /* 20 tasks, 5 minutes */
    printf("Equal Priority Test: Fairness Score = %.3f\n", equal_prio.overall_fairness_score);
    
    /* Test 2: Mixed priority fairness */
    system_fairness_result_t mixed_prio = test_mixed_priority_fairness(50, 600); /* 50 tasks, 10 minutes */
    printf("Mixed Priority Test: Fairness Score = %.3f\n", mixed_prio.overall_fairness_score);
    
    /* Test 3: Starvation prevention */
    system_fairness_result_t starvation = test_starvation_prevention(30, 180); /* 30 tasks, 3 minutes */
    printf("Starvation Prevention: Events = %u (should be 0)\n", starvation.total_starvation_events);
    
    /* Test 4: Load burst handling */
    system_fairness_result_t burst = test_load_burst_fairness(100, 120); /* 100 tasks, 2 minutes */
    printf("Load Burst Test: Max Unfairness = %.2f%% (should be <5%%)\n", 
           (burst.max_unfairness_ratio - 1.0) * 100);
}

3. Multi-Hart Scalability Tests

/* Multi-hart performance and fairness evaluation */
typedef struct {
    uint8_t hart_count;
    float smp_efficiency;              /* 0.0-1.0, 1.0 = perfect scaling */
    uint32_t total_migrations;         /* Task migrations between harts */
    float migration_efficiency;        /* Successful/beneficial migrations */
    uint32_t load_imbalance_events;    /* Times load became imbalanced */
    float avg_load_distribution;       /* How evenly work was distributed */
} multihart_result_t;

/* Multi-hart test scenarios */
void run_multihart_scalability_tests(void)
{
    printf("=== Multi-Hart Scalability Tests ===\n");
    
    for (uint8_t harts = 2; harts <= MAX_HARTS; harts++) {
        printf("\nTesting with %u harts:\n", harts);
        
        multihart_result_t result = run_multihart_test(harts, 100, 300); /* 100 tasks, 5 min */
        
        printf("  SMP Efficiency: %.1f%%\n", result.smp_efficiency * 100);
        printf("  Task Migrations: %u\n", result.total_migrations);
        printf("  Migration Efficiency: %.1f%%\n", result.migration_efficiency * 100);
        printf("  Load Distribution: %.1f%% even\n", result.avg_load_distribution * 100);
        
        /* Theoretical vs actual performance */
        float theoretical_speedup = (float)harts;
        float actual_speedup = result.smp_efficiency * harts;
        printf("  Speedup: %.2fx (theoretical: %.2fx)\n", actual_speedup, theoretical_speedup);
    }
}

Implementation Guidelines and Best Practices

Code Quality Standards

1. Performance-Critical Code

/* All scheduler hot paths must meet these requirements */
- Maximum 50 instructions for O(1) operations
- No dynamic memory allocation in scheduler core
- Atomic operations must be lock-free where possible
- Cache-line alignment for frequently accessed structures

/* Example: Performance-critical task selection */
static inline tcb_t *fast_task_select(hart_sched_t *hart)
{
    /* This function must complete in <10 μs */
    uint8_t prio = find_highest_ready_priority_rv(hart->ready_bitmap);
    return (prio < PRIO_LEVELS) ? dequeue_ready_task(hart, prio) : NULL;
}

2. Multi-Hart Safety

/* All shared data must be properly synchronized */
#define HART_LOCAL    /* Data accessed only by one hart */
#define HART_SHARED   /* Data shared between harts - needs atomics */
#define HART_GLOBAL   /* Data accessed by all harts - needs coordination */

/* Example: Safe bitmap updates */
static inline void safe_bitmap_update(volatile uint32_t *bitmap, uint8_t bit, bool set)
{
    if (set) {
        __atomic_or_fetch(bitmap, (1U << bit), __ATOMIC_RELAXED);
    } else {
        __atomic_and_fetch(bitmap, ~(1U << bit), __ATOMIC_RELAXED);
    }
}

3. Fairness Invariants

/* Fairness constraints that must always hold */
assert(task->fairness.total_runtime_us <= task->fairness.expected_runtime_us + FAIRNESS_TOLERANCE);
assert(task->fairness.wait_cycles < MAX_WAIT_CYCLES);
assert(task->fairness.starvation_level <= MAX_STARVATION_LEVEL);

/* Fairness verification in debug builds */
#ifdef CONFIG_FAIRNESS_DEBUG
#define VERIFY_FAIRNESS(task) verify_task_fairness_invariants(task)
#else
#define VERIFY_FAIRNESS(task) do {} while(0)
#endif

Configuration Options

Build-Time Configuration

/* Enhanced scheduler feature levels */
#define CONFIG_ENHANCED_SCHEDULER_LEVEL_BASIC    1  /* O(1) only */
#define CONFIG_ENHANCED_SCHEDULER_LEVEL_FAIR     2  /* + fairness */
#define CONFIG_ENHANCED_SCHEDULER_LEVEL_SMP      3  /* + multi-hart */

/* Memory usage control */
#define CONFIG_MAX_HARTS                         4  /* Maximum supported harts */
#define CONFIG_PRIORITY_LEVELS                   8  /* Number of priority levels */
#define CONFIG_FAIRNESS_TRACKING                 1  /* Enable fairness metrics */
#define CONFIG_MIGRATION_SUPPORT                 1  /* Enable task migration */

/* Performance tuning */
#define CONFIG_LOAD_BALANCE_INTERVAL_MS        100  /* Load balancing frequency */
#define CONFIG_FAIRNESS_UPDATE_INTERVAL_MS     100  /* Fairness reconciliation */
#define CONFIG_STARVATION_THRESHOLD_CYCLES      50  /* Starvation detection */

Runtime Configuration

/* Runtime tunable parameters */
typedef struct {
    bool enhanced_scheduler_enabled;        /* Enable enhanced scheduler */
    uint8_t active_hart_count;             /* Number of harts to use */
    uint32_t migration_cost_threshold;     /* Minimum benefit for migration */
    uint32_t fairness_tolerance_us;        /* Acceptable fairness deviation */
    bool strict_priority_enforcement;      /* Global priority consistency */
    bool load_balancing_enabled;           /* Enable automatic load balancing */
} scheduler_config_t;

/* Configuration API */
int32_t mo_sched_configure(const scheduler_config_t *config);
int32_t mo_sched_get_stats(scheduler_stats_t *stats);
int32_t mo_sched_reset_stats(void);

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions