Multiclass Classifier Training Agent

🧠 How to Run a Binary Classifier ONNX Model with Python: Full Beginner-Friendly Guide

📌 What is this?

This guide walks you through running a binary classifier ONNX model using Python, starting from scratch — including Python installation, setting up dependencies, and running the model for binary classification tasks.

Choose your path:

📖 Quick Guide: Follow the step-by-step tutorial below
🚀 Full Example: See complete implementation with tests on GitHub

✅ 1. Install Python

🔷 Windows

Go to: https://www.python.org/downloads/windows
Download the latest Python 3.11+ installer
During installation, check ✅ Add Python to PATH
After installation, check if it worked:

python --version

🍏 macOS

You have two options to install Python:

Option 1 - Official Website:
- Visit: https://www.python.org/downloads/macos/
- Download the latest Python 3.11+ installer for macOS
- Run the installer package and follow the installation wizard

Option 2 - Homebrew:

Install Homebrew (if you don't have it):

/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"

Then install Python:

brew install python@3.11

After installation, check if it worked:

python3 --version

Note: macOS uses python3, not python.

🐧 Linux (Ubuntu/Debian)

You have two options to install Python:

Option 1 - Package Manager:

sudo apt update
sudo apt install python3 python3-pip

Option 2 - Official Website:
- Visit: https://www.python.org/downloads/source/
- Download the latest Python 3.11+ source tarball
- Extract and build from source:
```
tar -xf Python-3.11.x.tar.xz
cd Python-3.11.x
./configure
make
sudo make install
```

After installation, check if it worked:

python3 --version

✅ 2. Get Your Model

🔷 Download the Repository

git clone https://github.com/whitelightning-ai/whitelightning.git
cd whitelightning.ai

🟩 3 Choose Your Model

You have two options:

Use Pre-trained Model:
- Navigate to the models directory
- Copy these files to your project's src/main/resources directory:
  - model.onnx
  - model_vocab.json
  - model_scaler.json
Train Your Own Model:
- Follow the training guide in the repository
- Use the provided scripts to train your custom binary classifier
- Export your model to ONNX format

✅ 4. Project Setup

mkdir binary_classifier_demo
cd binary_classifier_demo

Folder structure:

binary_classifier_demo/
├── model.onnx
├── model_vocab.json
├── model_scaler.json
└── run_onnx.py

✅ 5. Install Required Python Libraries

pip install onnxruntime numpy

On macOS or Linux, you might need to run pip3 install instead.

✅ 6. Prepare Supporting Files

🔹 model_vocab.json (TF-IDF vocabulary)

{
  "vocab": {
    "the": 0,
    "government": 1,
    "announced": 2,
    "new": 3,
    "policies": 4
  },
  "idf": [1.2, 2.1, 1.8, 1.5, 2.3]
}

🔹 model_scaler.json (normalization parameters)

{
  "mean": [0.1, 0.2, 0.3, 0.4, 0.5],
  "scale": [1.1, 1.2, 1.3, 1.4, 1.5]
}

These values are used to normalize the TF-IDF features.

🔹 model.onnx

Place your trained binary classifier ONNX model here. It should accept a (1, 5000) input tensor of float32.

✅ 7. Create the Python Script run_onnx.py

Use the code example below:

import json
import numpy as np
import onnxruntime as ort

# --- Preprocessing: TF-IDF + Scaling ---
def preprocess_text(text, vocab_file, scaler_file):
    # Load vocabulary and IDF weights
    with open('model_vocab.json', 'r') as f:
        vocab = json.load(f)
    with open('model_scaler.json', 'r') as f:
        scaler = json.load(f)
    idf = vocab['idf']
    word2idx = vocab['vocab']
    mean = np.array(scaler['mean'], dtype=np.float32)
    scale = np.array(scaler['scale'], dtype=np.float32)

    # Compute term frequency (TF)
    tf = np.zeros(len(word2idx), dtype=np.float32)
    words = text.lower().split()
    for word in words:
        idx = word2idx.get(word)
        if idx is not None:
            tf[idx] += 1
    if tf.sum() > 0:
        tf = tf / tf.sum()  # Normalize TF

    # TF-IDF
    tfidf = tf * np.array(idf, dtype=np.float32)

    # Standardize
    tfidf_scaled = (tfidf - mean) / scale
    return tfidf_scaled.astype(np.float32)

# Example usage
text = "This is a positive test"
vector = preprocess_text(text, 'vocab.json', 'scaler.json')  # 5000-dim float32

# --- ONNX Inference ---
session = ort.InferenceSession('model.onnx')
input_name = session.get_inputs()[0].name
output_name = session.get_outputs()[0].name
input_data = vector.reshape(1, -1)
outputs = session.run([output_name], {input_name: input_data})

probability = outputs[0][0][0]  # Probability of positive class
print(f'Python ONNX output: Probability = {probability:.4f}')

✅ 8. Run the Script

🔷 Windows

python run_onnx.py

🍏 macOS/Linux

python3 run_onnx.py

✅ 9. Expected Output

Python ONNX output: Probability = 0.9123

The output shows the probability of the positive class. In this example, the model predicted a 91.23% probability of the text belonging to the positive class. A probability above 0.5 indicates a positive classification, while below 0.5 indicates a negative classification.

🧠 How to Run a Binary Classifier ONNX Model with JavaScript (Browser or Node.js)

📌 What is this?

This guide explains how to load and run binary classifier ONNX models using JavaScript and ONNX Runtime Web, covering both browser and Node.js environments.

Choose your path:

📖 Quick Guide: Follow the step-by-step tutorial below
🚀 Full Example: See complete implementation with tests on GitHub

✅ 1. Choose Your Runtime

You can run ONNX models in JavaScript in two ways:

Environment	Description	Recommended For
✅ Browser	Uses WebAssembly or WebGL	Web apps, frontend demos
✅ Node.js	Uses Node runtime (CPU only)	Backend/CLI usage

✅ 2. Requirements

🔷 For browser

No install — just include the library from a CDN or bundle via npm.

🟩 For Node.js

Install Node.js:

Download from: https://nodejs.org/
Check installation:

node -v
npm -v

Then install ONNX Runtime:

npm install onnxruntime-web

✅ 3. Get Your Model

🔄 Download Repository

Clone our repository to get started:

git clone https://github.com/whitelightning-ai/whitelightning.git
cd whitelightning.ai

📦 Choose Your Model

You have two options:

Use Pre-trained Model:
- Navigate to the models directory
- Copy the binary classifier model files to your project:
  - model.onnx - The ONNX model file
  - model_vocab.json - TF-IDF vocabulary
  - model_scaler.json - Feature scaling parameters
Train Your Own Model:
- Follow the training guide in the repository
- Use the provided scripts to train a custom binary classifier
- Export your model to ONNX format

✅ 4. Folder Setup

mkdir binary_classifier_demo
cd binary_classifier_demo

Files you'll need:

binary_classifier_demo/
├── index.html            # For browser use
├── run.js                # Main logic
├── model.onnx
├── model_vocab.json
└── model_scaler.json

✅ 5. Sample model_vocab.json

{
  "vocab": {
    "the": 0,
    "government": 1,
    "announced": 2,
    "new": 3,
    "policies": 4
  },
  "idf": [1.2, 2.1, 1.8, 1.5, 2.3]
}

✅ 6. Sample model_scaler.json

{
  "mean": [0.1, 0.2, 0.3, 0.4, 0.5],
  "scale": [1.1, 1.2, 1.3, 1.4, 1.5]
}

✅ 7. JavaScript Code (run.js)

Works in both browser and Node.js (with minor changes)

async function preprocessText(text, vocabUrl, scalerUrl) {
const tfidfResp = await fetch(vocabUrl);
const tfidfData = await tfidfResp.json();
const vocab = tfidfData.vocab;
const idf = tfidfData.idf;

const scalerResp = await fetch(scalerUrl);
const scalerData = await scalerResp.json();
const mean = scalerData.mean;
const scale = scalerData.scale;

// TF-IDF
const vector = new Float32Array(5000).fill(0);
const words = text.toLowerCase().split(/\s+/);
const wordCounts = {};
words.forEach(word => wordCounts[word] = (wordCounts[word] || 0);
for (const word in wordCounts) {
    if (vocab[word] !== undefined) {
        vector[vocab[word]] = wordCounts[word] * idf[vocab[word]];
    }
}

# Scale
for (let i = 0; i < 5000; i++) {
    vector[i] = (vector[i] - mean[i]) / scale[i];
}
return vector;
}

async function runModel(text) {
const session = await ort.InferenceSession.create("model.onnx");
const vector = await preprocessText(text, "model_vocab.json", "model_scaler.json");
const tensor = new ort.Tensor("float32", vector, [1, 5000]);
const feeds = { input: tensor };
const output = await session.run(feeds);
const probability = output[Object.keys(output)[0]].data[0];
console.log(`JS ONNX output: Probability = ${probability.toFixed(4)}`);
}

runModel("This is a positive test string");

✅ 8. Run in Browser (option A)

🔹 index.html

<!DOCTYPE html>
<html lang="en">
<head>
  <meta charset="UTF-8" />
  <title>ONNX JS Inference</title>
</head>
<body>
  <h1>Running ONNX Model...</h1>
  <script src="https://cdn.jsdelivr.net/npm/onnxruntime-web/dist/ort.min.js"></script>
  <script type="module" src="run.js"></script>
</body>
</html>

📦 Start a local server (required due to fetch):

npx serve .
# OR
python3 -m http.server

Visit: http://localhost:3000

✅ 9. Run with Node.js (option B)

🔹 Modify run.js for Node

import * as ort from 'onnxruntime-node';
import fs from 'fs/promises';

async function loadJSON(path) {
  const data = await fs.readFile(path, 'utf-8');
  return JSON.parse(data);
}

// Keep rest of logic same from previous example

📦 Run it:

node run.js

✅ 10. Expected Output

JS ONNX output: Probability = 0.9123

The output shows the probability of the positive class. A probability above 0.5 indicates a positive classification, while below 0.5 indicates a negative classification.

🧠 How to Run an ONNX Model with C using ONNX Runtime and cJSON

📌 What is this?

This guide explains how to load and run ONNX models using C, ONNX Runtime C API, and cJSON for JSON parsing.

Choose your path:

📖 Quick Guide: Follow the step-by-step tutorial below
🚀 Full Example: See complete implementation with tests on GitHub

✅ 1. Prerequisites

🔷 C Compiler

macOS: clang comes with Xcode Command Line Tools

xcode-select --install

Linux: install gcc

sudo apt install build-essential

🟩 ONNX Runtime C Library

Download ONNX Runtime C API from the official website:

👉 https://github.com/microsoft/onnxruntime/releases

Choose:

onnxruntime-osx-universal2-.tgz   # For macOS
onnxruntime-linux-x64-.tgz        # For Linux

📦 Install cJSON

macOS:

brew install cjson

Linux:

sudo apt install libcjson-dev

✅ 2. Choose Your Model

🔄 Download Repository

Clone our repository to get started:

git clone https://github.com/whitelightning-ai/whitelightning.git
cd whitelightning.ai

📦 Choose Your Model

You have two options:

Use Pre-trained Model:
- Navigate to the models directory
- Copy the multiclass classifier model files to your project:
  - model.onnx - The ONNX model file
  - model_vocab.json - Tokenizer vocabulary
  - model_labels.json - Class labels mapping
Train Your Own Model:
- Follow the training guide in the repository
- Use the provided scripts to train a custom multiclass classifier
- Export your model to ONNX format

✅ 3. Folder Structure

project/
├── ONNX_test.c               ← your C code
├── vocab.json                ← tokenizer
├── scaler.json               ← label map
├── model.onnx                ← ONNX model
├── onnxruntime-osx-universal2-1.22.0/
│   ├── include/
│   └── lib/

✅ 4. Build Command

🔷 macOS

gcc ONNX_test.c \
  -I./onnxruntime-osx-universal2-1.22.0/include \
  -L./onnxruntime-osx-universal2-1.22.0/lib \
  -lonnxruntime \
  -lcjson \
  -o onnx_test

🐧 Linux

Replace the onnxruntime-osx-... path with onnxruntime-linux-x64-....

✅ 5. Run the Executable

🔷 macOS

Important: You must set the library path.

export DYLD_LIBRARY_PATH=./onnxruntime-osx-universal2-1.22.0/lib:$DYLD_LIBRARY_PATH
./onnx_test

🐧 Linux

export LD_LIBRARY_PATH=./onnxruntime-linux-x64-1.22.0/lib:$LD_LIBRARY_PATH
./onnx_test

✅ 6. C Code Example

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>
#include "onnxruntime-osx-universal2-1.22.0/include/onnxruntime_c_api.h"
#include <cjson/cJSON.h>

const OrtApi* g_ort = NULL;

float* preprocess_text(const char* text, const char* vocab_file, const char* scaler_file) {
    float* vector = calloc(5000, sizeof(float));
    if (!vector) return NULL;

    FILE* f = fopen(vocab_file, "r");
    if (!f) return NULL;

    fseek(f, 0, SEEK_END);
    long len = ftell(f);
    fseek(f, 0, SEEK_SET);
    char* json_str = malloc(len + 1);
    fread(json_str, 1, len, f);
    json_str[len] = 0;
    fclose(f);

    cJSON* tfidf_data = cJSON_Parse(json_str);
    if (!tfidf_data) {
        free(json_str);
        return NULL;
    }

    cJSON* vocab = cJSON_GetObjectItem(tfidf_data, "vocab");
    cJSON* idf = cJSON_GetObjectItem(tfidf_data, "idf");
    if (!vocab || !idf) {
        free(json_str);
        cJSON_Delete(tfidf_data);
        return NULL;
    }

    f = fopen(scaler_file, "r");
    if (!f) {
        free(json_str);
        cJSON_Delete(tfidf_data);
        return NULL;
    }

    fseek(f, 0, SEEK_END);
    len = ftell(f);
    fseek(f, 0, SEEK_SET);
    char* scaler_str = malloc(len + 1);
    fread(scaler_str, 1, len, f);
    scaler_str[len] = 0;
    fclose(f);

    cJSON* scaler_data = cJSON_Parse(scaler_str);
    if (!scaler_data) {
        free(json_str);
        free(scaler_str);
        cJSON_Delete(tfidf_data);
        return NULL;
    }

    cJSON* mean = cJSON_GetObjectItem(scaler_data, "mean");
    cJSON* scale = cJSON_GetObjectItem(scaler_data, "scale");
    if (!mean || !scale) {
        free(json_str);
        free(scaler_str);
        cJSON_Delete(tfidf_data);
        cJSON_Delete(scaler_data);
        return NULL;
    }

    char* text_copy = strdup(text);
    for (char* p = text_copy; *p; p++) *p = tolower(*p);

    char* word = strtok(text_copy, " \t\n");
    while (word) {
        cJSON* idx = cJSON_GetObjectItem(vocab, word);
        if (idx) {
            int i = idx->valueint;
            if (i < 5000) {
                vector[i] += cJSON_GetArrayItem(idf, i)->valuedouble;
            }
        }
        word = strtok(NULL, " \t\n");
    }

    for (int i = 0; i < 5000; i++) {
        vector[i] = (vector[i] - cJSON_GetArrayItem(mean, i)->valuedouble) / 
                    cJSON_GetArrayItem(scale, i)->valuedouble;
    }

    free(text_copy);
    free(json_str);
    free(scaler_str);
    cJSON_Delete(tfidf_data);
    cJSON_Delete(scaler_data);
    return vector;
}

int main() {
    g_ort = OrtGetApiBase()->GetApi(ORT_API_VERSION);
    if (!g_ort) return 1;

    const char* text = "Earn $5000 a week from home — no experience required!";
    float* vector = preprocess_text(text, "vocab.json", "scaler.json");
    if (!vector) return 1;

    OrtEnv* env;
    OrtStatus* status = g_ort->CreateEnv(ORT_LOGGING_LEVEL_WARNING, "test", &env);
    if (status) return 1;

    OrtSessionOptions* session_options;
    status = g_ort->CreateSessionOptions(&session_options);
    if (status) return 1;

    OrtSession* session;
    status = g_ort->CreateSession(env, "model.onnx", session_options, &session);
    if (status) return 1;

    OrtMemoryInfo* memory_info;
    status = g_ort->CreateCpuMemoryInfo(OrtArenaAllocator, OrtMemTypeDefault, &memory_info);
    if (status) return 1;

    int64_t input_shape[] = {1, 5000};
    OrtValue* input_tensor;
    status = g_ort->CreateTensorWithDataAsOrtValue(memory_info, vector, 5000 * sizeof(float), 
                                                 input_shape, 2, ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT, 
                                                 &input_tensor);
    if (status) return 1;

    const char* input_names[] = {"float_input"};
    const char* output_names[] = {"output"};
    OrtValue* output_tensor = NULL;
    status = g_ort->Run(session, NULL, input_names, (const OrtValue* const*)&input_tensor, 1, 
                       output_names, 1, &output_tensor);
    if (status) return 1;

    float* output_data;
    status = g_ort->GetTensorMutableData(output_tensor, (void**)&output_data);
    if (status) return 1;

    printf("C ONNX output: %s (Score: %.4f)\n", 
           output_data[0] > 0.5 ? "Spam" : "Not Spam", 
           output_data[0]);

    g_ort->ReleaseValue(input_tensor);
    g_ort->ReleaseValue(output_tensor);
    g_ort->ReleaseMemoryInfo(memory_info);
    g_ort->ReleaseSession(session);
    g_ort->ReleaseSessionOptions(session_options);
    g_ort->ReleaseEnv(env);

    free(vector);
    return 0;
}

✅ 7. Expected Output

C ONNX output: Spam (Score: 0.9123)

🧠 How to Run a Binary Classifier ONNX Model with C++

📌 What is this?

This guide explains how to load and run binary classifier ONNX models using C++ and ONNX Runtime C++ API, with a focus on efficient text preprocessing and model inference.

Choose your path:

📖 Quick Guide: Follow the step-by-step tutorial below
🚀 Full Example: See complete implementation with tests on GitHub

✅ 1. Prerequisites

🔷 C++ Compiler

macOS: clang++ comes with Xcode Command Line Tools

xcode-select --install

Linux: install g++

sudo apt install build-essential

🟩 ONNX Runtime C++ Library

Download ONNX Runtime C++ API from the official website:

👉 https://github.com/microsoft/onnxruntime/releases

Choose:

onnxruntime-osx-universal2-.tgz   # For macOS
onnxruntime-linux-x64-.tgz        # For Linux

📦 Install nlohmann/json

macOS:

brew install nlohmann-json

Linux:

sudo apt install nlohmann-json3-dev

✅ 2. Choose Your Model

🔄 Download Repository

Clone our repository to get started:

git clone https://github.com/whitelightning-ai/whitelightning.git
cd whitelightning.ai

📦 Choose Your Model

You have two options:

Use Pre-trained Model:
- Navigate to the models directory
- Copy the binary classifier model files to your project:
  - model.onnx - The ONNX model file
  - model_vocab.json - TF-IDF vocabulary
  - model_scaler.json - Feature scaling parameters
Train Your Own Model:
- Follow the training guide in the repository
- Use the provided scripts to train a custom binary classifier
- Export your model to ONNX format

✅ 3. Folder Structure

project/
├── main.cpp                ← your C++ code
├── model_vocab.json        ← TF-IDF vocabulary
├── model_scaler.json       ← scaling parameters
├── model.onnx             ← ONNX model
├── onnxruntime-osx-universal2-1.22.0/
│   ├── include/
│   └── lib/

✅ 4. Build Command

🔷 macOS

g++ -std=c++17 main.cpp \
  -I./onnxruntime-osx-universal2-1.22.0/include \
  -L./onnxruntime-osx-universal2-1.22.0/lib \
  -lonnxruntime \
  -o binary_classifier

🐧 Linux

Replace the onnxruntime-osx-... path with onnxruntime-linux-x64-....

✅ 5. Run the Executable

🔷 macOS

Important: You must set the library path.

export DYLD_LIBRARY_PATH=./onnxruntime-osx-universal2-1.22.0/lib:$DYLD_LIBRARY_PATH
./binary_classifier

🐧 Linux

export LD_LIBRARY_PATH=./onnxruntime-linux-x64-1.22.0/lib:$LD_LIBRARY_PATH
./binary_classifier

✅ 6. C++ Code Example

#include<onnxruntime_cxx_api.h>
#include <fstream>
#include <nlohmann/json.hpp>
using json = nlohmann::json;

std::vector<float> preprocess_text(const std::string& text, const std::string& vocab_file, const std::string& scaler_file) {
    std::vector<float> vector(5000, 0.0f);
    
    std::ifstream vf(vocab_file);
    json tfidf_data; vf >> tfidf_data;
    auto vocab = tfidf_data["vocab"];
    std::vector<float> idf = tfidf_data["idf"];
    
    std::ifstream sf(scaler_file);
    json scaler_data; sf >> scaler_data;
    std::vector<float> mean = scaler_data["mean"];
    std::vector<float> scale = scaler_data["scale"];
    
    # TF-IDF
    std::string text_lower = text;
    std::transform(text_lower.begin(), text_lower.end(), text_lower.begin(), ::tolower);
    std::map<std::string, int> word_counts;
    size_t start = 0, end;
    while ((end = text_lower.find(' ', start)) != std::string::npos) {
        if (end > start) word_counts[text_lower.substr(start, end - start)]++;
        start = end + 1;
    }
    if (start < text_lower.length()) word_counts[text_lower.substr(start)]++;
    for (const auto& [word, count] : word_counts) {
        if (vocab.contains(word)) {
            vector[vocab[word]] = count * idf[vocab[word]];
        }
    }
    
    # Scale
    for (int i = 0; i < 5000; i++) {
        vector[i] = (vector[i] - mean[i]) / scale[i];
    }
    return vector;
}

int main() {
    std::string text = "This is a positive test string";
    auto vector = preprocess_text(text, "model_vocab.json", "model_scaler.json");
    
    Ort::Env env(ORT_LOGGING_LEVEL_WARNING, "test");
    Ort::SessionOptions session_options;
    Ort::Session session(env, "model.onnx", session_options);
    
    std::vector<int64_t> input_shape = {1, 5000};
    Ort::MemoryInfo memory_info("Cpu", OrtDeviceAllocator, 0, OrtMemTypeDefault);
    Ort::Value input_tensor = Ort::Value::CreateTensor<float>(memory_info, vector.data(), vector.size(), input_shape.data(), input_shape.size());
    
    std::vector<const char*> input_names = {"input"};
    std::vector<const char*> output_names = {"output"};
    auto output_tensors = session.Run(Ort::RunOptions{nullptr}, input_names.data(), &input_tensor, 1, output_names.data(), 1);
    
    float* output_data = output_tensors[0].GetTensorMutableData<float>();
    std::cout << "C++ ONNX output: Probability = " << output_data[0] << std::endl;
    return 0;
}

✅ 7. Expected Output

C++ ONNX output: Probability = 0.9123

The output shows the probability of the positive class. A probability above 0.5 indicates a positive classification, while below 0.5 indicates a negative classification.

🧠 How to Run a Binary Classifier ONNX Model with Rust

📌 What is this?

This guide explains how to load and run binary classifier ONNX models using Rust and ONNX Runtime, with a focus on efficient text preprocessing and model inference.

Choose your path:

📖 Quick Guide: Follow the step-by-step tutorial below
🚀 Full Example: See complete implementation with tests on GitHub

✅ 1. Prerequisites

🔷 Rust Toolchain

Install Rust using rustup:

curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

Verify installation:

rustc --version
cargo --version

✅ 2. Get Your Model

🔄 Download Repository

Clone our repository to get started:

git clone https://github.com/whitelightning-ai/whitelightning.git
cd whitelightning.ai

📦 Choose Your Model

You have two options:

Use Pre-trained Model:
- Navigate to the models directory
- Copy the binary classifier model files to your project:
  - model.onnx - The ONNX model file
  - model_vocab.json - TF-IDF vocabulary
  - model_scaler.json - Feature scaling parameters
Train Your Own Model:
- Follow the training guide in the repository
- Use the provided scripts to train a custom binary classifier
- Export your model to ONNX format

✅ 3. Create a New Project

cargo new binary_classifier
cd binary_classifier

✅ 4. Add Dependencies

🔹 Cargo.toml

[package]
name = "binary_classifier"
version = "0.1.0"
edition = "2021"

[dependencies]
ort = "1.16.0"
serde = { version = "1.0", features = ["derive"] }
serde_json = "1.0"
anyhow = "1.0"
thiserror = "1.0"
ndarray = "0.15"

✅ 5. Project Structure

binary_classifier/
├── src/
│   └── main.rs
├── resources/
│   ├── model.onnx
│   ├── model_vocab.json
│   └── model_scaler.json
└── Cargo.toml

✅ 6. Rust Code Example

use anyhow::Result;
use ort::{Environment, Session, SessionBuilder, Value};
use serde_json::Value as JsonValue;
use std::collections::HashMap;
use std::fs::File;
use std::io::BufReader;
use std::sync::Arc;
use ndarray::Array2;

struct BinaryClassifier {
    vocab: HashMap<String, usize>,
    idf: Vec<f32>,
    mean: Vec<f32>,
    scale: Vec<f32>,
    session: Session,
}

impl BinaryClassifier {
    fn new(model_path: &str, vocab_path: &str, scaler_path: &str) -> Result<Self> {
        let vocab_file = File::open(vocab_path)?;
        let vocab_reader = BufReader::new(vocab_file);
        let vocab_data: JsonValue = serde_json::from_reader(vocab_reader)?;
        
        let mut vocab = HashMap::new();
        let vocab_obj = vocab_data["vocab"].as_object().unwrap();
        for (key, value) in vocab_obj {
            vocab.insert(key.clone(), value.as_u64().unwrap() as usize);
        }
        
        let idf: Vec<f32> = vocab_data["idf"]
            .as_array()
            .unwrap()
            .iter()
            .map(|v| v.as_f64().unwrap() as f32)
            .collect();

        let scaler_file = File::open(scaler_path)?;
        let scaler_reader = BufReader::new(scaler_file);
        let scaler_data: JsonValue = serde_json::from_reader(scaler_reader)?;
        
        let mean: Vec<f32> = scaler_data["mean"]
            .as_array()
            .unwrap()
            .iter()
            .map(|v| v.as_f64().unwrap() as f32)
            .collect();
            
        let scale: Vec<f32> = scaler_data["scale"]
            .as_array()
            .unwrap()
            .iter()
            .map(|v| v.as_f64().unwrap() as f32)
            .collect();

        let environment = Arc::new(Environment::builder()
            .with_name("binary_classifier")
            .build()?);
        let session = SessionBuilder::new(&environment)?
            .with_model_from_file(model_path)?;

        Ok(BinaryClassifier {
            vocab,
            idf,
            mean,
            scale,
            session,
        })
    }

    fn preprocess_text(&self, text: &str) -> Vec<f32> {
        let mut vector = vec![0.0; 5000];
        let mut word_counts: HashMap<&str, usize> = HashMap::new();

        let text_lower = text.to_lowercase();
        for word in text_lower.split_whitespace() {
            *word_counts.entry(word).or_insert(0) += 1;
        }

        for (word, count) in word_counts {
            if let Some(&idx) = self.vocab.get(word) {
                vector[idx] = count as f32 * self.idf[idx];
            }
        }

        for i in 0..5000 {
            vector[i] = (vector[i] - self.mean[i]) / self.scale[i];
        }

        vector
    }

    fn predict(&self, text: &str) -> Result<f32> {
        let input_data = self.preprocess_text(text);
        let input_array = Array2::from_shape_vec((1, 5000), input_data)?;
        let input_dyn = input_array.into_dyn();
        let input_cow = ndarray::CowArray::from(input_dyn.view());
        let input_tensor = Value::from_array(self.session.allocator(), &input_cow)?;

        let outputs = self.session.run(vec![input_tensor])?;
        let output_view = outputs[0].try_extract::<f32>()?;
        let output_data = output_view.view();
        
        Ok(output_data[[0, 0]])
    }
}

fn main() -> Result<()> {
    let classifier = BinaryClassifier::new(
        "spam_classifier/model.onnx",
        "spam_classifier/vocab.json",
        "spam_classifier/scaler.json",
    )?;

    let text = "Act now! Get 70% off on all products. Visit our site today!";
    let probability = classifier.predict(text)?;
    
    println!("Rust ONNX output: Probability = {:.4}", probability);
    println!("Classification: {}", 
        if probability > 0.5 { "Positive" } else { "Negative" }
    );

    Ok(())
}

✅ 6. Build and Run

🔷 Build the Project

cargo build --release

🟩 Run the Classifier

./target/release/binary_classifier

✅ 7. Expected Output

Rust ONNX output: Probability = 0.9123
Classification: Positive

The output shows the probability of the positive class. A probability above 0.5 indicates a positive classification, while below 0.5 indicates a negative classification.

🧠 How to Run a Binary Classifier ONNX Model with Java

📌 What is this?

This guide explains how to load and run binary classifier ONNX models using Java and ONNX Runtime Java API, with a focus on efficient text preprocessing and model inference.

Choose your path:

📖 Quick Guide: Follow the step-by-step tutorial below
🚀 Full Example: See complete implementation with tests on GitHub

✅ 1. Prerequisites

🔷 Java Development Kit (JDK)

Install JDK 17 or later:

🐧 Linux

✅ Installation via package manager (Ubuntu/Debian):

sudo apt update
sudo apt install openjdk-17-jdk -y

📦 Download from Oracle website:

.tar.gz archive: jdk-17.0.15_linux-x64_bin.tar.gz
.deb package: jdk-17.0.15_linux-x64_bin.deb
.rpm package: jdk-17.0.15_linux-x64_bin.rpm

🔗 Alternative sources:

Adoptium (Temurin)
OpenLogic
Liberica JDK

🪟 Windows

📥 Download from Oracle:

.exe installer: jdk-17.0.15_windows-x64_bin.exe
.msi installer: jdk-17.0.15_windows-x64_bin.msi
.zip archive: jdk-17.0.15_windows-x64_bin.zip

🔗 Alternative sources:

Adoptium (Temurin)
Microsoft Build of OpenJDK

🍏 macOS

📥 Download from Oracle:

For Intel (x64):

.dmg installer: jdk-17.0.15_macos-x64_bin.dmg
.tar.gz archive: jdk-17.0.15_macos-x64_bin.tar.gz

For Apple Silicon (ARM64):

.dmg installer: jdk-17.0.15_macos-aarch64_bin.dmg
.tar.gz archive: jdk-17.0.15_macos-aarch64_bin.tar.gz

🔗 Alternative sources:

Adoptium (Temurin)
Liberica JDK

Verify installation:

java -version
javac -version

🟩 Maven

Install Maven for dependency management:

Download from: https://maven.apache.org/download.cgi
Verify installation:

mvn -version

✅ 2. Project Setup

🔄 Create Maven Project

mvn archetype:generate -DgroupId=com.example -DartifactId=binary-classifier -DarchetypeArtifactId=maven-archetype-quickstart -DinteractiveMode=false

✅ 3. Add Dependencies

🔹 pom.xml

<dependencies>
    <dependency>
        <groupId>com.microsoft.onnxruntime</groupId>
        <artifactId>onnxruntime</artifactId>
        <version>1.16.3</version>
    </dependency>
    <dependency>
        <groupId>org.json</groupId>
        <artifactId>json</artifactId>
        <version>20231013</version>
    </dependency>
</dependencies>

✅ 4. Project Structure

binary-classifier/
├── src/
│   ├── main/
│   │   ├── java/
│   │   │   └── com/
│   │   │       └── example/
│   │   │           └── BinaryClassifier.java
│   │   └── resources/
│   │       ├── model.onnx
│   │       ├── model_vocab.json
│   │       └── model_scaler.json
│   └── test/
└── pom.xml

✅ 5. Java Code Example

import ai.onnxruntime.*;
import org.json.JSONObject;
import java.nio.file.Files;
import java.nio.file.Paths;
import java.util.*;

public class BinaryClassifier {
    private Map<String, Integer> vocab;
    private List<Float> idf;
    private List<Float> mean;
    private List<Float> scale;
    private OrtSession session;

    public BinaryClassifier(String modelPath, String vocabPath, String scalerPath) throws Exception {
        // Load vocabulary and IDF weights
        String vocabJson = new String(Files.readAllBytes(Paths.get(vocabPath)));
        JSONObject vocabData = new JSONObject(vocabJson);
        this.vocab = new HashMap<>();
        JSONObject vocabObj = vocabData.getJSONObject("vocab");
        for (String key : vocabObj.keySet()) {
            this.vocab.put(key, vocabObj.getInt(key));
        }
        this.idf = new ArrayList<>();
        vocabData.getJSONArray("idf").forEach(item -> this.idf.add(((Number) item).floatValue()));

        // Load scaling parameters
        String scalerJson = new String(Files.readAllBytes(Paths.get(scalerPath)));
        JSONObject scalerData = new JSONObject(scalerJson);
        this.mean = new ArrayList<>();
        this.scale = new ArrayList<>();
        scalerData.getJSONArray("mean").forEach(item -> this.mean.add(((Number) item).floatValue()));
        scalerData.getJSONArray("scale").forEach(item -> this.scale.add(((Number) item).floatValue()));

        // Initialize ONNX Runtime session
        OrtEnvironment env = OrtEnvironment.getEnvironment();
        this.session = env.createSession(modelPath, new OrtSession.SessionOptions());
    }

    private float[] preprocessText(String text) {
        float[] vector = new float[5000];
        Map<String, Integer> wordCounts = new HashMap<>();

        // Count word frequencies
        for (String word : text.toLowerCase().split("\\s+")) {
            wordCounts.put(word, wordCounts.getOrDefault(word, 0) + 1);
        }

        // Compute TF-IDF
        for (Map.Entry<String, Integer> entry : wordCounts.entrySet()) {
            Integer idx = vocab.get(entry.getKey());
            if (idx != null) {
                vector[idx] = entry.getValue() * idf.get(idx);
            }
        }

        // Scale features
        for (int i = 0; i < 5000; i++) {
            vector[i] = (vector[i] - mean.get(i)) / scale.get(i);
        }

        return vector;
    }

    public float predict(String text) throws OrtException {
        float[] inputData = preprocessText(text);
        float[][] inputArray = new float[1][5000];
        inputArray[0] = inputData;

        OnnxTensor inputTensor = OnnxTensor.createTensor(OrtEnvironment.getEnvironment(), inputArray);
        String inputName = session.getInputNames().iterator().next();
        OrtSession.Result result = session.run(Collections.singletonMap(inputName, inputTensor));

        float[][] outputArray = (float[][]) result.get(0).getValue();
        return outputArray[0][0];
    }

    public static void main(String[] args) {
        try {
            BinaryClassifier classifier = new BinaryClassifier(
                "src/main/resources/model.onnx",
                "src/main/resources/model_vocab.json",
                "src/main/resources/model_scaler.json"
            );

            String text = "This is a positive test string";
            float probability = classifier.predict(text);
            System.out.printf("Java ONNX output: Probability = %.4f%n", probability);
            System.out.println("Classification: " + (probability > 0.5 ? "Positive" : "Negative"));

        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

✅ 6. Build and Run

🔷 Build the Project

mvn clean package

🟩 Run the Classifier

java -cp target/binary-classifier-1.0-SNAPSHOT.jar com.example.BinaryClassifier

✅ 7. Expected Output

Java ONNX output: Probability = 0.9123
Classification: Positive

The output shows the probability of the positive class. A probability above 0.5 indicates a positive classification, while below 0.5 indicates a negative classification.

🧠 How to Run a Binary Classifier ONNX Model with Dart

📌 What is this?

This guide explains how to load and run binary classifier ONNX models using Dart, with comprehensive text preprocessing, system monitoring, and performance analysis.

Choose your path:

📖 Quick Guide: Follow the step-by-step tutorial below
🚀 Full Example: See complete implementation with tests on GitHub

✅ 1. Prerequisites

🔷 Dart SDK

Install Dart SDK:

🪟 Windows

choco install dart-sdk

Or download from: https://dart.dev/get-dart

🍏 macOS

brew tap dart-lang/dart
brew install dart

🐧 Linux

sudo apt-get update
sudo apt-get install apt-transport-https
wget -qO- https://dl-ssl.google.com/linux/linux_signing_key.pub | sudo apt-key add -
wget -qO- https://storage.googleapis.com/download.dartlang.org/linux/debian/dart_stable.list | sudo tee /etc/apt/sources.list.d/dart_stable.list
sudo apt-get update
sudo apt-get install dart

Verify installation:

dart --version

✅ 2. Project Setup

🔄 Create Flutter Project

flutter create binary_classifier_dart
cd binary_classifier_dart

✅ 3. Project Structure

binary_classifier_dart/
├── lib/
│   └── main.dart
├── assets/
│   ├── model.onnx
│   ├── vocab.json
│   └── scaler.json
└── pubspec.yaml

✅ 4. Add Dependencies

🔹 pubspec.yaml

name: binary_classifier_dart
description: "Binary Classifier ONNX Test"
publish_to: 'none'

version: 1.0.0+1

environment:
  sdk: '>=3.0.0 <4.0.0'

dependencies:
  flutter:
    sdk: flutter
  onnxruntime: ^1.4.1
  path_provider: ^2.1.2
  image_picker: ^1.0.7
  image: ^4.1.7
  collection: ^1.18.0
  http: ^1.2.0
  shared_preferences: ^2.2.2
  fl_chart: ^0.60.0
  cupertino_icons: ^1.0.8

dev_dependencies:
  flutter_test:
    sdk: flutter
  flutter_lints: ^4.0.0

flutter:
  uses-material-design: true
  
  assets:
    - model.onnx
    - vocab.json
    - scaler.json

✅ 5. Get Your Model

🔄 Download Repository

Clone our repository to get started:

git clone https://github.com/whitelightning-ai/whitelightning.git
cd whitelightning.ai

📦 Choose Your Model

You have two options:

Use Pre-trained Model:
- Navigate to the models directory
- Copy the binary classifier model files to your project:
  - model.onnx - The ONNX model file
  - vocab.json - TF-IDF vocabulary
  - scaler.json - Feature scaling parameters
Train Your Own Model:
- Follow the training guide in the repository
- Use the provided scripts to train a custom binary classifier
- Export your model to ONNX format

✅ 6. Dart Code Example

Create lib/main.dart:

import 'dart:convert';
import 'dart:typed_data';
import 'package:flutter/material.dart';
import 'package:flutter/services.dart';
import 'package:onnxruntime/onnxruntime.dart';

class BinaryClassifier {
  late OrtSession _session;
  late Map<String, int> _vocab;
  late List<double> _idf;
  late List<double> _mean;
  late List<double> _scale;

  Future<void> initialize() async {
    // Load model
    final modelBytes = await rootBundle.load('model.onnx');
    final sessionOptions = OrtSessionOptions();
    _session = OrtSession.fromBuffer(modelBytes.buffer.asUint8List(), sessionOptions);

    // Load vocabulary
    final vocabString = await rootBundle.loadString('vocab.json');
    final vocabData = json.decode(vocabString);
    _vocab = Map<String, int>.from(vocabData['vocab']);
    _idf = (vocabData['idf'] as List).map((e) => (e as num).toDouble()).toList();

    // Load scaler
    final scalerString = await rootBundle.loadString('scaler.json');
    final scalerData = json.decode(scalerString);
    _mean = (scalerData['mean'] as List).map((e) => (e as num).toDouble()).toList();
    _scale = (scalerData['scale'] as List).map((e) => (e as num).toDouble()).toList();
  }

  Float32List _preprocessText(String text) {
    final tf = List<double>.filled(_vocab.length, 0.0);
    final words = text.toLowerCase().split(RegExp(r'\s+'));
    
    for (final word in words) {
      final idx = _vocab[word];
      if (idx != null) {
        tf[idx] += 1.0;
      }
    }
    
    final tfSum = tf.reduce((a, b) => a + b);
    if (tfSum > 0) {
      for (int i = 0; i < tf.length; i++) {
        tf[i] = tf[i] / tfSum;
      }
    }

    final tfidf = List<double>.generate(tf.length, (i) => tf[i] * _idf[i]);
    final tfidfScaled = List<double>.generate(tfidf.length, (i) => (tfidf[i] - _mean[i]) / _scale[i]);

    return Float32List.fromList(tfidfScaled);
  }

  Future<double> predict(String text) async {
    final inputVector = _preprocessText(text);
    final inputOrt = OrtValueTensor.createTensorWithDataAsFloat32List(
      [1, inputVector.length], 
      inputVector
    );
    
    final inputs = {'input': inputOrt};
    final runOptions = OrtRunOptions();
    final outputs = await _session.runAsync(runOptions, inputs);
    
    final output = outputs[0]?.value as List<List<double>>;
    return output[0][0];
  }

  void dispose() {
    _session.release();
  }
}

void main() {
  runApp(MyApp());
}

class MyApp extends StatelessWidget {
  @override
  Widget build(BuildContext context) {
    return MaterialApp(
      title: 'Binary Classifier',
      theme: ThemeData(primarySwatch: Colors.blue),
      home: BinaryClassifierPage(),
    );
  }
}

class BinaryClassifierPage extends StatefulWidget {
  @override
  _BinaryClassifierPageState createState() => _BinaryClassifierPageState();
}

class _BinaryClassifierPageState extends State<BinaryClassifierPage> {
  final _classifier = BinaryClassifier();
  final _textController = TextEditingController();
  String _result = '';
  bool _isLoading = false;

  @override
  void initState() {
    super.initState();
    _initializeClassifier();
  }

  Future<void> _initializeClassifier() async {
    await _classifier.initialize();
  }

  Future<void> _classifyText() async {
    setState(() {
      _isLoading = true;
    });

    try {
      final probability = await _classifier.predict(_textController.text);
      final sentiment = probability > 0.5 ? 'Positive' : 'Negative';
      
      setState(() {
        _result = 'Prediction: $sentiment\nProbability: ${probability.toStringAsFixed(4)}';
      });
    } catch (e) {
      setState(() {
        _result = 'Error: $e';
      });
    } finally {
      setState(() {
        _isLoading = false;
      });
    }
  }

  @override
  void dispose() {
    _classifier.dispose();
    _textController.dispose();
    super.dispose();
  }

  @override
  Widget build(BuildContext context) {
    return Scaffold(
      appBar: AppBar(title: Text('Binary Classifier')),
      body: Padding(
        padding: EdgeInsets.all(16.0),
        child: Column(
          children: [
            TextField(
              controller: _textController,
              decoration: InputDecoration(
                hintText: 'Enter text to classify',
                border: OutlineInputBorder(),
              ),
              maxLines: 3,
            ),
            SizedBox(height: 16),
            ElevatedButton(
              onPressed: _isLoading ? null : _classifyText,
              child: _isLoading ? CircularProgressIndicator() : Text('Classify'),
            ),
            SizedBox(height: 16),
            Text(_result, style: TextStyle(fontSize: 16)),
          ],
        ),
      ),
    );
  }
}

✅ 7. Run the Application

🔷 Build and Run

dart run bin/binary_classifier.dart

Or with custom text:

dart run bin/binary_classifier.dart "This is a positive test message"

✅ 8. Expected Output

🤖 ONNX BINARY CLASSIFIER - DART IMPLEMENTATION
📊  RESULTS:
   🏆 Predicted Sentiment: Positive
   📈 Confidence: 99.98% (0.9998)
   
   (45.0ms total - Target: <100ms)

This example shows comprehensive system monitoring and performance analysis. The Dart implementation includes detailed preprocessing steps, system information gathering, and performance benchmarking with real-time metrics.

Running Binary Classifier Models

🧪 Preprocessing: Crafting the Vector

Text Input

TF-IDF Magic

Scaling

Output

💻 Running Guide

🧠 How to Run a Binary Classifier ONNX Model with Python: Full Beginner-Friendly Guide

📌 What is this?

✅ 1. Install Python

🔷 Windows

🍏 macOS

🐧 Linux (Ubuntu/Debian)

✅ 2. Get Your Model

🔷 Download the Repository

🟩 3 Choose Your Model

✅ 4. Project Setup

✅ 5. Install Required Python Libraries

✅ 6. Prepare Supporting Files

🔹 model_vocab.json (TF-IDF vocabulary)

🔹 model_scaler.json (normalization parameters)

🔹 model.onnx

✅ 7. Create the Python Script run_onnx.py

✅ 8. Run the Script

🔷 Windows

🍏 macOS/Linux

✅ 9. Expected Output

🧠 How to Run a Binary Classifier ONNX Model with JavaScript (Browser or Node.js)

📌 What is this?

✅ 1. Choose Your Runtime

✅ 2. Requirements

🔷 For browser

🟩 For Node.js

✅ 3. Get Your Model

🔄 Download Repository

📦 Choose Your Model

✅ 4. Folder Setup

✅ 5. Sample model_vocab.json

✅ 6. Sample model_scaler.json

✅ 7. JavaScript Code (run.js)

✅ 8. Run in Browser (option A)

🔹 index.html

✅ 9. Run with Node.js (option B)

🔹 Modify run.js for Node

✅ 10. Expected Output

🧠 How to Run an ONNX Model with C using ONNX Runtime and cJSON

📌 What is this?

✅ 1. Prerequisites

🔷 C Compiler

🟩 ONNX Runtime C Library

📦 Install cJSON

✅ 2. Choose Your Model

🔄 Download Repository

📦 Choose Your Model

✅ 3. Folder Structure

✅ 4. Build Command

🔷 macOS

🐧 Linux

✅ 5. Run the Executable

🔷 macOS

🐧 Linux

✅ 6. C Code Example

✅ 7. Expected Output

🧠 How to Run a Binary Classifier ONNX Model with C++

📌 What is this?

✅ 1. Prerequisites

🔷 C++ Compiler

🟩 ONNX Runtime C++ Library

📦 Install nlohmann/json

✅ 2. Choose Your Model

🔄 Download Repository

📦 Choose Your Model

✅ 3. Folder Structure

✅ 4. Build Command

🔷 macOS

🐧 Linux

✅ 5. Run the Executable

🔷 macOS

🐧 Linux

✅ 6. C++ Code Example