Running Binary Classifier Models

WhiteLightning.ai Intro

πŸ§ͺ Preprocessing: Crafting the Vector

1

Text Input

Start with your string (e.g., "This is a positive test").

2

TF-IDF Magic

Map words to a 5000-feature space using a pre-trained vocabulary and IDF weights (exported as _vocab.json).

3

Scaling

Normalize the features with mean and scale values (from _scaler.json) to keep the brew balanced.

4

Output

A 5000-element float32 array, ready to pour into the ONNX model.

πŸ’» Running Guide

🧠 How to Run a Binary Classifier ONNX Model with Python: Full Beginner-Friendly Guide

πŸ“Œ What is this?

This guide walks you through running a binary classifier ONNX model using Python, starting from scratch β€” including Python installation, setting up dependencies, and running the model for binary classification tasks.

Choose your path:

βœ… 1. Install Python

πŸ”· Windows
  1. Go to: https://www.python.org/downloads/windows
  2. Download the latest Python 3.11+ installer
  3. During installation, check βœ… Add Python to PATH
  4. After installation, check if it worked:
python --version
🍏 macOS

You have two options to install Python:

  1. Option 1 - Official Website:
  2. Option 2 - Homebrew:
    • Install Homebrew (if you don't have it):
    /bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"

    Then install Python:

    brew install python@3.11

After installation, check if it worked:

python3 --version

Note: macOS uses python3, not python.

🐧 Linux (Ubuntu/Debian)

You have two options to install Python:

  1. Option 1 - Package Manager:
    sudo apt update
    sudo apt install python3 python3-pip
  2. Option 2 - Official Website:
    tar -xf Python-3.11.x.tar.xz
    cd Python-3.11.x
    ./configure
    make
    sudo make install

After installation, check if it worked:

python3 --version

βœ… 2. Get Your Model

πŸ”· Download the Repository
git clone https://github.com/whitelightning-ai/whitelightning.git
cd whitelightning.ai
🟩 3 Choose Your Model

You have two options:

  1. Use Pre-trained Model:
    • Navigate to the models directory
    • Copy these files to your project's src/main/resources directory:
      • model.onnx
      • model_vocab.json
      • model_scaler.json
  2. Train Your Own Model:
    • Follow the training guide in the repository
    • Use the provided scripts to train your custom binary classifier
    • Export your model to ONNX format

βœ… 4. Project Setup

mkdir binary_classifier_demo
cd binary_classifier_demo

Folder structure:

binary_classifier_demo/
β”œβ”€β”€ model.onnx
β”œβ”€β”€ model_vocab.json
β”œβ”€β”€ model_scaler.json
└── run_onnx.py

βœ… 5. Install Required Python Libraries

pip install onnxruntime numpy

On macOS or Linux, you might need to run pip3 install instead.

βœ… 6. Prepare Supporting Files

πŸ”Ή model_vocab.json (TF-IDF vocabulary)
{
  "vocab": {
    "the": 0,
    "government": 1,
    "announced": 2,
    "new": 3,
    "policies": 4
  },
  "idf": [1.2, 2.1, 1.8, 1.5, 2.3]
}
πŸ”Ή model_scaler.json (normalization parameters)
{
  "mean": [0.1, 0.2, 0.3, 0.4, 0.5],
  "scale": [1.1, 1.2, 1.3, 1.4, 1.5]
}

These values are used to normalize the TF-IDF features.

πŸ”Ή model.onnx

Place your trained binary classifier ONNX model here. It should accept a (1, 5000) input tensor of float32.

βœ… 7. Create the Python Script run_onnx.py

Use the code example below:

import json
import numpy as np
import onnxruntime as ort

# --- Preprocessing: TF-IDF + Scaling ---
def preprocess_text(text, vocab_file, scaler_file):
    # Load vocabulary and IDF weights
    with open('model_vocab.json', 'r') as f:
        vocab = json.load(f)
    with open('model_scaler.json', 'r') as f:
        scaler = json.load(f)
    idf = vocab['idf']
    word2idx = vocab['vocab']
    mean = np.array(scaler['mean'], dtype=np.float32)
    scale = np.array(scaler['scale'], dtype=np.float32)

    # Compute term frequency (TF)
    tf = np.zeros(len(word2idx), dtype=np.float32)
    words = text.lower().split()
    for word in words:
        idx = word2idx.get(word)
        if idx is not None:
            tf[idx] += 1
    if tf.sum() > 0:
        tf = tf / tf.sum()  # Normalize TF

    # TF-IDF
    tfidf = tf * np.array(idf, dtype=np.float32)

    # Standardize
    tfidf_scaled = (tfidf - mean) / scale
    return tfidf_scaled.astype(np.float32)

# Example usage
text = "This is a positive test"
vector = preprocess_text(text, 'vocab.json', 'scaler.json')  # 5000-dim float32

# --- ONNX Inference ---
session = ort.InferenceSession('model.onnx')
input_name = session.get_inputs()[0].name
output_name = session.get_outputs()[0].name
input_data = vector.reshape(1, -1)
outputs = session.run([output_name], {input_name: input_data})

probability = outputs[0][0][0]  # Probability of positive class
print(f'Python ONNX output: Probability = {probability:.4f}')

βœ… 8. Run the Script

πŸ”· Windows
python run_onnx.py
🍏 macOS/Linux
python3 run_onnx.py

βœ… 9. Expected Output

Python ONNX output: Probability = 0.9123

The output shows the probability of the positive class. In this example, the model predicted a 91.23% probability of the text belonging to the positive class. A probability above 0.5 indicates a positive classification, while below 0.5 indicates a negative classification.

🧠 How to Run a Binary Classifier ONNX Model with JavaScript (Browser or Node.js)

πŸ“Œ What is this?

This guide explains how to load and run binary classifier ONNX models using JavaScript and ONNX Runtime Web, covering both browser and Node.js environments.

Choose your path:

βœ… 1. Choose Your Runtime

You can run ONNX models in JavaScript in two ways:

Environment Description Recommended For
βœ… Browser Uses WebAssembly or WebGL Web apps, frontend demos
βœ… Node.js Uses Node runtime (CPU only) Backend/CLI usage

βœ… 2. Requirements

πŸ”· For browser

No install β€” just include the library from a CDN or bundle via npm.

🟩 For Node.js

Install Node.js:

  1. Download from: https://nodejs.org/
  2. Check installation:
node -v
npm -v

Then install ONNX Runtime:

npm install onnxruntime-web

βœ… 3. Get Your Model

πŸ”„ Download Repository

Clone our repository to get started:

git clone https://github.com/whitelightning-ai/whitelightning.git
cd whitelightning.ai
πŸ“¦ Choose Your Model

You have two options:

  1. Use Pre-trained Model:
    • Navigate to the models directory
    • Copy the binary classifier model files to your project:
      • model.onnx - The ONNX model file
      • model_vocab.json - TF-IDF vocabulary
      • model_scaler.json - Feature scaling parameters
  2. Train Your Own Model:
    • Follow the training guide in the repository
    • Use the provided scripts to train a custom binary classifier
    • Export your model to ONNX format

βœ… 4. Folder Setup

mkdir binary_classifier_demo
cd binary_classifier_demo

Files you'll need:

binary_classifier_demo/
β”œβ”€β”€ index.html            # For browser use
β”œβ”€β”€ run.js                # Main logic
β”œβ”€β”€ model.onnx
β”œβ”€β”€ model_vocab.json
└── model_scaler.json

βœ… 5. Sample model_vocab.json

{
  "vocab": {
    "the": 0,
    "government": 1,
    "announced": 2,
    "new": 3,
    "policies": 4
  },
  "idf": [1.2, 2.1, 1.8, 1.5, 2.3]
}

βœ… 6. Sample model_scaler.json

{
  "mean": [0.1, 0.2, 0.3, 0.4, 0.5],
  "scale": [1.1, 1.2, 1.3, 1.4, 1.5]
}

βœ… 7. JavaScript Code (run.js)

Works in both browser and Node.js (with minor changes)

async function preprocessText(text, vocabUrl, scalerUrl) {
const tfidfResp = await fetch(vocabUrl);
const tfidfData = await tfidfResp.json();
const vocab = tfidfData.vocab;
const idf = tfidfData.idf;

const scalerResp = await fetch(scalerUrl);
const scalerData = await scalerResp.json();
const mean = scalerData.mean;
const scale = scalerData.scale;

// TF-IDF
const vector = new Float32Array(5000).fill(0);
const words = text.toLowerCase().split(/\s+/);
const wordCounts = {};
words.forEach(word => wordCounts[word] = (wordCounts[word] || 0);
for (const word in wordCounts) {
    if (vocab[word] !== undefined) {
        vector[vocab[word]] = wordCounts[word] * idf[vocab[word]];
    }
}

# Scale
for (let i = 0; i < 5000; i++) {
    vector[i] = (vector[i] - mean[i]) / scale[i];
}
return vector;
}

async function runModel(text) {
const session = await ort.InferenceSession.create("model.onnx");
const vector = await preprocessText(text, "model_vocab.json", "model_scaler.json");
const tensor = new ort.Tensor("float32", vector, [1, 5000]);
const feeds = { input: tensor };
const output = await session.run(feeds);
const probability = output[Object.keys(output)[0]].data[0];
console.log(`JS ONNX output: Probability = ${probability.toFixed(4)}`);
}

runModel("This is a positive test string");

βœ… 8. Run in Browser (option A)

πŸ”Ή index.html
<!DOCTYPE html>
<html lang="en">
<head>
  <meta charset="UTF-8" />
  <title>ONNX JS Inference</title>
</head>
<body>
  <h1>Running ONNX Model...</h1>
  <script src="https://cdn.jsdelivr.net/npm/onnxruntime-web/dist/ort.min.js"></script>
  <script type="module" src="run.js"></script>
</body>
</html>

πŸ“¦ Start a local server (required due to fetch):

npx serve .
# OR
python3 -m http.server

Visit: http://localhost:3000

βœ… 9. Run with Node.js (option B)

πŸ”Ή Modify run.js for Node
import * as ort from 'onnxruntime-node';
import fs from 'fs/promises';

async function loadJSON(path) {
  const data = await fs.readFile(path, 'utf-8');
  return JSON.parse(data);
}

// Keep rest of logic same from previous example

πŸ“¦ Run it:

node run.js

βœ… 10. Expected Output

JS ONNX output: Probability = 0.9123

The output shows the probability of the positive class. A probability above 0.5 indicates a positive classification, while below 0.5 indicates a negative classification.

🧠 How to Run an ONNX Model with C using ONNX Runtime and cJSON

πŸ“Œ What is this?

This guide explains how to load and run ONNX models using C, ONNX Runtime C API, and cJSON for JSON parsing.

Choose your path:

βœ… 1. Prerequisites

πŸ”· C Compiler

macOS: clang comes with Xcode Command Line Tools

xcode-select --install

Linux: install gcc

sudo apt install build-essential
🟩 ONNX Runtime C Library

Download ONNX Runtime C API from the official website:

πŸ‘‰ https://github.com/microsoft/onnxruntime/releases

Choose:

onnxruntime-osx-universal2-.tgz   # For macOS
onnxruntime-linux-x64-.tgz        # For Linux
πŸ“¦ Install cJSON

macOS:

brew install cjson

Linux:

sudo apt install libcjson-dev

βœ… 2. Choose Your Model

πŸ”„ Download Repository

Clone our repository to get started:

git clone https://github.com/whitelightning-ai/whitelightning.git
cd whitelightning.ai
πŸ“¦ Choose Your Model

You have two options:

  1. Use Pre-trained Model:
    • Navigate to the models directory
    • Copy the multiclass classifier model files to your project:
      • model.onnx - The ONNX model file
      • model_vocab.json - Tokenizer vocabulary
      • model_labels.json - Class labels mapping
  2. Train Your Own Model:
    • Follow the training guide in the repository
    • Use the provided scripts to train a custom multiclass classifier
    • Export your model to ONNX format

βœ… 3. Folder Structure

project/
β”œβ”€β”€ ONNX_test.c               ← your C code
β”œβ”€β”€ vocab.json                ← tokenizer
β”œβ”€β”€ scaler.json               ← label map
β”œβ”€β”€ model.onnx                ← ONNX model
β”œβ”€β”€ onnxruntime-osx-universal2-1.22.0/
β”‚   β”œβ”€β”€ include/
β”‚   └── lib/

βœ… 4. Build Command

πŸ”· macOS
gcc ONNX_test.c \
  -I./onnxruntime-osx-universal2-1.22.0/include \
  -L./onnxruntime-osx-universal2-1.22.0/lib \
  -lonnxruntime \
  -lcjson \
  -o onnx_test
🐧 Linux

Replace the onnxruntime-osx-... path with onnxruntime-linux-x64-....

βœ… 5. Run the Executable

πŸ”· macOS

Important: You must set the library path.

export DYLD_LIBRARY_PATH=./onnxruntime-osx-universal2-1.22.0/lib:$DYLD_LIBRARY_PATH
./onnx_test
🐧 Linux
export LD_LIBRARY_PATH=./onnxruntime-linux-x64-1.22.0/lib:$LD_LIBRARY_PATH
./onnx_test

βœ… 6. C Code Example

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>
#include "onnxruntime-osx-universal2-1.22.0/include/onnxruntime_c_api.h"
#include <cjson/cJSON.h>

const OrtApi* g_ort = NULL;

float* preprocess_text(const char* text, const char* vocab_file, const char* scaler_file) {
    float* vector = calloc(5000, sizeof(float));
    if (!vector) return NULL;

    FILE* f = fopen(vocab_file, "r");
    if (!f) return NULL;

    fseek(f, 0, SEEK_END);
    long len = ftell(f);
    fseek(f, 0, SEEK_SET);
    char* json_str = malloc(len + 1);
    fread(json_str, 1, len, f);
    json_str[len] = 0;
    fclose(f);

    cJSON* tfidf_data = cJSON_Parse(json_str);
    if (!tfidf_data) {
        free(json_str);
        return NULL;
    }

    cJSON* vocab = cJSON_GetObjectItem(tfidf_data, "vocab");
    cJSON* idf = cJSON_GetObjectItem(tfidf_data, "idf");
    if (!vocab || !idf) {
        free(json_str);
        cJSON_Delete(tfidf_data);
        return NULL;
    }

    f = fopen(scaler_file, "r");
    if (!f) {
        free(json_str);
        cJSON_Delete(tfidf_data);
        return NULL;
    }

    fseek(f, 0, SEEK_END);
    len = ftell(f);
    fseek(f, 0, SEEK_SET);
    char* scaler_str = malloc(len + 1);
    fread(scaler_str, 1, len, f);
    scaler_str[len] = 0;
    fclose(f);

    cJSON* scaler_data = cJSON_Parse(scaler_str);
    if (!scaler_data) {
        free(json_str);
        free(scaler_str);
        cJSON_Delete(tfidf_data);
        return NULL;
    }

    cJSON* mean = cJSON_GetObjectItem(scaler_data, "mean");
    cJSON* scale = cJSON_GetObjectItem(scaler_data, "scale");
    if (!mean || !scale) {
        free(json_str);
        free(scaler_str);
        cJSON_Delete(tfidf_data);
        cJSON_Delete(scaler_data);
        return NULL;
    }

    char* text_copy = strdup(text);
    for (char* p = text_copy; *p; p++) *p = tolower(*p);

    char* word = strtok(text_copy, " \t\n");
    while (word) {
        cJSON* idx = cJSON_GetObjectItem(vocab, word);
        if (idx) {
            int i = idx->valueint;
            if (i < 5000) {
                vector[i] += cJSON_GetArrayItem(idf, i)->valuedouble;
            }
        }
        word = strtok(NULL, " \t\n");
    }

    for (int i = 0; i < 5000; i++) {
        vector[i] = (vector[i] - cJSON_GetArrayItem(mean, i)->valuedouble) / 
                    cJSON_GetArrayItem(scale, i)->valuedouble;
    }

    free(text_copy);
    free(json_str);
    free(scaler_str);
    cJSON_Delete(tfidf_data);
    cJSON_Delete(scaler_data);
    return vector;
}

int main() {
    g_ort = OrtGetApiBase()->GetApi(ORT_API_VERSION);
    if (!g_ort) return 1;

    const char* text = "Earn $5000 a week from home β€” no experience required!";
    float* vector = preprocess_text(text, "vocab.json", "scaler.json");
    if (!vector) return 1;

    OrtEnv* env;
    OrtStatus* status = g_ort->CreateEnv(ORT_LOGGING_LEVEL_WARNING, "test", &env);
    if (status) return 1;

    OrtSessionOptions* session_options;
    status = g_ort->CreateSessionOptions(&session_options);
    if (status) return 1;

    OrtSession* session;
    status = g_ort->CreateSession(env, "model.onnx", session_options, &session);
    if (status) return 1;

    OrtMemoryInfo* memory_info;
    status = g_ort->CreateCpuMemoryInfo(OrtArenaAllocator, OrtMemTypeDefault, &memory_info);
    if (status) return 1;

    int64_t input_shape[] = {1, 5000};
    OrtValue* input_tensor;
    status = g_ort->CreateTensorWithDataAsOrtValue(memory_info, vector, 5000 * sizeof(float), 
                                                 input_shape, 2, ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT, 
                                                 &input_tensor);
    if (status) return 1;

    const char* input_names[] = {"float_input"};
    const char* output_names[] = {"output"};
    OrtValue* output_tensor = NULL;
    status = g_ort->Run(session, NULL, input_names, (const OrtValue* const*)&input_tensor, 1, 
                       output_names, 1, &output_tensor);
    if (status) return 1;

    float* output_data;
    status = g_ort->GetTensorMutableData(output_tensor, (void**)&output_data);
    if (status) return 1;

    printf("C ONNX output: %s (Score: %.4f)\n", 
           output_data[0] > 0.5 ? "Spam" : "Not Spam", 
           output_data[0]);

    g_ort->ReleaseValue(input_tensor);
    g_ort->ReleaseValue(output_tensor);
    g_ort->ReleaseMemoryInfo(memory_info);
    g_ort->ReleaseSession(session);
    g_ort->ReleaseSessionOptions(session_options);
    g_ort->ReleaseEnv(env);

    free(vector);
    return 0;
}


βœ… 7. Expected Output

C ONNX output: Spam (Score: 0.9123)

🧠 How to Run a Binary Classifier ONNX Model with C++

πŸ“Œ What is this?

This guide explains how to load and run binary classifier ONNX models using C++ and ONNX Runtime C++ API, with a focus on efficient text preprocessing and model inference.

Choose your path:

βœ… 1. Prerequisites

πŸ”· C++ Compiler

macOS: clang++ comes with Xcode Command Line Tools

xcode-select --install

Linux: install g++

sudo apt install build-essential
🟩 ONNX Runtime C++ Library

Download ONNX Runtime C++ API from the official website:

πŸ‘‰ https://github.com/microsoft/onnxruntime/releases

Choose:

onnxruntime-osx-universal2-.tgz   # For macOS
onnxruntime-linux-x64-.tgz        # For Linux
πŸ“¦ Install nlohmann/json

macOS:

brew install nlohmann-json

Linux:

sudo apt install nlohmann-json3-dev

βœ… 2. Choose Your Model

πŸ”„ Download Repository

Clone our repository to get started:

git clone https://github.com/whitelightning-ai/whitelightning.git
cd whitelightning.ai
πŸ“¦ Choose Your Model

You have two options:

  1. Use Pre-trained Model:
    • Navigate to the models directory
    • Copy the binary classifier model files to your project:
      • model.onnx - The ONNX model file
      • model_vocab.json - TF-IDF vocabulary
      • model_scaler.json - Feature scaling parameters
  2. Train Your Own Model:
    • Follow the training guide in the repository
    • Use the provided scripts to train a custom binary classifier
    • Export your model to ONNX format

βœ… 3. Folder Structure

project/
β”œβ”€β”€ main.cpp                ← your C++ code
β”œβ”€β”€ model_vocab.json        ← TF-IDF vocabulary
β”œβ”€β”€ model_scaler.json       ← scaling parameters
β”œβ”€β”€ model.onnx             ← ONNX model
β”œβ”€β”€ onnxruntime-osx-universal2-1.22.0/
β”‚   β”œβ”€β”€ include/
β”‚   └── lib/

βœ… 4. Build Command

πŸ”· macOS
g++ -std=c++17 main.cpp \
  -I./onnxruntime-osx-universal2-1.22.0/include \
  -L./onnxruntime-osx-universal2-1.22.0/lib \
  -lonnxruntime \
  -o binary_classifier
🐧 Linux

Replace the onnxruntime-osx-... path with onnxruntime-linux-x64-....

βœ… 5. Run the Executable

πŸ”· macOS

Important: You must set the library path.

export DYLD_LIBRARY_PATH=./onnxruntime-osx-universal2-1.22.0/lib:$DYLD_LIBRARY_PATH
./binary_classifier
🐧 Linux
export LD_LIBRARY_PATH=./onnxruntime-linux-x64-1.22.0/lib:$LD_LIBRARY_PATH
./binary_classifier

βœ… 6. C++ Code Example

#include<onnxruntime_cxx_api.h>
#include <fstream>
#include <nlohmann/json.hpp>
using json = nlohmann::json;

std::vector<float> preprocess_text(const std::string& text, const std::string& vocab_file, const std::string& scaler_file) {
    std::vector<float> vector(5000, 0.0f);
    
    std::ifstream vf(vocab_file);
    json tfidf_data; vf >> tfidf_data;
    auto vocab = tfidf_data["vocab"];
    std::vector<float> idf = tfidf_data["idf"];
    
    std::ifstream sf(scaler_file);
    json scaler_data; sf >> scaler_data;
    std::vector<float> mean = scaler_data["mean"];
    std::vector<float> scale = scaler_data["scale"];
    
    # TF-IDF
    std::string text_lower = text;
    std::transform(text_lower.begin(), text_lower.end(), text_lower.begin(), ::tolower);
    std::map<std::string, int> word_counts;
    size_t start = 0, end;
    while ((end = text_lower.find(' ', start)) != std::string::npos) {
        if (end > start) word_counts[text_lower.substr(start, end - start)]++;
        start = end + 1;
    }
    if (start < text_lower.length()) word_counts[text_lower.substr(start)]++;
    for (const auto& [word, count] : word_counts) {
        if (vocab.contains(word)) {
            vector[vocab[word]] = count * idf[vocab[word]];
        }
    }
    
    # Scale
    for (int i = 0; i < 5000; i++) {
        vector[i] = (vector[i] - mean[i]) / scale[i];
    }
    return vector;
}

int main() {
    std::string text = "This is a positive test string";
    auto vector = preprocess_text(text, "model_vocab.json", "model_scaler.json");
    
    Ort::Env env(ORT_LOGGING_LEVEL_WARNING, "test");
    Ort::SessionOptions session_options;
    Ort::Session session(env, "model.onnx", session_options);
    
    std::vector<int64_t> input_shape = {1, 5000};
    Ort::MemoryInfo memory_info("Cpu", OrtDeviceAllocator, 0, OrtMemTypeDefault);
    Ort::Value input_tensor = Ort::Value::CreateTensor<float>(memory_info, vector.data(), vector.size(), input_shape.data(), input_shape.size());
    
    std::vector<const char*> input_names = {"input"};
    std::vector<const char*> output_names = {"output"};
    auto output_tensors = session.Run(Ort::RunOptions{nullptr}, input_names.data(), &input_tensor, 1, output_names.data(), 1);
    
    float* output_data = output_tensors[0].GetTensorMutableData<float>();
    std::cout << "C++ ONNX output: Probability = " << output_data[0] << std::endl;
    return 0;
}

βœ… 7. Expected Output

C++ ONNX output: Probability = 0.9123

The output shows the probability of the positive class. A probability above 0.5 indicates a positive classification, while below 0.5 indicates a negative classification.

🧠 How to Run a Binary Classifier ONNX Model with Rust

πŸ“Œ What is this?

This guide explains how to load and run binary classifier ONNX models using Rust and ONNX Runtime, with a focus on efficient text preprocessing and model inference.

Choose your path:

βœ… 1. Prerequisites

πŸ”· Rust Toolchain

Install Rust using rustup:

curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

Verify installation:

rustc --version
cargo --version

βœ… 2. Get Your Model

πŸ”„ Download Repository

Clone our repository to get started:

git clone https://github.com/whitelightning-ai/whitelightning.git
cd whitelightning.ai
πŸ“¦ Choose Your Model

You have two options:

  1. Use Pre-trained Model:
    • Navigate to the models directory
    • Copy the binary classifier model files to your project:
      • model.onnx - The ONNX model file
      • model_vocab.json - TF-IDF vocabulary
      • model_scaler.json - Feature scaling parameters
  2. Train Your Own Model:
    • Follow the training guide in the repository
    • Use the provided scripts to train a custom binary classifier
    • Export your model to ONNX format

βœ… 3. Create a New Project

cargo new binary_classifier
cd binary_classifier

βœ… 4. Add Dependencies

πŸ”Ή Cargo.toml
[package]
name = "binary_classifier"
version = "0.1.0"
edition = "2021"

[dependencies]
ort = "1.16.0"
serde = { version = "1.0", features = ["derive"] }
serde_json = "1.0"
anyhow = "1.0"
thiserror = "1.0"
ndarray = "0.15"

βœ… 5. Project Structure

binary_classifier/
β”œβ”€β”€ src/
β”‚   └── main.rs
β”œβ”€β”€ resources/
β”‚   β”œβ”€β”€ model.onnx
β”‚   β”œβ”€β”€ model_vocab.json
β”‚   └── model_scaler.json
└── Cargo.toml

βœ… 6. Rust Code Example

use anyhow::Result;
use ort::{Environment, Session, SessionBuilder, Value};
use serde_json::Value as JsonValue;
use std::collections::HashMap;
use std::fs::File;
use std::io::BufReader;
use std::sync::Arc;
use ndarray::Array2;

struct BinaryClassifier {
    vocab: HashMap<String, usize>,
    idf: Vec<f32>,
    mean: Vec<f32>,
    scale: Vec<f32>,
    session: Session,
}

impl BinaryClassifier {
    fn new(model_path: &str, vocab_path: &str, scaler_path: &str) -> Result<Self> {
        let vocab_file = File::open(vocab_path)?;
        let vocab_reader = BufReader::new(vocab_file);
        let vocab_data: JsonValue = serde_json::from_reader(vocab_reader)?;
        
        let mut vocab = HashMap::new();
        let vocab_obj = vocab_data["vocab"].as_object().unwrap();
        for (key, value) in vocab_obj {
            vocab.insert(key.clone(), value.as_u64().unwrap() as usize);
        }
        
        let idf: Vec<f32> = vocab_data["idf"]
            .as_array()
            .unwrap()
            .iter()
            .map(|v| v.as_f64().unwrap() as f32)
            .collect();

        let scaler_file = File::open(scaler_path)?;
        let scaler_reader = BufReader::new(scaler_file);
        let scaler_data: JsonValue = serde_json::from_reader(scaler_reader)?;
        
        let mean: Vec<f32> = scaler_data["mean"]
            .as_array()
            .unwrap()
            .iter()
            .map(|v| v.as_f64().unwrap() as f32)
            .collect();
            
        let scale: Vec<f32> = scaler_data["scale"]
            .as_array()
            .unwrap()
            .iter()
            .map(|v| v.as_f64().unwrap() as f32)
            .collect();

        let environment = Arc::new(Environment::builder()
            .with_name("binary_classifier")
            .build()?);
        let session = SessionBuilder::new(&environment)?
            .with_model_from_file(model_path)?;

        Ok(BinaryClassifier {
            vocab,
            idf,
            mean,
            scale,
            session,
        })
    }

    fn preprocess_text(&self, text: &str) -> Vec<f32> {
        let mut vector = vec![0.0; 5000];
        let mut word_counts: HashMap<&str, usize> = HashMap::new();

        let text_lower = text.to_lowercase();
        for word in text_lower.split_whitespace() {
            *word_counts.entry(word).or_insert(0) += 1;
        }

        for (word, count) in word_counts {
            if let Some(&idx) = self.vocab.get(word) {
                vector[idx] = count as f32 * self.idf[idx];
            }
        }

        for i in 0..5000 {
            vector[i] = (vector[i] - self.mean[i]) / self.scale[i];
        }

        vector
    }

    fn predict(&self, text: &str) -> Result<f32> {
        let input_data = self.preprocess_text(text);
        let input_array = Array2::from_shape_vec((1, 5000), input_data)?;
        let input_dyn = input_array.into_dyn();
        let input_cow = ndarray::CowArray::from(input_dyn.view());
        let input_tensor = Value::from_array(self.session.allocator(), &input_cow)?;

        let outputs = self.session.run(vec![input_tensor])?;
        let output_view = outputs[0].try_extract::<f32>()?;
        let output_data = output_view.view();
        
        Ok(output_data[[0, 0]])
    }
}

fn main() -> Result<()> {
    let classifier = BinaryClassifier::new(
        "spam_classifier/model.onnx",
        "spam_classifier/vocab.json",
        "spam_classifier/scaler.json",
    )?;

    let text = "Act now! Get 70% off on all products. Visit our site today!";
    let probability = classifier.predict(text)?;
    
    println!("Rust ONNX output: Probability = {:.4}", probability);
    println!("Classification: {}", 
        if probability > 0.5 { "Positive" } else { "Negative" }
    );

    Ok(())
}

βœ… 6. Build and Run

πŸ”· Build the Project
cargo build --release
🟩 Run the Classifier
./target/release/binary_classifier

βœ… 7. Expected Output

Rust ONNX output: Probability = 0.9123
Classification: Positive

The output shows the probability of the positive class. A probability above 0.5 indicates a positive classification, while below 0.5 indicates a negative classification.

🧠 How to Run a Binary Classifier ONNX Model with Java

πŸ“Œ What is this?

This guide explains how to load and run binary classifier ONNX models using Java and ONNX Runtime Java API, with a focus on efficient text preprocessing and model inference.

Choose your path:

βœ… 1. Prerequisites

πŸ”· Java Development Kit (JDK)

Install JDK 17 or later:

🐧 Linux

βœ… Installation via package manager (Ubuntu/Debian):

sudo apt update
sudo apt install openjdk-17-jdk -y

πŸ“¦ Download from Oracle website:

  • .tar.gz archive: jdk-17.0.15_linux-x64_bin.tar.gz
  • .deb package: jdk-17.0.15_linux-x64_bin.deb
  • .rpm package: jdk-17.0.15_linux-x64_bin.rpm

πŸ”— Alternative sources:

  • Adoptium (Temurin)
  • OpenLogic
  • Liberica JDK
πŸͺŸ Windows

πŸ“₯ Download from Oracle:

  • .exe installer: jdk-17.0.15_windows-x64_bin.exe
  • .msi installer: jdk-17.0.15_windows-x64_bin.msi
  • .zip archive: jdk-17.0.15_windows-x64_bin.zip

πŸ”— Alternative sources:

  • Adoptium (Temurin)
  • Microsoft Build of OpenJDK
🍏 macOS

πŸ“₯ Download from Oracle:

For Intel (x64):

  • .dmg installer: jdk-17.0.15_macos-x64_bin.dmg
  • .tar.gz archive: jdk-17.0.15_macos-x64_bin.tar.gz

For Apple Silicon (ARM64):

  • .dmg installer: jdk-17.0.15_macos-aarch64_bin.dmg
  • .tar.gz archive: jdk-17.0.15_macos-aarch64_bin.tar.gz

πŸ”— Alternative sources:

  • Adoptium (Temurin)
  • Liberica JDK

Verify installation:

java -version
javac -version
🟩 Maven

Install Maven for dependency management:

  1. Download from: https://maven.apache.org/download.cgi
  2. Verify installation:
mvn -version

βœ… 2. Project Setup

πŸ”„ Create Maven Project
mvn archetype:generate -DgroupId=com.example -DartifactId=binary-classifier -DarchetypeArtifactId=maven-archetype-quickstart -DinteractiveMode=false

βœ… 3. Add Dependencies

πŸ”Ή pom.xml
<dependencies>
    <dependency>
        <groupId>com.microsoft.onnxruntime</groupId>
        <artifactId>onnxruntime</artifactId>
        <version>1.16.3</version>
    </dependency>
    <dependency>
        <groupId>org.json</groupId>
        <artifactId>json</artifactId>
        <version>20231013</version>
    </dependency>
</dependencies>

βœ… 4. Project Structure

binary-classifier/
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ main/
β”‚   β”‚   β”œβ”€β”€ java/
β”‚   β”‚   β”‚   └── com/
β”‚   β”‚   β”‚       └── example/
β”‚   β”‚   β”‚           └── BinaryClassifier.java
β”‚   β”‚   └── resources/
β”‚   β”‚       β”œβ”€β”€ model.onnx
β”‚   β”‚       β”œβ”€β”€ model_vocab.json
β”‚   β”‚       └── model_scaler.json
β”‚   └── test/
└── pom.xml

βœ… 5. Java Code Example

import ai.onnxruntime.*;
import org.json.JSONObject;
import java.nio.file.Files;
import java.nio.file.Paths;
import java.util.*;

public class BinaryClassifier {
    private Map<String, Integer> vocab;
    private List<Float> idf;
    private List<Float> mean;
    private List<Float> scale;
    private OrtSession session;

    public BinaryClassifier(String modelPath, String vocabPath, String scalerPath) throws Exception {
        // Load vocabulary and IDF weights
        String vocabJson = new String(Files.readAllBytes(Paths.get(vocabPath)));
        JSONObject vocabData = new JSONObject(vocabJson);
        this.vocab = new HashMap<>();
        JSONObject vocabObj = vocabData.getJSONObject("vocab");
        for (String key : vocabObj.keySet()) {
            this.vocab.put(key, vocabObj.getInt(key));
        }
        this.idf = new ArrayList<>();
        vocabData.getJSONArray("idf").forEach(item -> this.idf.add(((Number) item).floatValue()));

        // Load scaling parameters
        String scalerJson = new String(Files.readAllBytes(Paths.get(scalerPath)));
        JSONObject scalerData = new JSONObject(scalerJson);
        this.mean = new ArrayList<>();
        this.scale = new ArrayList<>();
        scalerData.getJSONArray("mean").forEach(item -> this.mean.add(((Number) item).floatValue()));
        scalerData.getJSONArray("scale").forEach(item -> this.scale.add(((Number) item).floatValue()));

        // Initialize ONNX Runtime session
        OrtEnvironment env = OrtEnvironment.getEnvironment();
        this.session = env.createSession(modelPath, new OrtSession.SessionOptions());
    }

    private float[] preprocessText(String text) {
        float[] vector = new float[5000];
        Map<String, Integer> wordCounts = new HashMap<>();

        // Count word frequencies
        for (String word : text.toLowerCase().split("\\s+")) {
            wordCounts.put(word, wordCounts.getOrDefault(word, 0) + 1);
        }

        // Compute TF-IDF
        for (Map.Entry<String, Integer> entry : wordCounts.entrySet()) {
            Integer idx = vocab.get(entry.getKey());
            if (idx != null) {
                vector[idx] = entry.getValue() * idf.get(idx);
            }
        }

        // Scale features
        for (int i = 0; i < 5000; i++) {
            vector[i] = (vector[i] - mean.get(i)) / scale.get(i);
        }

        return vector;
    }

    public float predict(String text) throws OrtException {
        float[] inputData = preprocessText(text);
        float[][] inputArray = new float[1][5000];
        inputArray[0] = inputData;

        OnnxTensor inputTensor = OnnxTensor.createTensor(OrtEnvironment.getEnvironment(), inputArray);
        String inputName = session.getInputNames().iterator().next();
        OrtSession.Result result = session.run(Collections.singletonMap(inputName, inputTensor));

        float[][] outputArray = (float[][]) result.get(0).getValue();
        return outputArray[0][0];
    }

    public static void main(String[] args) {
        try {
            BinaryClassifier classifier = new BinaryClassifier(
                "src/main/resources/model.onnx",
                "src/main/resources/model_vocab.json",
                "src/main/resources/model_scaler.json"
            );

            String text = "This is a positive test string";
            float probability = classifier.predict(text);
            System.out.printf("Java ONNX output: Probability = %.4f%n", probability);
            System.out.println("Classification: " + (probability > 0.5 ? "Positive" : "Negative"));

        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

βœ… 6. Build and Run

πŸ”· Build the Project
mvn clean package
🟩 Run the Classifier
java -cp target/binary-classifier-1.0-SNAPSHOT.jar com.example.BinaryClassifier

βœ… 7. Expected Output

Java ONNX output: Probability = 0.9123
Classification: Positive

The output shows the probability of the positive class. A probability above 0.5 indicates a positive classification, while below 0.5 indicates a negative classification.

🧠 How to Run a Binary Classifier ONNX Model with Dart

πŸ“Œ What is this?

This guide explains how to load and run binary classifier ONNX models using Dart, with comprehensive text preprocessing, system monitoring, and performance analysis.

Choose your path:

βœ… 1. Prerequisites

πŸ”· Dart SDK

Install Dart SDK:

πŸͺŸ Windows
choco install dart-sdk

Or download from: https://dart.dev/get-dart

🍏 macOS
brew tap dart-lang/dart
brew install dart
🐧 Linux
sudo apt-get update
sudo apt-get install apt-transport-https
wget -qO- https://dl-ssl.google.com/linux/linux_signing_key.pub | sudo apt-key add -
wget -qO- https://storage.googleapis.com/download.dartlang.org/linux/debian/dart_stable.list | sudo tee /etc/apt/sources.list.d/dart_stable.list
sudo apt-get update
sudo apt-get install dart

Verify installation:

dart --version

βœ… 2. Project Setup

πŸ”„ Create Flutter Project
flutter create binary_classifier_dart
cd binary_classifier_dart

βœ… 3. Project Structure

binary_classifier_dart/
β”œβ”€β”€ lib/
β”‚   └── main.dart
β”œβ”€β”€ assets/
β”‚   β”œβ”€β”€ model.onnx
β”‚   β”œβ”€β”€ vocab.json
β”‚   └── scaler.json
└── pubspec.yaml

βœ… 4. Add Dependencies

πŸ”Ή pubspec.yaml
name: binary_classifier_dart
description: "Binary Classifier ONNX Test"
publish_to: 'none'

version: 1.0.0+1

environment:
  sdk: '>=3.0.0 <4.0.0'

dependencies:
  flutter:
    sdk: flutter
  onnxruntime: ^1.4.1
  path_provider: ^2.1.2
  image_picker: ^1.0.7
  image: ^4.1.7
  collection: ^1.18.0
  http: ^1.2.0
  shared_preferences: ^2.2.2
  fl_chart: ^0.60.0
  cupertino_icons: ^1.0.8

dev_dependencies:
  flutter_test:
    sdk: flutter
  flutter_lints: ^4.0.0

flutter:
  uses-material-design: true
  
  assets:
    - model.onnx
    - vocab.json
    - scaler.json

βœ… 5. Get Your Model

πŸ”„ Download Repository

Clone our repository to get started:

git clone https://github.com/whitelightning-ai/whitelightning.git
cd whitelightning.ai
πŸ“¦ Choose Your Model

You have two options:

  1. Use Pre-trained Model:
    • Navigate to the models directory
    • Copy the binary classifier model files to your project:
      • model.onnx - The ONNX model file
      • vocab.json - TF-IDF vocabulary
      • scaler.json - Feature scaling parameters
  2. Train Your Own Model:
    • Follow the training guide in the repository
    • Use the provided scripts to train a custom binary classifier
    • Export your model to ONNX format

βœ… 6. Dart Code Example

Create lib/main.dart:

import 'dart:convert';
import 'dart:typed_data';
import 'package:flutter/material.dart';
import 'package:flutter/services.dart';
import 'package:onnxruntime/onnxruntime.dart';

class BinaryClassifier {
  late OrtSession _session;
  late Map<String, int> _vocab;
  late List<double> _idf;
  late List<double> _mean;
  late List<double> _scale;

  Future<void> initialize() async {
    // Load model
    final modelBytes = await rootBundle.load('model.onnx');
    final sessionOptions = OrtSessionOptions();
    _session = OrtSession.fromBuffer(modelBytes.buffer.asUint8List(), sessionOptions);

    // Load vocabulary
    final vocabString = await rootBundle.loadString('vocab.json');
    final vocabData = json.decode(vocabString);
    _vocab = Map<String, int>.from(vocabData['vocab']);
    _idf = (vocabData['idf'] as List).map((e) => (e as num).toDouble()).toList();

    // Load scaler
    final scalerString = await rootBundle.loadString('scaler.json');
    final scalerData = json.decode(scalerString);
    _mean = (scalerData['mean'] as List).map((e) => (e as num).toDouble()).toList();
    _scale = (scalerData['scale'] as List).map((e) => (e as num).toDouble()).toList();
  }

  Float32List _preprocessText(String text) {
    final tf = List<double>.filled(_vocab.length, 0.0);
    final words = text.toLowerCase().split(RegExp(r'\s+'));
    
    for (final word in words) {
      final idx = _vocab[word];
      if (idx != null) {
        tf[idx] += 1.0;
      }
    }
    
    final tfSum = tf.reduce((a, b) => a + b);
    if (tfSum > 0) {
      for (int i = 0; i < tf.length; i++) {
        tf[i] = tf[i] / tfSum;
      }
    }

    final tfidf = List<double>.generate(tf.length, (i) => tf[i] * _idf[i]);
    final tfidfScaled = List<double>.generate(tfidf.length, (i) => (tfidf[i] - _mean[i]) / _scale[i]);

    return Float32List.fromList(tfidfScaled);
  }

  Future<double> predict(String text) async {
    final inputVector = _preprocessText(text);
    final inputOrt = OrtValueTensor.createTensorWithDataAsFloat32List(
      [1, inputVector.length], 
      inputVector
    );
    
    final inputs = {'input': inputOrt};
    final runOptions = OrtRunOptions();
    final outputs = await _session.runAsync(runOptions, inputs);
    
    final output = outputs[0]?.value as List<List<double>>;
    return output[0][0];
  }

  void dispose() {
    _session.release();
  }
}

void main() {
  runApp(MyApp());
}

class MyApp extends StatelessWidget {
  @override
  Widget build(BuildContext context) {
    return MaterialApp(
      title: 'Binary Classifier',
      theme: ThemeData(primarySwatch: Colors.blue),
      home: BinaryClassifierPage(),
    );
  }
}

class BinaryClassifierPage extends StatefulWidget {
  @override
  _BinaryClassifierPageState createState() => _BinaryClassifierPageState();
}

class _BinaryClassifierPageState extends State<BinaryClassifierPage> {
  final _classifier = BinaryClassifier();
  final _textController = TextEditingController();
  String _result = '';
  bool _isLoading = false;

  @override
  void initState() {
    super.initState();
    _initializeClassifier();
  }

  Future<void> _initializeClassifier() async {
    await _classifier.initialize();
  }

  Future<void> _classifyText() async {
    setState(() {
      _isLoading = true;
    });

    try {
      final probability = await _classifier.predict(_textController.text);
      final sentiment = probability > 0.5 ? 'Positive' : 'Negative';
      
      setState(() {
        _result = 'Prediction: $sentiment\nProbability: ${probability.toStringAsFixed(4)}';
      });
    } catch (e) {
      setState(() {
        _result = 'Error: $e';
      });
    } finally {
      setState(() {
        _isLoading = false;
      });
    }
  }

  @override
  void dispose() {
    _classifier.dispose();
    _textController.dispose();
    super.dispose();
  }

  @override
  Widget build(BuildContext context) {
    return Scaffold(
      appBar: AppBar(title: Text('Binary Classifier')),
      body: Padding(
        padding: EdgeInsets.all(16.0),
        child: Column(
          children: [
            TextField(
              controller: _textController,
              decoration: InputDecoration(
                hintText: 'Enter text to classify',
                border: OutlineInputBorder(),
              ),
              maxLines: 3,
            ),
            SizedBox(height: 16),
            ElevatedButton(
              onPressed: _isLoading ? null : _classifyText,
              child: _isLoading ? CircularProgressIndicator() : Text('Classify'),
            ),
            SizedBox(height: 16),
            Text(_result, style: TextStyle(fontSize: 16)),
          ],
        ),
      ),
    );
  }
}

βœ… 7. Run the Application

πŸ”· Build and Run
dart run bin/binary_classifier.dart

Or with custom text:

dart run bin/binary_classifier.dart "This is a positive test message"

βœ… 8. Expected Output

πŸ€– ONNX BINARY CLASSIFIER - DART IMPLEMENTATION
πŸ“Š  RESULTS:
   πŸ† Predicted Sentiment: Positive
   πŸ“ˆ Confidence: 99.98% (0.9998)
   
   (45.0ms total - Target: <100ms)

This example shows comprehensive system monitoring and performance analysis. The Dart implementation includes detailed preprocessing steps, system information gathering, and performance benchmarking with real-time metrics.