Running Binary Classifier Models
π§ͺ Preprocessing: Crafting the Vector
Text Input
Start with your string (e.g., "This is a positive test").
TF-IDF Magic
Map words to a 5000-feature space using a pre-trained vocabulary and IDF weights (exported as _vocab.json
).
Scaling
Normalize the features with mean and scale values (from _scaler.json
) to keep the brew balanced.
Output
A 5000-element float32
array, ready to pour into the ONNX model.
π» Running Guide
π§ How to Run a Binary Classifier ONNX Model with Python: Full Beginner-Friendly Guide
π What is this?
This guide walks you through running a binary classifier ONNX model using Python, starting from scratch β including Python installation, setting up dependencies, and running the model for binary classification tasks.
Choose your path:
- π Quick Guide: Follow the step-by-step tutorial below
- π Full Example: See complete implementation with tests on GitHub
β 1. Install Python
π· Windows
- Go to: https://www.python.org/downloads/windows
- Download the latest Python 3.11+ installer
- During installation, check β Add Python to PATH
- After installation, check if it worked:
python --version
π macOS
You have two options to install Python:
- Option 1 - Official Website:
- Visit: https://www.python.org/downloads/macos/
- Download the latest Python 3.11+ installer for macOS
- Run the installer package and follow the installation wizard
- Option 2 - Homebrew:
- Install Homebrew (if you don't have it):
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
Then install Python:
brew install python@3.11
After installation, check if it worked:
python3 --version
Note: macOS uses python3, not python.
π§ Linux (Ubuntu/Debian)
You have two options to install Python:
- Option 1 - Package Manager:
sudo apt update sudo apt install python3 python3-pip
- Option 2 - Official Website:
- Visit: https://www.python.org/downloads/source/
- Download the latest Python 3.11+ source tarball
- Extract and build from source:
tar -xf Python-3.11.x.tar.xz cd Python-3.11.x ./configure make sudo make install
After installation, check if it worked:
python3 --version
β 2. Get Your Model
π· Download the Repository
git clone https://github.com/whitelightning-ai/whitelightning.git
cd whitelightning.ai
π© 3 Choose Your Model
You have two options:
- Use Pre-trained Model:
- Navigate to the
models
directory - Copy these files to your project's
src/main/resources
directory:model.onnx
model_vocab.json
model_scaler.json
- Navigate to the
- Train Your Own Model:
- Follow the training guide in the repository
- Use the provided scripts to train your custom binary classifier
- Export your model to ONNX format
β 4. Project Setup
mkdir binary_classifier_demo
cd binary_classifier_demo
Folder structure:
binary_classifier_demo/
βββ model.onnx
βββ model_vocab.json
βββ model_scaler.json
βββ run_onnx.py
β 5. Install Required Python Libraries
pip install onnxruntime numpy
On macOS or Linux, you might need to run pip3 install instead.
β 6. Prepare Supporting Files
πΉ model_vocab.json (TF-IDF vocabulary)
{
"vocab": {
"the": 0,
"government": 1,
"announced": 2,
"new": 3,
"policies": 4
},
"idf": [1.2, 2.1, 1.8, 1.5, 2.3]
}
πΉ model_scaler.json (normalization parameters)
{
"mean": [0.1, 0.2, 0.3, 0.4, 0.5],
"scale": [1.1, 1.2, 1.3, 1.4, 1.5]
}
These values are used to normalize the TF-IDF features.
πΉ model.onnx
Place your trained binary classifier ONNX model here. It should accept a (1, 5000) input tensor of float32.
β 7. Create the Python Script run_onnx.py
Use the code example below:
import json
import numpy as np
import onnxruntime as ort
# --- Preprocessing: TF-IDF + Scaling ---
def preprocess_text(text, vocab_file, scaler_file):
# Load vocabulary and IDF weights
with open('model_vocab.json', 'r') as f:
vocab = json.load(f)
with open('model_scaler.json', 'r') as f:
scaler = json.load(f)
idf = vocab['idf']
word2idx = vocab['vocab']
mean = np.array(scaler['mean'], dtype=np.float32)
scale = np.array(scaler['scale'], dtype=np.float32)
# Compute term frequency (TF)
tf = np.zeros(len(word2idx), dtype=np.float32)
words = text.lower().split()
for word in words:
idx = word2idx.get(word)
if idx is not None:
tf[idx] += 1
if tf.sum() > 0:
tf = tf / tf.sum() # Normalize TF
# TF-IDF
tfidf = tf * np.array(idf, dtype=np.float32)
# Standardize
tfidf_scaled = (tfidf - mean) / scale
return tfidf_scaled.astype(np.float32)
# Example usage
text = "This is a positive test"
vector = preprocess_text(text, 'vocab.json', 'scaler.json') # 5000-dim float32
# --- ONNX Inference ---
session = ort.InferenceSession('model.onnx')
input_name = session.get_inputs()[0].name
output_name = session.get_outputs()[0].name
input_data = vector.reshape(1, -1)
outputs = session.run([output_name], {input_name: input_data})
probability = outputs[0][0][0] # Probability of positive class
print(f'Python ONNX output: Probability = {probability:.4f}')
β 8. Run the Script
π· Windows
python run_onnx.py
π macOS/Linux
python3 run_onnx.py
β 9. Expected Output
Python ONNX output: Probability = 0.9123
The output shows the probability of the positive class. In this example, the model predicted a 91.23% probability of the text belonging to the positive class. A probability above 0.5 indicates a positive classification, while below 0.5 indicates a negative classification.
π§ How to Run a Binary Classifier ONNX Model with JavaScript (Browser or Node.js)
π What is this?
This guide explains how to load and run binary classifier ONNX models using JavaScript and ONNX Runtime Web, covering both browser and Node.js environments.
Choose your path:
- π Quick Guide: Follow the step-by-step tutorial below
- π Full Example: See complete implementation with tests on GitHub
β 1. Choose Your Runtime
You can run ONNX models in JavaScript in two ways:
Environment | Description | Recommended For |
---|---|---|
β Browser | Uses WebAssembly or WebGL | Web apps, frontend demos |
β Node.js | Uses Node runtime (CPU only) | Backend/CLI usage |
β 2. Requirements
π· For browser
No install β just include the library from a CDN or bundle via npm.
π© For Node.js
Install Node.js:
- Download from: https://nodejs.org/
- Check installation:
node -v
npm -v
Then install ONNX Runtime:
npm install onnxruntime-web
β 3. Get Your Model
π Download Repository
Clone our repository to get started:
git clone https://github.com/whitelightning-ai/whitelightning.git
cd whitelightning.ai
π¦ Choose Your Model
You have two options:
- Use Pre-trained Model:
- Navigate to the
models
directory - Copy the binary classifier model files to your project:
model.onnx
- The ONNX model filemodel_vocab.json
- TF-IDF vocabularymodel_scaler.json
- Feature scaling parameters
- Navigate to the
- Train Your Own Model:
- Follow the training guide in the repository
- Use the provided scripts to train a custom binary classifier
- Export your model to ONNX format
β 4. Folder Setup
mkdir binary_classifier_demo
cd binary_classifier_demo
Files you'll need:
binary_classifier_demo/
βββ index.html # For browser use
βββ run.js # Main logic
βββ model.onnx
βββ model_vocab.json
βββ model_scaler.json
β 5. Sample model_vocab.json
{
"vocab": {
"the": 0,
"government": 1,
"announced": 2,
"new": 3,
"policies": 4
},
"idf": [1.2, 2.1, 1.8, 1.5, 2.3]
}
β 6. Sample model_scaler.json
{
"mean": [0.1, 0.2, 0.3, 0.4, 0.5],
"scale": [1.1, 1.2, 1.3, 1.4, 1.5]
}
β 7. JavaScript Code (run.js)
Works in both browser and Node.js (with minor changes)
async function preprocessText(text, vocabUrl, scalerUrl) {
const tfidfResp = await fetch(vocabUrl);
const tfidfData = await tfidfResp.json();
const vocab = tfidfData.vocab;
const idf = tfidfData.idf;
const scalerResp = await fetch(scalerUrl);
const scalerData = await scalerResp.json();
const mean = scalerData.mean;
const scale = scalerData.scale;
// TF-IDF
const vector = new Float32Array(5000).fill(0);
const words = text.toLowerCase().split(/\s+/);
const wordCounts = {};
words.forEach(word => wordCounts[word] = (wordCounts[word] || 0);
for (const word in wordCounts) {
if (vocab[word] !== undefined) {
vector[vocab[word]] = wordCounts[word] * idf[vocab[word]];
}
}
# Scale
for (let i = 0; i < 5000; i++) {
vector[i] = (vector[i] - mean[i]) / scale[i];
}
return vector;
}
async function runModel(text) {
const session = await ort.InferenceSession.create("model.onnx");
const vector = await preprocessText(text, "model_vocab.json", "model_scaler.json");
const tensor = new ort.Tensor("float32", vector, [1, 5000]);
const feeds = { input: tensor };
const output = await session.run(feeds);
const probability = output[Object.keys(output)[0]].data[0];
console.log(`JS ONNX output: Probability = ${probability.toFixed(4)}`);
}
runModel("This is a positive test string");
β 8. Run in Browser (option A)
πΉ index.html
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8" />
<title>ONNX JS Inference</title>
</head>
<body>
<h1>Running ONNX Model...</h1>
<script src="https://cdn.jsdelivr.net/npm/onnxruntime-web/dist/ort.min.js"></script>
<script type="module" src="run.js"></script>
</body>
</html>
π¦ Start a local server (required due to fetch):
npx serve .
# OR
python3 -m http.server
Visit: http://localhost:3000
β 9. Run with Node.js (option B)
πΉ Modify run.js for Node
import * as ort from 'onnxruntime-node';
import fs from 'fs/promises';
async function loadJSON(path) {
const data = await fs.readFile(path, 'utf-8');
return JSON.parse(data);
}
// Keep rest of logic same from previous example
π¦ Run it:
node run.js
β 10. Expected Output
JS ONNX output: Probability = 0.9123
The output shows the probability of the positive class. A probability above 0.5 indicates a positive classification, while below 0.5 indicates a negative classification.
π§ How to Run an ONNX Model with C using ONNX Runtime and cJSON
π What is this?
This guide explains how to load and run ONNX models using C, ONNX Runtime C API, and cJSON for JSON parsing.
Choose your path:
- π Quick Guide: Follow the step-by-step tutorial below
- π Full Example: See complete implementation with tests on GitHub
β 1. Prerequisites
π· C Compiler
macOS: clang comes with Xcode Command Line Tools
xcode-select --install
Linux: install gcc
sudo apt install build-essential
π© ONNX Runtime C Library
Download ONNX Runtime C API from the official website:
π https://github.com/microsoft/onnxruntime/releases
Choose:
onnxruntime-osx-universal2-.tgz # For macOS
onnxruntime-linux-x64-.tgz # For Linux
π¦ Install cJSON
macOS:
brew install cjson
Linux:
sudo apt install libcjson-dev
β 2. Choose Your Model
π Download Repository
Clone our repository to get started:
git clone https://github.com/whitelightning-ai/whitelightning.git
cd whitelightning.ai
π¦ Choose Your Model
You have two options:
- Use Pre-trained Model:
- Navigate to the
models
directory - Copy the multiclass classifier model files to your project:
model.onnx
- The ONNX model filemodel_vocab.json
- Tokenizer vocabularymodel_labels.json
- Class labels mapping
- Navigate to the
- Train Your Own Model:
- Follow the training guide in the repository
- Use the provided scripts to train a custom multiclass classifier
- Export your model to ONNX format
β 3. Folder Structure
project/
βββ ONNX_test.c β your C code
βββ vocab.json β tokenizer
βββ scaler.json β label map
βββ model.onnx β ONNX model
βββ onnxruntime-osx-universal2-1.22.0/
β βββ include/
β βββ lib/
β 4. Build Command
π· macOS
gcc ONNX_test.c \
-I./onnxruntime-osx-universal2-1.22.0/include \
-L./onnxruntime-osx-universal2-1.22.0/lib \
-lonnxruntime \
-lcjson \
-o onnx_test
π§ Linux
Replace the onnxruntime-osx-... path with onnxruntime-linux-x64-....
β 5. Run the Executable
π· macOS
Important: You must set the library path.
export DYLD_LIBRARY_PATH=./onnxruntime-osx-universal2-1.22.0/lib:$DYLD_LIBRARY_PATH
./onnx_test
π§ Linux
export LD_LIBRARY_PATH=./onnxruntime-linux-x64-1.22.0/lib:$LD_LIBRARY_PATH
./onnx_test
β 6. C Code Example
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>
#include "onnxruntime-osx-universal2-1.22.0/include/onnxruntime_c_api.h"
#include <cjson/cJSON.h>
const OrtApi* g_ort = NULL;
float* preprocess_text(const char* text, const char* vocab_file, const char* scaler_file) {
float* vector = calloc(5000, sizeof(float));
if (!vector) return NULL;
FILE* f = fopen(vocab_file, "r");
if (!f) return NULL;
fseek(f, 0, SEEK_END);
long len = ftell(f);
fseek(f, 0, SEEK_SET);
char* json_str = malloc(len + 1);
fread(json_str, 1, len, f);
json_str[len] = 0;
fclose(f);
cJSON* tfidf_data = cJSON_Parse(json_str);
if (!tfidf_data) {
free(json_str);
return NULL;
}
cJSON* vocab = cJSON_GetObjectItem(tfidf_data, "vocab");
cJSON* idf = cJSON_GetObjectItem(tfidf_data, "idf");
if (!vocab || !idf) {
free(json_str);
cJSON_Delete(tfidf_data);
return NULL;
}
f = fopen(scaler_file, "r");
if (!f) {
free(json_str);
cJSON_Delete(tfidf_data);
return NULL;
}
fseek(f, 0, SEEK_END);
len = ftell(f);
fseek(f, 0, SEEK_SET);
char* scaler_str = malloc(len + 1);
fread(scaler_str, 1, len, f);
scaler_str[len] = 0;
fclose(f);
cJSON* scaler_data = cJSON_Parse(scaler_str);
if (!scaler_data) {
free(json_str);
free(scaler_str);
cJSON_Delete(tfidf_data);
return NULL;
}
cJSON* mean = cJSON_GetObjectItem(scaler_data, "mean");
cJSON* scale = cJSON_GetObjectItem(scaler_data, "scale");
if (!mean || !scale) {
free(json_str);
free(scaler_str);
cJSON_Delete(tfidf_data);
cJSON_Delete(scaler_data);
return NULL;
}
char* text_copy = strdup(text);
for (char* p = text_copy; *p; p++) *p = tolower(*p);
char* word = strtok(text_copy, " \t\n");
while (word) {
cJSON* idx = cJSON_GetObjectItem(vocab, word);
if (idx) {
int i = idx->valueint;
if (i < 5000) {
vector[i] += cJSON_GetArrayItem(idf, i)->valuedouble;
}
}
word = strtok(NULL, " \t\n");
}
for (int i = 0; i < 5000; i++) {
vector[i] = (vector[i] - cJSON_GetArrayItem(mean, i)->valuedouble) /
cJSON_GetArrayItem(scale, i)->valuedouble;
}
free(text_copy);
free(json_str);
free(scaler_str);
cJSON_Delete(tfidf_data);
cJSON_Delete(scaler_data);
return vector;
}
int main() {
g_ort = OrtGetApiBase()->GetApi(ORT_API_VERSION);
if (!g_ort) return 1;
const char* text = "Earn $5000 a week from home β no experience required!";
float* vector = preprocess_text(text, "vocab.json", "scaler.json");
if (!vector) return 1;
OrtEnv* env;
OrtStatus* status = g_ort->CreateEnv(ORT_LOGGING_LEVEL_WARNING, "test", &env);
if (status) return 1;
OrtSessionOptions* session_options;
status = g_ort->CreateSessionOptions(&session_options);
if (status) return 1;
OrtSession* session;
status = g_ort->CreateSession(env, "model.onnx", session_options, &session);
if (status) return 1;
OrtMemoryInfo* memory_info;
status = g_ort->CreateCpuMemoryInfo(OrtArenaAllocator, OrtMemTypeDefault, &memory_info);
if (status) return 1;
int64_t input_shape[] = {1, 5000};
OrtValue* input_tensor;
status = g_ort->CreateTensorWithDataAsOrtValue(memory_info, vector, 5000 * sizeof(float),
input_shape, 2, ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT,
&input_tensor);
if (status) return 1;
const char* input_names[] = {"float_input"};
const char* output_names[] = {"output"};
OrtValue* output_tensor = NULL;
status = g_ort->Run(session, NULL, input_names, (const OrtValue* const*)&input_tensor, 1,
output_names, 1, &output_tensor);
if (status) return 1;
float* output_data;
status = g_ort->GetTensorMutableData(output_tensor, (void**)&output_data);
if (status) return 1;
printf("C ONNX output: %s (Score: %.4f)\n",
output_data[0] > 0.5 ? "Spam" : "Not Spam",
output_data[0]);
g_ort->ReleaseValue(input_tensor);
g_ort->ReleaseValue(output_tensor);
g_ort->ReleaseMemoryInfo(memory_info);
g_ort->ReleaseSession(session);
g_ort->ReleaseSessionOptions(session_options);
g_ort->ReleaseEnv(env);
free(vector);
return 0;
}
β 7. Expected Output
C ONNX output: Spam (Score: 0.9123)
π§ How to Run a Binary Classifier ONNX Model with C++
π What is this?
This guide explains how to load and run binary classifier ONNX models using C++ and ONNX Runtime C++ API, with a focus on efficient text preprocessing and model inference.
Choose your path:
- π Quick Guide: Follow the step-by-step tutorial below
- π Full Example: See complete implementation with tests on GitHub
β 1. Prerequisites
π· C++ Compiler
macOS: clang++ comes with Xcode Command Line Tools
xcode-select --install
Linux: install g++
sudo apt install build-essential
π© ONNX Runtime C++ Library
Download ONNX Runtime C++ API from the official website:
π https://github.com/microsoft/onnxruntime/releases
Choose:
onnxruntime-osx-universal2-.tgz # For macOS
onnxruntime-linux-x64-.tgz # For Linux
π¦ Install nlohmann/json
macOS:
brew install nlohmann-json
Linux:
sudo apt install nlohmann-json3-dev
β 2. Choose Your Model
π Download Repository
Clone our repository to get started:
git clone https://github.com/whitelightning-ai/whitelightning.git
cd whitelightning.ai
π¦ Choose Your Model
You have two options:
- Use Pre-trained Model:
- Navigate to the
models
directory - Copy the binary classifier model files to your project:
model.onnx
- The ONNX model filemodel_vocab.json
- TF-IDF vocabularymodel_scaler.json
- Feature scaling parameters
- Navigate to the
- Train Your Own Model:
- Follow the training guide in the repository
- Use the provided scripts to train a custom binary classifier
- Export your model to ONNX format
β 3. Folder Structure
project/
βββ main.cpp β your C++ code
βββ model_vocab.json β TF-IDF vocabulary
βββ model_scaler.json β scaling parameters
βββ model.onnx β ONNX model
βββ onnxruntime-osx-universal2-1.22.0/
β βββ include/
β βββ lib/
β 4. Build Command
π· macOS
g++ -std=c++17 main.cpp \
-I./onnxruntime-osx-universal2-1.22.0/include \
-L./onnxruntime-osx-universal2-1.22.0/lib \
-lonnxruntime \
-o binary_classifier
π§ Linux
Replace the onnxruntime-osx-... path with onnxruntime-linux-x64-....
β 5. Run the Executable
π· macOS
Important: You must set the library path.
export DYLD_LIBRARY_PATH=./onnxruntime-osx-universal2-1.22.0/lib:$DYLD_LIBRARY_PATH
./binary_classifier
π§ Linux
export LD_LIBRARY_PATH=./onnxruntime-linux-x64-1.22.0/lib:$LD_LIBRARY_PATH
./binary_classifier
β 6. C++ Code Example
#include<onnxruntime_cxx_api.h>
#include <fstream>
#include <nlohmann/json.hpp>
using json = nlohmann::json;
std::vector<float> preprocess_text(const std::string& text, const std::string& vocab_file, const std::string& scaler_file) {
std::vector<float> vector(5000, 0.0f);
std::ifstream vf(vocab_file);
json tfidf_data; vf >> tfidf_data;
auto vocab = tfidf_data["vocab"];
std::vector<float> idf = tfidf_data["idf"];
std::ifstream sf(scaler_file);
json scaler_data; sf >> scaler_data;
std::vector<float> mean = scaler_data["mean"];
std::vector<float> scale = scaler_data["scale"];
# TF-IDF
std::string text_lower = text;
std::transform(text_lower.begin(), text_lower.end(), text_lower.begin(), ::tolower);
std::map<std::string, int> word_counts;
size_t start = 0, end;
while ((end = text_lower.find(' ', start)) != std::string::npos) {
if (end > start) word_counts[text_lower.substr(start, end - start)]++;
start = end + 1;
}
if (start < text_lower.length()) word_counts[text_lower.substr(start)]++;
for (const auto& [word, count] : word_counts) {
if (vocab.contains(word)) {
vector[vocab[word]] = count * idf[vocab[word]];
}
}
# Scale
for (int i = 0; i < 5000; i++) {
vector[i] = (vector[i] - mean[i]) / scale[i];
}
return vector;
}
int main() {
std::string text = "This is a positive test string";
auto vector = preprocess_text(text, "model_vocab.json", "model_scaler.json");
Ort::Env env(ORT_LOGGING_LEVEL_WARNING, "test");
Ort::SessionOptions session_options;
Ort::Session session(env, "model.onnx", session_options);
std::vector<int64_t> input_shape = {1, 5000};
Ort::MemoryInfo memory_info("Cpu", OrtDeviceAllocator, 0, OrtMemTypeDefault);
Ort::Value input_tensor = Ort::Value::CreateTensor<float>(memory_info, vector.data(), vector.size(), input_shape.data(), input_shape.size());
std::vector<const char*> input_names = {"input"};
std::vector<const char*> output_names = {"output"};
auto output_tensors = session.Run(Ort::RunOptions{nullptr}, input_names.data(), &input_tensor, 1, output_names.data(), 1);
float* output_data = output_tensors[0].GetTensorMutableData<float>();
std::cout << "C++ ONNX output: Probability = " << output_data[0] << std::endl;
return 0;
}
β 7. Expected Output
C++ ONNX output: Probability = 0.9123
The output shows the probability of the positive class. A probability above 0.5 indicates a positive classification, while below 0.5 indicates a negative classification.
π§ How to Run a Binary Classifier ONNX Model with Rust
π What is this?
This guide explains how to load and run binary classifier ONNX models using Rust and ONNX Runtime, with a focus on efficient text preprocessing and model inference.
Choose your path:
- π Quick Guide: Follow the step-by-step tutorial below
- π Full Example: See complete implementation with tests on GitHub
β 1. Prerequisites
π· Rust Toolchain
Install Rust using rustup:
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
Verify installation:
rustc --version
cargo --version
β 2. Get Your Model
π Download Repository
Clone our repository to get started:
git clone https://github.com/whitelightning-ai/whitelightning.git
cd whitelightning.ai
π¦ Choose Your Model
You have two options:
- Use Pre-trained Model:
- Navigate to the
models
directory - Copy the binary classifier model files to your project:
model.onnx
- The ONNX model filemodel_vocab.json
- TF-IDF vocabularymodel_scaler.json
- Feature scaling parameters
- Navigate to the
- Train Your Own Model:
- Follow the training guide in the repository
- Use the provided scripts to train a custom binary classifier
- Export your model to ONNX format
β 3. Create a New Project
cargo new binary_classifier
cd binary_classifier
β 4. Add Dependencies
πΉ Cargo.toml
[package]
name = "binary_classifier"
version = "0.1.0"
edition = "2021"
[dependencies]
ort = "1.16.0"
serde = { version = "1.0", features = ["derive"] }
serde_json = "1.0"
anyhow = "1.0"
thiserror = "1.0"
ndarray = "0.15"
β 5. Project Structure
binary_classifier/
βββ src/
β βββ main.rs
βββ resources/
β βββ model.onnx
β βββ model_vocab.json
β βββ model_scaler.json
βββ Cargo.toml
β 6. Rust Code Example
use anyhow::Result;
use ort::{Environment, Session, SessionBuilder, Value};
use serde_json::Value as JsonValue;
use std::collections::HashMap;
use std::fs::File;
use std::io::BufReader;
use std::sync::Arc;
use ndarray::Array2;
struct BinaryClassifier {
vocab: HashMap<String, usize>,
idf: Vec<f32>,
mean: Vec<f32>,
scale: Vec<f32>,
session: Session,
}
impl BinaryClassifier {
fn new(model_path: &str, vocab_path: &str, scaler_path: &str) -> Result<Self> {
let vocab_file = File::open(vocab_path)?;
let vocab_reader = BufReader::new(vocab_file);
let vocab_data: JsonValue = serde_json::from_reader(vocab_reader)?;
let mut vocab = HashMap::new();
let vocab_obj = vocab_data["vocab"].as_object().unwrap();
for (key, value) in vocab_obj {
vocab.insert(key.clone(), value.as_u64().unwrap() as usize);
}
let idf: Vec<f32> = vocab_data["idf"]
.as_array()
.unwrap()
.iter()
.map(|v| v.as_f64().unwrap() as f32)
.collect();
let scaler_file = File::open(scaler_path)?;
let scaler_reader = BufReader::new(scaler_file);
let scaler_data: JsonValue = serde_json::from_reader(scaler_reader)?;
let mean: Vec<f32> = scaler_data["mean"]
.as_array()
.unwrap()
.iter()
.map(|v| v.as_f64().unwrap() as f32)
.collect();
let scale: Vec<f32> = scaler_data["scale"]
.as_array()
.unwrap()
.iter()
.map(|v| v.as_f64().unwrap() as f32)
.collect();
let environment = Arc::new(Environment::builder()
.with_name("binary_classifier")
.build()?);
let session = SessionBuilder::new(&environment)?
.with_model_from_file(model_path)?;
Ok(BinaryClassifier {
vocab,
idf,
mean,
scale,
session,
})
}
fn preprocess_text(&self, text: &str) -> Vec<f32> {
let mut vector = vec![0.0; 5000];
let mut word_counts: HashMap<&str, usize> = HashMap::new();
let text_lower = text.to_lowercase();
for word in text_lower.split_whitespace() {
*word_counts.entry(word).or_insert(0) += 1;
}
for (word, count) in word_counts {
if let Some(&idx) = self.vocab.get(word) {
vector[idx] = count as f32 * self.idf[idx];
}
}
for i in 0..5000 {
vector[i] = (vector[i] - self.mean[i]) / self.scale[i];
}
vector
}
fn predict(&self, text: &str) -> Result<f32> {
let input_data = self.preprocess_text(text);
let input_array = Array2::from_shape_vec((1, 5000), input_data)?;
let input_dyn = input_array.into_dyn();
let input_cow = ndarray::CowArray::from(input_dyn.view());
let input_tensor = Value::from_array(self.session.allocator(), &input_cow)?;
let outputs = self.session.run(vec![input_tensor])?;
let output_view = outputs[0].try_extract::<f32>()?;
let output_data = output_view.view();
Ok(output_data[[0, 0]])
}
}
fn main() -> Result<()> {
let classifier = BinaryClassifier::new(
"spam_classifier/model.onnx",
"spam_classifier/vocab.json",
"spam_classifier/scaler.json",
)?;
let text = "Act now! Get 70% off on all products. Visit our site today!";
let probability = classifier.predict(text)?;
println!("Rust ONNX output: Probability = {:.4}", probability);
println!("Classification: {}",
if probability > 0.5 { "Positive" } else { "Negative" }
);
Ok(())
}
β 6. Build and Run
π· Build the Project
cargo build --release
π© Run the Classifier
./target/release/binary_classifier
β 7. Expected Output
Rust ONNX output: Probability = 0.9123
Classification: Positive
The output shows the probability of the positive class. A probability above 0.5 indicates a positive classification, while below 0.5 indicates a negative classification.
π§ How to Run a Binary Classifier ONNX Model with Java
π What is this?
This guide explains how to load and run binary classifier ONNX models using Java and ONNX Runtime Java API, with a focus on efficient text preprocessing and model inference.
Choose your path:
- π Quick Guide: Follow the step-by-step tutorial below
- π Full Example: See complete implementation with tests on GitHub
β 1. Prerequisites
π· Java Development Kit (JDK)
Install JDK 17 or later:
π§ Linux
β Installation via package manager (Ubuntu/Debian):
sudo apt update
sudo apt install openjdk-17-jdk -y
π¦ Download from Oracle website:
- .tar.gz archive: jdk-17.0.15_linux-x64_bin.tar.gz
- .deb package: jdk-17.0.15_linux-x64_bin.deb
- .rpm package: jdk-17.0.15_linux-x64_bin.rpm
π Alternative sources:
- Adoptium (Temurin)
- OpenLogic
- Liberica JDK
πͺ Windows
π₯ Download from Oracle:
- .exe installer: jdk-17.0.15_windows-x64_bin.exe
- .msi installer: jdk-17.0.15_windows-x64_bin.msi
- .zip archive: jdk-17.0.15_windows-x64_bin.zip
π Alternative sources:
- Adoptium (Temurin)
- Microsoft Build of OpenJDK
π macOS
π₯ Download from Oracle:
For Intel (x64):
- .dmg installer: jdk-17.0.15_macos-x64_bin.dmg
- .tar.gz archive: jdk-17.0.15_macos-x64_bin.tar.gz
For Apple Silicon (ARM64):
- .dmg installer: jdk-17.0.15_macos-aarch64_bin.dmg
- .tar.gz archive: jdk-17.0.15_macos-aarch64_bin.tar.gz
π Alternative sources:
- Adoptium (Temurin)
- Liberica JDK
Verify installation:
java -version
javac -version
π© Maven
Install Maven for dependency management:
- Download from: https://maven.apache.org/download.cgi
- Verify installation:
mvn -version
β 2. Project Setup
π Create Maven Project
mvn archetype:generate -DgroupId=com.example -DartifactId=binary-classifier -DarchetypeArtifactId=maven-archetype-quickstart -DinteractiveMode=false
β 3. Add Dependencies
πΉ pom.xml
<dependencies>
<dependency>
<groupId>com.microsoft.onnxruntime</groupId>
<artifactId>onnxruntime</artifactId>
<version>1.16.3</version>
</dependency>
<dependency>
<groupId>org.json</groupId>
<artifactId>json</artifactId>
<version>20231013</version>
</dependency>
</dependencies>
β 4. Project Structure
binary-classifier/
βββ src/
β βββ main/
β β βββ java/
β β β βββ com/
β β β βββ example/
β β β βββ BinaryClassifier.java
β β βββ resources/
β β βββ model.onnx
β β βββ model_vocab.json
β β βββ model_scaler.json
β βββ test/
βββ pom.xml
β 5. Java Code Example
import ai.onnxruntime.*;
import org.json.JSONObject;
import java.nio.file.Files;
import java.nio.file.Paths;
import java.util.*;
public class BinaryClassifier {
private Map<String, Integer> vocab;
private List<Float> idf;
private List<Float> mean;
private List<Float> scale;
private OrtSession session;
public BinaryClassifier(String modelPath, String vocabPath, String scalerPath) throws Exception {
// Load vocabulary and IDF weights
String vocabJson = new String(Files.readAllBytes(Paths.get(vocabPath)));
JSONObject vocabData = new JSONObject(vocabJson);
this.vocab = new HashMap<>();
JSONObject vocabObj = vocabData.getJSONObject("vocab");
for (String key : vocabObj.keySet()) {
this.vocab.put(key, vocabObj.getInt(key));
}
this.idf = new ArrayList<>();
vocabData.getJSONArray("idf").forEach(item -> this.idf.add(((Number) item).floatValue()));
// Load scaling parameters
String scalerJson = new String(Files.readAllBytes(Paths.get(scalerPath)));
JSONObject scalerData = new JSONObject(scalerJson);
this.mean = new ArrayList<>();
this.scale = new ArrayList<>();
scalerData.getJSONArray("mean").forEach(item -> this.mean.add(((Number) item).floatValue()));
scalerData.getJSONArray("scale").forEach(item -> this.scale.add(((Number) item).floatValue()));
// Initialize ONNX Runtime session
OrtEnvironment env = OrtEnvironment.getEnvironment();
this.session = env.createSession(modelPath, new OrtSession.SessionOptions());
}
private float[] preprocessText(String text) {
float[] vector = new float[5000];
Map<String, Integer> wordCounts = new HashMap<>();
// Count word frequencies
for (String word : text.toLowerCase().split("\\s+")) {
wordCounts.put(word, wordCounts.getOrDefault(word, 0) + 1);
}
// Compute TF-IDF
for (Map.Entry<String, Integer> entry : wordCounts.entrySet()) {
Integer idx = vocab.get(entry.getKey());
if (idx != null) {
vector[idx] = entry.getValue() * idf.get(idx);
}
}
// Scale features
for (int i = 0; i < 5000; i++) {
vector[i] = (vector[i] - mean.get(i)) / scale.get(i);
}
return vector;
}
public float predict(String text) throws OrtException {
float[] inputData = preprocessText(text);
float[][] inputArray = new float[1][5000];
inputArray[0] = inputData;
OnnxTensor inputTensor = OnnxTensor.createTensor(OrtEnvironment.getEnvironment(), inputArray);
String inputName = session.getInputNames().iterator().next();
OrtSession.Result result = session.run(Collections.singletonMap(inputName, inputTensor));
float[][] outputArray = (float[][]) result.get(0).getValue();
return outputArray[0][0];
}
public static void main(String[] args) {
try {
BinaryClassifier classifier = new BinaryClassifier(
"src/main/resources/model.onnx",
"src/main/resources/model_vocab.json",
"src/main/resources/model_scaler.json"
);
String text = "This is a positive test string";
float probability = classifier.predict(text);
System.out.printf("Java ONNX output: Probability = %.4f%n", probability);
System.out.println("Classification: " + (probability > 0.5 ? "Positive" : "Negative"));
} catch (Exception e) {
e.printStackTrace();
}
}
}
β 6. Build and Run
π· Build the Project
mvn clean package
π© Run the Classifier
java -cp target/binary-classifier-1.0-SNAPSHOT.jar com.example.BinaryClassifier
β 7. Expected Output
Java ONNX output: Probability = 0.9123
Classification: Positive
The output shows the probability of the positive class. A probability above 0.5 indicates a positive classification, while below 0.5 indicates a negative classification.
π§ How to Run a Binary Classifier ONNX Model with Dart
π What is this?
This guide explains how to load and run binary classifier ONNX models using Dart, with comprehensive text preprocessing, system monitoring, and performance analysis.
Choose your path:
- π Quick Guide: Follow the step-by-step tutorial below
- π Full Example: See complete implementation with tests on GitHub
β 1. Prerequisites
π· Dart SDK
Install Dart SDK:
πͺ Windows
choco install dart-sdk
Or download from: https://dart.dev/get-dart
π macOS
brew tap dart-lang/dart
brew install dart
π§ Linux
sudo apt-get update
sudo apt-get install apt-transport-https
wget -qO- https://dl-ssl.google.com/linux/linux_signing_key.pub | sudo apt-key add -
wget -qO- https://storage.googleapis.com/download.dartlang.org/linux/debian/dart_stable.list | sudo tee /etc/apt/sources.list.d/dart_stable.list
sudo apt-get update
sudo apt-get install dart
Verify installation:
dart --version
β 2. Project Setup
π Create Flutter Project
flutter create binary_classifier_dart
cd binary_classifier_dart
β 3. Project Structure
binary_classifier_dart/
βββ lib/
β βββ main.dart
βββ assets/
β βββ model.onnx
β βββ vocab.json
β βββ scaler.json
βββ pubspec.yaml
β 4. Add Dependencies
πΉ pubspec.yaml
name: binary_classifier_dart
description: "Binary Classifier ONNX Test"
publish_to: 'none'
version: 1.0.0+1
environment:
sdk: '>=3.0.0 <4.0.0'
dependencies:
flutter:
sdk: flutter
onnxruntime: ^1.4.1
path_provider: ^2.1.2
image_picker: ^1.0.7
image: ^4.1.7
collection: ^1.18.0
http: ^1.2.0
shared_preferences: ^2.2.2
fl_chart: ^0.60.0
cupertino_icons: ^1.0.8
dev_dependencies:
flutter_test:
sdk: flutter
flutter_lints: ^4.0.0
flutter:
uses-material-design: true
assets:
- model.onnx
- vocab.json
- scaler.json
β 5. Get Your Model
π Download Repository
Clone our repository to get started:
git clone https://github.com/whitelightning-ai/whitelightning.git
cd whitelightning.ai
π¦ Choose Your Model
You have two options:
- Use Pre-trained Model:
- Navigate to the
models
directory - Copy the binary classifier model files to your project:
model.onnx
- The ONNX model filevocab.json
- TF-IDF vocabularyscaler.json
- Feature scaling parameters
- Navigate to the
- Train Your Own Model:
- Follow the training guide in the repository
- Use the provided scripts to train a custom binary classifier
- Export your model to ONNX format
β 6. Dart Code Example
Create lib/main.dart
:
import 'dart:convert';
import 'dart:typed_data';
import 'package:flutter/material.dart';
import 'package:flutter/services.dart';
import 'package:onnxruntime/onnxruntime.dart';
class BinaryClassifier {
late OrtSession _session;
late Map<String, int> _vocab;
late List<double> _idf;
late List<double> _mean;
late List<double> _scale;
Future<void> initialize() async {
// Load model
final modelBytes = await rootBundle.load('model.onnx');
final sessionOptions = OrtSessionOptions();
_session = OrtSession.fromBuffer(modelBytes.buffer.asUint8List(), sessionOptions);
// Load vocabulary
final vocabString = await rootBundle.loadString('vocab.json');
final vocabData = json.decode(vocabString);
_vocab = Map<String, int>.from(vocabData['vocab']);
_idf = (vocabData['idf'] as List).map((e) => (e as num).toDouble()).toList();
// Load scaler
final scalerString = await rootBundle.loadString('scaler.json');
final scalerData = json.decode(scalerString);
_mean = (scalerData['mean'] as List).map((e) => (e as num).toDouble()).toList();
_scale = (scalerData['scale'] as List).map((e) => (e as num).toDouble()).toList();
}
Float32List _preprocessText(String text) {
final tf = List<double>.filled(_vocab.length, 0.0);
final words = text.toLowerCase().split(RegExp(r'\s+'));
for (final word in words) {
final idx = _vocab[word];
if (idx != null) {
tf[idx] += 1.0;
}
}
final tfSum = tf.reduce((a, b) => a + b);
if (tfSum > 0) {
for (int i = 0; i < tf.length; i++) {
tf[i] = tf[i] / tfSum;
}
}
final tfidf = List<double>.generate(tf.length, (i) => tf[i] * _idf[i]);
final tfidfScaled = List<double>.generate(tfidf.length, (i) => (tfidf[i] - _mean[i]) / _scale[i]);
return Float32List.fromList(tfidfScaled);
}
Future<double> predict(String text) async {
final inputVector = _preprocessText(text);
final inputOrt = OrtValueTensor.createTensorWithDataAsFloat32List(
[1, inputVector.length],
inputVector
);
final inputs = {'input': inputOrt};
final runOptions = OrtRunOptions();
final outputs = await _session.runAsync(runOptions, inputs);
final output = outputs[0]?.value as List<List<double>>;
return output[0][0];
}
void dispose() {
_session.release();
}
}
void main() {
runApp(MyApp());
}
class MyApp extends StatelessWidget {
@override
Widget build(BuildContext context) {
return MaterialApp(
title: 'Binary Classifier',
theme: ThemeData(primarySwatch: Colors.blue),
home: BinaryClassifierPage(),
);
}
}
class BinaryClassifierPage extends StatefulWidget {
@override
_BinaryClassifierPageState createState() => _BinaryClassifierPageState();
}
class _BinaryClassifierPageState extends State<BinaryClassifierPage> {
final _classifier = BinaryClassifier();
final _textController = TextEditingController();
String _result = '';
bool _isLoading = false;
@override
void initState() {
super.initState();
_initializeClassifier();
}
Future<void> _initializeClassifier() async {
await _classifier.initialize();
}
Future<void> _classifyText() async {
setState(() {
_isLoading = true;
});
try {
final probability = await _classifier.predict(_textController.text);
final sentiment = probability > 0.5 ? 'Positive' : 'Negative';
setState(() {
_result = 'Prediction: $sentiment\nProbability: ${probability.toStringAsFixed(4)}';
});
} catch (e) {
setState(() {
_result = 'Error: $e';
});
} finally {
setState(() {
_isLoading = false;
});
}
}
@override
void dispose() {
_classifier.dispose();
_textController.dispose();
super.dispose();
}
@override
Widget build(BuildContext context) {
return Scaffold(
appBar: AppBar(title: Text('Binary Classifier')),
body: Padding(
padding: EdgeInsets.all(16.0),
child: Column(
children: [
TextField(
controller: _textController,
decoration: InputDecoration(
hintText: 'Enter text to classify',
border: OutlineInputBorder(),
),
maxLines: 3,
),
SizedBox(height: 16),
ElevatedButton(
onPressed: _isLoading ? null : _classifyText,
child: _isLoading ? CircularProgressIndicator() : Text('Classify'),
),
SizedBox(height: 16),
Text(_result, style: TextStyle(fontSize: 16)),
],
),
),
);
}
}
β 7. Run the Application
π· Build and Run
dart run bin/binary_classifier.dart
Or with custom text:
dart run bin/binary_classifier.dart "This is a positive test message"
β 8. Expected Output
π€ ONNX BINARY CLASSIFIER - DART IMPLEMENTATION
π RESULTS:
π Predicted Sentiment: Positive
π Confidence: 99.98% (0.9998)
(45.0ms total - Target: <100ms)
This example shows comprehensive system monitoring and performance analysis. The Dart implementation includes detailed preprocessing steps, system information gathering, and performance benchmarking with real-time metrics.