Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ ZENODOarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Preprint
Data sources: ZENODO
addClaim

Orchid 1.0: A Reproducible Recipe for Aligned Ternary-Weight Language Models on Consumer Hardware

Authors: Romero Chisco, Michelangelo;

Orchid 1.0: A Reproducible Recipe for Aligned Ternary-Weight Language Models on Consumer Hardware

Abstract

We present Orchid 1.0, a 2-billion-parameter ternary-weight language model aligned through a three-stage LoRA pipeline (reasoning SFT, identity-and-knowledge SFT, and Odds-Ratio Preference Optimization) on a single RTX 3050 laptop with 4 GB of VRAM. We document each design decision, memory-management technique, and recovery procedure that made the training feasible on this hardware. We then describe and resolve the ternary merge problem — the destructive interaction between LoRA deltas and ternary weight quantization — which motivated the construction of ternative.cpp, a purpose-built C++ inference engine that loads a base I2_S GGUF and a separate LoRA adapter GGUF and merges them at full precision at load time. Ternative.cpp supports CPU (AVX2, OpenMP) and GPU (CUDA 12.6) execution with an OpenAI-compatible HTTP server. We evaluate Orchid 1.0 on four standard benchmarks: ARC-Challenge 56.0% (+6.1 pp over the BitNet base), HellaSwag 52.0%, WinoGrande 74.0%, and MMLU 38.6%. All artifacts are openly available:- Model: https://huggingface.co/MicheRomChis/orchid-1.0- Inference engine: https://github.com/michelangeloromerochisco/ternative- Training code: https://github.com/michelangeloromerochisco/orchid-1.0

Powered by OpenAIRE graph
Found an issue? Give us feedback