PRISM: PARALLEL RESPONSE INTELLIGENCE SYSTEM FOR MULTI-LLMS

The rapid development in LLM has expanded the possibilities of modern AI systems. But, at the same time, it brought practical difficulties for the developers and researchers who require comparing model outputs reliably and efficiently. Most existing LLM interfaces act in isolation from each other, making a multi-model, side-by-side test quite difficult to accomplish without error. This paper presents PRISM, a webbased multi-LLM playground developed to support parallel querying and qualitative comparison of model responses in one interface. The current prototype incorporates Google’s Gemini 2.0 Flash and Groq-hosted Llama models, including Llama 3.3 70B and Llama 3.1 8B Instant. PRISM provides a host of practical functionality, including local API key management, automatic retry mechanisms to handle provider rate limits, and facilities to export comparison results and session history. This paper describes the system architecture, design considerations, implementation methodology, and some qualitative observations from its testing. Finally, it describes several future enhancements that are planned with the goal of expanding PRISM as a much more general platform for LLM evaluation and benchmarking. Index Terms—Multi-LLM, Gemini, Llama, Groq, Model Comparison, PRISM, Web UI, Streamlit.

Found an issue? Give us feedback