CWV × AI: The First Systematic Measurement of Client-Side Neural Network Inference Impact on Core Web Vitals

Srikar Phani Kumar, Marti

Found an issue? Give us feedback

ZENODOarrow_drop_down

ZENODO

Preprint

Data sources: ZENODO

CWV × AI: The First Systematic Measurement of Client-Side Neural Network Inference Impact on Core Web Vitals

descriptionPublicationkeyboard_double_arrow_right Preprint Under curation English Publisher:Zenodo

Authors: Srikar Phani Kumar, Marti;

doi: 10.5281/zenodo.20381364

CWV × AI: The First Systematic Measurement of Client-Side Neural Network Inference Impact on Core Web Vitals

- Summary

Abstract

This paper presents the first systematic benchmark of client-side neural network inference impact on Core Web Vitals (CWV) proxies, specifically Interaction to Next Paint (INP), a Core Web Vital included in Google’s page experience signals. The proliferation of browser-native machine learning libraries such as Transformers.js has enabled inference without server round-trips, but its cost to user-perceived performance has never been systematically measured. We benchmark four quantized models—DistilBERT, BERT-base, Whisper Tiny, and MobileViT-S—across two real devices (Apple MacBook Pro M1 Max and Samsung Galaxy Z Tri Fold) and two simulated mobile profiles (4X and 6X CPU throttle), measuring a lab-based INP-equivalent responsiveness proxy, memory pressure, and bundle cost across 10 iterations per configuration. On a high-performance desktop, the measured INP-equivalent ranges from 27.2 ms (DistilBERT, “Good”) to 500.3 ms (Whisper Tiny, “Poor”). On a premium Android device without throttling, the same models produce 57.1 ms to 947.4ms—a consistent 2X degradation. On the Galaxy Z Tri Fold with simulated 6X CPU slowdown, Whisper Tiny reaches 6,535 ms. Critically, DistilBERT is the only model that maintains “Good” INP-equivalent classification across all device profiles tested on the M1 Max. These findings establish that model architecture—not parameter count—is the primary predictor of browser inference cost, and provide the first empirical basis for model selection decisions in interaction-critical web applications.

Found an issue? Give us feedback