
We introduce entity-conditioned probing with resampling, a simple, reproducible method to measure intrinsic brand/site associations in large language models. A single schema-constrained prompt produces top-N lists for each category–locale cell; we collect k independent samples per cell and aggregate with a Plackett–Luce (PL) model to obtain latent worth scores and ranks with 95% bootstrap confidence intervals. In a study of 52 categories × 4 locales (US/GB/DE/JP) totaling 15,600 prompt iterations, we report PL scores alongside frequency baselines (@1/@3) and find strong split-half stability at top-3 (median Spearman = 1.00; mean = 0.876, 95% CI 0.806–0.932; overlap@3 mean = 0.962, 95% CI 0.936–0.985). The method is model-agnostic, emphasizes structured outputs and alias canonicalization, and separates intrinsic association from first-turn prompting effects; when forecasting first-turn outcomes is required, a small stratified panel can be used for monotonic calibration. Code and processed aggregates are openly available (see Related Works). (Preprint v0.6.1.)
Computer Science — Computation and Language (cs.CL), reliability, self-consistency, Statistics — Machine Learning (stat.ML), Bradley-Terry, brand recommendations, LLM Evaluation, rank aggregation, locales, entities, SEO, Plackett-Luce, resampling, JSON schema, GPT-5, entity-condition probing, split-half, bootstrap, structured outputs, Information Retrieval (cs.IR), confidence intervals
Computer Science — Computation and Language (cs.CL), reliability, self-consistency, Statistics — Machine Learning (stat.ML), Bradley-Terry, brand recommendations, LLM Evaluation, rank aggregation, locales, entities, SEO, Plackett-Luce, resampling, JSON schema, GPT-5, entity-condition probing, split-half, bootstrap, structured outputs, Information Retrieval (cs.IR), confidence intervals
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
