Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ ZENODOarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Dataset
Data sources: ZENODO
addClaim

The QuoteSweep Stated-Appetite Corpus, v1.0

Authors: Shrestha, Ankurman;

The QuoteSweep Stated-Appetite Corpus, v1.0

Abstract

A public dataset of 509 U.S. commercial Property & Casualty (P&C) insurance carriers' publicly available stated-appetite materials, normalized into a single machine-readable JSON schema. Companion to the working paper *Observed Appetite: A Computational Framework for Measuring Commercial Insurance Carrier Underwriting Behavior at Distribution Scale* (Shrestha, 2026). What this is For each of 509 commercial P&C carriers, the corpus captures the carrier's publicly available appetite documentation – either a carrier-published PDF appetite guide (n=201) or a carrier-website appetite-page text scrape (n=308) – and parses it into a uniform JSON schema covering industry classes, lines of business, state availability, size thresholds, exclusions, and underwriting notes. The corpus is the empirical substrate for two studies reported in §5 of the paper: Analysis B (granularity gap) – the structural granularity of stated appetite across six coding dimensions (industry, state, size, exclusions, interactions, dates). Analysis A (inter-source agreement) – the within-carrier agreement between a carrier's published PDF and its own appetite web page on which NAICS-2 sectors the carrier writes. Sample and scope 509 carriers, 2,031 line-of-business rows, 9,526 appetite class rows Scrape window: 2026-03-23 to 2026-04-06 (15 days) Coverage: U.S. commercial P&C carriers with any publicly available appetite documentation Headline findings derived from this corpus Only 2.2% of carriers publicly disclose any industry × state × size interaction (95% CI 1.0–3.5%) Only 1.2% annotate appetite at six-digit NAICS resolution Only 4.7% of line-of-business commitments disclose a revenue threshold Same-carrier PDF guides assert 2.14× more sector inclusions than the carrier's own appetite web page Cohen's κ = +0.25 [95% CI 0.22, 0.28] between same-carrier PDF and web-page sources on NAICS-2 sector availability ("fair agreement" under Landis & Koch, 1977; n = 189 carriers, 3,780 cells) Contents The `appetite-corpus-v1.zip` archive preserves a nested directory structure. Top-level files contain the dataset (`carriers.json`, `sources.csv`, `codebook.md`, `corpus-schema.md`); the `reproduce/` subdirectory contains the reproducibility scripts and analytical audit trail (`compute_analysis_b_v2.py`, `compute_analysis_a_v2.py`, headline JSONs, raw input vectors, and verbatim D5 evidence quotes). Random seed for all bootstraps: `20260518`. See the standalone `README.md` for the full file inventory. Limitations The 509-carrier sample over-represents carriers that publish *any* appetite documentation; carriers absent from this corpus are absent because they publish nothing locatable. Headline disclosure rates should be interpreted as ceilings on the U.S. P&C population, not central estimates. Coding was performed by one LLM-assisted agent in a single session; verbatim evidence quotes are included in `reproduce/analysis-b-coding-v2.csv` to enable second-coder validation. Citation Shrestha, A. (2026). *The QuoteSweep Stated-Appetite Corpus, v1.0* [Data set]. Zenodo. https://doi.org/10.5281/zenodo.20280436

Powered by OpenAIRE graph
Found an issue? Give us feedback