
Understanding job titles, career trajectories, and promotions provides valuable insight into labor market dynamics and professional mobility. We present Career Map (CMap), a novel dataset spanning 24 industry sectors, systematically structured to study job specialization, sector concentration, and career advancements. Using advanced natural language processing techniques and large language models, we standardize 6.2 million job titles into 109 thousand unique titles and introduce a Specialization Index to quantify how specialized a title is within its sector. The dataset includes both a structured job titles dataset and a set of identified promotions—30 thousand validated promotions from the United States and the United Kingdom, and 72 thousand inferred promotions from a global context. It enables research on job hierarchies, workforce mobility and systemic inequalities in professional advancement. By providing insights into career progression patterns, labor market structures, and the impact of education and experience, this dataset serves as a valuable resource for economists, sociologists, and computational researchers studying employment trends across industries and regions.This repository contains the code necessary to recreate Figure 4 and Table 4 from the original manuscript.
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 1 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
