
The online labour market's expansion presents unique opportunities to analyse job trends through Machine Learning (ML). However, the effectiveness of ML depends on access to well-labelled job advertisement datasets, which are often limited and require labour-intensive manual annotation. Our proposed solution, JobGen, leverages Large Language Models (LLMs) to generate synthetic Online Job Advertisements (OJAs), using real data and the ESCO taxonomy to ensure accurate representation of job market distributions. JobGen enhances data diversity and semantic alignment, addressing common issues in synthetic data generation. The resulting dataset, JobSet, provides a valuable resource for tasks like skill extraction and job matching and is openly available to the community.
Large Language Models, Natural language processing, Machine learning, Labour Market
Large Language Models, Natural language processing, Machine learning, Labour Market
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
