Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ ZENODOarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Dataset . 2025
License: CC BY
Data sources: ZENODO
ZENODO
Dataset . 2025
License: CC BY
Data sources: Datacite
ZENODO
Dataset . 2025
License: CC BY
Data sources: Datacite
versions View all 2 versions
addClaim

LSST light curves for constant and variable sources, and for point-like and extended objects microlensing

Authors: Crispim Romão, Miguel; Croon, Djuna; Godines, Daniel;

LSST light curves for constant and variable sources, and for point-like and extended objects microlensing

Abstract

This repository contains the dataset that accompanies the paper Anomaly Detection to Identify Transients in LSST Time Series Data, which should be consulted for further details, along with the artefacts of the trained machine learning models. The dataset was generated using simulated LSST light curves for the Vera C. Rubin Observatory cadence and observational conditions via rubin-sim. It comprises approximately 600 000 light curves designed to detect various transient events, including microlensing signals and variable stars, as well as non-variable signal-less sources used to train the anomaly detection model. The dataset includes six distinct classes: Constant (non-variable signal-less sources), RR Lyrae variables, Point-like Microlensing (ML), Binary Microlensing (Binary ML), Boson Stars (BS), and NFW Subhalos (NFW). The total number of simulated light curves for each class is as follows: BS: 320 494 Binary ML: 84 022 ML: 53 565 RR Lyrae: 49 573 NFW: 47 837 Constant: 41 522 The light curves incorporate rubin-sim noise simulation and the LSST 10-year baseline cadence strategy (v2.0). Light curves for Constant, variable, and point-like microlensing events were simulated using MicroLIA, while binary microlensing events were generated using pyLIMA. Light curves for the BS and NFW objects were simulated using the code from this work. The dataset contains 182 columns covering simulation and generation parameters, observable time series features, the time series itself, and the predictions from the machine learning models used in the paper. The columns are organised by type using prefixes and suffixes: 'timestamps', 'mag', 'magerr': Light curve data. 'gen': Generation parameters (metadata). 'sim': Simulation parameters (metadata). 'feature_' prefix: Features extracted from the light curve and its derivative, marked with the suffix 'deriv'. 'iforest_output': iForest anomaly score. 'pred_': Probabilities and class prediction for the multiclass classifier. The dataset is provided in 'parquet' format, accessible in Python via 'pandas' by installing the 'parquet' optional dependency (i.e., pip install pandas[parquet]). The artefacts were generated in Python 3.9.21 using scikit-learn 1.4.1. The imputer_train.pkl file is required to impute missing values before predicting with the iForest model (final_isolation_forest_model.pkl), as it does not handle missing or nan values. The multiclass classifier (classifier.pck) handles missing and nan values directly and was trained without imputed data.Please cite the paper alongisde the zenodo entry if you use this dataset: @article{CrispimRomao:2025pyl, author = "Crispim Romao, Miguel and Croon, Djuna and Godines, Daniel", title = "{Anomaly Detection to identify Transients in LSST Time Series Data}", eprint = "2503.09699", archivePrefix = "arXiv", primaryClass = "astro-ph.SR", reportNumber = "IPPP/25/15", month = "3", year = "2025"}

Related Organizations
Keywords

LSST, Extended Objects, Microlensing, Dark Matter, Anomaly Detection, iForest, NFW Subhalos, Light curves, Boson Starts

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
0
Average
Average
Average