
Evaluation Datasets for ChatTS We provide the two evaluation datasets we collected, as mentioned in the paper. Each dataset sample contains the following fields: `timeseries`, `question`, `answer` (standard answers in text format, for reference only), `attributes` (structured labels used for result evaluation), and `ability_types` (the tasks included in the question). Please note that, to reduce the evaluation cost, we have merged different questions for the same time series into a single `question`, using numbering to distinguish between the different questions. Therefore, the actual number of questions in the evaluation dataset may be greater than the number of `timeseries` entries. Additionally, please note that some tasks in inductive reasoning and alignment are grouped into the same question, as the inductive reasoning tasks involve explaining the physical meanings of time series attributes. The `MCQ2` dataset is a third-party open-source dataset, we do not provide it in this repository. Please download it directly via https://github.com/behavioral-data/TSandLanguage. Data Source Dataset A includes real-world time series data collected from the following sources: NAB (https://github.com/numenta/NAB) Weather (https://www.bgc-jena.mpg.de/wetter/) Oracle (https://zenodo.org/records/6955909) AIOps (https://github.com/netmanaiops/kpi-anomaly-detection) License Our datasets uses Creative Commons Attribution 4.0 International License (CC BY 4.0). If you use this dataset, you must: 1. Give Attribution: When using the dataset, clearly mention the dataset name and the sources NAB, Weather, Oracle, and AIOps. 2. Share Alike: If you make something new based on this dataset, share it under the same CC BY 4.0 license. 3. Follow Source Restrictions: If any source dataset has special rules (like some parts of the Weather dataset under CC-BY-4.0), follow those too. By using this dataset, you agree to follow all laws and the license terms of each source dataset and this overall license.
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
