
Conceptual annotations and provenance can provide contextual information to inform a range of data processing activities. In this workstream we will be utilising the metadata generated in the earlier workstreams – the questions and response domains from the metadata extraction phase and the concorded variables from the conceptual comparison phase – to identify key variables, those that although are not sensitive in of themselves, have the potential to be disclosive if used in combination. This identification will be achieved using state-of-the-art text classification methods, which we will also use to identify such metadata as identifiers and weight variables. Rule-based classifiers will further interrogate the variable metadata to determine its classification hierarchy and level, e.g., a socio-economic variable may be coded using the ONS NS-SeC classification hierarchy at the 8-class analytic level. This enhanced metadata can then be combined with the data itself to provide an enhanced curation platform – one which allows our data curators to evaluate and mitigate the disclosure risk of a dataset with relative ease. The resulting platform will be powered by metadata and microdata stored using the DDI-CDI schema, utilising such aspects as its variable cascade.
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
