
# Styloformer: Automatic Classification of Art Film Scenes This repository contains the implementation of **Styloformer**, a multimodal transformer framework for **automatic classification of art film scenes** based on **image and audio deep features**. The project integrates **visual, auditory, textual, and curatorial signals** into a unified representation space, enabling both predictive performance and art-historical interpretability. --- ## ✨ Key Features - **Multimodal Fusion** Cross-modal attention mechanism dynamically aligns visual and auditory features for robust scene understanding. - **Styloformer Architecture** A transformer-based framework integrating: - Stylistic clustering - Canonicality estimation - Influence prediction - Historiographic navigation - **Historiographic Navigation** Novel interpretive module embedding ontological priors and temporal logic for reasoning about artistic influence. - **State-of-the-Art Performance** - **MovieNet dataset**: 91.85% accuracy, 94.31% AUC - Outperforms baselines like **CLIP**, **ViT**, and **PANDA**:contentReference[oaicite:1]{index=1} --- ## 📂 Datasets Experiments were conducted on several benchmarks: - **MovieNet** – narrative and stylistic structure in cinema - **Hollywood2** – action and scene classification - **MovieGraphs** – graph-based social interaction semantics - **TACoS** – fine-grained visual-text alignment - **CineArtSet (new)** – curated art film dataset (1,920 clips, 54 films, 9,458 labeled scenes):contentReference[oaicite:2]{index=2} --- ## ⚙️ Installation ```bash# Clone this repogit clone https://github.com//styloformer.gitcd styloformer # Create environmentconda create -n styloformer python=3.9conda activate styloformer # Install dependenciespip install -r requirements.txt
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
