Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao Closed Access logo, derived from PLoS Open Access logo. This version with transparent background. http://commons.wikimedia.org/wiki/File:Closed_Access_logo_transparent.svg Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao Neural Networksarrow_drop_down
image/svg+xml Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao Closed Access logo, derived from PLoS Open Access logo. This version with transparent background. http://commons.wikimedia.org/wiki/File:Closed_Access_logo_transparent.svg Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao
Neural Networks
Article . 2025 . Peer-reviewed
License: Elsevier TDM
Data sources: Crossref
https://doi.org/10.2139/ssrn.4...
Article . 2024 . Peer-reviewed
Data sources: Crossref
DBLP
Article . 2025
Data sources: DBLP
versions View all 4 versions
addClaim

Explicitly Diverse Visual Question Generation

Authors: Jiayuan Xie; Jiasheng Zheng; Wenhao Fang; Yi Cai 0001; Qing Li 0001;

Explicitly Diverse Visual Question Generation

Abstract

Visual question generation involves the generation of meaningful questions about an image. Although we have made significant progress in automatically generating a single high-quality question related to an image, existing methods often ignore the diversity and interpretability of generated questions, which are important for various daily tasks that require clear question sources. In this paper, we propose an explicitly diverse visual question generation model that aims to generate diverse questions based on interpretable question sources. To explicitly perform question generation, our model first extracts the scene graph from the image using the unbiased scene graph generation method, where questions generated based on the scene graphs have interpretable question sources. To ensure the diversity of generated questions, our model selects different subgraphs from the scene graph as question sources. Specifically, we employ a subgraph selector to learn how humans select multiple subgraphs that are suitable for question generation. Finally, our model generates diverse questions based on different selected subgraphs. Extensive experiments on the VQA v2.0 and COCO-QA datasets show that the proposed model outperforms the baselines and is able to interpretably generate diverse questions.

Related Organizations
Keywords

Visual Perception, Humans, Neural Networks, Computer, Algorithms

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    1
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
1
Average
Average
Average
Upload OA version
Are you the author of this publication? Upload your Open Access version to Zenodo!
It’s fast and easy, just two clicks!