Powered by OpenAIRE graph
Found an issue? Give us feedback
addClaim

This Research product is the result of merged Research products in OpenAIRE.

You have already added 0 works in your ORCID record related to the merged Research product.

Learning Deep Representations, Embeddings and Codes from the Pixel Level of Natural and Medical Images

Authors: Kiros, Ryan J;

Learning Deep Representations, Embeddings and Codes from the Pixel Level of Natural and Medical Images

Abstract

Significant research has gone into engineering representations that can identify high-level semantic structure in images, such as objects, people, events and scenes. Recently there has been a shift towards learning representations of images either on top of dense features or directly from the pixel level. These features are often learned in hierarchies using large amounts of unlabeled data with the goal of removing the need for hand-crafted representations. In this thesis we consider the task of learning two specific types of image representations from standard size RGB images: a semi-supervised dense low-dimensional embedding and an unsupervised sparse binary code. We introduce a new algorithm called the deep matching pursuit network (DMP) that efficiently learns features layer-by-layer from the pixel level without the need for backpropagation fine tuning. The DMP network can be seen as a generalization of the single layer networks of Coates et. al. to multiple layers and larger images. We apply our features to several tasks including object detection, scene and event recognition, image auto-annotation and retrieval. For auto-annotation, we achieve competitive performance against methods that use 15 distinct hand-crafted features. We also apply our features for handwritten digit recognition on MNIST, achieving the best reported error when no distortions are used for training. When our features are combined with t-SNE, we obtain highly discriminative two dimensional image visualizations. Finally, we introduce the multi-scale DMP network for domain independent multimodal segmentation of medical images. We obtain the top performance on the MICCAI lung vessel segmentation (VESSEL12) competition and competitive performance on the MICCAI multimodal brain tumor segmentation (BRATS2012) challenge. We conclude by discussing how the deep matching pursuit network can be applied to other modalities such as RGB-D images and spectrograms.

Keywords

Machine Learning, Deep Learning, Computer Vision, Representation Learning

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
0
Average
Average
Average
Upload OA version
Are you the author of this publication? Upload your Open Access version to Zenodo!
It’s fast and easy, just two clicks!