Deep Captioning with Multimodal Recurrent Neural Networks (m-RNN)

Mao, Junhua; Xu, Wei; Yang, Yi; Wang, Jiang; Huang, Zhiheng; Yuille, Alan

doi:10.48550/arxiv.1412.6632

Public

Deep Captioning with Multimodal Recurrent Neural Networks (m-RNN)

Shared by NobleBlocks on Dec 20, 2014 • 12:00 AM UTC

Authors:

Junhua Mao

Wei Xu

Yi Yang

Abstract

In this paper, we present a multimodal Recurrent Neural Network (m-RNN) model for generating novel image captions. It directly models the probability distribution of generating a word given previous words and an image. Image captions are generated by sampling from this distribution. The model consis...

Subject

Recurrent neural network

Computer science

Closed captioning

Research Assistant

AI chat, annotations, notes & similar papers

Finding related papers...

Discussions

(0)

No comments yet

Be the first to share your thoughts!