In this paper, we present a multimodal Recurrent Neural Network (m-RNN) model for generating novel image captions. It directly models the probability distribution of generating a word given previous words and an image. Image captions are generated by sampling from this distribution. The model consis...
Research Assistant
AI chat, annotations, notes & similar papers
No comments yet
Be the first to share your thoughts!