Matching images and sentences demands a fine understanding of both modalities. In this article, we propose a new system to discriminatively embed the image and text to a shared visual-textual space. In this field, most existing works apply the ranking loss to pull the positive image/text pairs close...
Research Assistant
AI chat, annotations, notes & similar papers
No comments yet
Be the first to share your thoughts!