masked token modeling tagged posts

Computer Vision System Marries Image Recognition and Generation

Computer vision system marries image recognition and generation
A unified vision system known as Masked Generative Encoder (MAGE), developed by researchers at MIT and Google, could be useful for many things, like finding and classifying objects in an image, learning from just a few examples, generating images with specific conditions such as text or class, editing existing images, and more. Credit: Alex Shipps/MIT CSAIL via Midjourney

Computers possess two remarkable capabilities with respect to images: They can both identify them and generate them anew. Historically, these functions have stood separate, akin to the disparate acts of a chef who is good at creating dishes (generation), and a connoisseur who is good at tasting dishes (recognition).

Yet, one can’t help but wonder: What would it take to orchestrate a harmonious union between these t...

Read More