We present an instance-based attention model to predict where humans could look first when searching for an object instance, and we show its application in image synthesis. The proposed model learns configurational rules from vast scene images described by global scene representations. The rules are then used to predict the focus of attention for the purpose of searching for a given object instance with special scale and pose. Finally, the image synthesis results are obtained by putting the object instance into the scene at the position that attracts most attention. Promising experimental results demonstrate the effectiveness of the proposed model.
© 2012 Optical Society of AmericaFull Article | PDF Article