Abstract: Recently, CLIP has been applied to pixel-level zero-shot learning tasks via a two-stage scheme. The general idea is to first generate class-agnostic region proposals and then feed the ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results