ALEXANDRIA, Va., June 16 -- United States Patent no. 12,657,400, issued on June 16, was assigned to Salesforce Inc. (San Francisco).
"Systems and methods for vision-language model instruction tuning" was invented by Wenliang Dai (Singapore), Junnan Li (Singapore), Chu Hong Hoi (Singapore) and Dongxu Li (Singapore).
According to the abstract* released by the U.S. Patent & Trademark Office: "Embodiments described herein provide a method of generating a vision-language task output to a text instruction relating to an input image, the method comprising receiving, via a data interface, the input image and the text instruction comprising an instruction relating to the image. The method further includes encoding, via an image encoder, the image ...