SenseTime, a facial recognition and AI company, has joined forces with the Shanghai Artificial Intelligence Laboratory, the Chinese University of Hong Kong, and Shanghai Jiao Tong University to launch a general vision open source platform that they hope will aid in object detection and other applications in both industry and academia.
According to an unnamed assistant director of the Shanghai Artificial Intelligence Laboratory in a press release issued by SenseTime, the platform, called OpenGVLab, will foster the development of the AI industry by “providing a platform to better explore and apply general vision AI technology, thus resolving bottlenecks amid project expansion and continuing our contribution to the progress of AI technology.”
According to SenseTime, OpenGVLab is based on a general vision model known as ‘INTERN,’ which was developed last year in collaboration by the Shanghai Artificial Intelligence Laboratory, the Chinese University of Hong Kong, and Shanghai Jiao Tong University. It is said to address a major barrier to general vision development by utilising a single model that can perform multiple tasks. Face detection is one of the tasks examined in the paper, but not facial recognition or biometrics in general.
One advantage of INTERN, according to SenseTime, is its powerful capabilities in pre-training general vision models for classification, detection, segmentation, and depth estimation. The pre-training models can then be used to assist developers in building algorithm models for hundreds of visual tasks and scenes, thereby addressing the ‘long-tail’ problem of AI development and promoting large-scale AI application. SenseTime claims that, in addition to “ultra-large-scale datasets” encompassing “tens of millions of datasets” and a labelling system provided by the Shanghai Artificial Intelligence Laboratory, it can reduce input costs and the cost of collecting downstream data, thus improving efficiency.
SenseTime will also launch what they call the industry’s first benchmark for general vision model evaluation, which will allow developers to assess and optimise the performance of various general vision models.