Smiling Robo Logo

SmilingRobo

Pollen-Vision

By Pollen Robotics

vision-model

Share this project to show your support

Pollen-Vision: Unified interface for Zero-Shot vision models in robotics

Simple and unified interface to zero-shot computer vision models curated for robotics use cases.

Key Features of Pollen-Vision

Zero-Shot Capabilities: The library includes several state-of-the-art zero-shot vision models that can detect and segment objects without any prior training. This makes them instantly usable for a wide range of robotic tasks.

3D Object Detection: Pollen-Vision's initial focus is on 3D object detection, providing robots with the ability to perceive the spatial coordinates of objects in their environment. This is a crucial capability for tasks like robotic grasping.

Ease of Use: The library is designed for simplicity and ease of integration. Developers can quickly set up a 3D object detection pipeline using just a few lines of code.

Modular Architecture: Pollen-Vision is composed of independent modules that can be combined to create custom vision pipelines. This allows users to pick and choose the models that best suit their needs.

Key Models in Pollen-Vision

The Pollen-Vision library currently includes the following cutting-edge vision models:

OWL-ViT: A zero-shot 2D object detection model from Google Research that can localize objects based on text descriptions.

Mobile-DETR: A lightweight version of the Segment Anything Model (SAM) from Meta AI, enabling zero-shot image segmentation.

RAM: The Recognize Anything Model from OPPO Research Institute, designed for zero-shot image tagging and classification.

SmilngRobo

SmilingRobo, a platfrom of opensource robotics

Opensource Robotics Platform with opensource tools and resources. We are on a journey to advance and democratize robotics through opensource.