Automated Dataset Generation with myCobot 280 PI

The myCobot 280 PI robotic arm automates dataset creation by mimicking human hand movements. Cameras capture images of labels on motorized rewinders, simulating real-world motion. This setup reduces manual effort while ensuring consistency for object detection tasks. The companion iPhone app (demo below) runs on-device vision in the field—not only 3D QR authentication, but also a live screen / display signal (laptops and monitors), where glare, moiré, and weak box-like boundaries made a standard detector a poor fit.

Automated Dataset Generation, Object Detection, and Mobile Deployment

A demo video showcasing the app's functionality and results.

Overview

As a Machine Learning Engineer at Zortag, I automated dataset generation, optimized object detection models, and shipped an iPhone app for real-time inference. Zortag’s 3D QR stack needs high-accuracy authentication; the same product flow also needed the phone to know when a laptop or monitor screen was in view. That second problem sounds simple, but in practice screens are not “nice” detection targets: large regions, uneven texture, glare, and moiré meant a YOLO-style bounding-box model we tried first kept missing or jittering on real hardware.

I addressed that with a CLIP ViT-L/14@336 image encoder feeding a small MLP for a stable yes/no screen decision, with frames streamed over WebSockets, multi-crop inference at test time, EMA smoothing, and a short hysteresis window so the UI did not flicker—end-to-end roughly sub-second behavior on live video. The robotics and QR pieces are what most people see first; the screen pathway was the parallel engineering thread that made the whole authentication story usable outside a lab.

Key Components:

Robotic arm automation for scalable data collection
YOLOv8 fine-tuning (99.84% accuracy) for QR / label tasks
Synthetic data generation
iPhone app: CoreML + SwiftUI, QR inference plus the CLIP + MLP screen head above (video demo at the beginning)

Note: Code and datasets are proprietary.

Key Contributions

1. Robotic Automation & Dataset Generation

Programmed myCobot 280 PI to capture multi-angle images of 3D QR codes via human-like motions.
Motorized label rewinders simulated real-world movement, reducing manual effort by 90%.

2. Model Optimization

Fine-tuned YOLOv8 for anomaly detection using hybrid real/synthetic data, achieving 99.84% detection accuracy for fake 3D QR codes.
Automated labeling pipelines eliminated manual annotation errors.

3. iPhone App for Inference

Developed a native iOS app to run optimized YOLOv8 (CoreML) for 3D QR work on-device.
Added live screen / display detection for field use: CLIP + MLP with streaming, multi-crop evaluation, and temporal smoothing so the app could trust the signal in uneven lighting.
Offline-capable where the models allow it; SwiftUI UI aimed at operators in the field.

Methodology

Robotics: Python/ROS-controlled arm trajectories with OpenCV-based image capture.
ML Pipeline:
- Synthetic data augmentation (varied lighting/backgrounds).
- YOLOv8 fine-tuning via PyTorch and CoreML conversion for iOS.
Mobile Deployment:
- Optimized model for edge performance (CoreML).
- SwiftUI frontend with camera integration.

Results & Impact

Efficiency: 90% faster data generation vs. manual methods.
Accuracy: 99.84% detection under real-world conditions.
Deployment: iPhone app enables field authentication without cloud dependency.

Future Work

Expand the training dataset to cover more real-world cases.
Port to Android for broader adoption.