MediaPipe for Raspberry Pi and iOS — Google for Builders Weblog

August 20, 2023

1

Posted by Paul Ruiz, Developer Relations Engineer

Again in Might we launched MediaPipe Options, a set of instruments for no-code and low-code options to widespread on-device machine studying duties, for Android, internet, and Python. As we speak we’re joyful to announce that the preliminary model of the iOS SDK, plus an replace for the Python SDK to assist the Raspberry Pi, can be found. These embrace assist for audio classification, face landmark detection, and numerous pure language processing duties. Let’s check out how you should use these instruments for the brand new platforms.

Object Detection for Raspberry Pi

Other than establishing your Raspberry Pi {hardware} with a digital camera, you can begin by putting in the MediaPipe dependency, together with OpenCV and NumPy when you don’t have them already.

python -m pip set up mediapipe

From there you possibly can create a brand new Python file and add your imports to the highest.

import mediapipe as mp

from mediapipe.duties import python

from mediapipe.duties.python import imaginative and prescient

import cv2

import numpy as np

Additionally, you will need to ensure you have an object detection mannequin saved regionally in your Raspberry Pi. In your comfort, we’ve offered a default mannequin, EfficientDet-Lite0, that you may retrieve with the next command.

wget -q -O efficientdet.tflite -q https://storage.googleapis.com/mediapipe-models/object_detector/efficientdet_lite0/int8/1/efficientdet_lite0.tflite

Upon getting your mannequin downloaded, you can begin creating your new ObjectDetector, together with some customizations, just like the max outcomes that you just need to obtain, or the boldness threshold that have to be exceeded earlier than a consequence could be returned.



base_options = python.BaseOptions(model_asset_path=mannequin)

choices = imaginative and prescient.ObjectDetectorOptions( base_options=base_options,

                                   running_mode=imaginative and prescient.RunningMode.LIVE_STREAM,

                                   max_results=max_results,                                                       score_threshold=score_threshold,

                                   result_callback=save_result)

detector = imaginative and prescient.ObjectDetector.create_from_options(choices)

After creating the ObjectDetector, you will have to open the Raspberry Pi digital camera to learn the continual frames. There are a number of preprocessing steps that will probably be omitted right here, however can be found in our pattern on GitHub.

Inside that loop you possibly can convert the processed digital camera picture into a brand new MediaPipe.Picture, then run detection on that new MediaPipe.Picture earlier than displaying the outcomes which are obtained in an related listener.

mp_image = mp.Picture(image_format=mp.ImageFormat.SRGB, information=rgb_image)

detector.detect_async(mp_image, time.time_ns())

When you draw out these outcomes and detected bounding bins, it’s best to be capable to see one thing like this:

Moving image of a person holidng up a cup and a phone, and detected bounded boxes identifying these items in real time

You’ll find the whole Raspberry Pi instance proven above on GitHub, or see the official documentation right here.

Textual content Classification on iOS

Whereas textual content classification is among the extra direct examples, the core concepts will nonetheless apply to the remainder of the out there iOS Duties. Much like the Raspberry Pi, you’ll begin by creating a brand new MediaPipe Duties object, which on this case is a TextClassifier.

var textClassifier: TextClassifier?

textClassifier = TextClassifier(modelPath: mannequin.modelPath)

Now that you’ve your TextClassifier, you simply must move a String to it to get a TextClassifierResult.

func classify(textual content: String) -> TextClassifierResult? {

    guard let textClassifier = textClassifier else {

        return nil

    }

return attempt? textClassifier.classify(textual content: textual content) }

You are able to do this from elsewhere in your app, reminiscent of a ViewController DispatchQueue, earlier than displaying the outcomes.

let consequence = self?.textClassifier.classify(textual content: inputText)

let classes = consequence?.classificationResult.classifications.first?.classes?? []

You’ll find the remainder of the code for this challenge on GitHub, in addition to see the total documentation on builders.google.com/mediapipe.