Gemini 1.5 Professional Now Accessible in 180+ Nations; With Native Audio Understanding, System Directions, JSON Mode and Extra

April 9, 2024

1

Posted by Jaclyn Konzelmann and Megan Li – Google Labs

Seize an API key in Google AI Studio, and get began with the Gemini API Cookbook

Lower than two months in the past, we made our next-generation Gemini 1.5 Professional mannequin accessible in Google AI Studio for builders to check out. We’ve been amazed by what the group has been in a position to debug, create and be taught utilizing our groundbreaking 1 million context window.

Immediately, we’re making Gemini 1.5 Professional accessible in 180+ international locations through the Gemini API in public preview, with a first-ever native audio (speech) understanding functionality and a brand new File API to make it straightforward to deal with information. We’re additionally launching new options like system directions and JSON mode to offer builders extra management over the mannequin’s output. Lastly, we’re releasing our subsequent technology textual content embedding mannequin that outperforms comparable fashions. Go to Google AI Studio to create or entry your API key, and begin constructing.

Unlock new use circumstances with audio and video modalities

We’re increasing the enter modalities for Gemini 1.5 Professional to incorporate audio (speech) understanding in each the Gemini API and Google AI Studio. Moreover, Gemini 1.5 Professional is now in a position to purpose throughout each picture (frames) and audio (speech) for movies uploaded in Google AI Studio, and we stay up for including API help for this quickly.

screen grab of a clooege professor using Gemini 1.5 Pro to create a quiz based on their latest lecture video in Google AI Studio

You’ll be able to add a recording of a lecture, like this 117,000+ token lecture from Jeff Dean, and Gemini 1.5 Professional can flip it right into a quiz with a solution key. [Video sped up for demo purposes]

Gemini API Enhancements

Immediately, we’re addressing a lot of high developer requests:

1. System directions: Information the mannequin’s responses with system directions, now accessible in Google AI Studio and the Gemini API. Outline roles, codecs, objectives, and guidelines to steer the mannequin’s conduct on your particular use case.

2. JSON mode: Instruct the mannequin to solely output JSON objects. This mode allows structured knowledge extraction from textual content or photographs. You will get began with cURL, and Python SDK help is coming quickly.

3. Enhancements to perform calling: Now you can choose modes to restrict the mannequin’s outputs, enhancing reliability. Select textual content, perform name, or simply the perform itself.

A brand new embedding mannequin with improved efficiency

Beginning in the present day, builders will be capable to entry our subsequent technology textual content embedding mannequin through the Gemini API. The brand new mannequin, text-embedding-004, (text-embedding-preview-0409 in Vertex AI), achieves a stronger retrieval efficiency and outperforms current fashions with comparable dimensions, on the MTEB benchmarks.

table showing Gecko: Versativel Text Embeddings Distilled from Large Language Models

‘Textual content-embedding-004’ (aka Gecko) utilizing 256 dims output outperforms all bigger 768 dim output fashions on MTEB benchmarks

These are simply the primary of many enhancements coming to the Gemini API and Google AI Studio within the subsequent few weeks. We’re persevering with to work on making Google AI Studio and the Gemini API the simplest approach to construct with Gemini. Get began in the present day in Google AI Studio with Gemini 1.5 Professional, discover code examples and quickstarts in our new Gemini API Cookbook, and be a part of our group channel on Discord.

Supply hyperlink

Previous articleRogers and Shaw Prolong 5G Community in Western Canada

Next articlereact native – Geofence on ios when IOS app is absolutely terminated

Gemini 1.5 Professional Now Accessible in 180+ Nations; With Native Audio Understanding, System Directions, JSON Mode and Extra

Unlock new use circumstances with audio and video modalities

Gemini API Enhancements

A brand new embedding mannequin with improved efficiency

Google declares two new variants of Gemma: CodeGemma and RecurrentGemma

Meet the inaugural cohort of our Google for Startups Accelerator: AI First North America

Synopsys hopes to mitigate upstream dangers in software program provide chains with new SCA instrument

LEAVE A REPLY Cancel reply

Most Popular

Google declares two new variants of Gemma: CodeGemma and RecurrentGemma

Airtel Pay as you go Plans with Amazon Prime Video and Disney+ Hotstar

Foundational Instruments in Android | Kodeco

Taking AI to the subsequent degree in manufacturing

Recent Comments

ABOUT US

POPULAR POSTS

Google declares two new variants of Gemma: CodeGemma and RecurrentGemma

Airtel Pay as you go Plans with Amazon Prime Video and Disney+ Hotstar

Foundational Instruments in Android | Kodeco

POPULAR CATEGORY