Google Gemini 1.5, first impressions of a huge 1M token window

You can now get access to Google’s Gemini 1.5 language model with an insane 1 million token context window.

gemini 1 5 million token conteext window

What does 1 million token context window mean? Well, you could feed the AI hundreds of thousands of words to pre-train it on your data. There are tons of use cases.

First, how to get access to Gemini 1.5?

  1. Go to Google DeepMind page here https://deepmind.google/technologies/gemini/#gemini-1.5
  2. Click “Gemini 1.5” or scroll to where it says “Introducing Gemini 1.5”
  3. Click “Try Gemini 1.5” and sign up with your Gmail account.
  4. Done! You should get “Get Started” on your screen that takes you directly to Google Gemini 1.5 chatbot.

Optional steps

  1. Connect your Google Drive to save conversations.
  2. You may need VPN depending on what country you are in. Opera browser has a free VPN feature.

First experiences with Gemini 1.5

First thing I tested was the context window size. I have a transcript of my 11-hour AI course that is 160 thousand words and about 220 thousand tokens. More than Claude 3 or ChatGPT can handle.

google gemini 1 5

I fed it to Gemini 1.5 and asked to make a schedule based on the transcript. It correctly detected that it’s meant to be a 3-day seminar and broke it up into days and subsections. I also asked AI to use my style it detected from the transcript.

The results were great, spot on schedule covering the content of the course and pointing out the value to the person taking the course. Then I asked it to write Linkedin posts based on the transcript and schedule. One covering the whole course and 3 more for each day. As it knew my style, then the resulting posts were not over the top marketing hype but really useful posts I could virtually copy and paste without editing.

But then it failed. I have a large block of notes I have made over the last 7 years. I tried to feed it to Gemini 1.5 in the prompt. The text is 1.6 million characters or 466 thousand tokens. I got a “No content” error in the prompt. So, I saved everything as a plain txt file, uploaded it to Drive to avoid any connectivity issues, and tried again. Again, “No content” error.

google gemini 1 5 context window error

This means that for me the token limit for Gemini 1.5 is somewhere between 220 and 466 thousand tokens. This massive context window could be valuable when having long conversations, for example, creating a marketing persona with AI.

Save conversations to Drive

When you connect Gemini 1.5 to Google Drive then it saves all the conversations into the drive as files. You can also give it access to specific files that you want it to use as a knowledge base in your conversations.

google gemini 1.5 drive

Load data from Drive

As I connected Gemini to Drive then I could load all needed information from the Drive making sure I don’t have any bandwidth problems uploading large pieces of data. I used it for text files and videos.

Video prompts

Next, I wanted to experiment with Gemini’s video capabilities. I fed it a 12-second clip of my wife on a ski slope and asked it to describe the facts in the video. Gemini 1.5 got almost everything right.

google gemini 1.5 video

I also wanted to know if it understands the sequence of events happening in the video. I asked it to tell me when the skier is closest to the camera. I got the right answer on the first try.

Other languages

Gemini 1.5 is pretty good with languages other than English. For all the experiments above I use my native Estonian and English. It’s a small language. So, if it works in Estonian it will work with languages that have tens of millions of speakers.

Gemini 1.5 API keys

You can create a new project if you don’t have one already or add API keys to an existing project. All projects are subject to the Google Cloud Platform terms and conditions.

Different prompt options

Freeform prompts – These prompts offer an open-ended prompting experience for generating content and responses to instructions. You can use both images and text data for your prompts.

Structured prompts – This prompting technique lets you guide model output by providing a set of example requests and replies. Use this approach when you need more control over the structure of model output.

Chat prompts – Use chat prompts to build conversational experiences. This prompting technique allows for multiple input and response turns to generate output.

Is OpenAI dropping the ball?

Gemini 1.5 answers are on par with ChatGPT paid version. I haven’t tested it extensively. But everything I got out of it in the first couple of hours was OK. But considering the context window there’s no competition, ChatGPT just doesn’t have it. Video is also something that ChatGPT can’t work with.

Up till now Google PR has been “We are almost as good as ChatGPT!!!” Maybe they are catching up? Maybe GOOG stock is becoming the best AI investment. And this is Gemini 1.5 Pro the Ultra version should be even better. In a month or so OpenAI will probably release GPT4.5, but right now Claude 3 and Gemini are getting more attention even if not market share.