Chicago | September 18, 2023 – Google’s answer to ChatGPT has materialized in the form of Google Gemini, an ensemble of large language models (LLMs) that amalgamate GPT-4 with techniques inspired by AlphaGo. Google’s strategic maneuver is aimed at challenging ChatGPT’s supremacy in the generative AI domain. With its versatile capabilities and the potential to tap into Google’s extensive proprietary training data from diverse services, Gemini is poised to disrupt the generative AI landscape. This underscores Google’s unwavering commitment to innovation and competition in the AI arena, particularly in a generative AI market expected to soar to an astonishing $1.3 trillion by 2032.
The arrival of ChatGPT in November sent shockwaves through Google, prompting the company to take decisive action to close the generative AI gap. This endeavor led to the development of Google Bard and, most notably, Gemini.
What Is Google Gemini?
Source: AI Revolution
Gemini signifies a fusion of GPT-4 and AlphaGo-inspired training methodologies, including reinforcement learning and tree search. These innovations hold the potential to unseat ChatGPT as the preeminent generative AI solution on a global scale.
This development follows Google’s consolidation of its Brain and DeepMind AI research laboratories into the newly-formed research team, Google DeepMind, along with the introduction of Bard and the next-generation PaLM 2 LLM. Clearly, Google is making substantial investments in AI to maintain its leadership position in this rapidly evolving field, especially with market forecasts projecting a $1.3 trillion valuation by 2032.
All We Know About Gemini So Far
While the release of Google Gemini is anticipated in the autumn of 2023, detailed insights into its capabilities remain somewhat limited. In a blog post from May, Sundar Pichai, CEO of Google and Alphabet, provided a high-level overview of the LLM. He elucidated that Gemini was designed to be multimodal, highly integrable with various tools and APIs, and poised to support forthcoming innovations such as memory and planning. Pichai also alluded to its remarkable multimodal capabilities.
Nevertheless, official information regarding the release has been relatively scarce. Demis Hassabis, CEO of Google DeepMind, mentioned in an interview that Gemini would blend AlphaGo-like strengths with potent language capabilities. Android Police reported that an anonymous source associated with the project suggested Gemini’s ability to generate both text and contextual images, with training data sourced from platforms like YouTube video transcripts.
Can Gemini Outshine ChatGPT?
A pivotal question surrounding the release of Gemini is whether it can surpass ChatGPT, which has garnered over 100 million monthly active users this year.
At first glance, Gemini’s capacity to generate text and images endows it with a substantial advantage over GPT-4 in terms of content diversity. However, the true differentiator might lie in Google’s expansive repository of proprietary training data. Gemini can process data from various services, encompassing Google Search, YouTube, Google Books, and Google Scholar. Harnessing this proprietary data in Gemini’s training regimen could bestow upon it a distinct edge in generating sophisticated insights and inferences, particularly if reports of it being trained on twice as many tokens as GPT-4 prove accurate.
Furthermore, the collaboration between Google DeepMind and Brain teams constitutes a formidable factor. This alliance pits OpenAI against a team of world-class AI researchers, including Google co-founder Sergey Brin and DeepMind’s senior AI scientist and machine learning expert, Paul Barham. This seasoned team possesses a profound understanding of applying techniques like reinforcement learning and tree search to create AI programs capable of continually enhancing their problem-solving abilities, as evidenced by AlphaGo’s triumph over a Go world champion in 2016.
Gemini’s multimodal capabilities, reinforcement learning, text and image generation prowess, and access to Google’s proprietary data collectively position it to potentially outperform GPT-4. The pivotal role of training data underscores that the victor in the LLMs arms race is likely to be determined by who trains their models on the most extensive and richest dataset.