The Quest to Learn Human Brains Took Him from Philosophy to AI

Ed Grefenstette spoke to AIM about his shift from big tech to startup.
Listen to this story

Cohere’s Command Beta model gained the top spot in the Stanford HELM (Holistic Evaluation of Language Models) earlier this month. The startup’s generative model that’s conditioned to respond well to single-statement commands stood out among 36 LLM models, including Meta’s Galactica, OpenAI’s Davinci, Google’s Flan, Bloom and others.

Despite the accolade, Ed Grefenstette , the Head of Machine Learning at Cohere , remained admirably modest, denoting the achievement as a “nice marketing moment”.

“Leaderboards are always things that you should take with a grain of salt. We do not want to be complacent and imagine that just because we topped this leaderboard, we will be better than other models close behind us,” the NLP pundit said. He is more excited about the week-on-week progress of their models than the leading position.

AIM Daily XO

Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.

Additionally, he stated that OpenAI’s latest, GPT-4 —undeniably an extremely strong model—was not benchmarked at the time. So, he is under no illusion and is sure that the result will be different in the next run if Stanford benchmarks the current models against GPT-4.

The Philosopher’s Toolkit

“My interest in artificial intelligence piqued through studying philosophy of mind during my undergraduate studies, and reading science fiction and cyberpunk novels in my teens. I decided to pursue computer science due to the lack of job opportunities in philosophy,” said the AI stalwart.

Download our Mobile App

As a budding philosopher, Grefenstette was fascinated by the question of what makes humans and other species intelligent . He wanted to understand how we use intelligence to reason about not just the physical world but also about concepts and metaphysics. However, he soon realised that this was a difficult task and began to think about how we could use artificial intelligence to aid in reasoning.

While completing his doctoral work in natural language processing at Oxford, Grefenstette discovered that the approach he and his colleagues were pioneering was similar to rudimentary neural networks. In 2014, they formed ‘ Dark Blue Labs ’ to commercialise their ideas but got acquired by Google within a few months.

Six months prior to the acquisition, the tech giant had also purchased British AI company, ‘ DeepMind ’, so Grefenstette merged his team into DeepMind and helped establish the NLP group, as well as a programme synthesis and understanding group.

Quenching the Startup Spirit

During his time at DeepMind, the NLP expert witnessed the exponential growth of the company firsthand, from 80 employees to over 1,000 in only four years. This growth, he notes, can make it difficult for individual voices to be heard and can change the culture and dynamics of a workplace.

Despite considering the idea of launching a startup, he ultimately decided to join Facebook AI Research in London, where he helped build a new research lab. “That seemed like a nice compromise between the complete control of entrepreneurship and the idea to start something small and grow it into something big,” he said.

However, after three years, he was once again drawn to “the earlier stages of an organisation or smaller groups because that’s usually where there’s the most sort of potential to push the shape, the direction of things”. Cohere, with its focus on conversational AI and potential for growth, proved to be the fit for Grefenstette’s entrepreneurial spirit.

OpenAI Steals The Thunder

Grefenstette lauded OpenAI’s fantastic progress. However, he also said, “With no disrespect intended to OpenAI, who have set a new technical ceiling, a common trope in ML is that no one is more than a few weeks behind anyone else (when methods are published), or perhaps a few months behind (when they are not). A lot of stakeholders in the field are now looking to close the gap created by OpenAI’s head-start”.

Meanwhile, he says he’s interested in thinking about the next ChatGPT moment. Something that is disruptive, delightful, and helpful. “If I give you the specifics, we will give our competitors an edge at this point,” he said with a chuckle.

We, as humans, use language not only to communicate but also to plan, transact, negotiate and perform several activities that characterise intelligence. It’s how we explain concepts and reason with ourselves . Providing certain fragments of these abilities to computers would be a way to be able to integrate them into a significantly wider range of activities, Grefenstette believes.

This feat was becoming more feasible. However, it wasn’t immediately clear for Cohere how to make this technology connect to the broader market. “Fortunately, OpenAI did that for the sector with ChatGPT,” he said. “The important thing was connecting to the non-technical users and that was supremely beneficial for OpenAI. Now, we have South Park episodes about ChatGPT. We wouldn’t have guessed that a year ago but it was a fantastic moment for both them and their competitors,” Grefenstette added.

Ethical Consequences

Engineering and science are not domains where ethics is orthogonal. Considering the ethical and societal implications of technology is not a dialogue that should be happening in silos, he believes. “These issues are also a matter of education for the broader population, and for regulation from the government. It cannot be left entirely in the hands of the technologists to resolve these issues,” Grefenstette added.

Pinpointing the concern of language models being used for medical applications, he said, “People should be educated that the risk of hallucination is an intrinsic aspect. These models are trained not to tell the truth, but rather to say something that’s plausible given the data it was trained on. However, plausibility is different from truth, although sometimes these are one and the same. This occasional overlap has created a vulnerability in users, who come to expect truth as a result, but understanding that—that is not a guarantee, that’s fundamental,” he added. “ Healthy scepticism needs to be ingrained within the broader population ”.

“I qualify myself as a sceptic,” said Grefenstette, when asked about his views on the race for AI companies to achieve human level intelligence, also infamously known as AGI . “Humans are very general, but not completely. We’re tailored to what has helped us survive under the constraints of the environment. So, we’re not the ceiling of what is possible in physics and biology,” he explained.

Sign up for The AI Forum for India

Analytics India Magazine is excited to announce the launch of AI Forum for India – a community, created in association with NVIDIA, aimed at fostering collaboration and growth within the artificial intelligence (AI) industry in India.

Tasmia Ansari
Tasmia is a tech journalist at AIM, looking to bring a fresh perspective to emerging technologies and trends in data science, analytics, and artificial intelligence.

Our Upcoming Events

Regular Passes expiring on Friday
27-28th Apr, 2023 I Bangalore
Data Engineering Summit (DES) 2023

23 Jun, 2023 | Bangalore
MachineCon India 2023

21 Jul, 2023 | New York
MachineCon USA 2023

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox

Council Post: Retail Business through Generative AI

Today, retail technology is developing at a fast pace – whether it is business transformation or even exploring emerging tech (AR/VR and metaverse etc.) to give customers a more experiential journey. Businesses are innovating not only to remain relevant, but also, ahead. Some are really shaping the future of omni-channel retail by predicting customer expectations and market trends.

Cerebras Wants What NVIDIA Has

While OpenAI apparently utilised 10,000 NVIDIA GPUs to train ChatGPT, Cerebras claims to have trained their models to the highest accuracy for a given compute budget.