What BloombergGPT Brings to the Finance Table

The latest LLM by Bloomberg, trained on 700 billion tokens, is an ingredient model said to boost Bloomberg Terminal service
Listen to this story

Last week, Bloomberg released a research paper on its large language model BloombergGPT. Trained on over 50 billion parameters, the LLM model will be a first-of-its-kind AI generative model catering to the finance industry. While the move may set a precedent for other companies, for now, the announcement sounds like a push for the data and news company to seem relevant in the AI space.

Interestingly, Bloomberg already has Bloomberg Terminal, which employs NLP and ML -trained models for offering financial data. So, naturally, the question that arises is: how much of a value-add is BloombergGPT and where does it stand in comparison to other GPT models?

Training and Parameters

Bloomberg’s vast repository of financial data over the past forty years, has been used for training the GPT model. It is trained on 363 billion token proprietary datasets (financial documents) available from Bloomberg. In addition, 345 billion token public datasets were also incorporated to result in a total of 700 billion tokens for training.

AIM Daily XO

Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.

The company claims that the new model (Bloomberg GPT) will help improve their already existing NLP tasks such as sentiment analysis – a method that helps predict market prices – news classification, headline generation, question-answering, and other query-related tasks.

On the face of it, the new LLM model appears great, but is still very limited in its approach. It’s not a multilingual model, has biases and toxicity and is a closed model.


Download our Mobile App



Multilingual

BloombergGPT, the 50-billion parameter ‘decoder-only causal language model’ is not trained on multilingual data. Their training dataset, called FinPile, includes news, filings, press releases, web-scraped financial documents, and social media drawn from the Bloomberg archives, and they are all in the English language. For instance,  to train the model on data from press conferences, transcripts of company press conferences through speech recognition were used in the English language. The absence of multi-languages limits input training data.

BLOOM , which has the same model architecture and software stack as BloombergGPT (though BLOOM is trained on higher parameters of 175 billion), is multilingual. Similar is the case with GPT-3, which is also trained on multilingualism and 175 billion parameters.

Biases and Toxicity

Bloomberg has mentioned that the possibility of the “generation of harmful language remains an open question”. LLMs are known for their biases and hallucinations , a problem that large trained models, such as ChatGPT , are also combatting. LLM bias can be highly detrimental when utilised in finance models, as accurate and factual information determines the rightful prediction of market sentiments. However, BloombergGPT does not address this concern completely. The company is still evaluating the model and believes that “existing test procedures, risk and compliance controls” will help reduce the problem. Bloomberg is also studying their FinPile dataset which contains lesser biases and toxic language, which will ultimately curb the generation of inappropriate content.

Closed Model

BloombergGPT is a closed model. Apart from the parameters and general information, details such as the weights of the model are not mentioned in their research paper. It is possible that since this model is based on decades of Bloomberg data, clubbed with its sensitive nature of information, the LLM will not become open sourced. Besides, the model is set to target their Bloomberg Terminal users, who are already availing the service at a subscription cost. However, the company does have plans to release training logs of the model.

In a conversation with AIM , Anju Kambadur , head of AI Engineering at Bloomberg, said: “BloombergGPT is about empowering and augmenting human professionals in finance with new capabilities to deal with numerical and computational concepts in a more accessible way.” Bloomberg has been using AI, Machine Learning and NLP for more than a decade but each of them required a custom model. “With BloombergGPT, we will be able to develop new applications quicker and faster, some of which have been thought about for years and not developed yet,” he said.

“Conversational English can be used to post queries using Bloomberg Query Language (BQL) to pinpoint data, which can then be imported into data science and portfolio management tools.”

Kambadur clarified that BloombergGPT is not a chatbot. “It is an ingredient model that we are using internally for product development and feature enhancement.” The model will help power AI-enabled applications like Bloomberg Terminal, but also power back-end workflows within our data operations. Clients may not engage with the model directly but will be using it through the Terminal functions in the future.

Comparison

Below is a comparison with other models GPT-NeoX (trained on 20B parameters) and FLAN-T5-XXL (trained on 11B parameters). BloombergGPT, updated on the latest information, is able to answer the questions accurately when compared to other similarly-trained LLMs.

Source: arxiv.org

BloombergGPT fared better on financial tasks when compared to other similar open models of the same size and was even evaluated on the ‘Bloomberg internal benchmarks’ and other general-purpose NLP benchmarks such as BIG-bench Hard , knowledge assessments, reading comprehension and linguistic tasks.

Sign up for The AI Forum for India

Analytics India Magazine is excited to announce the launch of AI Forum for India – a community, created in association with NVIDIA, aimed at fostering collaboration and growth within the artificial intelligence (AI) industry in India.

Vandana Nair
As a rare breed of engineering, MBA, and journalism graduate, I bring a unique combination of technical know-how, business acumen, and storytelling skills to the table. My insatiable curiosity for all things startups, businesses, and AI technologies ensure that I'll always bring a fresh and insightful perspective to my reporting.

Our Upcoming Events

Regular Passes expiring on Friday
27-28th Apr, 2023 I Bangalore
Data Engineering Summit (DES) 2023

23 Jun, 2023 | Bangalore
MachineCon India 2023

21 Jul, 2023 | New York
MachineCon USA 2023

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox
MOST POPULAR

Council Post: Retail Business through Generative AI

Today, retail technology is developing at a fast pace – whether it is business transformation or even exploring emerging tech (AR/VR and metaverse etc.) to give customers a more experiential journey. Businesses are innovating not only to remain relevant, but also, ahead. Some are really shaping the future of omni-channel retail by predicting customer expectations and market trends.

Cerebras Wants What NVIDIA Has

While OpenAI apparently utilised 10,000 NVIDIA GPUs to train ChatGPT, Cerebras claims to have trained their models to the highest accuracy for a given compute budget.