Pandas 2.0 is Finally Here!

The new version of Pandas has added the ability to use any numpy numeric dtype in an Index, and removed Int64Index, UInt64Index, and Float64Index.
Pandas 2.0
Listen to this story

The most awaited Pandas 2.0 is finally here. The new updates come with new features, bug fixes, and improved performance, alongside breaking changes. Close to 253 people have contributed patches to this release.

Check out the GitHub repository here .

The release note stated that the users with existing code need to upgrade to pandas 1.5.3 before they upgrade to the second version of Pandas and make sure their code does not generate FutureWarning or DeprecationWarning messages. The release is said to be made available on conda-forge and PyPI .

AIM Daily XO

Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.

What’s new?

There have been significant improvements compared to previous versions:

Improved Performance

The new version of Pandas has added the ability to use any numpy numeric dtype in an Index , and removed Int64Index , UInt64Index , and Float64Index . Also, the operations that previously forced the creation of 64-bit indexes can now create indexes with lower-bit sizes, such as 32-bit indexes.

Download our Mobile App

The ability for Index to hold numpy numeric dtypes has brought some changes in Pandas functionality. Now, instantiating using a numpy numeric array follows the dtype of the numpy array.

Significant behaviour changes

The bug fixes in the latest version of panda have bought some notable behaviour changes. For instance, the DataFrameGroupBy.cumsum() and DataFrameGroupBy.cumprod() methods now overflow instead of casting to float when the result can be held by int64 dtype . This makes sure that the results are correct and consistent with numpy and the regular DataFrame.cumprod() and DataFrame.cumsum() methods when the limit of int64 is reached.

Further, SeriesGroupBy.nth() and DataFrameGroupBy.nth() methods now behave as filtrations instead of aggregations. In other words, they may return either zero or multiple rows per group, and the index of the result is derived from the input by selecting the appropriate rows. Say, when n is larger than the group, no rows instead of NaN is returned.

The release not stated that these changes may have notable behaviour changes, so it is important to be aware of them when upgrading to Pandas 2.0.

Read: Comprehensive Guide To Pandas Dataframes with Python Codes

There is more

The new version of Pandas also involves unsupported datetime and timedelta data types. For instance, in the previous versions, Pandas would replace unsupported data types with nanoseconds data types silently. But, in the new version, Pandas is said to support only “s”, “ms”, “us”, and “ns” resolutions , and it now raises an error instead of silently replacing unsupported data types with a supported one.

In addition to this, Pandas 2.0 has made changes related to the result name and index of the Series.value_counts() method. For example, in the previous versions, the resulting name and index were the same as the original object. This used to cause a lot of confusion when resetting the index. In the new version, the result name willl be ‘count’ (or ‘proportion’ if normalise=True was passed), and the index will be named after the original object.

In Pandas 2.0, the pandas disallow astype conversion to non-supported datetime64/timedelta64 data types, and it raises an error. In comparison, in the previous versions, when converting a Series or DataFrame from datetime64[ns] to a different datetime64[X] dtype , Pandas would return with datetime64[ns] dtype instead of the requested dtype .

For more details on the latest version of Pandas, click here .

Sign up for The AI Forum for India

Analytics India Magazine is excited to announce the launch of AI Forum for India – a community, created in association with NVIDIA, aimed at fostering collaboration and growth within the artificial intelligence (AI) industry in India.

Tasmia Ansari
Tasmia is a tech journalist at AIM, looking to bring a fresh perspective to emerging technologies and trends in data science, analytics, and artificial intelligence.

Our Upcoming Events

Regular Passes expiring on Friday
27-28th Apr, 2023 I Bangalore
Data Engineering Summit (DES) 2023

23 Jun, 2023 | Bangalore
MachineCon India 2023

21 Jul, 2023 | New York
MachineCon USA 2023

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox

Council Post: Retail Business through Generative AI

Today, retail technology is developing at a fast pace – whether it is business transformation or even exploring emerging tech (AR/VR and metaverse etc.) to give customers a more experiential journey. Businesses are innovating not only to remain relevant, but also, ahead. Some are really shaping the future of omni-channel retail by predicting customer expectations and market trends.

Cerebras Wants What NVIDIA Has

While OpenAI apparently utilised 10,000 NVIDIA GPUs to train ChatGPT, Cerebras claims to have trained their models to the highest accuracy for a given compute budget.