Home
Rho Signal
Cancel

Reading from S3 with Polars

Updated October 2023 In this post we see how to read and write from a CSV or Parquet file in S3 with Polars. We also see how to filter the file on S3 before downloading it to reduce the amount of ...

Understanding the Polars nested column types

Polars has 4 native nested column types. These can be very helpful at solving problems such as: working with ML embeddings splitting strings working with nested JSON data working with aggr...

Comparison of Matplotlib and Plotly in Polars

Updated July 2023 From Plotly v5.15.0 onwards Plotly has native support for Polars😊. So you can pass the DataFrame as the first argument and the column names as strings to the x and y encoding argu...

Filling time series gaps in lazy mode

Two major advantages of Polars over Pandas is that Polars has a lazy mode with query optimization and that Polars can scale to larger-than-memory datasets with its streaming mode. Taking advantage ...

Crucial parameters for streaming in Polars

In this post we see how Polars sets some crucial parameters that affect streaming mode. Understanding these concepts is important if you want to optimize the performance of a large streaming query ...

Exploding a Polars pivot for feature engineering

In my ML pipelines these days I find I replace some of the simpler scikit-learn metrics such as root-mean-squared-error with my own hand-rolled Polars expressions. This approach saves me from copyi...

Ordering of groupby and unique in Polars

Polars (and Apache Arrow) has been designed to be careful with your data so you don’t get surprises like the following Pandas code where the ints column has been cast to float because of the missin...

Polars, Altair and Vegafusion

Altair has been my favourite visualisation library for a long time. It allows me to make beautiful visualisations with an API that is concise and consistent. I was sad to find last year that I coul...

Concat, extend or vstack?

On the face of it the concat,extend and vstack functions in Polars can do the same job: they can take two initial DataFrames and turn them into a single DataFrame. In this post I show that they do ...

Filtering one df by another

One of the most common questions we get on the Polars discord is how to filter rows in one dataframe by values in another. I think people don’t realise this is a basically a join because they don’...