Sticky header and index for large data frames in Jupyter

See original GitHub issue

Problem description

When displaying a large data frame in Jupyter the number of columns will be limited by max_cols set in the default setting, and all the rows will be displayed.

I would like to add an option in the default settings so that large data frames will be displayed with a sticky header and index and then be able to scroll though the data frame.

Proof of concept solution

Following the solution for html tables found at Stackoverflow: Table with fixed header and fixed column on pure css with the solution shown in action here HTML and CSS Solution I came up with the following solution (which follows the same way of <style scoped> as the _repr_html_ method):

import numpy as np
import pandas as pd
from IPython.display import HTML

# Dummy dataframe
columns = [chr(i) for i in range(ord('a'),ord('z')+1)]
data = np.random.rand(len(columns),len(columns))
df = pd.DataFrame(data, columns=columns)

# Solution
# Getting default html as string
df_html = df.to_html() 
# CSS styling 
style = """
<style scoped>
    .dataframe-div {
      max-height: 300px;
      overflow: auto;
      position: relative;
    }

    .dataframe thead th {
      position: -webkit-sticky; /* for Safari */
      position: sticky;
      top: 0;
      background: black;
      color: white;
    }

    .dataframe thead th:first-child {
      left: 0;
      z-index: 1;
    }

    .dataframe tbody tr th:only-of-type {
            vertical-align: middle;
        }

    .dataframe tbody tr th {
      position: -webkit-sticky; /* for Safari */
      position: sticky;
      left: 0;
      background: black;
      color: white;
      vertical-align: top;
    }
</style>
"""
# Concatenating to single string
df_html = style+'<div class="dataframe-div">'+df_html+"\n</div>"

# Displaying df with sticky header and index
HTML(df_html)

I would therefore like to know if others also would like to have this feature in pandas? Otherwise I guess I would make it to an independent module that wraps the _repr_html_ method. I know that it is not just a matter of adding the new styling above for the general case, but the above solution is a minimal working solution. A related issues is #28091

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Reactions:5
  • Comments:9 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
attack68commented, Jun 3, 2021

then you need to define the sticky positions for each index level so they dont overlap e.g:

 for i, level in enumerate(sorted(index)):
                self.set_table_styles(
                    [
                        {
                            "selector": f"tbody th.level{level}",
                            "props": f"position: sticky; "
                            f"left: {i * index_width}px; "
                            f"min-width: {index_width}px; "
                            f"max-width: {index_width}px; "
                            f"background-color: white;",
                        }
                    ],
                    overwrite=False,
                )
1reaction
attack68commented, Mar 9, 2021

@dsjstc sorry I gave you the new 1.3.0 input format for release June 2021, which is more CSS friendly. Yes you need what you did for 1.2.2.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Freeze header in pandas dataframe - Stack Overflow
from ipywidgets import interact, IntSlider from IPython.display ... Freeze the headers (column and index names) of a Pandas DataFrame.
Read more >
Table Visualization — pandas 1.5.2 documentation - PyData |
Hiding Data# ... The index and column headers can be completely hidden, as well subselecting rows or columns that one wishes to exclude....
Read more >
How to handle large datasets in Python with Pandas and Dask
The most common fix is using Pandas alongside another solution — like a relational SQL database, MongoDB, ElasticSearch, or something similar.
Read more >
Pandas - Different Ways of Formatting Column Headers
You can preview your data set from Jupyter Notebook, it would be similar to below: You probably wonder why someone would use number...
Read more >
working with pandas dataframes in Jupyter Notebooks
In this video I will introduce pandas dataframes. Learn how to read in files, change column names and use methods and attributes on...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found