Performance issues when rendering large PDFs

See original GitHub issue

This might be a good question for pdf.js community itself but how does rendering large PDFs can be better handled with react-pdf?

pdf.js suggests not rendering more than 25 pages at a time: https://github.com/mozilla/pdf.js/wiki/Frequently-Asked-Questions#allthepages

I even had to add this to my component to keep react from trying re-create the virtual DOM of the Document:

    shouldComponentUpdate(nextProps, nextState) {
        if(nextProps.file !== this.props.file
            || nextState.numPages !== this.state.numPages
            || nextState.width !== this.state.width){
            return true
        }
        return false
    }

The problem is that I also need to dynamically set the width of the document on user interacting so I can’t save myself from re-creating the virtual DOM after width changes, any way I can achieve this with your lib?

Issue Analytics

  • State:open
  • Created 6 years ago
  • Comments:58 (12 by maintainers)

github_iconTop GitHub Comments

26reactions
wojtekmajcommented, Nov 18, 2017

Hey everyone, I’d like to remind you that it was never React-PDF’s intention to provide the users with fully-fledged PDF reader. Instead, this is only a tool to make it. While I have a plan of creating React-PDF-based PDF reader, I’m far from it. Mozilla is working on it for years and they seem to never be done. I think it would go similar way 😉

There is some good news too, though. If I can suggest something, onRenderSuccess callback that you can define for <Page> components can be your powerful friend. You can use it to, for example, force pages to be rendered one by one:

import React, { Component } from 'react';
import { Document, Page } from 'react-pdf/build/entry.webpack';

import './Sample.less';

export default class Sample extends Component {
  state = {
    file: './test.pdf',
    numPages: null,
    pagesRendered: null,
  }

  onDocumentLoadSuccess = ({ numPages }) =>
    this.setState({
      numPages,
      pagesRendered: 0,
    });

  onRenderSuccess = () =>
    this.setState(prevState => ({
      pagesRendered: prevState.pagesRendered + 1,
    }));

  render() {
    const { file, numPages, pagesRendered } = this.state;

    /**
     * The amount of pages we want to render now. Always 1 more than already rendered,
     * no more than total amount of pages in the document.
     */
    const pagesRenderedPlusOne = Math.min(pagesRendered + 1, numPages);

    return (
      <div className="Example">
        <header>
          <h1>react-pdf sample page</h1>
        </header>
        <div className="Example__container">
          <div className="Example__container__document">
            <Document
              file={file}
              onLoadSuccess={this.onDocumentLoadSuccess}
            >
              {
                Array.from(
                  new Array(pagesRenderedPlusOne),
                  (el, index) => {
                    const isCurrentlyRendering = pagesRenderedPlusOne === index + 1;
                    const isLastPage = numPages === index + 1;
                    const needsCallbackToRenderNextPage = isCurrentlyRendering && !isLastPage;

                    return (
                      <Page
                        key={`page_${index + 1}`}
                        onRenderSuccess={
                          needsCallbackToRenderNextPage ? this.onRenderSuccess : null
                        }
                        pageNumber={index + 1}
                      />
                    );
                  },
                )
              }
            </Document>
          </div>
        </div>
      </div>
    );
  }
}

Of course you can do much more - add placeholders, check on scroll which pages need rendering, keep info on whether all pages so far were rendered… I believe in your creativity 😉 And if I can be of any help regarding API, please let me know!

9reactions
stefanbuggecommented, Jan 6, 2019

I’ve had success with rendering with react-pdf together with react-window. The implementation below is inspired by the react-virtualized implementation by @michaeldzjap above and the description provided by @nikonet. It’s still a work in progress but so far it seems to perform well. Any suggestions to improve the implementation would be greatly appreciated.

One thing that concerns me, however: By caching all page dimensions on document load I would assume that you would loose the ability of pdfjs to load pages in chunks with range requests. Any thoughts on this?

import React from 'react'
import PropTypes from 'prop-types'
import { debounce } from 'lodash'

import { VariableSizeList as List } from 'react-window'
import { Document } from 'react-pdf/dist/entry.webpack'

import PageRenderer from './PageRenderer'
import PlaceholderPageList from './PlaceholderPageList'

import { PAGE_SPAZING } from './../constants'

/* eslint-disable import/no-webpack-loader-syntax */
import testpdf from 'url-loader!./../testpdf.pdf'
import './../style.scss'

const file = {
  url: testpdf
}

const propTypes = {
  scale: PropTypes.number.isRequired
}

// PDFjs options
const options = {}

class DocumentViewer extends React.Component {
  static propTypes = propTypes

  constructor (props) {
    super(props)

    this.state = {
      containerWidth: undefined,
      containerHeight: undefined,
      numPages: undefined,
      currentPage: 1,
      cachedPageDimensions: null
    }

    this.viewerContainerRef = React.createRef()
    this.listRef = React.createRef()
  }

  componentDidMount () {
    this._mounted = true
    this.calculateContainerBounds()
    window.addEventListener('resize', this.handleWindowResize, true)
  }

  componentWillUnmount () {
    this._mounted = false
    window.removeEventListener('resize', this.handleWindowResize, true)
  }

  componentDidUpdate (prevProps) {
    if (prevProps.scale !== this.props.scale) {
      this.recomputeRowHeights()
    }
  }

  /**
   * Load all pages to cache all page dimensions.
   */
  cachePageDimensions (pdf) {
    const promises = Array.from({ length: pdf.numPages }, (v, i) => i + 1).map(
      pageNumber => pdf.getPage(pageNumber)
    )

    // Assuming all pages may have different heights. Otherwise we can just
    // load the first page and use its height for determining all the row
    // heights.
    Promise.all(promises).then(values => {
      if (!this._mounted) {
        return null
      }

      const pageDimensions = values.reduce((accPageDimensions, page) => {
        accPageDimensions.set(page.pageIndex + 1, [
          page.view[2],
          page.view[3] + PAGE_SPAZING
        ])
        return accPageDimensions
      }, new Map())

      this.setState({
        cachedPageDimensions: pageDimensions
      })
    })
  }

  calculateContainerBounds = () => {
    if (this.viewerContainerRef == null) {
      return
    }
    const rect = this.viewerContainerRef.current.getBoundingClientRect()
    this.setState({
      containerWidth: rect.width,
      containerHeight: rect.height
    })
  }

  recomputeRowHeights = () => {
    this.listRef.current.resetAfterIndex(0)
  }

  /*
    HANDLERS
  */

  onDocumentLoadSuccess = pdf => {
    this.setState({
      numPages: pdf.numPages
    })
    this.cachePageDimensions(pdf)
    this.calculateContainerBounds()
  }

  handleWindowResize = debounce(() => {
    this.calculateContainerBounds()
  }, 300)

  updateCurrentVisiblePage = ({ visibleStopIndex }) => {
    this.setState({
      currentPage: visibleStopIndex + 1
    })
  }

  /*
    GETTERS
  */

  getItemSize = index => {
    return this.state.cachedPageDimensions.get(index + 1)[1] * this.props.scale
  }

  /*
    RENDERERS
  */

  render () {
    const { 
      numPages, 
      cachedPageDimensions,
      containerHeight
    } = this.state

    const itemData = {
      scale: this.props.scale,
      cachedPageDimensions: cachedPageDimensions
    }
    return (
      <div className='dv' ref={this.viewerContainerRef}>
        <Document
          className='dv__document'
          file={file}
          onLoadSuccess={this.onDocumentLoadSuccess}
          options={options}
          loading={<PlaceholderPageList />}
        >
          {cachedPageDimensions != null && (
            <List
              height={containerHeight}
              itemCount={numPages}
              itemSize={this.getItemSize}
              itemData={itemData}
              overscanCount={2}
              onItemsRendered={this.updateCurrentVisiblePage}
              ref={this.listRef}
            >
              {PageRenderer}
            </List>
          )}
        </Document>
      </div>
    )
  }
}

export default DocumentViewer

//////////////////////////////////////////////////

import React from 'react'
import PropTypes from 'prop-types'

import { Page } from 'react-pdf/dist/entry.webpack'

const propTypes = {
  index: PropTypes.number.isRequired,
  style: PropTypes.object.isRequired,
  data: PropTypes.object.isRequired
}

export default class PageRenderer extends React.PureComponent {
  static propTypes = propTypes

  render () {
    const { index, data } = this.props
    const { cachedPageDimensions, scale } = data

    const pageNumber = index + 1
    const pageDimensions = cachedPageDimensions.get(pageNumber)

    const width = pageDimensions[0] * scale
    const style = {
      ...this.props.style,
      width,
      left: '50%',
      WebkitTransform: 'translateX(-50%)',
      transform: 'translateX(-50%)'
    }
    return (
      <div
        className='dv__page-wrapper'
        key={`page_${pageNumber}`}
        style={style}
      >
        <Page
          className='dv__page'
          pageNumber={pageNumber}
          scale={scale}
          renderAnnotationLayer={false}
        />
      </div>
    )
  }
}

Read more comments on GitHub >

github_iconTop Results From Across the Web

React-PDF Slow Performance with large PDF - Stack Overflow
If you keep rendering your pdf document multiple times you app performance will get affected and thereby decline.
Read more >
What Contributes to Slow PDF Rendering? - PSPDFKit
Sometimes PDFs are broken. Lots of PDF software, including PSPDFKit, has support for recovering broken PDFs. One issue that can cause severe performance ......
Read more >
How to Optimize PDFs for Accurate Rendering in PDF.js
A common cause of incorrect rendering is where documents embed a feature such as a PDF transparency, pattern or gradient not supported by...
Read more >
582752 - Performance of PDF viewer with large files is bad
the final frame when the user stops scrolling a complex PDF. ... be half of the floor of the PDF render time and...
Read more >
Guide to Evaluating PDF.js Rendering - PDFTron
Mozilla also recognized that rendering vector-based PDFs as large static images was ... off by default) contributing to performance and readability issues.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found