Puppeteer
Puppeteer is one of the most popular libraries that abstract the lower-level DevTools protocol from developers and provides a high-level API that you can use to easily instrument Chrome/Chromium and automate browsing sessions. Puppeteer is used for tasks like creating screenshots, crawling pages, and testing web applications.
Puppeteer typically connects to a local Chrome or Chromium browser using the DevTools port. Refer to the Puppeteer API documentation on the Puppeteer.connect()
method for more information.
The Workers team forked a version of Puppeteer and patched it to connect to the Workers Browser Rendering API instead. The changes between Workers Puppeteer fork and the Puppeteer core are minimal. After connecting, the developers can then use the full Puppeteer API as they would on a standard setup.
Our version is open sourced and can be found in Cloudflare’s fork of Puppeteer. The npm can be installed from npmjs as @cloudflare/puppeteer:
Once the browser binding is configured and the @cloudflare/puppeteer
library is installed, Puppeteer can be used in a Worker:
This script launches the env.MYBROWSER
browser, opens a new page, goes to https://example.com/, gets the page load metrics, closes the browser and prints metrics in JSON.
If users omit the browser.close()
statement, it will stay open, ready to be connected to again and re-used but it will, by default, close automatically after 1 minute of inactivity. Users can optionally extend this idle time up to 10 minutes, by using the keep_alive
option, set in milliseconds:
Using the above, the browser will stay open for up to 10 minutes, even if inactive.
In order to facilitate browser session management, we’ve added new methods to puppeteer
:
puppeteer.sessions()
lists the current running sessions. It will return an output similar to this:
Notice that the session 478f4d7d-e943-40f6-a414-837d3736a1dc
has an active worker connection (connectionId=2a2246fa-e234-4dc1-8433-87e6cee80145
), while session 565e05fb-4d2a-402b-869b-5b65b1381db7
is free. While a connection is active, no other workers may connect to that session.
puppeteer.history()
lists recent sessions, both open and closed. It’s useful to get a sense of your current usage.
Session 2be00a21-9fb6-4bb2-9861-8cd48e40e771
was closed explicitly with browser.close()
by the client, while session 478f4d7d-e943-40f6-a414-837d3736a1dc
was closed due to reaching the maximum idle time (check limits).
You should also be able to access this information in the dashboard, albeit with a slight delay.
puppeteer.limits()
lists your active limits:
activeSessions
lists the IDs of the current open sessionsmaxConcurrentSessions
defines how many browsers can be open at the same timeallowedBrowserAcquisitions
specifies if a new browser session can be opened according to the rate limits in placetimeUntilNextAllowedBrowserAcquisition
defines the waiting period before a new browser can be launched.
The full Puppeteer API can be found in the Cloudflare’s fork of Puppeteer.