I’ve been dockerizing web scrapers in order to add them in a data pipeline. Web scraping is always a little bit of living on the edge because you never know which elements might have changed and which errors will pop up. In this blog post I would like to elaborate on the “Session deleted because of page crash” error, which is more of a Docker issue than a scraping issue.
When running the container, the following error popped up in the CLI:
selenium.common.exceptions.WebDriverException: Message: unknown error: session deleted because of page crash
There aren’t a lot of causes for a webpage crash. And if you’re using Docker, Bayesian reasoning should tell you that the described error can probably be solved with the solution below.
Here’s what’s happening: there’s not enough of (shared) memory allocated to the Docker container to fully load a web page. That’s because modern web applications tend to be very RAM-intensive. The default allocated memory of 64MB might simply not be enough.
You can adjust the shared memory when running/composing a Docker container.
Via the CLI:
docker run -it --shm-size=1gb oracle11g /bin/bash
Via the compose file (YAML):
version: '3.5' services: browser: image: 'selenium/standalone-chrome' ports: - '4444:4444' shm_size: '1gb'
If you would to know what valid parameter values are:
Size of /dev/shm. The format is <number><unit>. number must be greater than 0. Unit is optional and can be b (bytes), k (kilobytes), m (megabytes), or g (gigabytes). If you omit the unit, the system uses bytes. If you omit the size entirely, the system uses 64m.
Docker run reference
And that’s it. Great success!