Whoogle: Google without advertisement and Javascript

Published: 2022-09-28

Searching via Google, in my experience, scores the highest in productivity compared to other search engines. DuckDuckGo, Bing, Ecosia, Brave, and others are nice, but these often miss important results; especially for very specific technical searches. However, Google also scores lowest in privacy. They know where you drive and walk from Google Maps, when you have meetings from Google Calendar, what videos you watch from YouTube, and what you search for via their search engine. The search engine even tracks how long you are on a webpage by seeing when you leave their engine and when you come back to searching. According to uBlock Origin on Firefox, a Google Search page contains dozens of ads and trackers. Luckily, the situation can be improved by self-hosting an alternative front end such as Whoogle or searx. Here, we'll use Whoogle which is very easy to set up.

To run Whoogle, make sure that you have a server available with a stable and, preferably, fast internet connection. Assuming that you'll be using Whoogle for yourself and maybe some friends and family, the server only needs a hundred MB of memory and almost no CPU. Next, we'll be assuming that you have Docker installed. Start running Whoogle by adding the following docker-compose.yml file in some folder called whoogle:

version: '3'

services:
  whoogle:
    image: 'benbusby/whoogle-search:0.8.4'
    container_name: 'whoogle'
    ports:
      - '5004:5000'
    logging:
      driver: 'json-file'
      options:
        max-size: '10m'
        max-file: '10'
    environment:
      WHOOGLE_CONFIG_DISABLE: '1'
      WHOOGLE_CONFIG_COUNTRY: 'US'
      WHOOGLE_CONFIG_LANGUAGE: 'lang_en'
      WHOOGLE_CONFIG_SEARCH_LANGUAGE: 'lang_en'
      WHOOGLE_CONFIG_URL: 'https://search.<your domain name>'
    restart: 'unless-stopped'

As you can see, the Docker container will make port 5004 available to the outside world. To test that Whoogle works, you can go to http://<your server ip>:5004 in your browser and see whether you get a running Whoogle instance.

However, for safety reasons, it's better to put the Whoogle instance behind a password and to use SSL. We can achieve both via a reverse proxy. I'd advise Caddy because it makes setting up SSL way easier than NGINX. To setup SSL, add A (and optionally AAAA) records to your domain from search.<your domain name> to <your server ip>. These records will make https://search.<your domain name> point to your server. Once that is done, add the following to your Caddyfile:

search.<your domain name> {
    reverse_proxy 127.0.0.1:5004
}

and restart Caddy. After setting this, https://search.<your domain name> should make Whoogle available via SSL 🚀. Much better.

Finally, malicious actors could now freely use your Whoogle instance to either flood your server with work or to use any vulnerability in Whoogle to gain access to the Docker container and possibly even your server. At the time of writing, the Whoogle container has to run as root which is not so great from a security perspective. To solve these problems, we can add basic authentication to Whoogle. Whoogle provides this natively, but since more people use Caddy than Whoogle, it's probably safer to use the basic authentication from Caddy. To setup the basic auth, simply update your Caddyfile as follows:

search.<your domain name> {
    basicauth * {
        <your username> <your hashed password>
    }
    reverse_proxy 127.0.0.1:5004
}

This hashed password can be obtained by running

$ caddy hash-password

and typing your password in the console. Caddy will then return your hashed password which you can paste in the Caddyfile. If you don't have Caddy installed, you can simply run Caddy on your server via:

$ docker run -it --rm caddy sh

$ caddy hash-password

Or, if you have Caddy already running and have set container_name: 'caddy', then you can run:

$ docker exec -it caddy sh

$ caddy hash-password

After restarting Caddy, you should now have a Whoogle instance behind a password 🔑. If you have a password manager such as Bitwarden, adding the password to your safe will auto-fill the password each time you go to https://search.<yourdomain>. So, that means that you won't even notice that Whoogle is behind a login. Furthermore, you can add Whoogle to your Browser's search bar too, see the instructions in Whoogle's README. If the browser has trouble with logging in, that is, gives "not found" errors, then make sure that the browser looks at https://<your username>:<your hashed password>@search.<your domain name>. This way, you pass the credentials via the URL which should ensure that your browser uses the credentials.

The text is licensed under CC BY-NC-SA 4.0 and the code under Unlicense.