* Setting up After reading the following on https://hub.docker.com/r/browserless/chrome #+begin_quote Getting Chrome running well in docker is also a challenge as there's quiet a few packages you need in order to get Chrome running. Once that's done then there's still missing fonts, getting libraries to work with it, and having limitations on service reliability. #+end_quote Made me think twice about setting it up myself, so just grabbed this for now. - I realized soon eough that ws://localhost:3000 is browserless' own API, so I went and tried to figure out how to go about getting the websocket for the chrome devtools, turns out I need to launch an instance first. Browserless has an API but I went through the documentation and quickly felt like it probably defeats the purpose of the exercise to use them, so I instead used this; https://hub.docker.com/r/zenika/alpine-chrome Perhaps the exercise is looking for me to actually build an image from scratch, but let's make progress on all other other tasks before tackling that. I immediately hit bot detection when just running a normal websocket request to the docker container, so I started researching what I would need to do to avoid detection. Ok, so found this; https://github.com/ultrafunkamsterdam/undetected-chromedriver/ This is how to pass brave to the URL https://github.com/ultrafunkamsterdam/undetected-chromedriver/issues/806 I could set this up in the docker container, however, I'm not sure this is the right thing. I found this resource; https://bot.incolumitas.com/#botChallenge Ok, so it works! I was able to scrape google with the =driver.py= script! I could use this, but let's see if I can just build the docker container myself. https://hub.docker.com/r/ultrafunk/undetected-chromedriver Setting up this with the underlying dockerfile, but I'm hitting this issue; #+begin_quote /app $ python driver.py /usr/bin/chromium-browser https://ferano.io Traceback (most recent call last): File "/app/driver.py", line 12, in driver = uc.Chrome( ^^^^^^^^^^ File "/usr/lib/python3.11/site-packages/undetected_chromedriver/__init__.py", line 466, in __init__ super(Chrome, self).__init__( File "/usr/lib/python3.11/site-packages/selenium/webdriver/chrome/webdriver.py", line 47, in __init__ super().__init__( File "/usr/lib/python3.11/site-packages/selenium/webdriver/chromium/webdriver.py", line 69, in __init__ super().__init__(command_executor=executor, options=options) File "/usr/lib/python3.11/site-packages/selenium/webdriver/remote/webdriver.py", line 261, in __init__ self.start_session(capabilities) File "/usr/lib/python3.11/site-packages/undetected_chromedriver/__init__.py", line 724, in start_session super(selenium.webdriver.chrome.webdriver.WebDriver, self).start_session( File "/usr/lib/python3.11/site-packages/selenium/webdriver/remote/webdriver.py", line 362, in start_session response = self.execute(Command.NEW_SESSION, caps)["value"] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib/python3.11/site-packages/selenium/webdriver/remote/webdriver.py", line 454, in execute self.error_handler.check_response(response) File "/usr/lib/python3.11/site-packages/selenium/webdriver/remote/errorhandler.py", line 232, in check_response raise exception_class(message, screen, stacktrace) selenium.common.exceptions.SessionNotCreatedException: Message: session not created: cannot connect to chrome at 127.0.0.1:48747 from session not created: This version of ChromeDriver only supports Chrome version 138 Current browser version is 124.0.6367.78; For documentation on this error, please visit: https://www.selenium.dev/documentation/webdriver/troubleshooting/errors#sessionnotcreatedexception #+end_quote So now I need a new docker image, https://hub.docker.com/r/selenium/standalone-chrome Updated the docker file. Now this works, I get back my websites HTML! #+begin_src sh docker exec -it search-api python driver.py /usr/bin/google-chrome https://ferano.io #+end_src