42 lines
1.4 KiB
Org Mode
42 lines
1.4 KiB
Org Mode
* Setting up
|
|
|
|
After reading the following on https://hub.docker.com/r/browserless/chrome
|
|
|
|
#+begin_quote
|
|
Getting Chrome running well in docker is also a challenge as there's quiet a few packages you need in order to get Chrome running. Once that's done then there's still missing fonts, getting libraries to work with it, and having limitations on service reliability.
|
|
#+end_quote
|
|
|
|
Made me think twice about setting it up myself, so just grabbed this for now.
|
|
|
|
- I realized soon eough that ws://localhost:3000 is browserless' own API, so I went
|
|
and tried to figure out how to go about getting the websocket for the chrome
|
|
devtools, turns out I need to launch an instance first.
|
|
|
|
Browserless has an API but I went through the documentation and quickly felt
|
|
like it probably defeats the purpose of the exercise to use them, so I instead
|
|
used this;
|
|
|
|
https://hub.docker.com/r/zenika/alpine-chrome
|
|
|
|
Perhaps the exercise is looking for me to actually build an image from scratch,
|
|
but let's make progress on all other other tasks before tackling that.
|
|
|
|
|
|
|
|
|
|
Ok, so found this;
|
|
|
|
https://github.com/ultrafunkamsterdam/undetected-chromedriver/
|
|
|
|
This is how to pass brave to the URL
|
|
https://github.com/ultrafunkamsterdam/undetected-chromedriver/issues/806
|
|
|
|
I could set this up in the docker container, however, I'm not sure this is the
|
|
right thing.
|
|
|
|
|
|
I found this resource;
|
|
https://bot.incolumitas.com/#botChallenge
|
|
|
|
Ok, so it works! I was able to scrape google with the =driver.py= script!
|